Copying Changes Between Branches

Now you and Sally are working on parallel branches of the project: you're working on a private branch, and Sally is working on the trunk, or main line of development.

For projects that have a large number of contributors, it's common for most people to have working copies of the trunk. Whenever someone needs to make a long-running change that is likely to disrupt the trunk, a standard procedure is to create a private branch and commit changes there until all the work is complete.

So, the good news is that you and Sally aren't interfering with each other. The bad news is that it's very easy to drift too far apart. Remember that one of the problems with the “crawl in a hole” strategy is that by the time you're finished with your branch, it may be near-impossible to merge your changes back into the trunk without a huge number of conflicts.

Instead, you and Sally might continue to share changes as you work. It's up to you to decide which changes are worth sharing; Subversion gives you the ability to selectively “copy” changes between branches. And when you're completely finished with your branch, your entire set of branch changes can be copied back into the trunk.

Copying Specific Changes

In the previous section, we mentioned that both you and Sally made changes to integer.c on different branches. If you look at Sally's log message for revision 344, you can see that she fixed some spelling errors. No doubt, your copy of the same file still has the same spelling errors. It's likely that your future changes to this file will be affecting the same areas that have the spelling errors, so you're in for some potential conflicts when you merge your branch someday. It's better, then, to receive Sally's change now, before you start working too heavily in the same places.

It's time to use the svn merge command. This command, it turns out, is a very close cousin to the svn diff command (which you read about in Chapter 2, Basic Usage). Both commands are able to compare any two objects in the repository and describe the differences. For example, you can ask svn diff to show you the exact change made by Sally in revision 344:

$ svn diff -c 344 http://svn.example.com/repos/calc/trunk

Index: integer.c
===================================================================
--- integer.c	(revision 343)
+++ integer.c	(revision 344)
@@ -147,7 +147,7 @@
     case 6:  sprintf(info->operating_system, "HPFS (OS/2 or NT)"); break;
     case 7:  sprintf(info->operating_system, "Macintosh"); break;
     case 8:  sprintf(info->operating_system, "Z-System"); break;
-    case 9:  sprintf(info->operating_system, "CPM"); break;
+    case 9:  sprintf(info->operating_system, "CP/M"); break;
     case 10:  sprintf(info->operating_system, "TOPS-20"); break;
     case 11:  sprintf(info->operating_system, "NTFS (Windows NT)"); break;
     case 12:  sprintf(info->operating_system, "QDOS"); break;
@@ -164,7 +164,7 @@
     low = (unsigned short) read_byte(gzfile);  /* read LSB */
     high = (unsigned short) read_byte(gzfile); /* read MSB */
     high = high << 8;  /* interpret MSB correctly */
-    total = low + high; /* add them togethe for correct total */
+    total = low + high; /* add them together for correct total */

     info->extra_header = (unsigned char *) my_malloc(total);
     fread(info->extra_header, total, 1, gzfile);
@@ -241,7 +241,7 @@
      Store the offset with ftell() ! */

   if ((info->data_offset = ftell(gzfile))== -1) {
-    printf("error: ftell() retturned -1.\n");
+    printf("error: ftell() returned -1.\n");
     exit(1);
   }

@@ -249,7 +249,7 @@
   printf("I believe start of compressed data is %u\n", info->data_offset);
   #endif

-  /* Set postion eight bytes from the end of the file. */
+  /* Set position eight bytes from the end of the file. */

   if (fseek(gzfile, -8, SEEK_END)) {
     printf("error: fseek() returned non-zero\n");

The svn merge command is almost exactly the same. Instead of printing the differences to your terminal, however, it applies them directly to your working copy as local modifications:

$ svn merge -c 344 http://svn.example.com/repos/calc/trunk
U  integer.c

$ svn status
M  integer.c

The output of svn merge shows that your copy of integer.c was patched. It now contains Sally's change—the change has been “copied” from the trunk to your working copy of your private branch, and now exists as a local modification. At this point, it's up to you to review the local modification and make sure it works correctly.

In another scenario, it's possible that things may not have gone so well, and that integer.c may have entered a conflicted state. You might need to resolve the conflict using standard procedures (see Chapter 2, Basic Usage), or if you decide that the merge was a bad idea altogether, simply give up and svn revert the local change.

But assuming that you've reviewed the merged change, you can svn commit the change as usual. At that point, the change has been merged into your repository branch. In version control terminology, this act of copying changes between branches is commonly called porting changes.

When you commit the local modification, make sure your log message mentions that you're porting a specific change from one branch to another. For example:

$ svn commit -m "integer.c: ported r344 (spelling fixes) from trunk."
Sending        integer.c
Transmitting file data .
Committed revision 360.

As you'll see in the next sections, this is a very important “best practice” to follow.

A word of warning: while svn diff and svn merge are very similar in concept, they do have different syntax in many cases. Be sure to read about them in Chapter 9, Subversion Complete Reference for details, or ask svn help. For example, svn merge requires a working-copy path as a target, i.e. a place where it should apply the tree-changes. If the target isn't specified, it assumes you are trying to perform one of the following common operations:

  1. You want to merge directory changes into your current working directory.

  2. You want to merge the changes in a specific file into a file by the same name which exists in your current working directory.

If you are merging a directory and haven't specified a target path, svn merge assumes the first case above and tries to apply the changes into your current directory. If you are merging a file, and that file (or a file by the same name) exists in your current working directory, svn merge assumes the second case and tries to apply the changes to a local file with the same name.

If you want changes applied somewhere else, you'll need to say so. For example, if you're sitting in the parent directory of your working copy, you'll have to specify the target directory to receive the changes:

$ svn merge -c 344 http://svn.example.com/repos/calc/trunk my-calc-branch
U   my-calc-branch/integer.c

The Key Concept Behind Merging

You've now seen an example of the svn merge command, and you're about to see several more. If you're feeling confused about exactly how merging works, you're not alone. Many users (especially those new to version control) are initially perplexed about the proper syntax of the command, and about how and when the feature should be used. But fear not, this command is actually much simpler than you think! There's a very easy technique for understanding exactly how svn merge behaves.

The main source of confusion is the name of the command. The term “merge” somehow denotes that branches are combined together, or that there's some sort of mysterious blending of data going on. That's not the case. A better name for the command might have been svn diff-and-apply, because that's all that happens: two repository trees are compared, and the differences are applied to a working copy.

The command takes three arguments:

  1. An initial repository tree (often called the left side of the comparison),

  2. A final repository tree (often called the right side of the comparison),

  3. A working copy to accept the differences as local changes (often called the target of the merge).

Once these three arguments are specified, the two trees are compared, and the resulting differences are applied to the target working copy as local modifications. When the command is done, the results are no different than if you had hand-edited the files, or run various svn add or svn delete commands yourself. If you like the results, you can commit them. If you don't like the results, you can simply svn revert all of the changes.

The syntax of svn merge allows you to specify the three necessary arguments rather flexibly. Here are some examples:

$ svn merge http://svn.example.com/repos/branch1@150 \
            http://svn.example.com/repos/branch2@212 \
            my-working-copy

$ svn merge -r 100:200 http://svn.example.com/repos/trunk my-working-copy

$ svn merge -r 100:200 http://svn.example.com/repos/trunk

The first syntax lays out all three arguments explicitly, naming each tree in the form URL@REV and naming the working copy target. The second syntax can be used as a shorthand for situations when you're comparing two different revisions of the same URL. The last syntax shows how the working-copy argument is optional; if omitted, it defaults to the current directory.

Best Practices for Merging

Tracking Merges Manually

Merging changes sounds simple enough, but in practice it can become a headache. The problem is that if you repeatedly merge changes from one branch to another, you might accidentally merge the same change twice. When this happens, sometimes things will work fine. When patching a file, Subversion typically notices if the file already has the change, and does nothing. But if the already-existing change has been modified in any way, you'll get a conflict.

Ideally, your version control system should prevent the double-application of changes to a branch. It should automatically remember which changes a branch has already received, and be able to list them for you. It should use this information to help automate merges as much as possible.

Unfortunately, Subversion is not such a system; it does not yet record any information about merge operations. [22] When you commit local modifications, the repository has no idea whether those changes came from running svn merge, or from just hand-editing the files.

What does this mean to you, the user? It means that until the day Subversion grows this feature, you'll have to track merge information yourself. The best place to do this is in the commit log-message. As demonstrated in prior examples, it's recommended that your log-message mention a specific revision number (or range of revisions) that are being merged into your branch. Later on, you can run svn log to review which changes your branch already contains. This will allow you to carefully construct a subsequent svn merge command that won't be redundant with previously ported changes.

In the next section, we'll show some examples of this technique in action.

Previewing Merges

First, always remember to do your merge into a working copy that has no local edits and has been recently updated. If your working copy isn't “clean” in these ways, you can run into some headaches.

Assuming your working copy is tidy, merging isn't a particularly high-risk operation. If you get the merge wrong the first time, simply svn revert the changes and try again.

If you've merged into a working copy that already has local modifications, the changes applied by a merge will be mixed with your pre-existing ones, and running svn revert is no longer an option. The two sets of changes may be impossible to separate.

In cases like this, people take comfort in being able to predict or examine merges before they happen. One simple way to do that is to run svn diff with the same arguments you plan to pass to svn merge, as we already showed in our first example of merging. Another method of previewing is to pass the --dry-run option to the merge command:

$ svn merge --dry-run -c 344 http://svn.example.com/repos/calc/trunk
U  integer.c

$ svn status
#  nothing printed, working copy is still unchanged.

The --dry-run option doesn't actually apply any local changes to the working copy. It only shows status codes that would be printed in a real merge. It's useful for getting a “high level” preview of the potential merge, for those times when running svn diff gives too much detail.

Merge Conflicts

Just like the svn update command, svn merge applies changes to your working copy. And therefore it's also capable of creating conflicts. The conflicts produced by svn merge, however, are sometimes different, and this section explains those differences.

To begin with, assume that your working copy has no local edits. When you svn update to a particular revision, the changes sent by the server will always apply “cleanly” to your working copy. The server produces the delta by comparing two trees: a virtual snapshot of your working copy, and the revision tree you're interested in. Because the left-hand side of the comparison is exactly equal to what you already have, the delta is guaranteed to correctly convert your working copy into the right-hand tree.

But svn merge has no such guarantees and can be much more chaotic: the user can ask the server to compare any two trees at all, even ones that are unrelated to the working copy! This means there's large potential for human error. Users will sometimes compare the wrong two trees, creating a delta that doesn't apply cleanly. svn merge will do its best to apply as much of the delta as possible, but some parts may be impossible. Just as the Unix patch command sometimes complains about “failed hunks”, svn merge will complain about “skipped targets”:

$ svn merge -r 1288:1351 http://svn.example.com/repos/branch
U  foo.c
U  bar.c
Skipped missing target: 'baz.c'
U  glub.c
C  glorb.h

$

In the previous example it might be the case that baz.c exists in both snapshots of the branch being compared, and the resulting delta wants to change the file's contents, but the file doesn't exist in the working copy. Whatever the case, the “skipped” message means that the user is most likely comparing the wrong two trees; they're the classic sign of user error. When this happens, it's easy to recursively revert all the changes created by the merge (svn revert --recursive), delete any unversioned files or directories left behind after the revert, and re-run svn merge with different arguments.

Also notice that the previous example shows a conflict happening on glorb.h. We already stated that the working copy has no local edits: how can a conflict possibly happen? Again, because the user can use svn merge to define and apply any old delta to the working copy, that delta may contain textual changes that don't cleanly apply to a working file, even if the file has no local modifications.

Another small difference between svn update and svn merge are the names of the full-text files created when a conflict happens. In the section called “Resolve Conflicts (Merging Others' Changes)”, we saw that an update produces files named filename.mine, filename.rOLDREV, and filename.rNEWREV. When svn merge produces a conflict, though, it creates three files named filename.working, filename.left, and filename.right. In this case, the terms “left” and “right” are describing which side of the double-tree comparison the file came from. In any case, these differing names will help you distinguish between conflicts that happened as a result of an update versus ones that happened as a result of a merge.

Noticing or Ignoring Ancestry

When conversing with a Subversion developer, you might very likely hear reference to the term ancestry. This word is used to describe the relationship between two objects in a repository: if they're related to each other, then one object is said to be an ancestor of the other.

For example, suppose you commit revision 100, which includes a change to a file foo.c. Then foo.c@99 is an “ancestor” of foo.c@100. On the other hand, suppose you commit the deletion of foo.c in revision 101, and then add a new file by the same name in revision 102. In this case, foo.c@99 and foo.c@102 may appear to be related (they have the same path), but in fact are completely different objects in the repository. They share no history or “ancestry”.

The reason for bringing this up is to point out an important difference between svn diff and svn merge. The former command ignores ancestry, while the latter command is quite sensitive to it. For example, if you asked svn diff to compare revisions 99 and 102 of foo.c, you would see line-based diffs; the diff command is blindly comparing two paths. But if you asked svn merge to compare the same two objects, it would notice that they're unrelated and first attempt to delete the old file, then add the new file; the output would indicate a deletion followed by an add:

D  foo.c
A  foo.c

Most merges involve comparing trees that are ancestrally related to one another, and therefore svn merge defaults to this behavior. Occasionally, however, you may want the merge command to compare two unrelated trees. For example, you may have imported two source-code trees representing different vendor releases of a software project (see the section called “Vendor branches”). If you asked svn merge to compare the two trees, you'd see the entire first tree being deleted, followed by an add of the entire second tree! In these situations, you'll want svn merge to do a path-based comparison only, ignoring any relations between files and directories. Add the --ignore-ancestry option to your merge command, and it will behave just like svn diff. (And conversely, the --notice-ancestry option will cause svn diff to behave like the merge command.)

Merges and Moves

A common desire is to refactor source code, especially in Java-based software projects. Files and directories are shuffled around and renamed, often causing great disruption to everyone working on the project. Sounds like a perfect case to use a branch, doesn't it? Just create a branch, shuffle things around, then merge the branch back to the trunk, right?

Alas, this scenario doesn't work so well right now, and is considered one of Subversion's current weak spots. The problem is that Subversion's update command isn't as robust as it should be, particularly when dealing with copy and move operations.

When you use svn copy to duplicate a file, the repository remembers where the new file came from, but it fails to transmit that information to the client which is running svn update or svn merge. Instead of telling the client, “Copy that file you already have to this new location”, it instead sends down an entirely new file. This can lead to problems, especially because the same thing happens with renamed files. A lesser-known fact about Subversion is that it lacks “true renames”—the svn move command is nothing more than an aggregation of svn copy and svn delete.

For example, suppose that while working on your private branch, you rename integer.c to whole.c. Effectively you've created a new file in your branch that is a copy of the original file, and deleted the original file. Meanwhile, back on trunk, Sally has committed some improvements to integer.c. Now you decide to merge your branch to the trunk:

$ cd calc/trunk

$ svn merge -r 341:405 http://svn.example.com/repos/calc/branches/my-calc-branch
D   integer.c
A   whole.c

This doesn't look so bad at first glance, but it's also probably not what you or Sally expected. The merge operation has deleted the latest version of integer.c file (the one containing Sally's latest changes), and blindly added your new whole.c file—which is a duplicate of the older version of integer.c. The net effect is that merging your “rename” to the branch has removed Sally's recent changes from the latest revision!

This isn't true data-loss; Sally's changes are still in the repository's history, but it may not be immediately obvious that this has happened. The moral of this story is that until Subversion improves, be very careful about merging copies and renames from one branch to another.



[22] However, at the time of writing, this feature is being worked on!