What makes some version control systems better at merging?
I’ve heard that many of the distributed VCSs (git, mercurial, etc) are better at merging than traditional ones like Subversion. What does this mean? What sort of things do they do to make merging better? Could those things be done in a traditional VCS?
Bonus question: does SVN 1.5’s merge-tracking level the playing field at all?
- How do you handle versioning on a Web Application?
- SVN - GIT migration issue
- Android studio shelved changes disappeared, not even in the .idea/shelf directory
- hg to git conversion and subrepo merge
- Why do people fork and then don't change anything?
- How to setup VisualSVN Setup on Windows Azure Instance
5 Solutions collect form web for “What makes some version control systems better at merging?”
Most answers seems to be about Subversion, so here you have one about Git (and other DVCS).
In distributed version control system when you merge one branch into another, you create new merge commit, which remembers how you resolved a merge, and remembers all parents of a merge. This information was simply lacking in Subversion prior to version 1.5; you had to use additional tools such as SVK or svnmerge for this. This information is very important when doing repeated merge.
Thanks to this information distributed version control systems (DVCS) can automatically find common ancestor (or common ancestors), also known as merge base, for any two branches. Take a look at ASCII-art diagram of revisions below (I hope that it didn’t got too horribly mangled),
---O---*---*----M---*---*---1 \ / \---*---A/--*----2
If we want to merge branch ‘2’ into branch ‘1’, the common ancestor we would want to use to generate merge would be version (commit) marked ‘A’. However, if version control system didn’t record information about merge parents (‘M’ is previous merge of the same branches), it wouldn’t be able to find that is commit ‘A’, and it would find commit ‘O’ as common ancestor (merge base) instead… which would repeat already included changes and result in large merge conflict.
Distributed version control system had to do it right, i.e. they had to make merge very easy (without needing to mark/tag merge parents, and supply merge information by hand) from the very beginning, because the way to get somebody else to get code into project was not to give him/her commit access, but to pull from his/her repository: get commits from the other repository and perform a merge.
You can find information about merging in Subversion 1.5. in Subversion 1.5 Release Notes. Issues of note: you need different (!) options to merge branch into trunk than merge trunk into branch, aka. not all branches are equal (in distributed version control systems they are [usually] technically equivalent).
SVN’s merging capabilities are decent, and the simple merging scenarios work fine – e.g. release branch and trunk, where trunk tracks the commits on the RB.
More complex scenarios get complicated fast. For example lets start with a stable branch (
You want to demo a new feature, and prefer to base it on
stable as it’s, well, more stable than
trunk, but you want all your commits to be propagated to
trunk as well, while the rest of the developers are still fixing things in
stable and developing things on
So you create a
demo branch, and the merging graph looks like:
stable -> demo -> trunk(you)
stable -> trunk(other developers)
But what happens when you merge changes from
demo, then merge
trunk, while all the time other developers are also merging
trunk? SVN gets confused with the merges from
stable being merged twice into
There are ways around this, but with git/Bazaar/Mercurial this simply doesn’t happen – they realize whether the commits have already been merged because they ID each commit across the merging paths it takes.
The merge tracking in 1.5 is better than no merge tracking, but it is still very much a manual process. I do like the way that it records which rev’s are and aren’t merged, but its no where near perfect.
Merge has a nice dialog in 1.5. You can pick which revisions you wish to merge individually, or the whole branch. You then trigger the merge which occurs locally (and takes FOREVER) when then gives you a bunch of files to read through. You need to check logically each file for the correct behaviour (preferably running through unit tests on the files) and if you have conflicts you have to resolve them. Once your happy you make a commit of your change and at that point the branch is considered merged.
If you do it piecemeal, SVN will remember what you have previously said that you have merged, allowing you to merge. I found the process and the result of some of the merges to be strange to say the least however…
These version control systems can do better because they have more information.
SVN pre-1.5, along with most VCS’s before the latest generation, doesn’t actually remember that you merged two commits anywhere. It remembers that the two branches share a common ancestor way back when they first branched off, but it doesn’t know about any more recent merges that could be used as common ground.
I know nothing of SVN post 1.5 though, so maybe they’ve improved on this.
Flippant answer: Why are some programming languages better at text/math than others?
Real answer: because they have to be. distributed VCSs do much of there merging at a point where the neither of the authors of the conflicting code can tweak the merge manually because the merge is being done by a third party. As a result, the merge tool has to get it right most of the time.
In contract with SVN you are doing something funky (and wrong?) if you ever end up merging something where you didn’t write one side or the other.
IIRC most VCSs can shell out the merge to whatever you ask them to use, so there is (theoretically) nothing preventing SVN from using the GIT/mercurial merge engines. YMMV