For home projects, can Mercurial or Git (or other DVCS) provide more advantages over Subversion?
Which free source control system is most preferable with reason for home projects and documents?
I am thinking to use Subversion (as I am familiar with it).
Characteristic of home project:
Most likely single person will be committing changes.
(May be one day (not now), it is possible that I share a project with my friend who is in other city)
I would like to store other documents (non-programming files)
Is Mercurial or GIT (distributed version control system) can give me any more advantage over to subversion in Home Projects?
9 Solutions collect form web for “For home projects, can Mercurial or Git (or other DVCS) provide more advantages over Subversion?”
Take a look at part about version control for single developer in my answer to “Difference between GIT and CVS” question here on StackOverflow. Some of those issues do still apply also to Subversion versus Git (or other distributed VCS: Mercurial, Bazaar, or less known: Monotone, Darcs), even if Subversion is improvement over CVS.
DISCLAIMER: I use Git (so I am biased), and know Subversion only from documentation (and other resources), never having used it myself. I might be then mistaken about Subversion capabilities.
Below there are list of differences between Git over Subversion for a single developer, on single machine (single account):
Setting up repository. Git stores repository in
.gitdirectory in top directory of your project. Starting a new project from unversioned tree of files is as easy as doing “git init” in a top directory of your project (and then of course “git add .” to add files, and e.g. “git commit -m ‘Initial commit'” to create first commit).
In Subversion (in any centralized version control system) you need to set up central repository (unless you did that earlier) using “svnadmin create” (well, you need to do that only once). Then you have to import files into Subversion using “svn import” (or “svn add”)… But note that after the import is finished, the original tree is not converted into a working copy. To start working, you still need to “svn checkout” a fresh working copy of the tree.
Repository and repository metadata. Git stores both repository (i.e. information about revisions and branches, etc.) and repository metadata (e.g. your identity, list of ignored files, which branch is currently checked out) in
.gitdirectory in top directory of your projects.
Subversion stores repository in separate area you have to put for that purpose, and stores repository metadata (e.g. where central repository is, identity used to contact central repository, and I think also properties like
svn:ignore) are stored in
.svndirectory in each directory of your project. (Note that Subversion stores pristine copy of your checkout, to have fast “svn status” and “svn diff”)
Naming revisions / version numbers. Subversion uses global revision identifiers in the form of single number specifying revision (so you can for example refer to r344, revision 344). Subversion also supports a few symbolic revision specifiers: HEAD, BASE, COMITTED, PREV.
In Git each version of a project (each commit) has its unique name given by 40 hexdigits SHA-1 id; usually first 7-8 characters are enough to identify a commit (you can’t use simple numbering scheme for versions in distributed version control system — that requires central numbering authority). But Git offers also other kinds of revision specifiers, for example
HEAD^means parent of a current commit,
master~5means revision 5 ancestors back (in straight first-parent line) from top commit on a ‘master’ branch,
v1.6.3-rc2might mean revision tagged
See also Many different kinds of revision specifiers blog post by Elijah Newren.
Easy branching and merging. In Git creating and merging branches is very easy; Git remembers all required info by itself (so merging a branch is as easu as “git merge branchname”)… it had to, because distributed development naturally leads to multiple branches. Git uses heuristic similarity-based rename detection, so it while merging it can deal with the case where one side renamed file (and other similar cases related to renaming). This means that you are able to use topic branches workflow, i.e. develop a separate feature in multiple steps in separate feature branch.
Branches have an unusual implementation in Subversion; they are handled by a namespacing convention: a branch is the combination of revisions within the global repository that exist within a certain namespace. Creating a new branch is done by copying an existing set of files from one namespace to another, recorded as a revision itself. Subversion made it easy to create new branch… but up till version 1.5 you had to use extra tools such as SVK or svnmerge extensions to be able to merge easily. Subversion 1.5 introduced
svn:mergeinfoproperty, but even then merging is slightly more complicated than in Git; also you need to use extra options to show and make use of merge tracking information in tools such as “svn log” and “svn blame”. I have heard that it doesn’t work correctly in more complicated situations (criss-cross merge), and cannot deal currently with renames (there is even chance of silent corruption in such case). See also (for example) this post on git mailing list by Dmitry Potapov, explaining intended use case for
svn:mergeinfoand its (current) limitations.
Tagging. In Git tags are immutable, can have comment associated with them, and can be signed using PGP/GPG signature (and verified). They are made using “git tag”. You can refer to revision using tag name.
In Subversion tags use the same path_info-like namespace convention as branches (recommended convention is
svnroot/project/tags/tagname), and are not protected against changing. They are made using “svn copy”. They can have comment associated with [the commit creating a tag].
Keyword expansion. Git offers very, very limited set of keywords as compared to Subversion (by default). This is because of two facts: changes in Git are per repository and not per file, and Git avoids modifying files that did not change when switching to other branch or revinding to other point in history. If you want to embed revision number using Git, you should do this using your build system, e.g. following exaple of GIT-VERSION-GEN script in Linux kernel sources and in Git sources. There is also
'ident'gitattribute which allows expansion of “$Id$” keyword to SHA-1 identifier of file contents (not identifier of a commit).
Both Git and Subversion do keyword expansion only on request.
Binary files. Both Git and Subversion deal correctly woth binary files. Git does binary file detection using similar algorithm to the one used by e.g. GNU diff, unless overriden on per-path basis using gitattributes. Subversion does it in slightly different way, by detecting type of file during adding file and setting
svn:mime-typeproperty, which you can then modify. Both Git and Subversion can do end of line character conversion on demand; Git has additionally
core.safecrlfconfig option which warn and prevent irreversible change (all CR to all CRLF is reversible, mixed CR and CRLF is not reversible).
Ignoring files. Git stores ignore patterns using in-tree
.gitignorefile, which can be put under version control and distributed; it usually contain patterns for build products and other generated files, and in
.git/info/excludesfile, which usually contains ignore patterns specific to user or system, e.g. ignore pattersn for backup files of your editor. Git patterns apply recursively, unless patter contain directory delimiter i.e. forward slash character ‘/’, then it is anchored to directory
.gitignorefile is; to top dir for
.git/info/excludes. (There is also
core.excludesfileconfiguration variable; this variable can exist in per-user
~/.gitconfigconfiguration file, and point to per-user ignore file).
global-ignoresruntime configuration option (which generally apply to particular computer or by a particular user of a computer), and “
svn:ignore” property on SVN-versioned directories. However unlike the
global-ignoresoption (and in
.gitignore), the patterns found in the “
svn:ignore” property apply only to the directory on which that property is set, and not to any of its subdirectories. Also, Subversion does not recognize the use of the
!prefix to pattern as exception mechanism.
Amending commits. distributed VCS such as Git act of publishing is separate from creating a commit, one can change (edit, rewrite) unpublished part of history without inconveniencing other users. In particular if you notice typo (or other error) in commit message, or a bug in commit, you can simply use “git commit –amend”. (Note: technically it is re-creating a commit, not changing existing commit; the changed commit has different identifier).
Subversion allows only to modify commit message after the fact, by changing appropriate property.
Tools. On one hand Git offers richer set of commands. One of more important is “git bisect” that can be used to find a commit (revision) that introduced a bug; if your commits are small and self-contained it should be fairly easy then to discover where the bug is.
On the other hand, Subversion because exists longer, has perhaps wider set of third party tools, and Subversion support in tools, than Git. Or at least more mature. Especially on MS Windows.
And there is another issue, which might be quite important later:
Publishing repository. If (when?) at some time you would want to share your repository, turning it from one-person project developed on a single home computer, to something other contribute, with Git is as simple as creating empty repository on server or on one of existing git hosting sites / software hosting sites with git support (like http://repo.or.cz, GitHub, Gitorious, InDefero; more–also for other DVCS–are listed in that answer), and then pusing your project to this public repository.
I guess it is more complicated with Subversion, if you don’t start at software hosting site with Subversion support (like SourceForge) from the beginning, unless you don’t want to preserve existing revision history. On the other hand for example Google Code suggest to use svnsync tool (part of the standard Subversion distribution), as explained in Google Products > Project Hosting (the Data Liberation Front) article.
Take a look also at http://whygitisbetterthanx.com/ site.
One obvious benefit: you can develop when you’re away from the server. For instance, you may have a laptop with its own local git repository, and push to your server (or github). Now suppose you went somewhere without internet connectivity… in Subversion you’d have to make do without any commits until you were connected again. With a DVCS you can commit locally (and revert, branch etc) then push those commits back up when you get home.
A really strong advantage of both git and mercurial in a ‘home project’ setting is that a new repository is trivial to set up. In git you just do
git init at the root of your code tree and you have a new repository.
You can then add, commit, branch, etc. straight away. svn has a larger cost to set up as you need a separate repository location and url before you can create a working copy and start your usual VCS operations.
Storing documents is no problem in git or mercurial but certainly with git (not sure about hg) I would advise against storing large media files (anything from 100M upwards) as it tends not to perform very well in some operations.
I use git on personal projects to sort of “collaborate with myself.” I have repositories on a linux box on my home network that’s accessible via a tunnel from anywhere. I then will clone it to my home desktop, my laptop, maybe a machine at work, and I can see it or work on it anywhere I go. I can commit changes, get the latest, and have backups in various places. It’s very nice the ease and speed with which git allows you to switch branches. Found a bug? Switch to ‘master’, fix it, commit, push, then switch back to what you’re doing. Easier and faster than cvs or subversion.
Also, I use git a lot for small directories that aren’t even projects. The config directory for the apache server hosting my web site is git’d, and likewise the tomcat config directory for the same web site.
I use it at work for everything, even though at work we’re on CVS moving to Subversion. I don’t use git-cvs or git-svn, I just use git alongside either product, and keep my branches local. Very handy to be able to switch to another developer’s latest commit, check something, then switch back.
Then, of course, there’s bisect, which can be a huge help, for work or home projects.
Also, if at work they’re still using punch cards, cvs, or subversion, then using git at home is a great way to stay current, and find out for yourself the impact it can have.
I don’t get excited about technologies unless they bring something genuinely new to the table. Git does. I’m a fan. You probably figured that out already.
Aside from details about the many wonderful features of hg, git, darcs, bzr, and friends (no sarcasm; I’m a huge fan), the essentials are here:
With svn you have to choose between storing your repo offsite and storing it onsite. Onsite means if your disk fails your project is toast. Offsite means you can’t commit from an airplane or other disconnected situations, and when network connectivity is bad, commits can be slow.
With any distributed VCS, it is trivial to create one or more “clones” of your “repo”. You can commit changes locally at any time, fast, then push those changes to a remote repo when connectivity is available.
git, hg, and others are loaded with features (and misfeatures) that make them different from svn and cvs. But those are the essentials.
Sharing projects is much easier with dvcs’s because you don’t need to give others access to your central repository computer. You can have him create a copy and not allow him to do any commits anywhere if you wish. If you want his changes you could pull them from his computer to where ever you like rather than allowing him to push them. This way you can, if you so wish, check the changes first. You are in total control (if you want).
The main benefit is still that you carry the entire repository in your laptop! It might take a while to appreciate what this really means when you are used to massive central repositories and all the hassle that goes with it. Of course in many situations it is beneficial to have one but again, you can control who has and what kind of access to it. With centralized vcs’s not allowing someone to commit directly to the central repository means a lot of extra work from some unfortunate sod. The commits would have to be done more or less by hand while with dvcs’s the sod responsible for checking the commited code can commit them in just the same way from his computer as he would his own code.
There are ways of easing the aforementioned in vcs’s but they still require extra maintenance (create components/views/what ever is access controlled and allow commits to this only). In git/mercurial there really isn’t any of this overhead.
I would say that the more the work is done outside your company network the more beneficial dvcs’s are. If you all have fast access to the central repository and all can be trusted to commit their changes there then the major strenghts of dvcs’s are not so important (although there still are some but at the moment they are IMO balanced with the poor UI’s available).
I don’t know about mercurial, but my favourite thing to do in git which is impossible in subversion is history editing. For example, you can:
- edit previous commits to make changes to the list of files that were changed
- change the order of previous commits
- remove commits from the history altogether
- merge two or more commits together
- split commits apart
- append subsequent changes to previous commits
In short, if you think you should be able to do it, you probably can. This is very powerful, once you realise this ability and is probably more important in many ways than just the distributed advantage.
I want to emphazize that Mercurial and GIT gets you started with source control without a need for a server and makes it easy to move your repository to another server when necessary.
Other advantages have been covered in the other answers.
I used to have a subversion server for home syncing. Some years, I’ve been running git, but now moving to hg. Reason is mainly simplicity. I should be able to ‘master’ git, but still, I don’t.
Here’s a great tutorial from Joel (one of the StackOverflow masterminds) on hg.