Svn update vs git pull – implementation details
How actually svn up is performed? Does it get diffs from server and apply em, or server just sends last version of a file (is it compressed?) which then merged with my local file?
I ask because afaik, in git each file revision is a blob, so i expect that when i perform git pull it gets blobs from server and not a diffs.
And the real question is what is theoretically faster svn up or git pull? (of course it depends on repo size and changes in the repo, but let’s consider only the network trafic)
Thanks in advance
2 Solutions collect form web for “Svn update vs git pull – implementation details”
svn up determines the base revision of the file in your working copy, and communicates that to the server. The server will the use that knowledge to compute a deltafied version of the file to send back. The delta is then applied to get the new working base, and the diff is also applied to your working copy.
There’s a little more that’s going on here, and a few optimizations in certain circumstances, but at its most basic level this is what’s happening. It’s a little complicated because Subversion supports mixed-revision working copies. That is working copies that have different parts of the tree at different revisions.
git pull is a little different. Since git doesn’t support mixed revision working copies, it really only cares about what commit your currently at. So git will communicate the current state of its references (branches) and then the server will compute which commits need to be sent, pack them, and send them to the client. Once the new commits are on the client, then the remote ref is updated and an attempt is made to merge the new contents with the working copy.
From a high-level view, I think the two are awfully similar: they compute “diffs”, compress them, and send them to the client. But the details differ dramatically. Whole file contents is not necessarily sent with either one.
I believe that
git pull is a little more network efficient for a few reasons:
Git only ever needs to communicate the SHA1 of the branches, while Subversion needs to crawl the whole tree being updated and communicate parts of the tree that are different revisions than the base of the working tree (which is common is Subversion). This latter bit also means that there’s more disk I/O as well–though that’s been improved greatly in more recent versions of Subversion.
Git generally does a better job of aligning like content which means it generally achieves better compression of the data.
Git tracks and communicates content better. So if you have three copies of a file in Git, it will only communicate one blob to represent that file while Subversion would communicate three.
OTOH, you can choose to update only part of the tree in Subversion. If you have many developers working on the code base with lots of activity on the repository, and your part is fairly isolated, then you could
svn up only in the tree you care about and not have to worry about grabbing the other bits until your ready. Git only works on the repository and doesn’t support mixed revision working copies, so it’ll want to grab all the commits–which could be quite a lot of data.
So, in the end, it probably depends on your use case and what’s important to you. But in general, I believe Git has the upper hand.
Git stores blobs, but communicates diffs. Git is very efficient in doing this. Fetching changes from server git first determines differences between server and client and communicates a packages. The package is a kind of compressed diff between fetching git and server git. It also handles large binary files very efficient.