Does Subversion 1.5 performance stink?

We’ve been working on a project where everyone is working off of trunk. The project has reached the point where our development team is growing and we’ve finally started doing client releases (all good things). So, to help co-ordinate all this we started following the SVN best practices of branching/merging etc.

The problem we’re facing is that merges are taking upwards of 20 minutes and very often fail with “connection reset by peer” or “PROPFIND” errors. Branching and merging are just so much of a pain that they are very near unusable. We only have about 1000 files or so and very often we’re merging less than 20 files and it still takes 20 minutes. We’re using Apache to access SVN.

  • Multiple projects in one git repo?
  • Git - best practices (in general, and from a perspective of former SVN user.)
  • What are the differences between GIT and SVN when it comes to merge conflicts solving
  • Git or svn on windows server 2003
  • In subversion, how can I un-hijack a file?
  • Subversion has stopped working with snow leopard install
  • My question is, is this typical or do we have something configured wrong? How big are your SVN repositories and how long do merges take?

    Edit: The server is accessed via the Internet, we have some rather large binary files, we use Mac, Linux and Windows clients. No Internet or network problems we know of.

  • How to delete “-” file from svn?
  • What's the best way to turn a Subversion diff into JSON?
  • What is a pre-revprop-change hook in SVN, and how do I create it?
  • Logs with change history in JGit
  • In-place editing, version control - what's your solution?
  • Relocate without relocate in subversion?
  • 8 Solutions collect form web for “Does Subversion 1.5 performance stink?”

    This is due to Apache, see Stack Overflow question “Svnserve VS mod_dav_svn”.

    To recapitulate:

    It seems to be less known that the choice of the server variant used – the Apache Subversion mod_dav_svn module or the standalone svnserve server – have a great impact to measured and perceived subversion performance. Usually svnserve is significantly faster than Apache mod_dav_svn
    ……………
    The most significant performance penalty was measured during svn log and svn merge operations against the mod_dav_svn server – you’ll notice worse svn log performance immediately if eg. using the Eclipse Subversion plugin Subclipse.

    Full Disclosure: I work with Larf and I will tell him not to mark my answer as selected so that it doesn’t look like we set this up somehow to game the system. I’d of course love your upvotes 🙂

    We recently tried something at work that might have sped things up and I wanted to capture it on Stack Overflow. I’m not sure if it will work for you, but it looks like it’s working for us.

    The Background

    1. Our repository was originally a 1.4 server.
    2. It was dumped and reloaded into a 1.5 server
    3. During the dump and load, a master repository of the form /svn/Projects/Project[A|B|C] was moved to many smaller repositories /svn/projectA , /svn/projectB

    More Symptoms

    1. ‘svn merge’ likes to take random files and change properties. We had a folder of test scripts (100 or so of them) and for some reason 3-5 of them randomly had properties changed (prop was svn:mergeinfo).
    2. The apache logs showed propgets and history inquiries for /svn/Projects/ProjectA when doing a merge, despite the fact that the dir structure and name change happened long ago.
    3. Looking at the svn:mergeinfo on some projects showed some bizarro things: some files that had been around forever showed ‘tags’ ancestors for some tags but not all, some of them had ancestors for the original svn paths AND the new paths, sometimes 5+ paths and repository layouts.
    4. I noticed that another employee who used TortoiseSVN (I use OSX commandline) was checking the “ignore ancestors” box and his merges had “correct” apache logs as compared to mine. His merges also appeared to start quicker.

    While all of these may be completely normal, they certainly didn’t sound like what I had expected.

    What We Did

    1. Tried to move as many large, relatively static, binary files out of the main code folders. This way, dev branches do not need to clone them.
    2. Removed the svn:mergeinfo property off of EVERY file. We wrote a shell script to do this and let it run.

    The Aftermath

    Larf had created a dev branch and then after a few days tried to merge the trunk into his branch. Previously, this type of merge appeared to stall for 13+ minutes before starting the merge. Now, it started almost immediately and ran to completion in < 4 minutes.

    We may have shot ourselves in the foot in terms of merging code to other older branches (because we removed svn:mergeinfo), but that happens so infrequently we were willing to take the risk of improving the dev branch merge time (and all branches going forward). Also, we are currently doing monthly releases/branches so the next one will have correct svn:mergeinfo properties set on them.

    That’s a network problem more than a subversion problem per se. I had it at one point, can’t remember what I did to solve it, but quick googling suggests that it’s not uncommon. Some things to check:

    1. Do you have a proxy involved?
    2. Are you using a GUI client like
      Tortoise? (If so, try a command
      line for the same operation)

    I’m assuming you can do svn ls and a plain svn co without problems.

    How big are your files? Binary or text? Big binary files create a lot of load on the server.

    Also “connection reset by peer” indicates that something is wrong on the server side. Have you checked the load of the server, checked the logs for errors?

    Do you have problems with other network applications (browsers, accessing network shares)? This would indicate that there is something wrong with the network itself.

    I’m accessing a big repository (> 100’000 files) on a server on the Internet and updates take a few minutes at most! That’s on Linux, though. Collecting the files locally isn’t a strength of Windows….

    [EDIT] Your problem are the large binary files. The server probably runs out of processing time or memory while trying to generate the xdelta. I suggest your read this article: Performance tuning Subversion

    We have well over a thousand files and consistently merge large numbers of changed files with conflicts. I’ve never actually timed a merge, but I would be surprised if it ever took more than a few minutes. Maybe time to put some hot water on, but not enough to make a proper cup of tea!

    Where us your respository located with respect to your client machine? Have you been able to rule out any other connection issues?

    Also, I probably should add that we use svnserve over SSH. We’ve always thought that we could speed things up a bit by going with Apache to access SVN rather than SSH. Maybe we’re wrong!

    It might be worth taking a copy of the repository and doing a quick test with svnserve instead to see if it’s any faster. It might not be the right solution for you overall, but it might give an indication of where the bottleneck is.

    With 1.5, merging requires getting the svn:mergeinfo property for everything – which is another web call for each file, which apache serving has never been known for doing fast.

    We use svnserve, an svn ls -R has just returned over 600,000 files in some 40 projects. I’ve never noticed that merging is particularly slow at all.

    We faced something similar when we stored lot of Binary files in Subversion (many 25 MB+ size). What we observed it after we did Apache service stop/start, performance imporved immediately and then slowly went down over next one week. So we wrote a script which did it Apache service stop/start once a day. This improved things significantly.

    I am not sure if this is a correct solution for problem you are facing. But may be worth a try.

    This is the kind of problem that a centralized model will ultimately run into, and one of the motivating factors behind systems like BitKeeper, Git, Bzr, and Mercurial.

    Git Baby is a git and github fan, let's start git clone.