Migrate SVN to git with cleanup

I want to migrate my project from SVN to git but I would like to have a clean history of all files and basically get rid of branches. A few of the issues I have:

  • Some modules (subdirectories) where created, worked on and later discarded. I don’t want them in my history.
  • Some code was developed at different places in the repository (not a branch of this project) and then svn copy’ed. i.e. /otherproject/trunk/foo was incorporated into /myproject/trunk/bar. It should look like it was developed there from the start.
  • Once I had a huge rework happen in a branch so we decided to backport from trunk to that branch and then move the branch to trunk (replace trunk with the branch). It should look like it was trunk since its incarnation.

When I use svn2git I end up with all kinds of branches (some “imported” from the other project) and the history of the trunk is not as helpful as it could be. What I would like to have instead is basically have the history of each single file (or directory which is cleaner) to be recovered while stripping all the moving around that took place. If the other branches that are still relevant are not preserved by this it doesn’t particularly matter.

A related question to this is: From svn to git, with a moved trunk

I would be happy for any suggestions, e.g. do some magic in SVN before the migration, have a clever way of migration or “clean up” in git after the migration.

  • Can I ignore build folder from master branch? - Yeoman Deployments using Git Subtree
  • Vendor Branches in Git
  • What is the most efficient way to push a set of changed commits in to a remote git repository?
  • Git multiple repositories in Visual Studio
  • How to setup SVN repository in XCode?
  • Is it possible to import an MKS Integrity repository into git?
  • Best branching strategy when doing continuous integration?
  • Moving from SVN to …?
  • One Solution collect form web for “Migrate SVN to git with cleanup”

    1 Simple cases like #1 can be filtered by svndumpfilter

    2 More complex cases can be solved by rewriting history. Though you should understand that this is complex and error prone process.

    So, basically you should:

    • make dump
    • process it with Python + svndump lib as you want – remove nodes, replace nodes paths, rearrange revisions and so on.
    • load dump
    • compare (if need) old repository with new to be sure that history is not lost

    If you want, I can provide python script examples.

    Git Baby is a git and github fan, let's start git clone.