Migrate SVN to git with cleanup

I want to migrate my project from SVN to git but I would like to have a clean history of all files and basically get rid of branches. A few of the issues I have:

  • Some modules (subdirectories) where created, worked on and later discarded. I don’t want them in my history.
  • Some code was developed at different places in the repository (not a branch of this project) and then svn copy’ed. i.e. /otherproject/trunk/foo was incorporated into /myproject/trunk/bar. It should look like it was developed there from the start.
  • Once I had a huge rework happen in a branch so we decided to backport from trunk to that branch and then move the branch to trunk (replace trunk with the branch). It should look like it was trunk since its incarnation.

When I use svn2git I end up with all kinds of branches (some “imported” from the other project) and the history of the trunk is not as helpful as it could be. What I would like to have instead is basically have the history of each single file (or directory which is cleaner) to be recovered while stripping all the moving around that took place. If the other branches that are still relevant are not preserved by this it doesn’t particularly matter.

  • Show only history of one branch in a Git log
  • Git won't let me merge
  • What are the downsides to rebasing topic branches instead of merging?
  • Mercurial: Merging one file between branches in one repo
  • In git is it possible to merge from the master to a branch that was originally created from master after new code has been added to the master?
  • Git merge diff3 style need explanation
  • A related question to this is: From svn to git, with a moved trunk

    I would be happy for any suggestions, e.g. do some magic in SVN before the migration, have a clever way of migration or “clean up” in git after the migration.

  • git finding duplicate commits (by patch-id)
  • 'git apply' failed with code 1: trailing whitespace in SourceTree
  • “force” git commit on local to external via push?
  • how to git pull/fetch local repo from remote where cloned with --share --bare
  • Undo git filter-branch
  • Couldn't open xib file after git pull, invalid element name
  • One Solution collect form web for “Migrate SVN to git with cleanup”

    1 Simple cases like #1 can be filtered by svndumpfilter

    2 More complex cases can be solved by rewriting history. Though you should understand that this is complex and error prone process.

    So, basically you should:

    • make dump
    • process it with Python + svndump lib as you want – remove nodes, replace nodes paths, rearrange revisions and so on.
    • load dump
    • compare (if need) old repository with new to be sure that history is not lost

    If you want, I can provide python script examples.

    Git Baby is a git and github fan, let's start git clone.