Migrate SVN to git with cleanup

I want to migrate my project from SVN to git but I would like to have a clean history of all files and basically get rid of branches. A few of the issues I have:

  • Some modules (subdirectories) where created, worked on and later discarded. I don’t want them in my history.
  • Some code was developed at different places in the repository (not a branch of this project) and then svn copy’ed. i.e. /otherproject/trunk/foo was incorporated into /myproject/trunk/bar. It should look like it was developed there from the start.
  • Once I had a huge rework happen in a branch so we decided to backport from trunk to that branch and then move the branch to trunk (replace trunk with the branch). It should look like it was trunk since its incarnation.

When I use svn2git I end up with all kinds of branches (some “imported” from the other project) and the history of the trunk is not as helpful as it could be. What I would like to have instead is basically have the history of each single file (or directory which is cleaner) to be recovered while stripping all the moving around that took place. If the other branches that are still relevant are not preserved by this it doesn’t particularly matter.

  • Amend a past commit from a previously merged branch, while keeping branch history
  • Git - How to see if a branch has EVER been merged into another branch?
  • Moving master head to a branch
  • Why would my local changes in Git be overwritten by checkout in this circumstance?
  • Is git suitable for A/B testing, code swapping?
  • Always ignore a certain commit on merge in git
  • A related question to this is: From svn to git, with a moved trunk

    I would be happy for any suggestions, e.g. do some magic in SVN before the migration, have a clever way of migration or “clean up” in git after the migration.

  • Force Git submodules to always stay current
  • No refs in common and none specified; doing nothing
  • Adding older versions of code to git repo
  • How to use tags for versioning in git gui
  • Can I push the definition of a remote itself to a remote?
  • How to git rm a file whose name starts with ':'
  • One Solution collect form web for “Migrate SVN to git with cleanup”

    1 Simple cases like #1 can be filtered by svndumpfilter

    2 More complex cases can be solved by rewriting history. Though you should understand that this is complex and error prone process.

    So, basically you should:

    • make dump
    • process it with Python + svndump lib as you want – remove nodes, replace nodes paths, rearrange revisions and so on.
    • load dump
    • compare (if need) old repository with new to be sure that history is not lost

    If you want, I can provide python script examples.

    Git Baby is a git and github fan, let's start git clone.