Merge changes of copied repository without true common ancestor in git

I have a project, DemoA that was built off of a git repository, Project1.

Unfortunately, DemoA started as simply a copy of the files from Project1, before itself turning into an actual long-term project. I would now like to make Project1 a submodule of DemoA, but – more importantly – want to merge in the changes done on the code derived from Project1, in DemoA.

I have done a subtree split on DemoA to create a branch P1, which has all the changes done to the Project1 codebase in DemoA.

I have also managed to add in the changes to Project1 made to DemoA before it was instantiated as a repository.

Project1
A - B - C - D                     - E
Demo1/P1
    (untracked changes) F - G - H - I

where the files in E are identical to F

What I want:

Project 1
A - B - C - D - E - G - H - I

Obviously the hashes for E and F are different, so when I added Demo1/P1 as a remote to Project1 and tried to merge, it complained about no common ancestor.

I have tried using format-patch, but git am has complained

error: file.xyz: already exists in index

and I was trying to rebase onto a different branch, doing:

git rebase -s recursive -X subtree=project1dir --onto (E-hash) (F-hash) emptybranch

but I clearly don’t understand what that is actually doing, as it didn’t seem to actually do anything.

Is there a clean way to do this? I don’t mind some manualness to the process, but I would like to preserve the history.

  • Do you check in your rvmrc file?
  • When to break up a large Git repository into smaller ones?
  • Overcome git svn caveats
  • How to do rebase --interactive in a Windows GUI
  • How to prevent editing same files using git by multiple users
  • Why does git Submodule show wrong branch
  • Why don't they teach these things in school?
  • Default Git Repo in Visual Studio 2015 for startup default project in solution?
  • One Solution collect form web for “Merge changes of copied repository without true common ancestor in git”

    This is all moderately difficult (actual difficulty level varies depending on circumstances and your familiarity with Git).

    If the files in E and F are truly identical, the (or an) easy way to do this would be to put in a graft (with git replace or the grafts file) so that Git pretends that G‘s parent commit is commit E. That is, you have:

    A--B--C--D--E   <-- master
    
    F-------------G--H--I   <-- refs/remotes/rem/P1
    

    and git diff master rem/P1~4 produces no output at all (master names commit E, rem/P1~4 names commit F, and the two trees for E and F match exactly).

    You wish, at least as an intermediate product perhaps, that you had this:

    A--B--C--D--E   <-- master
                 \
    F             G--H--I   <-- refs/remotes/rem/P1
    

    That is, you’d like Git to pretend, at least for some purposes and some period of time, that commit G has commit E as its parent.

    Using git replace to emulate the old horrible-hack grafts

    Git grafts do precisely that: they tell Git to pretend that the parent(s) of some commit is some other commit(s). But these have been deprecated in favor of the more generic git replace. You can use git replace to make a new commit G' that resembles (but supersedes, at least, for most Git commands) G, with the one difference being that G' has E as its parent.

    You can then use git filter-branch to re-copy commits in the repository so that this replacement becomes real and permanent, rather than just a copy. You will, of course, get new commit hashes for the new commits (G' can keep its hash but you must get a new H' and I'). See this answer by Jakub Narębski, and then How do git grafts and replace differ? (Are grafts now deprecated?), where VonC links to Jakub’s answers.

    (Git grafts do still work, and you can just put the hash for commits G and E into .git/info/grafts: echo $(git rev-parse rem/P1~3) $(git rev-parse master) > .git/info/grafts, for instance. But they are a horrible hack and if you do this sort of trick it’s best to just run your filter-branch immediately afterward, as Jakub notes.)

    Using git rebase

    You can also use git rebase --onto, as you were attempting, but you must start this rebase using an existing (ordinary, local) branch name (I’m not sure where emptybranch came from here) that points to commit I. I think maybe the step you are missing might be making this regular ordinary local branch name:

    git checkout -b rewrite rem/P1
    

    for instance, assuming the name rem/P1 resolves to commit I. Or git checkout -b rewrite <hash-of-I>, if you have that hash in front of you for easy cut/paste. At that point you will have this:

    A--B--C--D--E   <-- master
    
    F-------------G--H--I   <-- HEAD -> rewrite, rem/P1
    

    That is, you’re now on this new rewrite branch, which points to commit I. Now you can git rebase --onto master HEAD~3 to copy the most recent 3 commits on the current branch—G, H, and I. The copies will be G', H', and I', with the parent of G' being E—the commit to which master points—and the parent of H' being G' and so on:

                  G'-H'-I'   <-- HEAD -> rewrite
                 /
    A--B--C--D--E   <-- master
    
    F-------------G--H--I   <-- rem/P1
    

    Now you can delete the remote and its remote-tracking branch since you have the commit chain you want. You can also fast-forward master to point to commit I' at any time, if that’s what you want.

    Git Baby is a git and github fan, let's start git clone.