Merge two repositories (original project and changed project WITHOUT HISTORY)
I have two repositories:
- Gephi (big open source project) hosted on github
- Project of my company based on gephi
7 months ago, when our project started, somebody took a snapshot of gephi project on github and save it to corporate svn => change history loss
now i decided to move our project to git repository and merge changes with original project
i have now git repository migrated from svn with git-svn
my files does not have change history beyond the time when our project started
Can i map initial state of our repository to state of original repository? In other words i would like to start aplying our changes to original repository from specific revision.
Today i found another obstacle. Schema first:
red branch is the original project
<alpha2>are commits of plugins for main project (unrelated to code commited in
<E' E'' E'''>)
<E'> <E''> <E'''>was added code from main project (red) repository
<E>(in each commit cca one third of project from
I have fetched red and blue repositories into one. On second schema i have desired state. Is it possible to do this? (for example make from
<E' E'' E''> just one commit (
<E'>) and then mark that commit as a merged from branches
Thank you Julien for your response. It seems very helpful.
2 Solutions collect form web for “Merge two repositories (original project and changed project WITHOUT HISTORY)”
Disclaimer: I have now tested this, and it seems like it works as expected (assuming I understood you correctly, of course). However, there’s still a lot that can go wrong. Absolutely only try this out on a separate working copy of your project’s repository, and make sure to examine everything before pushing it anywhere. Keep full directory backups of the state before you did this.
So I assume you have two independent repositories. The original project (Gephi):
A---B---C---D---E ^ HEAD of Gephi
And your project, whose first revision looks identical to the original project’s last revision:
E'---V---W---Y---...---Z ^ HEAD of your project
(possibly with some branches, but that doesn’t really matter here.)
What you’d like to have (if I understood correctly) is:
You could try the following. Again, do this on your own, separate working tree, and make sure everything is in order before pushing this to any central repository!
While in the directory of your own working tree, fetch the heads and objects of the original Gephi repository:
git fetch /path/to/original/gephi
If you haven’t cloned the Gephi repository, you might as well specify the github URL instead of a local filesystem path.
This will result in the following situation in your current working tree:
A---B---C---D---E ^ FETCH_HEAD E'---V---W---Y---...---Z ^ HEAD
We haven’t changed a lot. Currently, the two heads coexist peacefully and completely independently from each other, but you now have access to the objects from both repositories and can try to combine them.
We now want to discard E’ (it should be identical to E), and instead make E the parent of your project’s first commit, which is V. To do this, you can use
git filter-branch -f --parent-filter 'test $GIT_COMMIT = <V> && echo "-p <E>" || cat'
<E> by the commit hashes of V and E respectively. To find those out, you can do
git log to examine your project’s commits and, since we’ve fetched them,
git log FETCH_HEAD to examine Gephi’s commits.
This will effectively connect V directly to E.
This should even work if it turns out that the head (i.e. the latest commit) of the original Gephi repository isn’t what you based your project on, meaning that there have been new commits in Gephi that you haven’t (yet?) taken care of. Just be sure, again to substitute
<E> with the hash of the commit that you have based your changes on, not with the head.
Conversely, be sure that you substitute
<V> with the hash of the first change you made. Maybe your repository doesn’t contain an E’ identical to E, but the very first commit already contains changes toward the original. Then this first commit hash will be your
<V>, instead of the one after it.
To summarize both last paragraphs: the above command should also work if your situation looks like, for example, this:
A---B---C---D---E---F---G---H---I ^ ^ FETCH_HEAD point where your project branched off V---W---Y---...---Z ^ ^ HEAD first change based on E
Just make sure to use the commit hashes that make sense in this context.
It sounds like you might want to investigate grafts (or possibly replacements) — these methods differ from
rebase in that they add meta-data to change the visible state of the repository, rather than rewriting history to actually effect the change. This is useful when you have people using the existing branches, as it avoids changing history from under them.
In your case, you’d want to add a graft for
E as an extra parent.