Is there a way to easily convert a series of tarballs of a source tree into a git repository?
I’m new to git and I have a moderately large number of weekly tarballs from a long running project. Each tarball has on average a few hundred files in it. I’m looking for a git strategy that will allow me to add the expanded contents of each tarball to a new git repository, starting from version 1.001 and going through version 1.650. As of this stage of the project 99.5% of tarball(n) is just a copy of version(n-1) – in other words, a perfect candidate for git. The desired end result is to have only the master branch remaining at the end of the process.
I think I know git well enough to do this “by hand”. As I understand it there is no possibility of a merge conflict since there will be no opportunity to change the master before the next version is added and committed. A shell script is my first guess, but I’m not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1. For the purposes of this project the host environment is Ubuntu 10.4, resources available are 8 Gig RAM, 500 Gig Disk space free and 4 CPU processor at 3.ghz .
I don’t need someone else to solve the problem but I could use a nudge in the right direction as to how a git expert would approach it. Any advice from someone who’s “been there done that” would be appreciated.
PS: I have looked at site’s suggested “related questions” and found nothing relevant.
4 Solutions collect form web for “Is there a way to easily convert a series of tarballs of a source tree into a git repository?”
Regarding this comment:
I’m not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1
Are you concerned about two operations running concurrently and getting in each others’ way? This shouldn’t be a problem unless you intentionally run operations in parallel.
Assuming the tarballs follow a linear evolution, branching shouldn’t come into this at all.
The process should be fairly straightforward:
untar ball _n_
git add --all .; git commit(with appropriate flags)
git tag -a v1.001 -m "Version 1.001."
rm -rf *(to handle deletions in the history; you want to leave .git intact, of course)
- goto 2
Take a look at
What I would do in this situation, as you have tarballs that are in the end ‘tagged versions’:
- create empty git repository
- extract a tarball to that directory overwriting any files
- add all files
git add .
git commit -a -m 'version foo'
- git tag current version
- remove all files
- repeat from step 2 for each tarball
In your case it’s not necessary to create branches as all your tarballs are distinct, successive versions; each iteration overwrites previous one.
Without having been exactly there, yu should simply:
- untar an archive anywhere you want
- rsync it with the git working directory in order to:
- change the relevant file
- add the new files from that archive to the working directory
- remove the files from the working directory that are no linger part of the current archive
git add -A
git commit -m "archive n"
The idea is not to checkout branch_n+1, but to stay within the same branch, committing each tar content one after the other within the same branch of the same git repo.
Should you truly have somehow two concurrent processes, you could then:
git clonethe first git repo
git branch -b a_new_branchto make sure you isolate that parallel process in its own branch that you will be able to push back to the first repo when done.