git revert vs. git show -R & git apply -3
My git repository has a huge number of objects (nearly 600k, consisting of both text and binary files and total sum of the packfiles is 20GiB) and a
git revert takes a very long time (around 15 minutes) to finish.
git gc, which itself takes a longer time (around 40 minutes), didn’t help.
Then I came up with an idea that I make a reverse patch of a commit of interest by
git show --binary -R <sha1> > /tmp/x, merge-apply it to the index and the working directory by
git apply -3 /tmp/x, and then
git commit right after that to produce a “revert commit”. It worked, and finished its job in less than 30 seconds.
git revert do exactly, and what might I miss if I use
git show --binary -R,
git apply -3 and then
git commit instead of the former?
I assumed the two operations were the same, but from the observation I explained above
git revert does something much more than
git show --binary -R then
git apply -3.
Note that the merge conflicts are not an issue; they frequently happen and I fix them each time without problems. Also,
git apply -3 appears to detect them correctly.
auto-gc is turned off.
I discovered one problem; if a reverse patch contains a file deletion and the content of that file differs from the patch,
git apply -3 prints
error: <filename>: patch does not apply and exits without touching the index.
git revert sets such files to a conflicted state with
deleted by them (reported by
git status) in such cases.
I’ve used git as a backup system for my entire Linux system, whose distribution is Linux From Scratch. Being skeptical of package management systems, I’m playing around with my OS by manually managing packages using git. I make a pair of commits for each package I install to take snapshot of before/after installation. Those pairs enable me to uninstall desired packages by
git revert <the-commit-after-install>.
I know git is not supposed to be used as a backup system. I use
metastore to store metadata (owners/permissions, etc) in a binary file (which regularly causes conflicts on reverts). I chose git because it was widely used and seemed stable. Also “being skeptical” plays a part of choosing non-backup software for a backup purpose.
I heard git is not good at handling binary files, especially big ones. The truth is that it does decent job against binary files, until I installed TeXLive 2016, which is the second revision of TeXLive on my OS.
In most cases I use
git revert when upgrading a package. First, I
git revert an old package to uninstall it. Then I “fresh-install” a new package so that a history for each file doesn’t get convoluted (a chain of modifies makes it harder when I really want to uninstall it). I cannot leave the shell until
git revert finishes because I have to quickly install the now-missing new package right after that, or risk the stability of my OS.