Git recovery: “object file is empty”. How to recreate trees?

Note: I don’t have any before-corruption clone of this repository. I believe my situation is different from others described here, because I’m missing a tree, not a blob.

What happened:

  • Git responds with 'error in sideband demultiplexer'
  • git svn: Delta source ended unexpectedly
  • fatal: git-write-tree: error building trees
  • Go back in history to an undamaged version?
  • How do I prevent my git from corrupting?
  • git clone failing, can't repack on remote
  • When I tried to clone a repository over LAN (via SSH), Git returned an error saying that the repository is corrupted:

    remote: error: object file ./objects/2e/223ce259e9e33998d434acc778bc64b393d5d4 is empty
    remote: fatal: loose object 2e223ce259e9e33998d434acc778bc64b393d5d4 (stored in ./objects/2e/223ce259e9e33998d434acc778bc64b393d5d4) is corrupt
    error: git upload-pack: git-pack-objects died with error.
    fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
    remote: aborting due to possible repository corruption on the remote side.
    

    I’ve found somewhere that git fsck can be used for diagnosing corruption, but it didn’t tell me anything new:

    git fsck --full
    error: object file ./objects/2e/223ce259e9e33998d434acc778bc64b393d5d4 is empty
    fatal: loose object 2e223ce259e9e33998d434acc778bc64b393d5d4 (stored in ./objects/2e/223ce259e9e33998d434acc778bc64b393d5d4) is corrupt
    

    I’ve tried cloning the repository locally (using --no-hardlinks) to see what happens, but I got exactly the same results.

    Then I’ve stumbled upon this question, and the guy who answered just deleted the empty file (step 3), so i did this (i.e. i’ve deleted file 223ce259e9e33998d434acc778bc64b393d5d4 from subdirectory objects/2e/).

    git fsck again, and i see:

    Checking object directories: 100% (256/256), done.
    broken link from    tree 838e437f371c652fa4393d25473ce21cbf697d7a
                  to    tree 2e223ce259e9e33998d434acc778bc64b393d5d4
    dangling commit 54146bc0dc4eb3eede82a0405b749e05c11c5522
    missing tree 2e223ce259e9e33998d434acc778bc64b393d5d4
    dangling commit 864864feec207786b84158e526b2faec7799fd4e
    dangling blob d3cfd7cc7718d5b76df70cf9865db01c25181bfb
    

    So, there is now a problem with tree 838e437f37. That’s not what happened to the guy mentioned above, so I went googling and found some information from Linus.

    So, I did git ls-tree 838e437f371c652fa4393d25473ce21cbf697d7a and in the output there was a line reading:

    040000 tree 2e223ce259e9e33998d434acc778bc64b393d5d4    moje
    

    Now, “moje” is a directory (unlike the example which Linus explained, which was a file). I guess that’s why next step suggested by Linus, git hash-object moje returned fatal: Unable to hash moje.

    But anyway, there was just a small chance that it was what I needed, so I went looking further. I ran git log --raw --all --full-history -- moje/ and according to Linus’ guide, there should be a commit which lists 2e223 as a SHA-2 hash of some content, but there’s none. And the list ends with

    fatal: unable to read source tree (2e223ce259e9e33998d434acc778bc64b393d5d4)
    

    I tried looking at the last commit listed before that error, but I didn’t find this hash. I’ve seen this, but it didn’t help me, probably because there were some changes between the problematic version and current state of working tree.

    There’s one thing that may be important: inside moje/ there’s a directory cli/ which is a Git repository itself (a submodule). I’ve looked for the problematic SHA-2 hash there, but haven’t found it.

    What should I do?

  • Deleting ummerged Git branches that were actually merged after rebasing
  • Is there an upper limit to the number of commits a git repository can handle?
  • is there any way how to tell git to ignore certain lines of a file?
  • How can I find all commits that are in one branch but not in another using git?
  • Force git to push different branches to different repos
  • Pretty git branch graphs
  • 2 Solutions collect form web for “Git recovery: “object file is empty”. How to recreate trees?”

    Use this command to get a list of the commits that contain your missing tree:

    git rev-list --all | xargs -l -I '{}' sh -c 'if git ls-tree -rt {} > /dev/null 2>&1 ; then true; else git log --oneline -1 {}; git ls-tree -r -t {} | tail -1; fi'
    

    Now you need to recreate the missing tree, by placing the exact same contents in there as you had in there back then and then adding that tree to the repo. The easiest way to do that is probably to just recreate the contents and then commit them to the repo (you can remove that commit afterwards).

    The command (suggested by Chronial)

    git rev-list --all | xargs -l -I '{}' sh -c 'if git ls-tree -rt {} > /dev/null 2>&1 ; then true; else git log --oneline -1 {}; git ls-tree -r -t {} | tail -1; fi'
    

    returned the first commit that depended on missing 2e223ce object – its SHA-2 hash was 499b8fb. Its parent was all right (I could see its content, check it out, etc.), and I was also able to check out the next commit after the broken one (89b0fc4).

    Now I needed to see what changes happened between these two “good” commits – that was easy: git diff 499b8fb~ 89b0fc4 returned

    diff --git a/somefile b/somefile
    deleted file mode 100644
    index f5d1e1e..0000000
    --- a/somefile
    +++ /dev/null
    @@ -1,79 +0,0 @@
    [ contents of the deleted "somefile"... ]
    diff --git a/moje/cli b/moje/cli
    index 640a825..c0b1a24 160000
    --- a/moje/cli
    +++ b/moje/cli
    @@ -1 +1 @@
    -Subproject commit 640a825cd671dfba83601d6271e7e027665eaca8
    +Subproject commit c0b1a24aa246289831ec7db3a8596376db1f625a
    

    Now I know that between the parent of the bad commit and the good commit the file somefile was deleted, and submodule’s HEAD changed from 640a825 to c0b1a24. I went to the submodule repository and asked what commits happened between those two:

    git log --oneline 640a825..c0b1a24
    

    which returned

    c0b1a24 <commit message>
    8be9433 <commit message>
    02564e1 <commit message>
    

    Now I knew that four things happened between 499b8fb~ and 89b0fc4:

    • somefile was deleted
    • /moje/cli HEAD was changed from 640a825 to 02564e1
    • /moje/cli HEAD was changed from 02564e1 to 8be9433
    • /moje/cli HEAD was changed from 8be9433 to c0b1a24

    I didn’t know which part happened in 499b8fb (the bad commit), and which in 89b0fc4. But fortunately there are not that many possibilities, so i just tried every one of them. With each combination I made a commit so that Git would calculate appropriate objects and store them in the database. It turned out that when /moje/cli HEAD was at 8be9433, git commit resulted in creating the missing 2e223ce object – hooray!

    Note: if you’re having a similar situation and you’re poking around to see which commits are good and what Git can tell you about them, remember that being able to checkout a commit and show it are two different things. For example, I initially thought that if git show somesha throws an error it means that somesha commit is corrupted, and I cannot use it for anything. That turned out to be false: while git show 89b0fc4 returned an error, I was able to git checkout 89b0fc4 and also git diff 499b8fb~ 89b0fc4 worked.

    I suppose that’s because git show somesha shows what changes are introduced by somesha, and for that Git needs to read the content of the previous commit (in this case a corrupted one). Apparently, Git doesn’t need to look at the previous commit to check out one.

    (I managed to do this thanks to Chronial’s answer – kudos to him! I was advised to post this as my own answer.)

    Git Baby is a git and github fan, let's start git clone.