Find Git Revision of a Working Directory Missing the .git Directory

I’ve got a) a working directory without the .git directory and b) a repository. a is some revision in the middle of the history of b.

How can I find out, which revision a matches in b?

  • git wildcard - remove all instances of a subdirectory
  • Git: Find code in a file's history
  • How can I search my directory tree for contents within a file for a git managed project?
  • `find -exec` in git alias
  • List of all versioned files in subversion? (Remove files by name)
  • How can I get `find` to ignore .svn directories?
  • I thought of a shellscript doing a diff from the working directory to all revisions and pick the one with the least (hopefully 0) differences.

    That would be a bit raw (and I’m not sure how to do it), is there an easier way?

  • Git Mirroring Issue
  • Component based web project directory layout with git and symlinks
  • Using git to identify all modified functions in a revision
  • Git: Creating Git Submodules out of existing repository and reflecting parent changes to the child repository
  • Using diff-highlight with Git GUI
  • Getting a fatal error in git for multiple stage entries
  • 4 Solutions collect form web for “Find Git Revision of a Working Directory Missing the .git Directory”

    You could write a script to run diff gitdir workdir | wc -c for each commit. Then you could collate the results and say the commit which has the smallest difference (as measured by wc -c) is the closest commit to the bare working dir.

    Here is what it might look like in Python:

    #!/usr/bin/env python
    import subprocess
    import shlex
    import sys
    import os
    import operator
    proc=subprocess.Popen(shlex.split('git rev-list --all'),stdout=subprocess.PIPE)
    for sha1 in shas:
        subprocess.Popen(shlex.split('git checkout {s}'.format(s=sha1)),
        proc=subprocess.Popen(shlex.split('diff {g} {w}'.format(g=gitdir,w=workdir)),
    print('closest match: {s}'.format(s=answer))
    subprocess.Popen(shlex.split('git checkout {h}'.format(h=head)),


    % rsync -a gitdir/ workdir/
    % cd workdir
    % git checkout HEAD~10
    HEAD is now at b9fcebf... fix foo
    % cd ..
    % /bin/rm -rf workdir/.git
    % gitdir workdir
    closest match: b9fcebfb170785c19390ebb4a9076d11350ade79

    You could pare down the number of revisions you have to check with the pickaxe. Diff your working directory against the latest revision, and select some differing line that looks as rare as possible. Say your latest revision has a line containing foobar but your work directory does not; run git log -Sfoobar which outputs all commits adding or removing foobar. You can now move your repository back to the first (latest) revision on that list, since all of the revisions after that one are going to be different from your work directory. Repeat with another difference until you find the correct revision.

    Since git uses a content-addressible file store, it should be possible to find an arbitrary tree in there somewhere, but I don’t know the details. I’m guessing you could copy over the files from the detached work directory into the repository’s work directory, then commit everything, somehow find out the hash of the tree object created by the commit and search the existing commits for one that references the same tree.

    For this to work, the tree will obviously need to match perfectly, so you must not get any non-tracked files into the commit (such as object files, editor backups, etc).

    Edit: I just tried this on one repository (with git cat-file commit HEAD to show the tree object at HEAD, and searching the output of git log --pretty=raw for that tree hash), and it didn’t work (I didn’t find the hash in the history). I did get a bunch of warnings about CRLF conversion when I did the commit, so that might have been the problem, i.e. you probably get different hashes for the same tree depending on how your git is configured to mangle text files. I’m marking this answer community wiki in case someone knows how to do this reliably.

    Assuming that the in-tree and b/.git ignore settings are as they were when the commit was created and that there aren’t any non-ignored untracked files in the working tree you should be able to run something like this.

    The strategy is to recreate the git id of the working tree and then search for any commit that contains this tree.

    # work from detached working tree
    cd a
    # Use existing repository and a temporary index file
    # find out the id of the current working tree
    git add . &&
    tree_id=$(git write-tree) &&
    rm /tmp/tmp-index
    # find a commit that matches the tree
    for commit in $(git rev-list --all)
        if test "$tree_id" = "$(git rev-parse ${commit}^{tree})"; then
            git show "$commit"
    unset GIT_DIR
    unset GIT_INDEX_FILE
    Git Baby is a git and github fan, let's start git clone.