How to find individual breaking change in a large commit

I made some source changes to lint my code. I did not run unit tests, and later I found that a large commit broke the code. How do I go back and find out which code change in a number of files broke the tests?

I could:

  • Is it safe to use a copied git repo?
  • Git DEFLATE/optimized zlib
  • git without bash/cygwin
  • How can I make git-svn get rid of remote branches that don't exist anymore?
  • Automatically resolve git conflicts where possible
  • How to get diff between a branch on a modified master and a remote master?
    1. diff the working (earlier) commit against the breaking (later) commit
    2. save the result in a patch file
    3. patch/test cycle
      1. apply parts of the patch file to the earlier commit
      2. run the tests

    I am hoping git has something less manual than this.

  • Maintaining a set of small changes not to be committed to SCM
  • Mark repository as fork, if it has copied manully
  • dealing with long personal branch names in git and gitolite
  • How to push to github from cloud9?
  • Are merges in Git symmetric?
  • How do i make a GIT credential helper?
  • 3 Solutions collect form web for “How to find individual breaking change in a large commit”

    There’s a decent way to do this using the ability of git-stash to not only stash away and reapply changes, but also the state of the index. It goes something like this:

    # check out the bad commit
    git checkout bad-commit
    # and then reset to the commit before, leaving the bad changes in the work tree
    git reset HEAD^
    # stage the things you want to keep/test first
    git add -p
    # stash away the rest (keep the staged parts)
    git stash --keep-index
    # now build/test. if it works, go ahead and commit it
    git commit
    # bring back the stashed changes
    git stash pop
    # repeat!
    git add -p
    git stash --keep-index
    # now suppose the broken part is in what you kept, and you want to split it up more
    # unstage the changes
    git reset
    # and then repeat!
    git add -p
    git stash --keep-index
    # if you do this, you'll end up with multiple stashes; you can check on them:
    git stash list
    git stash show
    git stash show -p stash@{1}

    Using stashes and testing as you go along has the advantage that if you manage to pick a subset of the changes which simply breaks the build, you can just pop the stash back off, and try again.

    You could do something similar to split up the commit into many, then run git bisect on it, but often that’s more work, since it’s more difficult to know how to split things up without testing as you go along.

    Of course, now you know that you should make smaller commits. But I don’t always do it right the first time either!

    You can split your commit to several small ones, and then use git bisect to find the offending commit. The easiest way to that is to use interactive rebase:

    1. If you already pushed you commit, then create a new branch with git checkout -b <name>, otherwise you’re good to go.
    2. Run git rebase -i <bad commit>^
    3. An editor will pop up with one or more lines, each indicating a commit. On the line of the offending commit, change the first word from pick to edit, then save and close the editor.
    4. Git will now stop in the middle of the rebase, when all the changes of your offending commit are staged. Now we separate this large commit into smaller ones by doing a git reset on the changed files, and then committing each change on its own.
    5. When you’re done, do git rebase --continue to complete the rebase procedure.

    You’ve now split your commit into several smaller ones, and you can more easily find out exactly which change is broken. A very efficient way of doing that is using git bisect. What that does is perform a binary search of commits in order to find exactly the one that is broken, which is precisely what you want to achieve. This is how you use it:

    1. Run git bisect start.
    2. Run git bisect good <good commit>, where <good commit> is a hash/tag/whatever of a commit you know worked. You can use the commit before the first of the split commits you just made.
    3. Run git bisect bad <bad commit>, where <bad commit> is a has/tag/whatever of a commit you know doesn’t work. You can use HEAD for this since you know it’s broken.
    4. Git will now checkout some commit in that range and ask you to test it. Perform your test, then tell git whether it’s good or bad by running git bisect good or git bisect bad, respectively. Git will continue to checkout other commits, according to what you tell it.
    5. After some steps, git will tell you exactly which commit is the one broken. You can then use git show <commit> to see what it contains.
    6. Run git bisect reset to quit bisect mode.

    To learn more about these subject, continue reading on:

    1. Git Book about git bisect and interactive rebase.
    2. Pro Git book about git bisect and interactive rebase.
    3. The git man pages (which are excellent, by the way!), with git rebase --help and git bisect --help.

    I ended up doing essentially what Jefromi suggested. I wrote this python code, naming it

    #!/usr/bin/env python
    from subprocess import Popen, PIPE
    from sys import stderr
    def run_command(cmd):
        p = Popen(cmd, stdout=PIPE, stderr=PIPE)
        output, errors = p.communicate()
        if p.returncode:
            raise Exception(errors)
    def process_all(filename):
        with open(filename) as f:
            all_files = [fn.strip() for fn in f]
            for i, f in enumerate(all_files, start=1):
                print >> stderr, i, f
                run_command(['git', 'add', f])
                run_command(['git', 'stash', '--keep-index'])
                run_command(['git', 'stash', 'pop'])
        except Exception, exc:
            print >> stderr, exc
    if __name__ == '__main__':
        from sys import argv
        for filename in argv[1:]:

    And ran it against a list of files that had been affected.

    Here was my complete workflow:

    # Get the commit just before things went bad
    git checkout 8f5c3d7
    # Diff good code against bad, make patch
    git diff 8f5c3d7 cb8ddf0 > 8f5c3d7-cb8ddf0.patch
    # Apply patch
    git apply 8f5c3d7-cb8ddf0.patch
    # Now my index has all the changes that went into the bad commit
    # but with just a dirty index.
    # Run code that finds bad file in commit
    python ~/Dropbox/src/ ~/Dropbox/src/bad-commit.txt

    And this chugged along until it stopped on the first bad file. Then I would just:

    git reset --hard HEAD

    and reapply the patch, put the bad file at the end of the file list and start over. Soon I knew ever file that had changes that broke my build.

    Git Baby is a git and github fan, let's start git clone.