GitHub v3 API: Get full commit list for large comparison

I’m trying to use the GitHub v3 API to get the full list of commits between two SHAs, using the comparison API (/repos/:owner/:repo/compare/:base...:head), but it only returns the first 250 commits and I need to get all of them.

I found the API pagination docs, but the compare API doesn’t appear to support either the page or per_page parameters, either with counts or SHAs (EDIT: the last_sha parameter doesn’t work either). And unlike the commits API, the compare API doesn’t seem to return a Link HTTP header.

  • git svn and working with private branches?
  • How to fix git log output (missing lines in less)?
  • GUI for bare git repo
  • Git Workflow: selecting features to release from QA to PROD
  • Why can I view some Unix executable files in Mac OS X and not others?
  • Can anybody push to my project on github?
  • Is there any way to either increase the commit count limit on the compare API or to fetch a second page of commits?

  • Git “mirror” repo
  • Truncating commit messages
  • Github check in local database
  • Can I remove these 2 commits in GIT?
  • cannot rebase: you have unstaged changes git
  • Git tool to remove lines from staging if they consist only of changes in whitespace
  • 7 Solutions collect form web for “GitHub v3 API: Get full commit list for large comparison”

    Try using the parameter sha, for example:

    https://api.github.com/repos/junit-team/junit/commits?sha=XXX, where the XXX is the SHA of the last returned commit in the current round of the query. Then iterate this process until you reach the ending SHA.

    Sample python code:

    startSHA = ''
    endSHA = ''
    while True:
        url = 'https://api.github.com/repos/junit-team/junit/commits?sha=' + startSHA
        r = requests.get(url)
        data = json.loads(r.text)
        for i in range(len(data)):
            commit = data[i]['sha']
            if commit == endSHA:
                #reach the ending SHA, stop here
            startSHA = commit
    

    It’s relatively easy. Here is an example:

    import requests
    next_url = 'https://api.github.com/repos/pydanny/django-admin2/commits'
    while next_url:
        response = requests.get(next_url)
        # DO something with response
        # ...
        # ...
        if 'next' in response.links:
            next_url = response.links['next']['url']
        else:
            next_url = ''
    

    UPDATE:

    takie in mind that next urls are different than initial ex:
    Initial url:

    https://api.github.com/repos/pydanny/django-admin2/commits

    next url:

    https://api.github.com/repositories/10054295/commits?top=develop&last_sha=eb204104bd40d2eaaf983a5a556e38dc9134f74e

    So it’s totally new url structure.

    Try using the last_sha parameter. The commits API seems to use that for pagination rather than page

    From: https://developer.github.com/v3/repos/commits/#working-with-large-comparisons

    Working with large comparisons

    The response will include a comparison of up to 250 commits. If you are working with a larger commit range, you can use the Commit List API to enumerate all commits in the range.

    For comparisons with extremely large diffs, you may receive an error response indicating that the diff took too long to generate. You can typically resolve this error by using a smaller commit range

    I tried solving this again. My notes:

    • Compare (or pull request commits) list only shows 250 entries. For the pull request one, you can paginate, but you will only get a maximum of 250 commits, no matter what you do.

    • Commit list API can traverse the entire commit chain with paging all the way to the beginning of the repository.

    • For a pull request, the “base” commit is not necessarily in the history reachable from the pull request “head” commit. This is the same for comparison, the “base_commit” is not necessarily a part of the history of the current head.

    • The “merge_base_commit” is, however, a part of the history, so the correct approach is to start from the “head” commit, and iterate commit list queries until you reach the “merge_base_commit”. For a pull request, this means that it is mandatory to make a compare between “head” and “base” of the pull separately.

    • Alternative approach is to use “total_commits” returned by compare, and just iterate backwards until reaching the desired number of commits. This seems to work, however I am not 100% certain that this is correct in all corner cases with merges and such.

    So, commit list API, pagination and “merge_base_commit” solves this dilemma.

    Here’s a Sample to get ALL commits for a Pull Request Written using Octokit.NET (https://github.com/octokit/octokit.net)

           var owner = "...";
           var repository = "...";
           var gitHubClient = new GitHubClient(
                   new ProductHeaderValue("MyApp"),
                   new InMemoryCredentialStore(new Credentials("GitHubToken")));
            var pullRequest = await gitHubClient.PullRequest.Get(owner, repository, pullRequestNumber);
            Console.WriteLine("Summarising Pull Request #{0} - {1}", pullRequest.Number, pullRequest.Title);
            var commits = new List<GitHubCommit>();
            var moreToGet = true;
            var headSha = pullRequest.Head.Sha;
            while (moreToGet)
            {
                var comparison =
                    await
                    gitHubClient.Repository.Commits.Compare(
                        owner,
                        repository,
                        pullRequest.Base.Sha,
                        headSha);
    
                // Because we're working backwards from the head towards the base, but the oldest commits are at the start of the list
                commits.InsertRange(0, comparison.Commits);
                moreToGet = comparison.Commits.Count == 250;
                if (moreToGet)
                {
                    headSha = commits.First().Sha;
                }
            }
    

    I originally tried making moreToGet set to true if a commit with base sha was found, but’s never included in the list of commits (not sure why) so I’m just assuming more to get if the comparison hit’s the limit of 250.

    /commits?per_page=* will give you all commits

    Git Baby is a git and github fan, let's start git clone.