How to use GitHub V3 API to get commit count for a repo?

I am trying to count commits for many large github repos using the API, so I would like to avoid getting the entire list of commits (this way as an example: api.github.com/repos/jasonrudolph/keyboard/commits ) and counting them.

If I had the hash of the first (initial) commit , I could use this technique to compare the first commit to the latest and it happily reports the total_commits in between (so I’d need to add one) that way. Unfortunately, I cannot see how to elegantly get the first commit using the API.

The base repo URL does give me the created_at (this url is an example: api.github.com/repos/jasonrudolph/keyboard ), so I could get a reduced commit set by limiting the commits to be until the create date (this url is an example: api.github.com/repos/jasonrudolph/keyboard/commits?until=2013-03-30T16:01:43Z) and using the earliest one (always listed last?) or maybe the one with an empty parent (not sure about if forked projects have initial parent commits).

Any better way to get the first commit hash for a repo?

Better yet, this whole thing seems convoluted for a simple statistic, and I wonder if I’m missing something. Any better ideas for using the API to get the repo commit count?

Edit: This somewhat similar question is trying to filter by certain files (” and within them to specific files.”), so has a different answer.

  • How to get github issues(tickets) from terminal?
  • Automatically copy pushed files from one GitHub repository to another
  • Download code from GitHub using octokit.net
  • How do I receive/send information from/to GitHub using android App
  • How do I measure the number of commits before a push from a Git repository?
  • Getting Github individual file contributors
  • github api: How to efficiently find the number of commits for a repository?
  • github v3 API - delete / remove a repo
  • 3 Solutions collect form web for “How to use GitHub V3 API to get commit count for a repo?”

    If you’re looking for the total number of commits in the default branch, you might consider a different approach.

    Use the Repo Contributors API to fetch a list of all contributors:

    https://developer.github.com/v3/repos/#list-contributors

    Each item in the list will contain a contributions field which tells you how many commits the user authored in the default branch. Sum those fields across all contributors and you should get the total number of commits in the default branch.

    The list of contributors if often much shorter than the list of commits, so it should take fewer requests to compute the total number of commits in the default branch.

    I just made a little script to do this.
    It may not work with large repositories since it does not handle GitHub’s rate limits. Also it requires the Python requests package.

    #!/bin/env python3.4
    import requests
    
    GITHUB_API_BRANCHES = 'https://%(token)s@api.github.com/repos/%(namespace)s/%(repository)s/branches'
    GUTHUB_API_COMMITS = 'https://%(token)s@api.github.com/repos/%(namespace)s/%(repository)s/commits?sha=%(sha)s&page=%(page)i'
    
    
    def github_commit_counter(namespace, repository, access_token=''):
        commit_store = list()
    
        branches = requests.get(GITHUB_API_BRANCHES % {
            'token': access_token,
            'namespace': namespace,
            'repository': repository,
        }).json()
    
        print('Branch'.ljust(47), 'Commits')
        print('-' * 55)
    
        for branch in branches:
            page = 1
            branch_commits = 0
    
            while True:
                commits = requests.get(GUTHUB_API_COMMITS % {
                    'token': access_token,
                    'namespace': namespace,
                    'repository': repository,
                    'sha': branch['name'],
                    'page': page
                }).json()
    
                page_commits = len(commits)
    
                for commit in commits:
                    commit_store.append(commit['sha'])
    
                branch_commits += page_commits
    
                if page_commits == 0:
                    break
    
                page += 1
    
            print(branch['name'].ljust(45), str(branch_commits).rjust(9))
    
        commit_store = set(commit_store)
        print('-' * 55)
        print('Total'.ljust(42), str(len(commit_store)).rjust(12))
    
    # for private repositories, get your own token from
    # https://github.com/settings/tokens
    # github_commit_counter('github', 'gitignore', access_token='fnkr:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
    github_commit_counter('github', 'gitignore')
    

    Simple solution: Look at the page number. Github paginates for you. so you can easily calculate the number of commits by just getting the last page number from the Link header, subtracting one (you’ll need to add up the last page manually), multiplying by the page size, grabbing the last page of results and getting the size of that array and adding the two numbers together. It’s a max of two API calls!

    Here is my implementation of grabbing the total number of commits for an entire organization using the octokit gem in ruby:

    @github = Octokit::Client.new access_token: key, auto_traversal: true, per_page: 100
    
    Octokit.auto_paginate = true
    repos = @github.org_repos('my_company', per_page: 100)
    
    # * take the pagination number
    # * get the last page
    # * see how many items are on it
    # * multiply the number of pages - 1 by the page size
    # * and add the two together. Boom. Commit count in 2 api calls
    def calc_total_commits(repos)
        total_sum_commits = 0
    
        repos.each do |e| 
            repo = Octokit::Repository.from_url(e.url)
            number_of_commits_in_first_page = @github.commits(repo).size
            repo_sum = 0
            if number_of_commits_in_first_page >= 100
                links = @github.last_response.rels
    
                unless links.empty?
                    last_page_url = links[:last].href
    
                    /.*page=(?<page_num>\d+)/ =~ last_page_url
                    repo_sum += (page_num.to_i - 1) * 100 # we add the last page manually
                    repo_sum += links[:last].get.data.size
                end
            else
                repo_sum += number_of_commits_in_first_page
            end
            puts "Commits for #{e.name} : #{repo_sum}"
            total_sum_commits += repo_sum
        end
        puts "TOTAL COMMITS #{total_sum_commits}"
    end
    

    and yes I know the code is dirty, this was just thrown together in a few minutes.

    Git Baby is a git and github fan, let's start git clone.