Which Git commit stats are easy to pull

Previously I have enjoyed TortoiseSvn’s ability to generate simple commit stats for a given SVN repository. I wonder what is available in Git and am particularly interested in :

  • Number of commits per user
  • Number of lines changed per user
  • activity over time (for instance aggregated weekly changes)

Any ideas?

  • How can I use gitstats to find out how many SLOC a Git repo has in total and per commiter?
  • How can I get calculate for a git repo on OS X
  • Scan Git Repository for Statistics
  • How to determine the number of lines of code in each file after each commit in a git repo
  • Git contains between arbitrary commits
  • What's the best practice of going GIT when upstream is 100% CVS?
  • Resolve merge conflicts: Force overwrite all files
  • Use Git / Github to sync my Projects between PC / Laptop / WebInterface
  • LibGit2Sharp and TFS Git repository
  • Is “refs/heads/master” same as “refs/remotes/origin/master” in Git?
  • 12 Solutions collect form web for “Which Git commit stats are easy to pull”

    Actually, git already has a command for this:

    git shortlog
    

    in your case, it sounds like you’re interested in this form:

    git shortlog -sne
    

    See the --help for various options.

    You may also be interested in the GitStats project. They have a few examples, including the stats for the Git project. From the GitStat main page:

    Here is a list of some statistics generated currently:

    • General statistics: total files, lines, commits, authors.
    • Activity: commits by hour of day, day of week, hour of week, month of year, year and month, and year.
    • Authors: list of authors (name, commits (%), first commit date, last commit date, age), author of month, author of year.
    • Files: file count by date, extensions
    • Lines: Lines of Code by date

    First, you don’t have to pull anything (as in network pull), because you have the whole repository and the whole history locally. I’m pretty sure there are tools that will give you statistics, but sometimes you can just be creative with the command lines. For instance, this (just out of my head) will give you the number of commits per user:

    git log --pretty=format:%ae \
    | gawk -- '{ ++c[$0]; } END { for(cc in c) printf "%5d %s\n",c[cc],cc; }'
    

    Other statistics you asked for may need more thought put into it. You may want to see the tools available. Googling for git statistics points to the GitStats tool, which I have no experience with and even less idea of what it takes to get it run on windows, but you can try.

    Thanks to hacker for answering this question. However, I found these modified versions to be better for my particular usage:

    git log --pretty=format:%an \
    | awk '{ ++c[$0]; } END { for(cc in c) printf "%5d %s\n",c[cc],cc; }'\
    | sort -r
    

    (using awk as I don’t have gawk on my mac, and sorting with most active comitter on top.)
    It outputs a list like so:

     1205 therikss
     1026 lsteinth
      771 kmoes
      720 minielse
      507 pagerbak
      269 anjohans
      205 mfoldbje
      188 nstrandb
      133 pmoller
       58 jronn
       10 madjense
        3 nlindhol
        2 shartvig
        2 THERIKSS
    

    Here are ways to get stats for a specific branch or two hashs.

    key here is the ability to do HASH..HASH

    Below I am using the first hash from a branch to the HEAD which is the end of that branch.

    Show total commits in a branch

    • git log FIRST_HASH..HEAD –pretty=oneline | wc -l
    • Output 53

    Show total commits per author

    • git shortlog FIRST_HASH..HEAD -sne
    • Output
    • 24 Author Name
    • 9 Author Name

    Note that, if your repo is on GitHub, you now (May 2013) have a new set of GitHub API to get interesting statistics.
    See “File CRUD and repository statistics now available in the API”

    That would include:

    • Contributors
    • Commit Activity
    • Code Frequency
    • Participation
    • Punch Card

    I’ve written a small shell script that calculates merge statistics (useful when dealing with a feature-branch-based workflow). Here’s an example output on a small repository:

    [$]> git merge-stats
    % of Total Merges               Author  # of Merges  % of Commits
                57.14     Daniel Beardsley            4          5.63
                42.85        James Pearson            3         30.00
    

    Here is a simple ruby script that I used to get author, lines added, lines removed, and commit count from git. It does not cover commits over time.

    Note that I have a trick where it ignores any commit that adds/removes more than 10,000 lines because I assume that this is a code import of some sort, feel free to modify the logic for your needs. You can put the below into a file called gitstats-simple.rb and then run

    git log --numstat --pretty='%an' | ruby gitstats-simple.rb
    

    contents of gitstats-simple.rb

    #!/usr/bin/ruby
    
    # takes the output of this on stdin: git log --numstat --prety='%an'
    
    map = Hash.new{|h,k| h[k] = [0,0,0]}
    who = nil
    memo = nil
    STDIN.read.split("\n").each do |line|
      parts = line.split
      next if parts.size == 0
      if parts[0].match(/[a-z]+/)
        if who && memo[0] + memo[1] < 2000
          map[who][0] += memo[0]
          map[who][1] += memo[1]
          map[who][2] += 1
        end
        who = parts[0]
        memo = [0,0]
        next
      end
      if who
        memo[0]+=line[0].to_i
        memo[1]+=parts[1].to_i
      end
    end
    
    puts map.to_a.map{|x| [x[0], x[1][0], x[1][1], x[1][2]]}.sort_by{|x| -x[1] - x[2]}.map{|x|x.inspect.gsub("[", "").gsub("]","")}.join("\n")
    

    See this gitstat project

    http://mirror.celinuxforum.org/gitstat/

    DataHero now makes it easy to pull in Github data and get stats.
    We use it internally to track our progress on each milestone.

    https://datahero.com/partners/github/

    How we use it internally: https://datahero.com/blog/2013/08/13/managing-github-projects-with-datahero/

    Disclosure: I work for DataHero

    You can use gitlogged gem (https://github.com/dexcodeinc/gitlogged) to get activities by author and date. This will give you report like this:

    gitlogged 2016-04-25 2016-04-26
    

    which returns the following output

    ################################################################
    
    Date: 2016-04-25
    
    Yunan (4):
          fix attachment form for IE (#4407)
          fix (#4406)
          fix merge & indentation attachment form
          fix (#4394) unexpected after edit wo
    
    gilang (1):
          #4404 fix orders cart
    
    
    ################################################################
    ################################################################
    
    Date: 2016-04-26
    
    Armin Primadi (2):
          Fix document approval logs controller
          Adding git tool to generate summary on what each devs are doing on a given day for reporting purpose
    
    Budi (1):
          remove validation user for Invoice Processing feature
    
    Yunan (3):
          fix attachment in edit mode (#4405) && (#4430)
          fix label attachment on IE (#4407)
          fix void method (#4427)
    
    gilang (2):
          Fix show products list in discussion summary
          #4437 define CApproved_NR status id in order
    
    
    ################################################################
    

    Modify https://stackoverflow.com/a/18797915/3243930
    . the output is much closed to the graph data of github.

    #!/usr/bin/ruby
    
    # takes the output of this on stdin: git log --numstat --prety='%an'
    
    map = Hash.new{|h,k| h[k] = [0,0,0]}
    who = nil
    memo = nil
    STDIN.read.split("\n").each do |line|
      parts = line.split("\t")
      next if parts.size == 0
      if parts[0].match(/[a-zA-Z]+|[^\u0000-\u007F]+/)
        if who
          map[who][0] += memo[0]
          map[who][1] += memo[1]
          if memo[0] > 0 || memo[1] > 0 
            map[who][2] += 1
          end
        end
        who = parts[0]
        memo = [0,0]
        next
      end
      if who
        memo[0]+=parts[0].to_i
        memo[1]+=parts[1].to_i
      end
    end
    
    puts map.to_a.map{|x| [x[0], x[1][0], x[1][1], x[1][2]]}.sort_by{|x| -x[1] - x[2]}.map{|x|x.inspect.gsub("[", "").gsub("]","")}.join("\n")
    

    The best tool so far I identfied is gitinspector. It give the set report per user, per week etc

    You can install like below with npm

    npm install -g gitinspector
    

    Details to get the links are below

    https://www.npmjs.com/package/gitinspector
    https://github.com/ejwa/gitinspector/wiki/Documentation
    https://github.com/ejwa/gitinspector
    

    example commands are

    gitinspector -lmrTw
    gitinspector --since=1-1-2017
    

    etc

    Git Baby is a git and github fan, let's start git clone.