Checking for duplicate file (contents) in git?

In my ‘project/repo’ I have two MS Visual Studio projects, one for the main code, and an independent one for tests. I have some files that are common to both (in the copy and paste sense) and I’d like to see / check which ones they are.

What is the right Git commands (or Gui menu clicks) to see if I have used the same content blob twice in the overall repo tree? If I have read all the tutorials correctly, git should have a single SHA1 for the two copies of the same file content and already know about it. I am hoping Git has a command that finds and displays these duplicate usage file paths.

Eventually I’d like to be able to find out the diffs between the versions when there is a common ancestor blob SHA1 (but not a common location). [i.e. during testing one version gets updated ahead of the other version…]

I know it isn’t best practice to have such duplicates, but it is the way the work has ended up 🙁

I have Msysgit and GitExtensions on windows…

One Solution collect form web for “Checking for duplicate file (contents) in git?”

You can do something like

git ls-tree -r HEAD

To see the blobs and the files.

If you don’t want to manually look which are the same blobs:

git ls-tree -r HEAD |
    sort -t ' ' -k 3 |
        perl -ne '$1 && / $1\t/ && print "\e[0;31m" ; / ([0-9a-f]{40})\t/; print "$_\e[0m"'

From: Git: Find duplicate blobs (files) in this tree

Git Baby is a git and github fan, let's start git clone.