Reducing the size of a git repository
About a year ago when I first started using git (on an existing git project) I pushed some large files to a stash directory, which I shouldn’t have. A year later, I can see that this project has grown to almost 60Gb, and, probably unsurprisingly, I’m now having problems cloning this repo. I’ve tried to delete these files (there are 2 VBox virtual machine images, and a human genome..) using the instructions given here: https://confluence.atlassian.com/bitbucket/reduce-repository-size-321848262.html
I followed the instructions from “Manually reviewing large files in your repository” on-wards, which suggests running a script they provide to identify the large files, then following a series of commands to delete this.
However, having done this, and viewing the repo in stash, the size remains the same, as does the problem cloning it.
Two problems that could be worth noting:
Running the following command to delete the problem files:
- How to use GitHub V3 API to get commit count for a repo?
- Git branches - statuses mixed-up?
- Using LESS and Version Control: Should generated CSS be included in a repo?
- git status - don't show untracked files
- Qt Creator Git, command for checkout
- How to hide team activity lines on Visual Studio 2015?
$git filter-branch --force --index-filter "git rm --cached --ignore-unmatch 'src/Base/Base4.vdi'" HEAD
Results in this error (although it was rectified once using quotes around the path, the error has returned again, may be other reasons for this?):
WARNING: Ref 'refs/heads/master' is unchanged
And secondly, my git clone command (followed some additional instructions for this given the initial problems, which seemed to be based on downloading the repo in smaller chunks):
`git config --global http.postBuffer 1048576000 git clone --recursive ssh://git@<the project repo> git config --global core.compression 0 git clone --depth 1 ssh://git@<the project repo> git fetch --unshallow`
Receiving objects: 54% (13945/25407), 9.19 GiB | 1.46 MiB/s error: inflate: data stream error (invalid distance code)B/s fatal: pack has bad object at offset 9654044026: inflate returned -3 fatal: index-pack failed
Also worth noting, when I’ve re-run the ./git_find_big.sh file from that website, the large files continue to show up. Any ideas? I’m considering just creating a new git repo, and moth-balling the previous one if I can’t resolve this issue.