Git – filter before/on clone

I have a massive repository – it’s more than a gigabyte. Cloning the repository takes hours. However, most of that size is because of a data directory that isn’t needed to work on the project locally. However, I certainly don’t have the authority to simply remove the directory from the repository.

Is there any way to apply a filter to the repository before it’s cloned, so that I only download the files I actually need to work on?

  • Apply multiple filters for same files in git
  • How to setup gitattributes to filter part of a file?
  • What is wrong with this git smudge/clean filter?
  • How to list the last occurance of a specific string in Terminal
  • git keyword expansion after commit
  • Git clean and smudge filters don't do anything
  • In short, what are the advantages of git and mercurial over subversion?
  • How do I utilize git with multiple working directories?
  • Tagging a TFS Git repository during a release
  • GitConfig: bad config for shell command
  • Git ignore all files in directory but not directory
  • How to merge code from my fork to up stream GIT?
  • One Solution collect form web for “Git – filter before/on clone”

    No, by design of git that is absolutely not possible. You will have to change the central repository.

    As an interim solution you could create a new branch, filter only this branch and do a empty merge from master to your branch. Now people can clone just your branch and work on it. You will then have to merge to master somewhere. But since you added that empty merge, you can now merge between those two branches whenever you want – as long as you don’t change the data directory on master.

    edit: Sry, the empty merge would defeat the whole purpose, as clients would then again pull down all the data.

    Git Baby is a git and github fan, let's start git clone.