git — handling frozen content

Almost all projects that I’ve worked on have some kind of “frozen content” that should always come when cloned, yet rarely be changed (see below for an example). I’ve tried different approaches using git, but they’re all error-prone: people frequently wind up accidentally committing changes.

It’s a subtle case to be sure: the files/folders must be versioned, but the vast majority of changes shouldn’t get pushed.

Looking around I have a couple options:

  • git update-index --assume-unchanged <file>: Problem — this would appear to be a local setting, so this only solves the problem on a given machine. New clones are prone to forget and still commit changes by accident.
  • git update-index --skip-worktree <file>: Problem — Appears to have the same issue, since I don’t think changes to the index are ever propogated.
  • git rm --cached <file>: Problem — not really a solution at all since this toasts everyone’s copy when pushed!
  • echo <file> >> .gitignore: Problem — not really a solution, since this only controls whether an object is added to the repo.
  • use a smudge/clean filter to exclude file changes from commits (see jthill’s answer): Problem — complicated, error prone: still needs each developper to locally config.

An acceptable answer to this question doesn’t require special actions by each new developer.

Why? Because this is precisely the problem with the above solutions, which, in my experience leads to situations where “somebody has committed that file again”.

Searches easily turn up many questions. We need a final answer!

  • How to freeze a file in a repository
  • Prevent local changes getting pushed in Git
  • git assume-unchanged implications
  • Howto prevent git from pushing changes to some files
  • Ignore modified (but not committed) files in git?
  • Preventing a file overwrite with Git

Example:

Here’s the case I’m dealing with ATM. My project is a website that embeds wiki software (which I did not write). The wiki component needs a non-trivial folder structure which is used rather like a database (should probably be one). It needs to find the folders and files already there to work. After awhile these files get big — we don’t want to track those changes! This folder structure also contains some config (I know). If I could include the bare copy in the repo, and somehow (almost) never track its changes, that would be perfect.

  • git - find files changed in branch, since created
  • Converting a development team from FTP to a Versioning System
  • How can I run “git status” and just get the filenames
  • What smooth Git workflow to use to get from a development to a release?
  • Is there an acceptable Linux targeted GUI client for git-svn?
  • Merging release to master after a patch release git
  • Git repository lost its remote branches?
  • Git - Can not remove one auto-modified file
  • 3 Solutions collect form web for “git — handling frozen content”

    It’s not really clear what types of files you’re talking about.

    In the case of configuration files, I always recommend the following approach:

    1. Commit a sample configuration file with sane defaults, e.g. config.sample.ini.
    2. Ignore the non-sample file, e.g. config.ini.
    3. As part of the “start up” procedure on new machines, the config.sample.ini file must be copied to config.ini and and customised. Document this process in your README, or wiki, or wherever.
    4. Make sure your code “fails properly” if the config file is missing, e.g. when the software starts it will immediately error out with “Could not find config.ini. Did you copy the sample file?”

    This ensures that the sample file can be easily updated. It ensures that the configuration file is not committed. It fails quickly if it’s done wrong. And it’s relatively simple to implement.

    The quickest way to get from a committed config.ini to an ignored config.ini and a committed config.sample.ini is likely to do something like

    git mv config.ini config.sample.ini
    echo config.ini >> .gitignore
    git commit -a -m "Replace config file with sample config file"
    

    Update:

    Your wiki example is an interesting one.

    You’re right that a deep tree of .sample files would be difficult to manage, but you can get the same effect with a zip file or tarball:

    1. Create an archive that will be committed to the repository containing your base wiki content, e.g. wiki.zip.
    2. Ignore the directory where the wiki will be extracted, e.g. path/to/wiki-root.
    3. Add “extract wiki with unzip -d path/to/wiki-root” to your documentation. This is now part of your install / deploy procedure.

    Now you can update the zip file as necessary and commit those changes, while ignoring changes to the extracted files.

    AFAIK, the only way to enforce a “don’t change this” policy without voluntary action in each clone of the repo is to install a server-side hook for them. A pre-receive hook can inspect the incoming commits for changes to a frozen file and reject them (unless the commit message matches some magic pattern that makes it pass).

    Server-side hooks only work when you control the central repo, i.e. it can’t be on GitHub or some such service.

    Put the following file in your project as <proj-root>/.git/hooks/pre-commit

    #!/bin/sh
    
    numFilesToVerify=0
    IFS=$'\n'
    
    for l in `git diff --name-only --cached`; do 
        while read m; do
            [[ $l == $m ]] && verify="$verify"`echo " * $l"`$'\n' && ((numFilesToVerify++))
        done < .gitverify
    done 
    
    if [ $numFilesToVerify -gt 0 ]; then
        echo "\nYou're changing $numFilesToVerify frozen files:";
        echo "$verify"
        read -p "Do you know what you're doing? (yes/no): " reply < /dev/tty
        if [ "$reply" != "yes" ]; then
            echo 'commit aborted.'
            exit 1;
        fi
    fi
    
    unset IFS 
    

    Then list the files or directories you want to freeze in the file <proj-root>/.gitverify. Use the * wildcard if you want:

    testdir/test* 
    

    Here’s what happens if I try to commit changes to the files tesdir/test and testdir/test2

    $ git commit -m 'test'
    You're changing 2 frozen files:
     * testdir/test
     * testdir/test2
    
    Do you know what you're doing? (yes/no): no
    commit aborted.
    

    Advantages:

    • I can easily add / remove files from the freeze list
    • If updating part of a tar (see Chris’s answer) one is prone to overwrite a part not intended to be changed. No tars needed here, and it tells you what potentially dumb thing you’re doing.
    • doesn’t use server side hooks so no issues with access or having multiple remotes.
    Git Baby is a git and github fan, let's start git clone.