Is using git reset –hard instead of checkout a good idea when you have unpublished commits?

I was helping a coworker today on his machine and part of the troubleshooting we were conducting involved returning to a previous commit. He had an unpublished commit i.e. his branch was one commit ahead of the remote branch. I thought he was going to call git checkout, instead I watched in horror while he made a note of his unpublished commit’s SHA1 and then proceeded to call git reset --hard on the target commit we were supposed to investigate. After a while, he returned to his previous state by calling another hard reset back to the noted SHA1, which worked. He told me he always used hard reset in this situation.

I then ended up with the following questions:

  • Mercurial - what is “future auditing”?
  • Get number of files in git repository at any revision
  • How to create a git project from an existing maven project in Eclipse?
  • Pushing Git Repo to Remote Server
  • Unable to unstage git committed file
  • Maintaining a set of small changes not to be committed to SCM
    • Is this operation secure against data loss? Is there a chance that his commit is lost?

    • Wouldn’t this operation leave a dangling reference to his commit that can be removed by runnning git gc? Sometimes I see git running housekeeping operations on large repositories that are triggered automatically after I call some unrelated command. Could one of these operations wipe out the commit?

    • Is there a way to recover the commit if the note is lost?

  • git fetch/merge non-fast-forward changes?
  • git bad config when piping commands
  • After editing my code, “git checkout — .” works for resetting, but after “git checkout .” it doesn't
  • Git workflow for Independent Releases on same Code Base
  • updating a submodule to most recent commit
  • git rebase, “would be overwritten”, and “No changes - did you forget to use 'git add'?”
  • 2 Solutions collect form web for “Is using git reset –hard instead of checkout a good idea when you have unpublished commits?”

    Let’s take these in order:

    Is this operation secure against data loss? Is there a chance that his commit is lost?

    No and no, in that order. Or maybe “not yet” is better for the second answer.

    Git’s reset --hard adjusts (writes to) three things: the recorded branch tip, which changes from wherever it is now, to the argument commit; the index, which changes to match the new current commit once the branch is updated; and the work-tree, which changes to match the index once the index is updated.

    Some of these writes are completely unrecoverable, some are somewhat recoverable with difficulty, and some are easily recoverable. In particular, Git itself saves nothing of the work-tree, so these are unrecoverable (except by means outside Git, e.g., file system snapshot / backup).

    You can remove the “write to work tree” aspect of git reset by using git reset --mixed. That still writes to the index. Because the index contains only metadata, some file contents (if lost in this phase) can be retrieved. How hard this is varies, but it’s generally no fun. The metadata are of course gone (except by means outside Git, again: the index is by default stored in one or sometimes two files within the .git directory, or the per-worktree area for git worktree add-ed secondary worktrees).

    You can even remove the “write to index” aspect of git reset by using git reset --soft. That writes only the recorded branch-tip. This is the one that is easy to recover, at least for a short while, as the previous branch-tip value is immediately saved in ORIG_HEAD.

    The ORIG_HEAD name does, of course, get overwritten by another git reset, so if you do not save it quickly enough, that might lose it. However, there is a second mechanism by which all previous values of all references—both HEAD itself, and the branch name—are saved for at least 30 days by default, in Git’s reflogs. So even if you lose ORIG_HEAD, you still have some reflog entries, unless you have turned off reflogs.

    (Reflogs are off by default in new --bare repositories, but on by default in all others.)

    Wouldn’t this operation leave a dangling reference to his commit that can be removed by runnning git gc?

    If it were not for the reflogs and the saved ORIG_HEAD, yes. Those count against gc though. As long as ORIG_HEAD or a reflog entry (or both) protects any given commit, that commit will remain in the repository, along with everything reachable through that commit.

    Is there a way to recover the commit if the [noted commit ID] is lost?

    Reflogs (and ORIG_HEAD) are the usual way. Should those be lost as well, git fsck --lost-found finds unreachable commits (and other unreachable objects) and restores them into the lost-found subdirectory within .git, assuming they have not been git gc-ed. This is also the way to find modified files that were git add-ed but never committed (they become “dangling blobs” and --lost-found resurrects them).

    All that said, git reset is the wrong way to do this

    The right way to look at old commits is indeed to just use git checkout to extract them. You will get a “detached HEAD”, but that’s normal enough while looking around: just git checkout <branch> to re-attach your HEAD later. Or, if you’re trying to track down the point where a bug was introduced, use git bisect, which repeatedly checks out old commits as you narrow in on the problem. With git bisect, you identify some (earlier) commit where things are good and some (later) commit where things are bad, and then it picks a commit about halfway between for you to test. You then test it, declare it good or bad, and bisect picks the next one that’s the next half-way.

    If you have an automated test, that’s even better: you can git bisect run your automated test, and let that do all the work. But even if not you can still git bisect manually, as long as you have a way to declare any given commit “good” or “bad”.

    Is this operation secure against data loss? Is there a chance that his commit is lost?

    Yes, it’s secure. Every commit you have made in the local repo is kept for quite long enough. The data themselves related with any commit are not lost but the reference to the commit may be hard to find back.

    Wouldn’t this operation leave a dangling reference to his commit that can be removed by runnning git gc? Sometimes I see git running housekeeping operations on large repositories that are triggered automatically after I call some unrelated command. Could one of these operations wipe out the commit?

    These dangling commits are referred to by the reflog so they cannot be removed by an ordinary git gc, at least not within a long enough period.

    Is there a way to recover the commit if the note is lost?

    git reflog can help find it.

    PS

    What your coworker does is a flexible but tricky operation. It’s absolutely okay if he knows what the scenario is and what consequences and side effects come afterwards. In git, we are manipulating commits in most cases. The references help us to do it clearly but they are not always necessary and sometimes they trap us in a narrow circle.

    Git Baby is a git and github fan, let's start git clone.