Will Git garbage-collect commit in submodule referred to by a top-level repository?
top.git └── sub.git => 75fc7
- The top-level Git repository
top.gitrefers to commit
- The submodule Git repository
sub.githas neither branches nor tags leading to commit
sub.git eventually garbage-collect this commit
75fc7 because nothing can reach it?
AFAIK, Git submodules designed the way that, in this example,
sub.git is not able to establish the fact that it is a submodule of any other repository. In other words, commit
75fc7 is effectively a candidate for garbage collection. Then it would be unreliable to restore state of all submodules if they may “forget” required commits.
2 Solutions collect form web for “Will Git garbage-collect commit in submodule referred to by a top-level repository?”
Yes, the commit will be eventually garbage collected.
But don’t forget that, to be reused, a submodule referenced by its parent repo must also published that recorded SHA1 (recorded as a gitlink, a special entry in the index of the parent repo).
If that SHA1 is not published (pushed to an upstream repo), then any clone of the parent repo would not be able to checkout the submodule anyway.
That means a submodule must push the recorded SHA1, which makes that SHA1 referenced (by a branch or tag, as pushed on the upstream repo)
So the issue is not so much the garbage collector here, but just the capability of a parent repo to checkout its submodule to the right SHA1.
My scenario (not explicitly mentioned in question) is actually different and more specific. What if the commits are actually pushed upstream for both
Then you don’t need to wait for a
gc to remove a non-accessible SHA1 for the issue to manifest.
If the published SHA1 is no longer referable, it means any clone of
top.git won’t be able to checkout the
sub.git submodule repo at the right SHA1 (even if gc hasn’t run yet), because the non-referred SHA1 won’t be part of the
sub.git clone anyway.
The key point to understand: an upstream repo
sub.git has no idea it is used as a submodule by another upstream repo (like top.git).
sub.git does not include the right SHA1 (used by
top.git) for any reason (
gc or other
rebase/push --force or …), a clone of
top.git will fail to restore the submodule to its proper state.
Actually, it was easy to test thanks to this answer.
Yes, the commit was garbage-collected even if it was referenced by top-level repository.
Then it demands some measures or discipline in what commits can be used in top-level repository in order to reliably restore entire tree spanning submodules at any time in the future. Such commits must be ancestors to any long-term maintained branch or tag.