How Github or any other cloud based repository services(Gitlab, Bitbucket) store source code files and directories?

Do they store the meta data of files and directories in databases and the actual files and directories on the files system of the server instances?

  • What's happened to my directory on GitHub?
  • Can I move the .git directory for a repo to it's parent directory?
  • Use forever-branches or directories for project related documentation, wireframes, designs, etc.?
  • Correctly ignore all files recursively under a specific folder except for a specific file type
  • Git: Renaming a directory in a branch
  • Using git-svn without “trunk” sub-directory
  • git: update a php-script but keep own changes
  • Git best practice commit logs
  • Is there a way to format JSON commas that makes merge GUI tools happy?
  • how to automate the “commit-and-push” process? (git)
  • git - skipping specific commits when merging
  • How to fully delete a git repository created with init?
  • One Solution collect form web for “How Github or any other cloud based repository services(Gitlab, Bitbucket) store source code files and directories?”

    It should be in bare repositories.
    But it depends on the scale of the Git repositories server you are talking about.

    For instance, GitHub is using DGit

    DGit is short for “Distributed Git.

    As many readers already know, Git itself is distributed—any copy of a Git repository contains every file, branch, and commit in the project’s entire history.
    DGit uses this property of Git to keep three copies of every repository, on three different servers.
    The design of DGit keeps repositories fully available without interruption even if one of those servers goes down. Even in the extreme case that two copies of a repository become unavailable at the same time, the repository remains readable; i.e., fetches, clones, and most of the web UI continue to work.

    The point is: you cannot just store the bare repo without dealing with the rest:

    • authentication (https or ssh)
    • authorization (tied to authentication, or also membership)
    • diff: as GitHub realizes, you cannot just query a diff from Git and return it. See “How we made diff pages three times faster”.
    • search (see “How to search for a commit message on GitHub?”), in master branch or in all branches: only the master branch is supported for now (2017).

    And all of this does not even take into account all the services et metadata those hosting servers propose (project, wiki, issues, …). See for instance “Moving persistent data out of Redis” for a glimpse of the kind of technical challenge that poses.

    And that is just GitHub. BitBucket and GitLab have their own tehnical challenges and solutions.

    Git Baby is a git and github fan, let's start git clone.