BigQuery GitHub data: How to handle repo name changes?
My goal is to track the total number of stars of my repo. However, its repo.name changed over time. How to achieve this with the
One Solution collect form web for “BigQuery GitHub data: How to handle repo name changes?”
(related to https://stackoverflow.com/a/42930963/132438)
GitHub project names go through changes, so instead of querying by name it’s safer to query by id. You could look for a project id in a separate query, or do it altogether in a query like this:
SELECT COUNT(*) naive_count, COUNT(DISTINCT actor.id) unique_by_actor_id, COUNT(DISTINCT actor.login) unique_by_actor_login FROM `githubarchive.month.*` WHERE repo.id = ( SELECT repo.id FROM `githubarchive.month.201702` WHERE repo.name='bazelbuild/bazel' LIMIT 1) AND type = "WatchEvent"