Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Codespace usage metrics #173

Merged
merged 4 commits into from
May 31, 2024
Merged

Add Codespace usage metrics #173

merged 4 commits into from
May 31, 2024

Commits on May 31, 2024

  1. Add GitHub API query, dataclass, and test

    Define a Codespace dataclass containing required fields (see discussion
    in opensafely-core/codespaces-initiative#42).
    Rather than use an instance of the existing Repo dataclass to store
    repo data, we only need the name and we only receive a
    minimal amount of repo data from the API so just store the name as a
    string. This is hopefully less confusing than modifying the Repo class
    or populating the extra fields this class requires with dummy data.
    
    An additional PAT is required to query codespaces for the opensafely
    GitHub organisation. Any future querying of codespaces for other
    organisations will require similarly permissioned PATs.
    
    The organisation codespaces endpoint is queried and returned data is
    passed unmodified to the Codespace dataclass's from_dict() method, which
    does the required data conversion. This follows the pattern established
    for the other domain dataclasses.
    Jongmassey committed May 31, 2024
    Configuration menu
    Copy the full SHA
    5828f9c View commit details
    Browse the repository at this point in the history
  2. Define Codespaces database table

    Use a broad composite key to ensure uniqueness and stability over time.
    Candidate "id" fields from Codespaces API not used as minimally documented
    by GitHub and thus untrustworthy. Large composite primary key does have
    performance penalities, but for the expected data volume this is not
    presumed to be significant.
    Jongmassey committed May 31, 2024
    Configuration menu
    Copy the full SHA
    ee51522 View commit details
    Browse the repository at this point in the history
  3. Support upserts in database

    This is required as codespaces' state will change over time, and their
    data will disappear from the API once deleted so we are unable to
    do a full refresh of the data each time.
    
    Uses PostgreSQL "INSERT..ON CONFLICT.. UPDATE" style as newer
    "MERGE" statement not yet supported in SQLAlchemy.
    Jongmassey committed May 31, 2024
    Configuration menu
    Copy the full SHA
    b7051a4 View commit details
    Browse the repository at this point in the history
  4. Add task to fetch and store codespaces metrics

    As the codespaces data will change over time and deleted codespaces
    will disappear from the API, we do not drop and recreate the table
    each time the task is run as per the other tasks. Instead, calling
    the upsert() method ensures the table exists then merges the new
    data with the existing.
    
    Conversion method added to metrics.py is not a metric in the usual
    sense of the word, but instead renames some fields to match the
    database schema. This is neccesary due to different naming conventions
    in the domain dataclasses and database tables.
    Jongmassey committed May 31, 2024
    Configuration menu
    Copy the full SHA
    3b7b084 View commit details
    Browse the repository at this point in the history