Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROJ: GitHub action for publishing code.json files for each repo within a GitHub org #7

Open
decause-gov opened this issue Jun 6, 2023 · 7 comments

Comments

@decause-gov
Copy link
Contributor

MENTOR

@decause-gov

BRIEF DESCRIPTION

Code.gov is the canonical source of truth for federal open source code respositories. The Website lists an Index for such purposes, and has a process for including repos within in. To help improve the HHS compliance with this policy, and enable others to do the same, we could automate this task to upload the index metadata automatically!

SCOPING

  • weeks

REQUIRED DELIVERABLES

  • GitHub action for publishing code.json files for each repo within a GitHub org

DESIRED DATE WINDOW (June, July, August, Any)

  • June/July

NOTES/INSPO

GitHub Action Script to Generate Code.gov Metadata

This is an example GitHub Action script that generates a code.json file for each repository in a GitHub organization, using the Code.gov schema version 2.0.0, and commits the file to the root directory of the repo.

Usage

To use this script, you'll need to:

  1. Modify the script to include your specific organization name and API key.
  2. Modify the schema version or other parameters depending on your project's requirements.

GitHub Action Script

name: Generate Code.gov Metadata

on:
  schedule:
    - cron: "0 0 * * *"

jobs:
  generate-metadata:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Install Code.gov CLI
        run: npm install -g code-gov-cli

      - name: Authenticate with Code.gov
        uses: code-gov/[email protected]
        with:
          api_key: ${{ secrets.CODE_GOV_API_KEY }}

      - name: Generate and commit metadata for each repository
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          for repo in $(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" "https://api.github.com/orgs/<org-name>/repos?per_page=100" | jq -r '.[].name'); do
            echo "Generating metadata for $repo..."
            cd "$repo"
            code-gov init -s "https://raw.githubusercontent.com/GSA/code-gov-data/master/schemas/schema-2.0.0.json"
            code-gov validate
            git add code.json
            git commit -m "Add Code.gov metadata"
            git push
            cd ..
          done
@Gaelan
Copy link
Contributor

Gaelan commented Jun 14, 2023

ChatGPT tells lies: there's no code-gov-cli on NPM, nor is there a code-gov/code-gov-cli-actions :(

Also, AIUI, this data eventually needs to be served from https://hhs.gov/code.json (which currently details various HHS, but no CMS, repos) - do we know where that file gets served from, and how we get data into it?

@Gaelan
Copy link
Contributor

Gaelan commented Jun 15, 2023

Aha, it's generated out of https://github.com/HHS/Source-Code-Inventory

@Gaelan
Copy link
Contributor

Gaelan commented Jun 16, 2023

As good a place to dump this as any - spent a few hours yesterday trying to get a quick list of all the open-source projects at CMS. Here are my notes:

at least three orgs:
  gh/CMSgov
    the biggest one, based out of CMS Communications

    recently active (last commit <=6mo) and notable:
      ab2d-bcda-dpc-platform - common devops stuff for AB2D, Beneficiary Claims Data, Data @ Point of Care
      AB2D: part of https://ab2d.cms.gov, provides claims data (from medicare parts A+B) to perscription drug sponsors (medicare part D)
        ab2d
        ab2d-contracts
        ab2d-events
        ab2d-gradle
        ab2d-lambdas
        AB2D-Libs [sic]
        ab2d-pdp-documentation: PDP = perscription drug plan
        ab2d-properties
        ab2d-sample-client-bash
        ab2d-sample-client-powershell
        ab2d-sample-client-python
      Acceptable Risk Safeguards: cybersecurity compliance checklists and automatic auditing thereof
        ars-machine-readable
        *-stig-baseline (4 repos)
        cms-ars-*-overlay (~60 repos)
      ai_website - static html page for https://ai.cms.gov
      Beneficiary Claims Data API: lets ACOs get data for their patients
        bcda-app
        bcda-ssas-app - "System-to-System Authentication Service"
        bcda-static-site
      beneficiary-reporting-validation: JSON validation code, unclear what the underlying schema is
      beneficiary-fhir-data - common backend for AB2D, Blue Button, Data @ Point of Care, etc
      Blue Button API (2.0): lets patients get data about themselves
        bluebutton-css
        bluebutton-sample-client-python-react
        bluebutton-sample-client-nodejs-react
        bluebutton-sample-client-rails
        bluebutton-site-static
        bluebutton-web-deployment
        bluebutton-web-server
        cms-bb2-node-sdk
        cms-bb2-python-sdk
      Data @ Point of Care: show patient's history to their clinician
        dpc-app
        dpc-static-site
      distributed-load-testing-on-aws: cloudformation templates to do what the name implies, forked from an aws project with CMS-specific customizations
      design-system
      Easy Access to System Information (EASi): ticketing system for CMS IT Governance, https://easi.cms.gov
        easi-app - ts/react fe, go be
        easi-shared
      ECTA: looks like the very beginning of a project to provide machine-readable documentation of criteria as to whether we pay for various procedures
      HealthCare.gov-Styleguide: deprcated in favor of design-system
      heimdall-lite.cms.gov: actions+gh pages stub to deploy https://github.com/mitre/heimdall2 for cms
      Hospital Price Transparency validation:
        hpt-validator-cli
        hpt-validator
        hpt-validator-tool
        price-transparency-guide - just docs, might be deprecated? unclear
        price-transparency-guide-validator - older, unsure how this relates to hpt-validator
      kmp_sca: Knowledge Management Platform (CMS Office of IT initiative) SCAnner - analyzes dependencies of CMS github repos
      MES Certification Repository: static website at https://cmsgov.github.io/CMCS-DSG-DSS-Certification/ serving as "a space for states, CMS, and vendors to learn, share, and contribute information about the [Medicaid Enterprise Systems] Certification process and its related artifacts"
        CMCS-DSG-DSS-Certification-Staging
        CMCS-DSG-DSS-Certification
      mint-app - "model innovation tool", ts/react fe, go be
      Quality Payment Program: various tools, docs and data for QPP, which I believe is related to the ACO system
        qpp-conversion-tool - converts QRDA3 to QPP for eCQI submission
        qpp-eu-data - "county-zipcode crosswalk data used for determining the providers eligible Extreme And Uncontrollable Circumstances Hardship"
        qpp-file-upload-api-client
        qpp-measures-data - "source of truth for [Quality Payment Program] measures data"
        qpp-shared-api-versioning-node - shared middleware for api versioning in qpp services
        qpp-shared-healthcheck-node - shared "health check" route for qpp services
        qpp-shared-logger-node - shared logging code for qpp services
        qpp-submissions-docs - docs for the QPP submissions API
      saf: https://saf.cms.gov/, website for the Security Automation Framework
      T-MSIS-Data-Quality-Measures-Generation-Code - data science stuff (notebooks etc) measuring data quality in the Transformed Medicaid Statistical Information System

    recent but not notable:
      ab2d-bcda-sample-python-fhirconnectathon-July2023 - empty for now
      dpc_aop - just flask project boilerplate right now, presumably related to data @ point of care, no idea what AOP is

    unmaintained but significant external interest (>= 10 stars):
      QHP-provider-formulary-APIs: JSON schemas for data about coverage by Qualified Health Plans on healthcare.gov
      qpp-claims-to-quality-public - computes QPP data quality metrics from claims data
      bluebutton-data-server - "Migrated into monorepo: https://github.com/CMSgov/beneficiary-fhir-data"
      BenefitAssist - rules engine to determine eligibility for various benefits, incl Medic{are,aid} but also SNAP and many others

    old and little external interest:
      ato-blueprint - appears to be a fork of https://github.com/GovReady/govready-q, no idea what it is or 
      macpro-quickstart-serverless-infra - ~empty repo presumably related to Medicaid And CHIP PROgram
      MES-StateOfficerMD: seemingly-abandoned course on IT procurement - MES presumably is Medicaid Enterprise Systems
      many others I haven't catalogued!
  gh/DSACMS
    DSAC's org
    repos: open (open-source program website), .github, dsacms.github.io
  gh/CMS-Enterprise
    not sure which division this is/why it's separate
    todo: detailed list; I know it hosts SBOM Harbor and several more of the Acceptable Risk Safeguards things
    repos:
      SBOM Harbor: tool for storing and analyzing Software Bills of Materials (lists of dependencies)
        sbom-harbor
        sbom-harbor-ui
      cms-ars-*-overlay: Acceptable Risk Safeguards, cybersecurity auditing - not sure how this differs from repos in CMSgov
      ai_explorers: monorepo of various departments' AI pilot projects
      ccar: appears to be a fork of https://github.com/mitre-attack/car, unclear what's different if anything

ONC has a good list of open-source stuff they publish for eCQI: https://ecqi.healthit.gov/ecqi-tools-key-resources

@Gaelan
Copy link
Contributor

Gaelan commented Jun 16, 2023

ok there are at least six tools for this, which GSA has helpfully catalogued: https://github.com/GSA/code-gov/blob/master/docs/code_json_generators.md

Open questions:

  • Are any of those tools worth using?
  • Once we've generated our code.json, how do we merge it with the HHS one? There don't seem to be any tools to merge code.json files at the minute, though it wouldn't be too hard to write one.
  • How do we get the updated json to HHS? Other orgs within HHS (eg CDC) seem to be doing a manual PR every quarter or so - is that good enough, or do we want to be sending automatic PRs (there's a risk of annoying HHS here if not done carefully), or working with them to build a fully automated pathway?
  • Most of CMS's projects span multiple repos - it seems like the relatedCode field is intended to allow multi-repo projects, but few orgs use it in their published code.json and every one that does uses it differently. Do we want to try to use this, or just one entry per repo no matter what? I've requested to join the 18F Slack (which, afaict, is open to anyone) to ask them about this.
  • My reading of the requirements is that code.json should include all code developed at/for CMS, even if it isn't OSS - how do we go about indexing that?

@decause-gov decause-gov added the summer2024 Summer Fellowship and Internship Related Tasks label May 28, 2024
@decause-gov decause-gov added the open DSAC Open Repo label Jun 3, 2024
@CreativeNick CreativeNick self-assigned this Jul 22, 2024
@CreativeNick
Copy link

7/26:

  • Meeting set with Natalia & Ricardo for next Monday, July 29

@decause-gov
Copy link
Contributor Author

07/29:

  • Based on code.json findings (Nat/Remy met 1x1) @natalialuzuriaga is going to update this ticket with new info

@CreativeNick CreativeNick removed their assignment Jul 29, 2024
@decause-gov
Copy link
Contributor Author

07/29:

  • This ticket is from last year, and a lot of changes have happened since.
  • Based on looking at this ticket, we're still early in this step, and because we have only a week left, would recommend other tickets, or other code.json file related tasks.

Alternative Code.json ticket options:

  • Explore agency-wide code.json generators
  • Add required fields to code.json on cookiecutter (similar work to code.json backend ticket that Ricardo worked on)

@decause-gov decause-gov removed the summer2024 Summer Fellowship and Internship Related Tasks label Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants