This repository aims to showcase a use case under an MLOps paradigm using GCP as the cloud provider and GitHub Actions as the CICD flow manager.
The detailed explanation of this repository can be found in the article I published in Towards Data Science called: Part 1: Let's Build an Operational MLOps Framework from Scratch
This repository contains the resources to carry out what is described in the following flow
In order to adjust this repository to your use case, I recommend you follow the guide I described in the article Part 1: Let's Build an Operational MLOps Framework from Scratch or, if applicable, in the next section I explain which files you would have to modify to be able to use this content.
First, in GCP you will need to create a Workload Identity Provider which will enable the connection between GitHub and GCP.
You will also need to create a service account with the following roles:
"roles/artifactregistry.writer"
"roles/bigquery.readSessionUser"
"roles/cloudbuild.builds.builder"
"roles/cloudbuild.tokenAccessor"
"roles/cloudbuild.workerPoolUser"
"roles/logging.logWriter"
"roles/iam.serviceAccountUser"
"roles/aiplatform.user"
"roles/developerconnect.user"
"roles/storage.objectCreator"
Then, the service account you created will have to be associated with the Workload Identity Pool.
Finally, in the .github/workflows/cicd.yaml
file, you will need to adjust the variables PROJECT_ID
, WORKLOAD_IDENTITY_PROVIDER
and SERVICE_ACCOUNT
in the cd
job for the step GCP Auth
. This change must look like:
- name: 'GCP Auth'
uses: 'google-github-actions/[email protected]'
with:
project_id: ${{ vars.PROJECT_ID }}
workload_identity_provider: ${{ vars.WORKLOAD_IDENTITY_PROVIDER }}
service_account: ${{ vars.SERVICE_ACCOUNT }}
- name: Register
run: |
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-F tags=latest \
-F [email protected] \
${{ vars.PIPELINE_REPOSITORY }}
And that is pretty much it!
This repo will be updated as I add other MLOps resources (continuous training, model monitoring, data validation, etc.). So as soon as I can, I will be updating this README
and the repository code.
Happy coding 🤓!