Skip to content

Commit

Permalink
Fixing documentation (#268)
Browse files Browse the repository at this point in the history
* Fixing documentation
  • Loading branch information
gushob21 committed Mar 1, 2024
1 parent ca4086b commit 2596ced
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 11 deletions.
2 changes: 1 addition & 1 deletion ml-platform/04_setup_clusters/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ You just followed `GitOps` to promote changes from `dev` to higher environments.
Open the configsync repo and go to `manifests/clusters`, you will see there is a cluster selector created for each cluster via yaml files.

### Install a cluster scoped software
This section describes how platform admins will use the configsync repo to manage cluster scoped software or cluster level objects. These softwares could be used by multiple teams in their namespaces. An example of such softwares is [kuberay][kuberay] that can manage ray clusters in multiple namespace.
This section describes how platform admins will use the configsync repo to manage cluster scoped software or cluster level objects. These software could be used by multiple teams in their namespaces. An example of such software is [kuberay][kuberay] that can manage ray clusters in multiple namespace.


Let's install [Kuberay][kuberay] as a cluster level software that includes CRDs and deployments. Kuberay has a component called operator that facilitates `ray` on Kubernetes. We will install Kuberay operator in default namespace. The operator will then orchestrate `ray clusters` created in different namespace by different teams in the future.
Expand Down
16 changes: 6 additions & 10 deletions ml-platform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,26 +35,22 @@ It addresses following personae and provides means to automate and simplify thei

**CUJ 1** : Use ML tools like `ray` to perform their day to day tasks like data pre-processing, ML training etc.

**CUJ 2** : Use a development environment like Jupyter Notebook for faster inner loop of ML development.
**CUJ 2** : Use a development environment like Jupyter Notebook for faster inner loop of ML development. **[TBD]**

### Operators

**CUJ 1**: Act as a bridge between the Platform admins and the ML Engineers by providing and maintaining softwares needed by the ML engineers so they can focus on their job.
**CUJ 1**: Act as a bridge between the Platform admins and the ML Engineers by providing and maintaining software needed by the ML engineers so they can focus on their job.

**CUJ 2**: Deploying the models.
**CUJ 2**: Deploying the models. **[TBD]**

**CUJ 3**: Building observability on the models.
**CUJ 3**: Building observability on the models. **[TBD]**

**CUJ 4**: Operationalizing the models.
**CUJ 4**: Operationalizing the models. **[TBD]**

## Prerequistes

1. This tutorial has been tested on [Cloud Shell](https://shell.cloud.google.com) which comes preinstalled with [Google Cloud SDK](https://cloud.google.com/sdk) is required to complete this tutorial.

2. It is recommended to start the tutorial in a fresh project since the easiest way to clean up once complete is to delete the project. See [here](https://cloud.google.com/resource-manager/docs/creating-managing-projects) for more details.

3. This tutorial requires a number of different GCP Quotas (>= 60 T4 GPUs and 400 CPU cores) in the region of your choosing. Please visit the [IAM -> Quotas page](https://console.cloud.google.com/iam-admin/quotas) in the context of your project and region to request additional quota before proceeding with this tutorial.

## Deploy resources.

Follow these steps in order to build the platform and use it.
Expand All @@ -69,7 +65,7 @@ Follow these steps in order to build the platform and use it.

- Run steps in [05_setup_teams][setup-teams]. This modules walks through how as platform admin you can set up spaces for ML teams on the cluster and transfer ownership to operators to maintain that space.

- Run steps in [06_operating_teams][operating-teams]. This module walks through how as an operator you will provide the softwares required by ML engineers.
- Run steps in [06_operating_teams][operating-teams]. This module walks through how as an operator you will provide the software required by ML engineers.


[projects]: ./01_gcp_project/README.md
Expand Down

0 comments on commit 2596ced

Please sign in to comment.