Skip to content

Commit

Permalink
Added Questions sections for classroom discussion
Browse files Browse the repository at this point in the history
  • Loading branch information
carlmes committed Sep 5, 2024
1 parent 6ae940b commit 764cf7f
Show file tree
Hide file tree
Showing 9 changed files with 104 additions and 22 deletions.
2 changes: 1 addition & 1 deletion content/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
* 7. xref:32_model_training_car.adoc[Model Training]
* 8. xref:34_boto3.adoc[Boto3]
* 8. xref:34_using_s3_storage.adoc[Using S3 Storage]
* 9. xref:36_deploy_model.adoc[Deploy Model]
Expand Down
11 changes: 10 additions & 1 deletion content/modules/ROOT/pages/01_welcome.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,4 +33,13 @@ It's possible to install all the various components by hand, making notes such a

The source code for the AI Accelerator can be found at: https://github.com/redhat-ai-services/ai-accelerator

The accelerator was created and is currently maintained by the Red Hat AI Services organization, however contributions from anywhere are always greatly appreciated!
The accelerator was created and is currently maintained by the Red Hat AI Services organization, however contributions from anywhere are always greatly appreciated!

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. When would I want to manually install RHOAI and associated components, instead of using the AI Accelerator framework?
. Who maintains the AI Accelerator framework project?
. What if I find a bug, or have a new feature to contribute?
. Can I add my own additional components into the AI Accelerator?
12 changes: 11 additions & 1 deletion content/modules/ROOT/pages/05_environment_provisioning.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,14 @@ The provisioning process will take a while to complete, so why not take some tim

Once the clusters have been provisioned, you should receive an email containing the cluster URLs as well as an administrative user (such as `kubeadmin`) and password.

You can also obtain these URLs and credentials from your services dashboard at https://demo.redhat.com/[demo.redhat.com]. The dashboard also allows you to perform administrative functions on your clusters, such as starting/stopping or extending the lifespan if desired.
You can also obtain these URLs and credentials from your services dashboard at https://demo.redhat.com/[demo.redhat.com]. The dashboard also allows you to perform administrative functions on your clusters, such as starting/stopping or extending the lifespan if desired.

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. How long can we use the demo.redhat.com OpenShift cluster? When will it get deleted?
. I want to install a demonstration cluster that might last several months for a RHOAI evaluation period. What options are available?
. Can we use our own AWS based OpenShift cluster, other than one from demo.redhat.com?
. Could I install this on my own hardware, such as my desktop PC that is running a single node OpenShift cluster?
. The topic of being able to easily repeat an installation, as discussed in the following GitOps sections may be interesting to discuss, since this means that work done to configure an environment is not lost if the environment is destroyed.
13 changes: 12 additions & 1 deletion content/modules/ROOT/pages/07_installation.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -70,4 +70,15 @@ oc delete deployment granite-predictor-00001-deployment -n ai-example-single-mod
We will cover the ai-accelerator project overview in a later section.

---
Continue using the _**DEMO**_ cluster for the subsequent exercises.
Continue using the _**DEMO**_ cluster for the subsequent exercises.

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. What's the difference between "bootstrapping" and "installing" the new OpenShift cluster?
. Why is forking an open source project a good idea?
. How can a project fork be used to contribute back to the parent project with bug fixes, updates and new features?
. Could, or should the bootstrap shell script be converted to Ansible?
. How does the bootstrap script provision GPU resources in the new OpenShift cluster? Hint: a quick walk through the logic in the source code should be a useful exercise, time permitting.
. Where can I get help if the bootstrap process breaks?
10 changes: 9 additions & 1 deletion content/modules/ROOT/pages/20_ai-accelerator_review.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ This project is set up with ArgoCD and Kustomize in mind. Meaning ArgoCD will ha
If you are unfamiliar with Kustomize, this is a very good tutorial: https://devopscube.com/kustomize-tutorial/[Learn more about Kustomize].

### Overview of Kustomize in AI-Accelerator Project

Let's try to understand how Kustomize is being used to deploy the different resources in the ai-accelerator.

1. When running the _**bootstrap.sh**_ script, it will apply the Open Shift GitOps operator by using Kustomize on the https://github.com/redhat-ai-services/ai-accelerator/blob/b90f025691e14d8e8a8d5ff3452107f8a0c8f48d/scripts/bootstrap.sh#L11[GitOps_Overlay] https://github.com/redhat-ai-services/ai-accelerator/tree/b90f025691e14d8e8a8d5ff3452107f8a0c8f48d/components/operators/openshift-gitops/operator/overlays/latest[folder]:
Expand Down Expand Up @@ -126,4 +127,11 @@ If you are using a disconnected environment, you will need to first setup:
* Red Hat Blog: https://www.redhat.com/en/blog/your-guide-to-continuous-delivery-with-openshift-gitops-and-kustomize[Your Guide to Continuous Delivery with OpenShift GitOps and Kustomize] - a good article explaining more GitOps concepts
* GitHub: https://github.com/gnunn-gitops/standards/blob/master/folders.md[GitOps Folder Structure] - the original inspiration for the folder structure in the AI Accelerator project
* Red Hat Blog: https://www.redhat.com/en/blog/enterprise-mlops-reference-design[Enterprise MLOps Reference Design] - a conceptual reference design for performing Machine Learning Operations (MLOps)
* Topic: https://www.redhat.com/en/topics/devops/what-is-gitops[What is GitOps?] - 7-minute read on the topic of GitOps
* Topic: https://www.redhat.com/en/topics/devops/what-is-gitops[What is GitOps?] - 7-minute read on the topic of GitOps
## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. Where can I find a list of curated components that follow the GitOps pattern? Hint, see the https://github.com/redhat-cop/gitops-catalog[GitOps Catalog] GitHub page.
. Wow this project structure is complicated! Is there a way to simplify the project folder structures? Hint, a good discussion could be had on where we came from and how we got here in terms of project design and layout.
9 changes: 9 additions & 0 deletions content/modules/ROOT/pages/30_gitops_env_setup_dev_prod.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Environment Install and Setup: DEV and PROD Cluster

## Parasol-insurance-dev cluster

Follow the following steps to complete the install and setup:

* After the cluster is running and ready, log in as the admin
Expand Down Expand Up @@ -313,3 +314,11 @@ When running the bootstrap script, select `bootstrap/overlays/parasol-insurance-
====
To check your work please refer to https://github.com/redhat-ai-services/ai-accelerator-qa/tree/30_gitops_env_setup_prod[This Prod Branch]
====

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. How familiar are your development teams with CI/CD concepts?
. How do you currently deploy project to development, QA and Production environments?
. Is ArgoCD new to the team?
11 changes: 11 additions & 0 deletions content/modules/ROOT/pages/31_custom_notebook.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -304,3 +304,14 @@ image::01_custom_workbench.png[Custom workbench]
====
Verify your work against https://github.com/redhat-ai-services/ai-accelerator-qa/pull/new/31_custom_notebook:[This custom-workbench branch]
====

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

. How many Python packages are included in your typical data scientist development environment? Are there any packages that are unique to your team?
. How do you handle continuous updates in your development environment, remembering that AI/ML is an evolving landscape, and new packages are released all the time, and existing packages are undergoing very frequent updates?
. Can data scientists ask for new packages in a securely controlled development environment?
. Where do you store source code for model experimentation and training?
. Do you think that cluster storage (such as an OpenShift PVC) is a good permanent location for source code, so that in the event of failure the source is not lost?
. How do your teams of data scientists collaborate on notebooks when training models or performing other experiments?
4 changes: 2 additions & 2 deletions content/modules/ROOT/pages/32_model_training_car.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Model Training with Custom Notebook

## In this module you will sync a git repo and run through a the parasol-insurnace Jupyter notebooks.
In this module you will sync a git repo, and then execute the logic contained in the parasol-insurance Jupyter notebooks.

We will use the custom image we created and uploaded before and we will create a workbench with the customer image we uploaded in module 04.
To perform this task, we will use the custom data science notebook image we created and uploaded in the previous module.

## Steps to create workbench with a custom notebook

Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,26 @@
# Boto3 exploration
# Using S3 Storage

In this module you will use boto3 to explore the existing minio configuration and set up scripts to automate uploading the models to the locations you need for your pipelines to run.
In this module you will set up some S3 based storage in the OpenShift cluster, and the utilize the Python package called https://pypi.org/project/boto3/[Boto3] to explore the existing S3 based storage configuration.

## Minio S3 Storage
Let's add Minio s3 storage to our dev environment project so Argo can deploy it.
We will also set up scripts to automate uploading the ML models to the locations required by the pipelines in subsequent modules.

If you need a reference, the ai-accelerator project has minio set up in under the `tenants/ai-examples folder`.
## What is S3 Storage?

### Set up minio
Amazon Simple Storage Service (S3) is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface.

This lab uses https://github.com/minio/minio[MinIO] to implement S3 storage, which is High Performance Object Storage released under GNU Affero General Public License v3.0. MinIO is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

For more information about S3:
* AWS documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html[What is Amazon S3?]
* Wikipedia: https://en.wikipedia.org/wiki/Amazon_S3[Amazon S3]

## Adding MinIO to the Cluster

In this section we will add MinIO S3 storage operator to our dev OpenShift cluster configuration project so Argo can deploy it.

If you need a reference, the ai-accelerator project has MinIO set up in under the `tenants/ai-examples folder`.

### Set up MinIO

. Create a `object-datastore.yaml` file in the `tenants/parasol-insurance/namespaces/base` directory with the following content:

Expand Down Expand Up @@ -85,7 +98,8 @@ resources:
The same content will work for both overlays (dev and prod)
====

Commit your changes to your fork of ai-accelerator project. Let ArgoCD sync and deploy minio.
Commit your changes to your fork of ai-accelerator project. Wait for ArgoCD to sync and deploy MinIO.

You should find your minio resource in the _**object-datastore**_ namespace.

The *minio-ui* route can be found in _**object-datastore**_ namespace under _**Routes**_. Open this in a new tab and log in with `minio:minio123`.
Expand All @@ -97,9 +111,9 @@ Compare your change to this in git https://github.com/redhat-ai-services/ai-acce

Explore the S3 storage.

## Set up an standard data science workbench to explore S3 with boto3
## Explore S3 with Boto3

We have previously used a custom workbench to explore how to train a model. Now we will use a standard workbench to explore the S3 storage.
We have previously used a custom workbench to explore how to train a model. Now we will use a standard workbench to explore the S3 storage using the Boto3 Python package.

### Create a standard workbench

Expand Down Expand Up @@ -293,7 +307,7 @@ resources:
- ../../base
----

. Push the changes to git, and wait for the synchrnization to complete.
. Push the changes to git, and wait for the synchronization to complete.

+
[TIP]
Expand All @@ -307,8 +321,11 @@ Validate against https://github.com/redhat-ai-services/ai-accelerator-qa/pull/n
[.bordershadow]
image::standard-workbench.png[Standard workbench]

## Explore S3 in RHOAI Workbench:
Some S3 technologies do not come with UI or CLI to interact with the buckets or files. A common tool that can be used accross all S3 technologies is boto3. Boto3 is the AWS SDK for Python. It allows you to directly interact with AWS services such as S3, EC2, and more.
## Explore S3 in RHOAI Workbench

https://pypi.org/project/boto3/[Boto3] is a commonly used Python package, which is the AWS SDK for communicating with S3 storage providers. It allows you to directly interact with AWS services such as S3, EC2, and more.

Lets create some Python code in a Jupyter notebook to interact with our S3 storage:

. Go to RHOAI Dashboard and go to the _**parasol-insurance**_ Data Science Project.

Expand All @@ -324,19 +341,19 @@ image::standard-workbench.png[Standard workbench]
[.bordershadow]
image::Workbench_env_vars.png[]

. Launch the workbench and wait for the Jupyter notebook to spin up.
. Launch the workbench and wait for the Jupyter notebook to start up.

. Create a new Notebook.

. In a new cell, add and run the content below to install boto3 and ultralytics.
. In a new cell, add and run the content below to install the `boto3` and `ultralytics` packages using pip.

+
[source, python]
----
!pip install boto3 ultralytics
----

. Configure the connection to minio S3
. Configure the connection to MinIO S3

+
[source, python]
Expand Down Expand Up @@ -449,3 +466,10 @@ def get_minio_content(bucket):
get_minio_content('models')
----

## Questions for Further Consideration

Additional questions that could be discussed for this topic:

* What other tools exist for interacting with S3? Hint, https://s3tools.org/s3cmd[s3cmd] is another quite popular S3 CLI tool.
* Could a shortcut to the MinIO Console be added to OpenShift? Hint, see the OpenShift `ConsoleLink` API, https://github.com/redhat-na-ssa/demo-lab-config/blob/main/demo/run-mlflow/link-minio.yaml[here's an example].
* What's the maximum size of an object, such as a ML model that can be stored in S3?

0 comments on commit 764cf7f

Please sign in to comment.