Skip to content

Commit

Permalink
01: intro updates
Browse files Browse the repository at this point in the history
  • Loading branch information
katilp committed Oct 21, 2024
1 parent 0576110 commit cb9642c
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 6 deletions.
10 changes: 6 additions & 4 deletions episodes/01-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,17 +31,19 @@ To learn about CMS open data and the different data formats, work through the tu
::::::::::::::::::::::::::::::::::::::::::::::::


This is an option for you if you do not have enough computing resources and want to run some heavy processing. In this tutorial, we use as an [example processing](https://opendata.cern.ch/record/12504) of CMS open data MiniAOD to a "custom" NanoAOD, including more information than the standard NanoAOD but still in the same flat file format.
Using public cloud resources is an option for you if you do not have enough computing resources and want to run some heavy processing. In this tutorial, we use as an [example processing](https://opendata.cern.ch/record/12504) of CMS open data MiniAOD to a "custom" NanoAOD, including more information than the standard NanoAOD but still in the same flat file format.

We assume that you would want to download the output files to your local area and analyse them with your own resources. Note that analysis using GCP resources with your files stored on GCP is also possible, but is not covered in this tutorial.


## Google Cloud Platform

Public cloud providers are companies that offers computing resources and services over the internet to multiple users or organizations. Google Cloud Platform (GCP) is one of them. You define the resources that you need and pay for what you use. As many other such resource providers (for example AWS, Azure, OHV), it offers some free getting-started "credits".
Public cloud providers are companies that offers computing resources and services over the internet to multiple users or organizations. Google Cloud Platform (GCP) is one of them. You define and deploy the resources that you need and pay for what you use. As many other such resource providers (for example AWS, Azure, OHV), it offers some free getting-started "credits".

::::::::::::::::::::::::::::::::::::: callout

GCP offers [free trial](https://cloud.google.com/free/docs/free-cloud-features#free-trial) credits for $300 for a new account. This credit is valid for 90 days.

This tutorial was set up using [Google Cloud Research credits](https://cloud.google.com/edu/researchers). You can apply for similar credits for your research projects. Take note that the credit has to be used within 12 months.

::::::::::::::::::::::::::::::::::::::::::::::::
Expand Down Expand Up @@ -70,10 +72,10 @@ The processing workflow consist of some sequential and parallel steps. We use [A
In this tutorial, the Argo Workflows services are set up using a `kubectl` command. We use `argo`, the command-line tool, to submit and manage the workflows.


:::::::::::::::::: checklist

## Ready to go?

:::::::::::::::::: checklist

Check the instructions in [Software setup](index.html#software-setup)

- [ ] a GCP account and a GCP project
Expand Down
4 changes: 2 additions & 2 deletions learners/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,11 @@ The example processing workflow is defined as an "Argo workflow". To be able to

This tutorial uses a CMSSW open data container image with the [pfnano producer code](https://opendata.cern.ch/record/12504) downloaded and compiled. You do not need to install Docker to use it in the context of this tutorial. You need Docker, if you want to modify the code with your selections and build a new container image.

### Optional for building an image disk: go
### Building an image disk: go

A secondary boot disk image with the CMSSW container image preinstalled makes the processing workflow step start immediately. Otherwise, the image needs to be pulled at the start of the processing step and done for each cluster node separately.

This disk image can be created and stored using GCP resources. A `go` script is [available](https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main/tools/gke-disk-image-builder) for creating the image. To run it, install `go` following [these instructions](https://go.dev/doc/install) (WSL2 users should follow the Linux instructions).
This disk image can be created and stored using GCP resources. A `go` script is available for creating the image. To run it, install `go` following [these instructions](https://go.dev/doc/install) (WSL2 users should follow the Linux instructions).



0 comments on commit cb9642c

Please sign in to comment.