From 27880f73cf0d01ddf22b39cb7ce882a1931de72d Mon Sep 17 00:00:00 2001 From: nuwang <2070605+nuwang@users.noreply.github.com> Date: Sun, 30 Jun 2024 00:05:47 +0530 Subject: [PATCH] Update the intro to k8s tutorial --- .../tutorials/backup-cleanup/slides.html | 2 +- .../k8s-deploying-galaxy/tutorial.md | 251 +++++++----------- 2 files changed, 102 insertions(+), 151 deletions(-) diff --git a/topics/admin/tutorials/backup-cleanup/slides.html b/topics/admin/tutorials/backup-cleanup/slides.html index 28e08bf0855bd4..c81d1cf4b5fc27 100644 --- a/topics/admin/tutorials/backup-cleanup/slides.html +++ b/topics/admin/tutorials/backup-cleanup/slides.html @@ -51,7 +51,7 @@ ??? -- One of the important admimn tasks is to keep an eye on the storage consumption. +- One of the important admin tasks is to keep an eye on the storage consumption. - Every instance should have a data rention policy defined and shared with users. - Codebase contains scripts that can assist with cleaning up and reclaiming space. - The gxadmin tool can assist with invoking these scripts. diff --git a/topics/admin/tutorials/k8s-deploying-galaxy/tutorial.md b/topics/admin/tutorials/k8s-deploying-galaxy/tutorial.md index 001ab951f505c9..84404f0e5ca1b0 100644 --- a/topics/admin/tutorials/k8s-deploying-galaxy/tutorial.md +++ b/topics/admin/tutorials/k8s-deploying-galaxy/tutorial.md @@ -34,37 +34,27 @@ priority: 2 ## Overview - -Galaxy has a minimal number of required dependencies, which makes its basic -installation quick for both users and developers. However, configuring a -multi-user production instance is a complex undertaking due to Galaxy’s many -interacting and dependent systems, components, and configurations. Software -containerization has become the preferred method of addressing deployment -challenges across operating environments. Containerization also requires -orchestration, so that multiple containers can work together to deliver a -complex application. [Kubernetes](https://kubernetes.io/) has emerged as the -primary container orchestration technology, as it is both container agnostic and -widely adopted. Kubernetes allows managing, scaling, and deploying different -pieces of an application–in a standardized way–while providing excellent tooling -for doing so. - -In this tutorial, we'll take a look at Kubernetes and [Helm](https://helm.sh/) -as tools for deploying containerized Galaxy. The goals for this model of -deploying Galaxy is to use best-practices from the Galaxy community on how to -deploy the Galaxy application in a well-defined package. This model can simplify -deployment and management burden of running Galaxy. While it is possible to -follow this tutorial by simply copying and pasting supplied commands, and a -production-grade Galaxy will be installed, it is desirable to have a basic -understanding of the container concepts and Kubernetes and Helm technologies. - -Some of the goals for deploying and running Galaxy in this mode include: -- Design a mostly stateless model for running Galaxy where processes can be - horizontally scaled as needed -- Integrate components from the Galaxy project ecosystem to leverage existing - resources -- Provide a unified handling of Galaxy configurations -- Minimize customized dependencies -- Minimize the need to build custom components +This tutorial describes how to use the Galaxy Helm Chart to deploy a production +grade instance of Galaxy on a Kubernetes cluster. The Helm Chart has been designed +to follow best practices adopted by the community, including the usegalaxy.* federation, +and will install a Galaxy with the following features by default: +- Zero-downtime configuration changes and upgrades +- Scalable web and job handlers +- Automatic failure recovery based on liveness and readiness probes +- A built-in nginx for efficiently serving large files +- TUSD for resumable uploads +- Celery for background jobs +- Access to CVMFS reference data +- A toolset matching the usegalaxy.* federation (also served off CVMFS) +- Interactive tools (wildcard DNS mapping required) +- Minimal privileges, with jobs running as non-root and only having access to datasets they need +- Automatic maintenance scripts to cleanup the galaxy database and tmp directories + +Optionally, the chart can be configured with +- High-availability components - this includes trivial scaling of clustered Postgres, Rabbit MQ etc. +- Replacement components - You can replace the built-in operators with a managed or existing Postgres database (e.g. Amazon RDS), RabbitMQ cluster etc. +- Use S3 as an alternative to CVMFS +- Automatic scraping of metrics which can be sent to Influxdb > > @@ -74,11 +64,13 @@ Some of the goals for deploying and running Galaxy in this mode include: {: .agenda} ## Prerequisites -We'll be using the [Galaxy Helm chart] to install and manage a Galaxy -deployment. To be able to use this chart, we'll need access to a Kubernetes -cluster, with Helm installed. For development and testing purposes this can be -easily achieved by installing -[Docker Desktop locally and enabling Kubernetes][DfD]. Afterwards, also install +Some familarity with Kubernetes is assumed. This includes general administrative +familarity and how to install and configure Helm Charts. + +A running Kubernetes cluster is also required (1.27 or higher), with Helm +(3.5 or higher) configured to access it. +For development and testing purposes this can be easily achieved by installing +[Docker Desktop locally and enabling Kubernetes][DfD]. Afterwards, install [Helm](https://helm.sh). For production deployments, we'll also need some storage resources for data @@ -87,41 +79,6 @@ creating a [Persistent Volume and a corresponding Persistent Volume Claim][PV]. Once created, just keep a note of the resources Persistent Volume Claim ID and to use later. -For the CVMFS-enabled version of the chart (more on this below), it is also -necessary to run Kubernetes version 1.13 or newer because we'll be using the -[Container Storage Interface (CSI)][CSI]. - -# Downloading the Galaxy Helm Chart -The Galaxy Helm Chart is currently under active development with enhancements -continuously trickling in. As a result, there are no regular releases yet and -instead we recommend just cloning the GitHub repository with the chart -implementation. This will be the easiest method to keep up with chart changes -for the time being. - -> Download the chart -> -> Clone the chart repository from the machine where you would like to deploy -> Galaxy and change into the chart directory. -> -> {% raw %} -> ```bash -> git clone https://github.com/galaxyproject/galaxy-helm -> cd galaxy-helm/galaxy -> ``` -> {% endraw %} -{: .hands_on} - -# Deploying Galaxy -The Galaxy Helm chart packages best-practice solutions for deploying Galaxy -into a single package that can be readily deployed as a unit. Behind the -scenes, all the supporting services are started and configured into an -interoperable system. Specifically, this involves starting a database service -based on Postgres, using Nginx as a web proxy, and running an independently -scalable set of web and job handler processes for Galaxy. This follows -the production-quality deployment recommendation setup for Galaxy and leverages -some of the Kubernetes features to help with running long-term services (e.g., -liveness probes that automatically restart stuck processes). - ## Deploying the Default Configuration The default set of values for the Galaxy chart configures only a minimal set of Galaxy options necessary. The configured options are required for suitable @@ -132,129 +89,123 @@ chart later in this tutorial. > Deploying the Galaxy Helm Chart > -> 1. First, we need to fetch any dependencies for the chart. One of the -> advantages of using Helm is that we can reuse best-practice deployment -> methods for other software right out of the box by relying on published -> charts and integrating them into the Galaxy chart. +> 1. First, we need to add the helm repository for the chart. The chart is +> automatically packaged, versioned and uploaded to a helm repository on github +> with each accepted PR. Therefore, the latest version of the chart can be directly +> installed from that repository. > > {% raw %} > ```bash -> helm dependency update +> helm repo add galaxyproject https://raw.githubusercontent.com/galaxyproject/helm-charts/master/ +> helm repo update > ``` > {% endraw %} > -> 2. We can now deploy Galaxy via the Chart. Before running this command make -> sure you are in the chart source code directory (where `values.yaml` file -> resides) and note the trailing dot. Running this command will create a new -> Helm release (i.e., chart installation) called `galaxy`. +> 2. We can now deploy Galaxy via the Chart. Running this command will create a new +> Helm release (i.e., chart installation) called `mygalaxy`. > > {% raw %} > ```bash -> helm install --name galaxy . +> helm install mygalaxy galaxyproject/galaxy > ``` > {% endraw %} > -> 3. It will take about a minute or two for the database to be initialized, -> necessary containers downloaded, and Galaxy processes started. Ultimately, +> 3. It will take about a minute or two for the necessary containers to download, +> the database to initialize, and Galaxy processes to start. Ultimately, > while this may depend on the Kubernetes cluster setup you are using, -> Galaxy should be available at the root URI for the given machine. We can +> Galaxy should be available at https:///galaxy for the given machine. We can > always check the status of our release by typing `helm status galaxy`. > {: .hands_on} -## Deploying a CVMFS-enabled Configuration -The Galaxy Helm chart also comes with a more comprehensive set of configuration -options that leverage more of the Galaxy project ecosystem. In practice this -means deploying Galaxy with the same toolset as that of -_[usegalaxy.org](https://usegalaxy.org/)_ right out of the box. It's important to note -that this deployment configuration leverages all the same chart components but -just defines more configuration options. Namely, we attach to the -[Galaxy CVMFS][CVMFS] ready-only file system for retrieving the tool -configurations while leveraging [BioContainers] for resolving tool dependencies. +## Setting the admin user and changing the brand +The chart is designed to follow standard Kubernetes and Helm idioms, and therefore, +it should be intuitively similar to the steps required to change configuration in +any other Helm chart. For example, ingress paths, resource allocations, container +images etc. can be changed following standard helm conventions. The list of +available configuration options are also documented in the Galaxy Helm +_[Chart repository](https://github.com/galaxyproject/galaxy-helm/tree/master?tab=readme-ov-file#configuration)_ -> Deploying the CVMFS-enabled Configuration +To change Galaxy specific configuration, such as setting the admin user or change the brand in `galaxy.yml`, +we can follow the following steps. Once done, we will also rollback our change to demonstrate how Helm manages +configuration. + +> Setting admin user and changing the brand +> +> 1. Modify the following entries in your `mygalaxy.yml`. Make sure to add these +> keys under the `configs:` section of the file. > -> 1. If you are following this tutorial sequentially and have a release of -> Galaxy already running, let's delete it (assuming that's fine and you have -> no data to keep). More details about the deletion process are available in -> the [Deleting a Deployment section](#deleting-a-deployed-helm-release). If -> you're just playing around, run `helm delete --purge galaxy`. +> {% raw %} +> ``` +> configs: +> galaxy.yml: +> galaxy: +> brand: "Hello World" +> admin_users: "admin@mydomain.com" +> ``` +> {% endraw %} > -> 2. The CVMFS variant of the Galaxy chart has an additional dependency on the -> [Galaxy CVMFS chart](https://github.com/CloudVE/galaxy-cvmfs-csi-chart). -> We'll deploy this chart into its own [Namespace] to keep its resources -> nicely grouped. We'll also fetch the chart from a packaged chart -> repository instead of its GitHub repo. +> 2. Now, let’s upgrade the chart to apply the new configuration. > > {% raw %} > ```bash -> kubectl create namespace cvmfs -> helm repo add galaxy https://raw.githubusercontent.com/CloudVE/helm-charts/master/ -> helm repo update -> helm install --name cvmfs --namespace cvmfs galaxy/galaxy-cvmfs-csi +> helm upgrade --reuse-values -f mygalaxy.yml mygalaxy galaxyproject/galaxy > ``` > {% endraw %} > -> 3. We can now install the CVMFS-enabled set of values. +> 3. Inspect the currently set Helm values by: > > {% raw %} > ```bash -> helm install --name galaxy galaxy/galaxy +> helm get values mygalaxy > ``` > {% endraw %} > -> 4. Again, it will take a few minutes for Galaxy to start up. This time most of -> the waiting is due to the tool definition files to be cached on CVMFS and -> loaded into the tool panel. We can check the status of the deployment by -> running `helm status galaxy`. We can also watch the boot process by tailing -> the logs of the relevant container with a command similar to -> `kubectl logs -f galaxy-web-7568c58b94-hjl9w` where the last argument is -> the name of the desired pod, as printed following the `helm install` -> command. Once the boot process has completed, we can access Galaxy at -> `/galaxy/` URI (note the trailing `/`; it's significant). +> 4. List the installed Helm charts again and note that the revision of the chart has changed as expected. > -{: .hands_on} - -## Deleting a Deployed Helm Release -After we're done experimenting with an installation of the chart, we can just -as easily delete all the resources as we've created them. However, that may -not be desirable so make sure you understand the system you're working on to -avoid undesired surprises. Namely, deleting and recreating a Helm release is -generally not a problem where the processes will just respawn and everything will go back to operational; however, underlying storage configuration may -interfere here with all the application data being potentially lost. This -predominantly depends on how the relevant storage class was configured. - -> Deleting a Deployed Helm Release +> {% raw %} +> ```bash +> helm list +> NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE +> mygalaxy 2 Wed Jun 26 14:51:17 2023 DEPLOYED galaxy-5.14.2 v24.0.2 default +> ``` +> {% endraw %} +> +> 5. Revisit the Galaxy Application in your browser to check whether the settings have changed. This will +> take a short while (< 1 minute) for the new container to come up. You should experience no downtime. > -> 1. Before we delete a deployment, let's ensure we understand what will happen -> with the underlying storage used by Galaxy. +> 6. Let’s now roll back to the previous revision. > > {% raw %} > ```bash -> $ kubectl get pv -> NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE -> cvmfs-cache-pv 1000Mi RWX Retain Bound cvmfs/cvmfs-cache-pvc manual 31m -> pvc-55806281-96c6-11e9-8e96-0251cc6c62f4 1Gi ROX Delete Bound default/galaxy-cvmfs-gxy-data-pvc cvmfs-gxy-data 28m -> pvc-5580c830-96c6-11e9-8e96-0251cc6c62f4 1Gi ROX Delete Bound default/galaxy-cvmfs-gxy-main-pvc cvmfs-gxy-main 28m -> pvc-55814757-96c6-11e9-8e96-0251cc6c62f4 10Gi RWX Delete Bound default/galaxy-galaxy-pvc nfs-provisioner 28m -> pvc-70d4cc48-96be-11e9-8e96-0251cc6c62f4 8Gi RWO Delete Bound default/data-galaxy-galaxy-postgres-0 nfs-provisioner 84m -> pvc-8cb27bc9-9679-11e9-8e96-0251cc6c62f4 100Gi RWO Delete Bound cloudman/data-nfs-provisioner-0 ebs-provisioner 9h +> helm rollback mygalaxy 1 > ``` > {% endraw %} > -> As we can see in the command output, the storage resources associated with -> the current deployment have the reclaim policy set to `Delete`, which will -> happen once no resources are using the given resource. If what you see is -> the not the intended behavior, you can change the [reclaim policy]. -> -> 2. Once we're ok with the state of the resources and are ready to delete a -> a deployment, we can do so with the following commands: +> Use `helm get values` again to observe that the values have reverted to +> the previous revision. After a short while, once the new container is up +> and running, Kubernetes will automatically switch over to it and you can +> see that the previous configuration has been restored. > +{: .hands_on} + +## Deleting a Deployed Helm Release +By default, the Helm chart is designed to install all required dependencies, so that it's easy +to get an instance up and running quickly for experimentation. However, in production, we +recommend installing the dependency charts separately, once per cluster, by installing +Galaxy with helm options +`--set postgresql.deploy=false --set s3csi.deploy=false --set cvmfs.deploy=false --set rabbitmq.deploy=false`. + +This is particularly important during uninstallation, where orderly destruction of dependencies is often required +For example, if the rabbitmq operator is uninstalled before the rest of the Galaxy helm chart is deleted, there will be +no operator left to cleanup rabbitmq resources. Installing the aforementioned operators separately sidesteps this problem. + +> Deleting a Deployed Helm Release > > {% raw %} > ```bash -> helm delete --purge galaxy -> helm delete --purge cvmfs +> helm delete mygalaxy +> helm delete mycvmfs # and any other operators > ``` > {% endraw %} >