In this Hands-On we will go into detail on the KFP-Tekton project, how to compile Kubeflow Pipelines to Tekton YAML and run the pipeline with Tekton on a Kubernetes cluster.
- Prerequisites
- Compiling a Kubeflow Pipelines DSL Script
- Running the Pipeline on a Tekton Cluster
- Finding the PipelineRun in the Tekton Dashboard
- Optional: Compiling to Argo YAML (KFP Default)
- Python: version
3.7
or later - Kubernetes Cluster: version
1.20
(required by Kubeflow and Tekton 0.30) kubectl
CLI: required to deploy Tekton pipelines to Kubernetes cluster- Tekton Deployment: version
0.30.0
(or greater to support Tekton API versionv1beta1
), required for end-to-end testing tkn
CLI: required to work with Tekton pipelines- Kubeflow Pipelines Deployment: required for some end-to-end tests
A working Tekton cluster deployment is required to perform end-to-end tests of the pipelines generated by the
kfp_tekton
compiler. The Tekton CLI is useful to start a pipeline and analyze the pipeline logs.
Follow the instructions listed here or simply run:
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.30.0/release.yaml
Note, if your container runtime does not support image-reference:tag@digest (like cri-o used in OpenShift 4.x),
use release.notags.yaml
instead.
Enable custom task controller and other feature flags for kfp-tekton
kubectl patch cm feature-flags -n tekton-pipelines \
-p '{"data":{"enable-custom-tasks": "true", "enable-api-fields": "alpha"}}'
Optionally, for convenience, set the default namespace to tekton-pipelines
:
kubectl config set-context --current --namespace=tekton-pipelines
Follow the instructions here.
Mac OS users can install the Tekton CLI using the homebrew
formula:
brew tap tektoncd/tools
brew install tektoncd/tools/tektoncd-cli
Follow the installation instructions here, i.e.:
kubectl apply --filename https://storage.googleapis.com/tekton-releases/dashboard/latest/tekton-dashboard-release.yaml
The Tekton Dashboard can be accessed through its ClusterIP
service by running kubectl proxy
or the service can
be patched to expose a public NodePort
IP:
kubectl patch svc tekton-dashboard -n tekton-pipelines --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'
To open the dashboard run:
TKN_UI_PORT=$(kubectl -n tekton-pipelines get service tekton-dashboard -o jsonpath='{.spec.ports[0].nodePort}')
PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}')
open "http://${PUBLIC_IP}:${TKN_UI_PORT}/#/pipelineruns"
-
Clone the
kfp-tekton
repo:git clone https://github.com/kubeflow/kfp-tekton.git cd kfp-tekton
-
Setup Python virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Install the
kfp_tekton
compiler:pip install -e sdk/python
-
Run the compiler tests (optional):
make test
The kfp-tekton
Python package comes with the dsl-compile-tekton
command line
executable, which should be available in your terminal shell environment after
installing the kfp-tekton
Python package.
If you cloned the kfp-tekton
project, you can find example pipelines in the
samples
folder or under sdk/python/tests/compiler/testdata
folder.
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.yaml
After compiling the sdk/python/tests/compiler/testdata/parallel_join.py
DSL script
in the step above, we need to deploy the generated Tekton YAML to our Kubernetes
cluster with kubectl
. The Tekton server will automatically start a pipeline run
for which we can follow the logs using the tkn
CLI.
Here we have to deploy the pipeline in the kubeflow namespace because all the pipelines with metadata and artifacts tracking rely on the minio object storage credentials in the kubeflow namespace.
kubectl apply -f pipeline.yaml -n kubeflow
tkn pipelinerun logs --last -n kubeflow
Once the Tekton Pipeline is running, the logs should start streaming:
Waiting for logs to be available...
[gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[gcs-download : copy-artifacts] Added `storage` successfully.
[gcs-download : copy-artifacts] tar: removing leading '/' from member names
[gcs-download : copy-artifacts] tekton/results/data
[gcs-download : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download/data.tgz`
[gcs-download : copy-artifacts] Total: 0 B, Transferred: 195 B, Speed: 1 B/s
[gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[gcs-download-2 : copy-artifacts] Added `storage` successfully.
[gcs-download-2 : copy-artifacts] tar: removing leading '/' from member names
[gcs-download-2 : copy-artifacts] tekton/results/data
[gcs-download-2 : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download-2/data.tgz`
[gcs-download-2 : copy-artifacts] Total: 0 B, Transferred: 205 B, Speed: 1 B/s
[echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : main]
[echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main]
-
From the Terminal, run the following commands to open the PipelineRuns on the Tekton dashboard:
TKN_UI_PORT=$(kubectl get service tekton-dashboard -n tekton-pipelines -o jsonpath='{.spec.ports[0].nodePort}') PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}') open "http://${PUBLIC_IP}:${TKN_UI_PORT}/#/pipelineruns"
-
The Tekton Dashboard should open with the PipelineRuns tab selected:
-
Click on the
parallel-pipeline
in the table and select the individual tasks to see the log output:
If a Tekton Cluster deployment is not available, compiling the Kubeflow Pipeline DSL scripts to
Argo YAML works very similar to the compilation step described above.
Instead of the dsl-compile-tekton
command, use the dsl-compile
executable, which should be available in your
terminal shell environment after installing either the kfp-tekton
or the kfp
Python package. The output should be
a .tar.gz
file in order to upload the compiled pipeline to the Kubeflow Pipelines web interface.
dsl-compile \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.tar.gz
Take a look at the Kubeflow Pipelines documentation to learn more about compiling Kubeflow Pipeline samples on the command line and read through the Pipelines Quickstart tutorial to learn about using the Kubeflow Pipelines web interface.