In this HandsOn we will go into detail on the KFP-Tekton project, how to compile Kubeflow Pipelines to Tekton YAML and run the pipeline with Tekton on a Kubernetes cluster.
Python
: version3.5
or laterKubernetes
Cluster: version1.15
(required by Kubeflow and Tekton 0.11)kubectl
CLI: required to deploy Tekton pipelines to Kubernetes clusterTekton
Deployment: version0.13.0
(or greater to support Tekton API versionv1beta1
), required for end-to-end testingtkn
CLI: required to work with Tekton pipelinesKubeflow Pipelines
Deployment: required for some end-to-end tests
A working Tekton cluster deployment is required to perform end-to-end tests of the pipelines generated by the
kfp_tekton
compiler. The Tekton CLI is useful to start a pipeline and analyze the pipeline logs.
Follow the instructions listed here or simply run:
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.13.0/release.yaml
Note, if your container runtime does not support image-reference:tag@digest (like cri-o used in OpenShift 4.x),
use release.notags.yaml
instead.
Note, for KFP, we shouldn't modify the default work directory for any component. Therefore, please run the following command to disable the home and work directory overwrite from Tekton default.
kubectl patch cm feature-flags -n tekton-pipelines \
-p '{"data":{"disable-home-env-overwrite":"true","disable-working-directory-overwrite":"true"}}'
Optionally, for convenience, set the default namespace to tekton-pipelines
:
kubectl config set-context --current --namespace=tekton-pipelines
Follow the instructions here.
Mac OS users can install the Tekton CLI using the homebrew
formula:
brew tap tektoncd/tools
brew install tektoncd/tools/tektoncd-cli
Follow the installation instructions here, i.e.:
kubectl apply -f https://github.com/tektoncd/dashboard/releases/download/v0.7.1/tekton-dashboard-release.yaml
The Tekton Dashboard can be accessed through its ClusterIP
service by running kubectl proxy
or the service can
be patched to expose a public NodePort
IP:
kubectl patch svc tekton-dashboard -n tekton-pipelines --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'
To open the dashboard run:
TKN_DASHBOARD_SVC_PORT=$(kubectl -n tekton-pipelines get service tekton-dashboard -o jsonpath='{.spec.ports[0].nodePort}')
PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}')
open "http://${PUBLIC_IP}:${TKN_DASHBOARD_SVC_PORT}/#/pipelineruns"
-
Clone the
kfp-tekton
repo:git clone https://github.com/kubeflow/kfp-tekton.git cd kfp-tekton
-
Setup Python virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Install the
kfp_tekton
compiler:pip install -e sdk/python
-
Run the compiler tests (optional):
make test
The kfp-tekton
Python package comes with the dsl-compile-tekton
command line
executable, which should be available in your terminal shell environment after
installing the kfp-tekton
Python package.
If you cloned the kfp-tekton
project, you can find example pipelines in the
samples
folder or under sdk/python/tests/compiler/testdata
folder.
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.yaml
After compiling the sdk/python/tests/compiler/testdata/parallel_join.py
DSL script
in the step above, we need to deploy the generated Tekton YAML to our Kubernetes
cluster with kubectl
. The Tekton server will automatically start a pipeline run
for which we can follow the logs using the tkn
CLI.
Here we have to deploy the pipeline in the kubeflow namespace because all the pipelines with metadata and artifacts tracking rely on the minio object storage credentials in the kubeflow namespace.
kubectl apply -f pipeline.yaml -n kubeflow
tkn pipelinerun logs --last -n kubeflow
Once the Tekton Pipeline is running, the logs should start streaming:
Waiting for logs to be available...
[gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[gcs-download : copy-artifacts] Added `storage` successfully.
[gcs-download : copy-artifacts] tar: removing leading '/' from member names
[gcs-download : copy-artifacts] tekton/results/data
[gcs-download : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download/data.tgz`
[gcs-download : copy-artifacts] Total: 0 B, Transferred: 195 B, Speed: 1 B/s
[gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[gcs-download-2 : copy-artifacts] Added `storage` successfully.
[gcs-download-2 : copy-artifacts] tar: removing leading '/' from member names
[gcs-download-2 : copy-artifacts] tekton/results/data
[gcs-download-2 : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download-2/data.tgz`
[gcs-download-2 : copy-artifacts] Total: 0 B, Transferred: 205 B, Speed: 1 B/s
[echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : main]
[echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main]
-
From the Terminal, run the following commands to open the PipelineRuns on the Tekton dashboard:
TKN_UI_PORT=$(kubectl get service tekton-dashboard -n tekton-pipelines -o jsonpath='{.spec.ports[0].nodePort}') PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}') open "http://${PUBLIC_IP}:${TKN_UI_PORT}/#/pipelineruns"
-
The Tekton Dashboard should open with the PipelineRuns tab selected:
-
Click on the
parallel-pipeline
in the table and select the individual tasks to see the log output: