Skip to content

Latest commit

 

History

History
 
 

KFP-Tekton

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Kubeflow Pipelines with Tekton - Dojo Day 2 HandsOn

In this HandsOn we will go into detail on the KFP-Tekton project, how to compile Kubeflow Pipelines to Tekton YAML and run the pipeline with Tekton on a Kubernetes cluster.

Prerequisites

  1. Python: version 3.5 or later
  2. Kubernetes Cluster: version 1.15 (required by Kubeflow and Tekton 0.11)
  3. kubectl CLI: required to deploy Tekton pipelines to Kubernetes cluster
  4. Tekton Deployment: version 0.13.0 (or greater to support Tekton API version v1beta1), required for end-to-end testing
  5. tkn CLI: required to work with Tekton pipelines
  6. Kubeflow Pipelines Deployment: required for some end-to-end tests

Installing Tekton

A working Tekton cluster deployment is required to perform end-to-end tests of the pipelines generated by the kfp_tekton compiler. The Tekton CLI is useful to start a pipeline and analyze the pipeline logs.

Tekton Cluster

Follow the instructions listed here or simply run:

kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.13.0/release.yaml

Note, if your container runtime does not support image-reference:tag@digest (like cri-o used in OpenShift 4.x), use release.notags.yaml instead.

Note, for KFP, we shouldn't modify the default work directory for any component. Therefore, please run the following command to disable the home and work directory overwrite from Tekton default.

kubectl patch cm feature-flags -n tekton-pipelines \
    -p '{"data":{"disable-home-env-overwrite":"true","disable-working-directory-overwrite":"true"}}'

Optionally, for convenience, set the default namespace to tekton-pipelines:

kubectl config set-context --current --namespace=tekton-pipelines

Tekton CLI

Follow the instructions here.

Mac OS users can install the Tekton CLI using the homebrew formula:

brew tap tektoncd/tools
brew install tektoncd/tools/tektoncd-cli

Tekton Dashboard

Follow the installation instructions here, i.e.:

kubectl apply -f https://github.com/tektoncd/dashboard/releases/download/v0.7.1/tekton-dashboard-release.yaml

The Tekton Dashboard can be accessed through its ClusterIP service by running kubectl proxy or the service can be patched to expose a public NodePort IP:

kubectl patch svc tekton-dashboard -n tekton-pipelines --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'

To open the dashboard run:

TKN_DASHBOARD_SVC_PORT=$(kubectl -n tekton-pipelines get service tekton-dashboard -o jsonpath='{.spec.ports[0].nodePort}')
PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}')
open "http://${PUBLIC_IP}:${TKN_DASHBOARD_SVC_PORT}/#/pipelineruns"

Install KFP-Tekton Compiler

  1. Clone the kfp-tekton repo:

    git clone https://github.com/kubeflow/kfp-tekton.git
    cd kfp-tekton
    
  2. Setup Python virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install the kfp_tekton compiler:

    pip install -e sdk/python
    
  4. Run the compiler tests (optional):

    make test
    

Compiling a Kubeflow Pipelines DSL Script

The kfp-tekton Python package comes with the dsl-compile-tekton command line executable, which should be available in your terminal shell environment after installing the kfp-tekton Python package.

If you cloned the kfp-tekton project, you can find example pipelines in the samples folder or under sdk/python/tests/compiler/testdata folder.

dsl-compile-tekton \
    --py sdk/python/tests/compiler/testdata/parallel_join.py \
    --output pipeline.yaml

Running the Pipeline on a Tekton Cluster

After compiling the sdk/python/tests/compiler/testdata/parallel_join.py DSL script in the step above, we need to deploy the generated Tekton YAML to our Kubernetes cluster with kubectl. The Tekton server will automatically start a pipeline run for which we can follow the logs using the tkn CLI.

Here we have to deploy the pipeline in the kubeflow namespace because all the pipelines with metadata and artifacts tracking rely on the minio object storage credentials in the kubeflow namespace.

kubectl apply -f pipeline.yaml -n kubeflow

tkn pipelinerun logs --last -n kubeflow

Once the Tekton Pipeline is running, the logs should start streaming:

Waiting for logs to be available...

[gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate

[gcs-download : copy-artifacts] Added `storage` successfully.
[gcs-download : copy-artifacts] tar: removing leading '/' from member names
[gcs-download : copy-artifacts] tekton/results/data
[gcs-download : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download/data.tgz`
[gcs-download : copy-artifacts] Total: 0 B, Transferred: 195 B, Speed: 1 B/s

[gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath

[gcs-download-2 : copy-artifacts] Added `storage` successfully.
[gcs-download-2 : copy-artifacts] tar: removing leading '/' from member names
[gcs-download-2 : copy-artifacts] tekton/results/data
[gcs-download-2 : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline/gcs-download-2/data.tgz`
[gcs-download-2 : copy-artifacts] Total: 0 B, Transferred: 205 B, Speed: 1 B/s

[echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : main]
[echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main]

Find the PipelineRun in the Tekton Dashboard

  1. From the Terminal, run the following commands to open the PipelineRuns on the Tekton dashboard:

    TKN_UI_PORT=$(kubectl get service tekton-dashboard -n tekton-pipelines -o jsonpath='{.spec.ports[0].nodePort}')
    PUBLIC_IP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="ExternalIP")].address}')
    open "http://${PUBLIC_IP}:${TKN_UI_PORT}/#/pipelineruns"
  2. The Tekton Dashboard should open with the PipelineRuns tab selected:

  3. Click on the parallel-pipeline in the table and select the individual tasks to see the log output: