Workflow based on the ML training presented in Orion and RMMap>
In this workflow we train a random forest model using the MNIST 10k dataset.
First, upload the images:
faasmctl s3.clear-bucket --bucket ${BUCKET_NAME}
# Upload all data files in the directory
faasmctl s3.upload-dir \
--bucket ${BUCKET_NAME} \
--host-path ${PROJ_ROOT}/datasets/ml-training/mnist-images-2k \
--s3-path ml-training/mnist-images-2k
Second, upload the WASM files for each stage in the workflow:
faasmctl upload.workflow \
ml-training \
faasm.azurecr.io/tless-experiments:$(cat ${PROJ_DIR}/VERSION):/usr/local/faasm/wasm/ml-training
Lastly, you may invoke the driver function to trigger workflow execution with 2 PCA functions, and 8 random forest trees.
faasmctl invoke ml-training driver --cmdline "ml-training/mnist-images-2k 2 8"
Training the full 10k images inside SGX takes up to (almost) 30'. It can be done with the following command:
faasmctl invoke ml-training driver --cmdline "ml-training/mnist-images-10k 4 8"
First, deploy the workflow to the k8s cluster with bare-metal access to SEV nodes:
export RUNTIME_CLASS_NAME=kata-qemu-sev
export TLESS_VERSION=$(cat ${PROJ_ROOT}/VERSION)
kubectl apply -f ${PROJ_ROOT}/workflows/k8s_common.yaml
envsubst < ${PROJ_ROOT}/workflows/ml-training/knative/workflow.yaml | kubectl apply -f -
Second, upload the images:
export MINIO_URL=$(kubectl -n tless get services -o jsonpath='{.items[?(@.metadata.name=="minio")].spec.clusterIP}')
# Clean dir
invrs s3 clear-dir --prefix ml-training
# Upload all data files in the directory
invrs upload-dir --host-path ${PROJ_ROOT}/datasets/ml-training/mnist-images-2k --s3-path ml-training/mnist-images-2k
then you may execute the workflow by running:
${PROJ_ROOT}/workflows/ml-training/knative/curl_cmd.sh