Container for AWS SageMaker TrainingJob with pre-installed Jupyter Lab
- You need to have a jump host (e.g. EC2 instance) with an SSH server
- Make sure you have aws cli and docker installed
- Clone the repository
- Change to the directory and build the container with docker
docker build -t sagemaker-notebook .
- Create an repository "sagemaker-notebook" in AWS ECR (Elastic Container Registry)
- Tag the container you just built
docker tag sagemaker-notebook:latest <put your ecr registry here>/sagemaker-notebook:latest
- Login to your ECR. Your can also find the login instruction in the ECR console
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <put your ecr registry here>
- Push your container to ECR
docker push <put your ecr registry here>/sagemaker-notebook:latest
- Create a training job using "Your own algorithm container in ECR" and specify the previous created repository.
- Provide your SSH jump host IP, SSH key (base64 encoded) and token for notebook access in hyper parameters. The following is an example
"SSH_HOST": "",
- (optional but recommended) Add VPC connection to your training job if your jump host is located in a VPC
- Wait until the training job status becomes "training"
- Your note book should be accessible in your jump host:
- Following is an example command to forward port in your local machine to the Jupyter Lab port in your jump server:
ssh -NR 8888: <jump host IP>
- Check training job log for potential error messages.
- Make sure your jump host IP is reachable from the training job