-
Notifications
You must be signed in to change notification settings - Fork 6
JupyterHub Binder deployment strategies on AWS
(https://jupyter.org/hub ; https://tljh.jupyter.org/en/latest/topic/whentouse.html#topic-whentouse)
What is JupyterHub and key features of JupyterHub can be found at the first link above.
Information about distributions of JupyterHub and choosing a distrubution of JupyterHub can be found and the first and second link above.
There are two distributions: Kubernetes and Littlest
- Kubernetes -
- allows JupyterHub to scale to many thousands of users
- can flexibly grow/shrink the size of resources it needs
- uses container technology (Docker) in administering user sessions
- allows users to interact with a computing environment through a webpage - makes it is easy to provide and standardize the computing environment of a group of people
- Littlest -
- also known as The Littlest JupyterHub (TLJH)
- an opinionated and pre-configured distribution to deploy a JupyterHub on a single virtual machine (in the cloud or on your own hardware)
- designed to be a more lightweight and maintainable solution for use-cases where size, scalability, and cost-savings are not a huge concern
- distribution for a small (0-100) number of users
Although we are testing with 1-5 users, we are chosing Kubernetes deployment because we are spreading users on a cluster of smaller machines that are scaled up or down, and we need to be able to run containers (docker or singularity). This will also allow us to scale up users as needed.
open ports 80, 443, and 22
reference: https://hackernoon.com/tutorial-how-to-extend-aws-ebs-volumes-with-no-downtime-ec7d9e82426e
a. login to AWS console
b. choose "EC2" from services list
c. click on "Volumes" under ELASTIC BLOCK STORE menu (on the left)
d. choose the volume to resize, right click on "Modify Volume"
e. set the new size for volume
`# extend from 8GB to 50GB`
`# need to at least ~15-20GB`
f. click on modify
g. make sure partition is extended
`lsblk`
-OR-
`df -h`
`# if partition is not extended see reference`
a. login to your AWS console
b. choose “EC2” from the services list
c. click on “Volumes” under ELASTIC BLOCK STORE menu (on the left)
d. choose the volume that you want to resize, right click on “Modify Volume”
d. set the new size for your volume
# extended from 8GB to 50GB
# need at least ~15-20GB
e. click on modify
f. make sure partition is extended
lsblk
OR
df -h
# if partition is not extended see reference
-
set up lets encrypt & nginx reference: https://github.com/dandi/infrastructure/wiki/Girder-setup-on-aws
install pre-requisites: apt-get update && apt-get upgrade -y #update package list apt-get install -y git python3.7 python3-setuptools python3-7-pip nginx vim fail2ban
setup nginx: vim /etc/nginx/sites-enabled/hub.dandiarchive.org
edit nginx site file: reference: https://jupyterhub.readthedocs.io/en/stable/reference/config-proxy.html reference: https://jupyterhub.readthedocs.io/en/stable/reference/config-proxy.html) # top-level http config for websocket headers # If Upgrade is defined, Connection = upgrade # If Upgrade is empty, Connection = close map $http_upgrade $connection_upgrade { default upgrade; '' close; } server { server_name hub.dandiarchive.org; location / { # proxy_pass http://localhost:8080/; proxy_pass http://localhost:8000/; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # websocket headers proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; } listen 443 ssl; # managed by Certbot ssl_certificate /etc/letsencrypt/live/hub.dandiarchive.org/fullchain.pem; # managed by Certbot ssl_certificate_key /etc/letsencrypt/live/hub.dandiarchive.org/privkey.pem; # managed by Certbot include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot ssl_session_cache shared:SSL:50m; ssl_stapling on; ssl_stapling_verify on; add_header Strict-Transport-Security max-age=15768000; } server { if ($host = hub.dandiarchive.org) { return 301 https://$host$request_uri; } # managed by Certbot listen 80; server_name hub.dandiarchive.org; return 404; # managed by Certbot }
restart nginx: nginx -t # test nginx configuration service nginx restart # restart nginx service nginx status # check nginx status
setup lets encrypt: apt-get install -y software-properties-common add-apt-repository universe add-apt-repository -y ppa:certbot/certbot apt-get update apt-get install -y certbot python-certbot-nginx certbot --nginx
-
install docker reference: https://phoenixnap.com/kb/install-kubernetes-on-ubuntu
apt-get update && apt-get upgrade -y # update package list apt-get install docker.io docker -v # check docker version systemctl enable docker # set docker to launch at boot systemctl status docker # check docker is running systemctl start docker # start docker if it is not running
-
create jupyterhub docker image reference: https://medium.com/@bluedme/connecting-jupyterhub-to-auth0-e92f0bb6efb0
docker pull jupyterhub/jupyterhub # download jupyterhub container docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub # launch jupyterhub server docker exec -it jupyterhub bash # go inside/allow to run a bash process in container useradd --create-home # create user (with password) to log into jupyterHub server passwd
conda install notebook # install jupyter notebook conda install jupyterlab # install jupyter lab apt-get update && apt-get upgrade -y # update package list apt-get install python3-pip # install pip exit # exit container
docker restart jupyterhub # restart jupyterhub server
-
install and start minikube install kubectl: reference: https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-on-linux
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl # download latest release chmod +x ./kubectl # make the kubectl binary executable sudo mv ./kubectl /usr/local/bin/kubectl # move the binary in to your PATH kubectl version # check kubectl is installed and version is up-to-date
install minikube: reference: https://kubernetes.io/docs/tasks/tools/install-minikube/
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 # download latest release chmod +x minikube # make the kubectl binary executable sudo mkdir -p /usr/local/bin/ # move the binary in to your PATH sudo install minikube /usr/local/bin/
check minikube version and start: minikube version sudo minikube start --vm-driver=none # start minikube without VM. if command returns error, see reference
-
deploy jupyterhub to minikube pod reference: https://sweetcode.io/learning-kubernetes-getting-started-minikube/ reference: https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/
setup pod: check which nodes and pods are up: kubectl get nodes kubectl get pods # no pods should be deployed to the cluster yet
install pre-requisites: apt-get update && apt-get upgrade -y # update package list apt-get install socat # required for port-forwarding edit pod configurations: vim pod.yaml # create pod configuration options files apiVersion: v1 kind: Pod metadata: name: pod-jupyter-test labels: app: pod-jupyter-test spec: # specification of the pod's contents restartPolicy: Never containers: - name: pod-jupyter-test image: jupyterhub/jupyterhub ports: - containerPort: 8000
deploy pod: kubectl create -f pod.yaml kubectl get pods kubectl describe pod pod-jupyter-test nohup kubectl port-forward pod-jupyter-test 8000:8000 & # run port forwarder in the background (even after logout)