-
Notifications
You must be signed in to change notification settings - Fork 6
JupyterHub Binder deployment strategies on AWS
(https://jupyter.org/hub ; https://tljh.jupyter.org/en/latest/topic/whentouse.html#topic-whentouse)
What is JupyterHub and key features of JupyterHub can be found at the first link above.
Information about distributions of JupyterHub and choosing a distrubution of JupyterHub can be found and the first and second link above.
There are two distributions: Kubernetes and Littlest
- Kubernetes -
- allows JupyterHub to scale to many thousands of users
- can flexibly grow/shrink the size of resources it needs
- uses container technology (Docker) in administering user sessions
- allows users to interact with a computing environment through a webpage - makes it is easy to provide and standardize the computing environment of a group of people
- Littlest -
- also known as The Littlest JupyterHub (TLJH)
- an opinionated and pre-configured distribution to deploy a JupyterHub on a single virtual machine (in the cloud or on your own hardware)
- designed to be a more lightweight and maintainable solution for use-cases where size, scalability, and cost-savings are not a huge concern
- distribution for a small (0-100) number of users
Although we are testing with 1-5 users, we are chosing Kubernetes deployment because we are spreading users on a cluster of smaller machines that are scaled up or down, and we need to be able to run containers (docker or singularity). This will also allow us to scale up users as needed.
Ubuntu Server 18.04 LTS (HVM), SSD Volume Type - ami-05c1fa8df71875112 (64-bit x86) / ami-0606a0d9f566249d3 (64-bit Arm)
t3a.medium (Variable ECUs, 2 vCPUs, 2.2 GHz, AMD EPYC 7571, 4 GiB memory, EBS only)
instance configuration details: T2/T3 Unlimited enabled
storage: root volume, device /dev/sda1, size 8 GiB, General Purpose SSD (gp2), not encrypted
tags: DANDI-HUB, Webserver, on instances and volumes
Type Protocol Port Range Source Description
HTTP TCP 80 0.0.0.0/0
HTTP TCP 80 ::/0
SSH TCP 22 0.0.0.0/0
HTTPS TCP 443 0.0.0.0/0
HTTPS TCP 443 ::/0
hub 3.19.206.158
https://github.com/dandi/infrastructure/wiki/Girder-setup-on-aws
#### install pre-requisites:
apt-get update
apt-get install -y git docker-compose python3.7 python3-setuptools nginx vim fail2ban
#### setup nginx:
vim /etc/nginx/sites-enabled/hub.dandiarchive.org
#### edit nginx site file:
`server {`
`listen 80;`
`server_name hub.dandiarchive.org;`
`location / {`
`proxy_pass http://localhost:8080/;`
`proxy_set_header Host $host;`
`proxy_set_header X-Real-IP $remote_addr;`
`proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;`
`}`
`}`
#### restart nginx:
`service nginx restart`
`service nginx status`
#### setup lets encrypt:
`apt-get install -y software-properties-common`
`add-apt-repository universe`
`add-apt-repository -y ppa:certbot/certbot`
`apt-get update`
`apt-get install -y certbot python-certbot-nginx`
`certbot --nginx`
https://github.com/kubernetes/kops/blob/master/docs/install.md
#### kops:
`curl -Lo kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64`
`chmod +x ./kops`
`sudo mv ./kops /usr/local/bin/`
#### kubectl:
`curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl`
`chmod +x ./kubectl`
`sudo mv ./kubectl /usr/local/bin/kubectl`
#### aws cli tools:
#### install pip:
(after installation of python3.7 and python3-setuptools during lets encrypt step)
???????? `apt update`
`apt-get install python3-setuptools`
`curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`
`python3.7 get-pip.py`
#### install aws cli tools:
`pip install awscli`
`pip install --user --upgrade awscli`
`export REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep region|awk -F '"' '{print }'``
`sudo apt-get update`
`sudo apt-get install awscli`
`export ZONES=$(aws ec2 describe-availability-zones --region $REGION | grep ZoneName | awk '{print }' | tr -d '"')`
`ssh-keygen`
https://kubernetes.io/docs/setup/production-environment/tools/kops/#creating-a-cluster
*** sub domain not found as per instructions on website above:
ubuntu@ip-172-31-42-61:~/.ssh$ dig ns hub.dandiarchive.org
; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> ns hub.dandiarchive.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54931 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 65494 ;; QUESTION SECTION: ;hub.dandiarchive.org. IN NS
;; Query time: 36 msec ;; SERVER: 127.0.0.53#53(127.0.0.53) ;; WHEN: Tue Sep 17 02:09:26 UTC 2019 ;; MSG SIZE rcvd: 49
*** showing failure when trying to use sub-domain:
ubuntu@ip-172-31-42-61:~$ kops create cluster --name=cluster1.hub.dandiarchive.org \
--zones=us-east-2a
--authorization=RBAC
--master-size=t3a.medium
--master-volume-size=4
--node-size=t3a.medium
--node-volume-size=4
--state=s3://bucket1.hub.dandiarchive.org
--topology=private
--networking=weave
--yes I0917 18:14:33.124809 17849 create_cluster.go:519] Inferred --cloud=aws from zone "us-east-2a" I0917 18:14:33.171711 17849 subnets.go:184] Assigned CIDR 172.20.32.0/19 to subnet us-east-2a I0917 18:14:33.171745 17849 subnets.go:198] Assigned CIDR 172.20.0.0/22 to subnet utility-us-east-2a I0917 18:14:33.462670 17849 create_cluster.go:1486] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
error doing DNS lookup for NS records for "hub.dandiarchive.org": lookup hub.dandiarchive.org on 127.0.0.53:53: no such host
*** aws only seems to want to use top level domains:
ubuntu@ip-172-31-42-61:~/.ssh$ dig ns dandiarchive.org
; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> ns dandiarchive.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33252 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 65494 ;; QUESTION SECTION: ;dandiarchive.org. IN NS
;; ANSWER SECTION: dandiarchive.org. 21600 IN NS ns-cloud-d4.googledomains.com. dandiarchive.org. 21600 IN NS ns-cloud-d1.googledomains.com. dandiarchive.org. 21600 IN NS ns-cloud-d2.googledomains.com. dandiarchive.org. 21600 IN NS ns-cloud-d3.googledomains.com.
;; Query time: 104 msec ;; SERVER: 127.0.0.53#53(127.0.0.53) ;; WHEN: Tue Sep 17 02:09:31 UTC 2019 ;; MSG SIZE rcvd: 166
*** it works when using top-level domain *** but we are missing api.dandiarchive.org ...
ubuntu@ip-172-31-42-61:~$ kops create cluster --name=dandiarchive.org --zones=us-east-2a --authorization=RBAC --master-size=t3a.me dium --master-volume-size=4 --node-size=t3a.medium --node-volume-size=4 --state=s3://bucket1.hub.dandiarchive.org --topology=p rivate --networking=weave --yes I0917 02:23:23.999970 7524 create_cluster.go:519] Inferred --cloud=aws from zone "us-east-2a" I0917 02:23:24.050888 7524 subnets.go:184] Assigned CIDR 172.20.32.0/19 to subnet us-east-2a I0917 02:23:24.050918 7524 subnets.go:198] Assigned CIDR 172.20.0.0/22 to subnet utility-us-east-2a I0917 02:23:24.345531 7524 create_cluster.go:1486] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub I0917 02:23:25.170525 7524 executor.go:103] Tasks: 0 done / 103 total; 48 can run I0917 02:23:25.978820 7524 vfs_castore.go:729] Issuing new certificate: "etcd-manager-ca-events" I0917 02:23:26.030063 7524 vfs_castore.go:729] Issuing new certificate: "etcd-peers-ca-main" I0917 02:23:26.655894 7524 vfs_castore.go:729] Issuing new certificate: "apiserver-aggregator-ca" I0917 02:23:26.808139 7524 vfs_castore.go:729] Issuing new certificate: "ca" I0917 02:23:27.132766 7524 vfs_castore.go:729] Issuing new certificate: "etcd-clients-ca" I0917 02:23:27.157690 7524 vfs_castore.go:729] Issuing new certificate: "etcd-manager-ca-main" I0917 02:23:27.399647 7524 vfs_castore.go:729] Issuing new certificate: "etcd-peers-ca-events" I0917 02:23:27.566378 7524 executor.go:103] Tasks: 48 done / 103 total; 27 can run I0917 02:23:28.791094 7524 vfs_castore.go:729] Issuing new certificate: "kubelet-api" I0917 02:23:29.160236 7524 vfs_castore.go:729] Issuing new certificate: "kubecfg" I0917 02:23:29.352976 7524 vfs_castore.go:729] Issuing new certificate: "apiserver-proxy-client" I0917 02:23:29.406408 7524 vfs_castore.go:729] Issuing new certificate: "kube-proxy" I0917 02:23:29.497207 7524 vfs_castore.go:729] Issuing new certificate: "kubelet" I0917 02:23:29.525997 7524 vfs_castore.go:729] Issuing new certificate: "kube-scheduler" I0917 02:23:30.551625 7524 vfs_castore.go:729] Issuing new certificate: "apiserver-aggregator" I0917 02:23:30.780717 7524 vfs_castore.go:729] Issuing new certificate: "kube-controller-manager" I0917 02:23:31.094066 7524 vfs_castore.go:729] Issuing new certificate: "kops" I0917 02:23:31.410342 7524 vfs_castore.go:729] Issuing new certificate: "master" I0917 02:23:31.616379 7524 executor.go:103] Tasks: 75 done / 103 total; 22 can run I0917 02:23:31.752264 7524 launchconfiguration.go:364] waiting for IAM instance profile "nodes.dandiarchive.org" to be ready I0917 02:23:31.772118 7524 launchconfiguration.go:364] waiting for IAM instance profile "masters.dandiarchive.org" to be ready I0917 02:23:42.071310 7524 executor.go:103] Tasks: 97 done / 103 total; 4 can run I0917 02:23:42.587616 7524 executor.go:103] Tasks: 101 done / 103 total; 2 can run I0917 02:23:42.615335 7524 natgateway.go:286] Waiting for NAT Gateway "nat-0adac19b60471232b" to be available (this often takes about 5 minutes) I0917 02:25:28.329877 7524 executor.go:103] Tasks: 103 done / 103 total; 0 can run I0917 02:25:28.329926 7524 dns.go:153] Pre-creating DNS records I0917 02:25:28.566035 7524 update_cluster.go:291] Exporting kubecfg for cluster kops has set your kubectl context to dandiarchive.org
Cluster is starting. It should be ready in a few minutes.
Suggestions:
- validate cluster: kops validate cluster
- list nodes: kubectl get nodes --show-labels
- ssh to the master: ssh -i ~/.ssh/id_rsa [email protected]
- the admin user is specific to Debian. If not using Debian please use the appropriate user based on your OS.
- read about installing addons at: https://github.com/kubernetes/kops/blob/master/docs/addons.md.
ubuntu@ip-172-31-42-61:~$ kubectl get nodes --show-labels Unable to connect to the server: dial tcp: lookup api.dandiarchive.org on 127.0.0.53:53: no such host