First we connet to the vm instances via ssh connections. Follow the following steps to set up ssh
-
create ssh key on your local environment
-
add the generated public key to the project metadata
- copy the contents of the
.pub
file - follow the guide here
- copy the contents of the
-
create a config file in the .ssh directory
touch ~/.ssh/config
-
paste the text below into the config file and edit accordingly
Host kafka-vm HostName <External IP Address> User <username> IdentityFile <~/.ssh/private_keyfile> LocalForward 9021 localhost:9021 Host spark-master-node HostName <External IP Address Of Master Node> User <username> IdentityFile <~/.ssh/private_keyfile> LocalForward 4040 localhost:4040 Host airflow-vm HostName <External IP Address> User <username> IdentityFile <~/.ssh/private_keyfile> LocalForward 8080 localhost:8080
-
connect to the vms in separate terminal windows
ssh kafka-vm
ssh spark-master-node
ssh airflow-vm
-
clone git repo and change directory to kafka
git clone https://github.com/topefolorunso/musicaly-project.git ~/musicaly-project
The following set up only applies to the kafka and airflow vms. The spark vm is managed by GCP so the necessary installation has been handled upon provisioning.
-
install python (anaconda dist), docker and docker-compose in the vm
bash ~/musicaly-project/vm_setup/vm_setup.sh && \ exec newgrp docker