-
Notifications
You must be signed in to change notification settings - Fork 3
Flowmap.query on DigitalOcean: Step by step guide
Create a 2GB RAM Ubuntu 18.04 droplet on DigitalOcean (costs $10 per month). If you database is small, you can later scale it down to a 1GB RAM droplet which costs $5 per month. But 1GB is unfortunately not enough to build the app from source with Node.js.
On your DigitalOcean projects page you should see the IP address of your new droplet. Connect to it via SSH from your terminal:
Alternatively, you can use "Access console" directly on the projects page.
sudo apt-add-repository "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4 # optional
sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server
sudo service clickhouse-server start
You will need to set a password for ClickHouse during this process.
Download the data for 2018:
for month in `seq 1 12`; do wget -P citibike-trips/ \
`printf "https://s3.amazonaws.com/tripdata/2018%02d-citibike-tripdata.csv.zip" $month`; done
Unzip it:
sudo apt-get install unzip
unzip "citibike-trips/*.zip" -d citibike-trips/
clickhouse-client --password YOUR_CLICKHOUSE_PASSWORD --query="
CREATE TABLE nyc_citibike_trips (
start_date Date,
trip_duration UInt16,
start_time DateTime,
stop_time DateTime,
start_station_id String,
end_station_id String,
bike_id UInt8,
user_type Enum8('Subscriber'=1,'Customer'=2),
birth_year UInt16,
gender Enum8('0'=0,'1'=1,'2'=2)) ENGINE = MergeTree(start_date, (start_date, start_time), 8192);"
for csvfile in citibike-trips/*.csv; do cat $csvfile | \
awk -F, -v OFS=',' '{print substr($2,2,10),$1,substr($2,2,19),substr($3,2,19),$4,$8,$12,$13,$14,$15,$16}' | \
sed '1d;$d' | \
clickhouse-client --password YOUR_CLICKHOUSE_PASSWORD --query="INSERT INTO nyc_citibike_trips FORMAT CSV"; \
done
Do this if you want to be able to connect to ClickHouse from the outside.
sudo vi /etc/clickhouse-server/config.xml
Find and Uncomment the line <listen_host>::</listen_host>
. Restart ClickHouse.
sudo service clickhouse-server restart
sudo apt-get install build-essential libssl-dev
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.34.0/install.sh | bash
source ~/.bashrc
nvm install v10.16.0
git clone https://github.com/teralytics/flowmap.query.git
cd flowmap.query
npm install
echo CLICKHOUSE_URL="http://localhost:8123?enable_http_compression=1&password=YOUR_CLICKHOUSE_PASSWORD" > .env
You need to provide a Mapbox access token for the base map to work. Sign up here if you don't have one yet.
echo REACT_APP_MapboxAccessToken=pk… > client/.env
cd client && npm install && npm run build && cd ..
PM2 will automatically restart flowmap.query if it crashes and if the server restarts.
npm install pm2@latest -g
pm2 startup systemd
env $(cat .env) NODE_ENV=production pm2 start backend/server.js