Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update INSTALL.pipeline #514

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 31 additions & 1 deletion INSTALL.pipeline
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,37 @@ him know a username you'd like for yourself, if you don't already have one.
He will set things up so your new pipeline server can coordinate with the
others, and will be allowed to upload finished WARCs to the Internet Archive.

** STEP 2.5: PYTHON SETUP **

For a recent OS, you will likely not be able to run ArchiveBot with the system
version of Python. One way around this is to use pyenv. These instructions
use the installer; we suggest reading the pyenv.run script first or installing
it manually. For a manual installation, see <https://github.com/pyenv/pyenv>.

sudo apt install git make build-essential libssl-dev zlib1g-dev libbz2-dev \
libreadline-dev libsqlite3-dev libffi-dev curl
curl https://pyenv.run | bash

Follow the directions the script gives you (probably, the shims part to add
to .profile must be near the beginning).

Restart your shell, and then:

pyenv install -v 3.6.14 # or whatever the newest python 3.6 is
pyenv virtualenv 3.6.14 archivebot
pyenv activate archivebot

** STEP 2.7: TCP-CLOSER **

Sometimes the pipeline clogs up due to hung sockets. tcp-closer fixes this.

apt install cmake libmnl-dev
git clone https://github.com/kristrev/tcp_closer.git

Follow the directions in its README.md to install. Run as follows:

screen -dmS tcp-closer-4 -- bash -c 'sudo /usr/local/bin/tcp-closer --dport 443 -i 30 --idle_time 21601000 --last_recv_limit 43200000; exec bash'
screen -dmS tcp-closer-6 -- bash -c 'sudo /usr/local/bin/tcp-closer -6 --dport 443 -i 30 --idle_time 21601000 --last_recv_limit 43200000; exec bash'

** STEP 3: FINALLY, IT'S TIME TO INSTALL ARCHIVEBOT CODE **

Expand Down Expand Up @@ -198,4 +229,3 @@ If the pipeline runs out of RAM, you will likely have to kill the job
that is consuming all the RAM; wpull instances will pause to avoid the
OOM killer being run. Consider creating a small swap file if your VM
does not have any swap.