-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
README: documentation about the repo
Adds information about how to used the ansible scripts. Signed-off-by: José Guilherme Vanz <[email protected]>
- Loading branch information
Showing
1 changed file
with
75 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
querido-diario-automation | ||
========================= | ||
|
||
This repository contains the Ansible script to deploy the spiders created in the | ||
Querido Diário project. It installs the packages necessary to run the spiders | ||
and installs systemd services to run the spiders. | ||
|
||
|
||
#### How to use it? | ||
|
||
The deploys is done using Ansible. For that, you need to launch your server | ||
some where, configure ansible to access it and run the playbook | ||
|
||
### Inventory | ||
|
||
The playbooks expects a "querido_diario" group. So, first of all you need to | ||
configure the inventory. There a bunch of different ways to do that. You can | ||
check the [Ansible documentation](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) | ||
and choose what is better for you. One of the simplest way is have a simple | ||
file with the inventory. Something like this: | ||
|
||
``` | ||
[querido_diario] | ||
161.35.151.103 | ||
``` | ||
|
||
After you setup the inventory, you can test the access with the following | ||
command: | ||
|
||
```bash | ||
ansible -i inventoryfile -m ping querido_diario | ||
``` | ||
|
||
### Variables | ||
|
||
The default scrapy pipeline configured by these scripts only download the | ||
gazette files and upload them to a remote storage system. The only one which | ||
has been tested is the Digital Ocean Spaces which works with the S3 protocol. | ||
For this reason, it is necessary define the variables with the access information. | ||
|
||
All the available variables and their default values can be found at the | ||
roles/spider/defaults/main.yml. The playbooks already define the variables | ||
files at vars/configure_spiders.yml. Thus, you can just update the file with | ||
the variables and run the playbook. | ||
|
||
The variables to configure the S3ish storage system are: | ||
|
||
``` | ||
#Digital Ocean spaces config | ||
AWS_ACCESS_KEY_ID: "" | ||
AWS_SECRET_ACCESS_KEY: "" | ||
AWS_ENDPOINT_URL: "" | ||
AWS_REGION_NAME: "" | ||
``` | ||
|
||
You should be able to get this info from your PaaS provider. | ||
|
||
### Playbooks | ||
|
||
After configure the inventory and the variables values (if needed), you can | ||
run the playbooks to configure the server. There are two playbooks available, | ||
the `configure_everything.yaml` and `configure_spider.yaml`. The | ||
`configure_everything.yaml` installs all the packages necessary to run the | ||
spiders, updates all the packages, creates user and install the systemd services | ||
and timers to run the spiders very day. `configure_spider.yaml` reconfigures the | ||
systemd services and timer. But it does not reconfigure the host machine. To | ||
run the playbook the following command can be used: | ||
|
||
```bash | ||
ansible-playbook -i inventoryfile configure_everything.yaml | ||
# OR | ||
ansible-playbook -i inventoryfile configure_spider.yaml | ||
``` | ||
|
||
|