sigh
is a simple program made to scrape, save and print information about disaster events around the world listed on rsoe-edis.org. Epidemics, floods, heatwaves, earthquakes, leakage of toxic chemicals, cyclones, plane crashes, zoonoses, you name it. Enough to definitely cure your optimism for the day, if not more.
sigh
is meant to be used either with fzf
submenus when run with no flag or arguments, or directly from CLI with queries.
bs4
requests
json
jtbl
argparse
Those are Python dependencies for the autogenerated .py
scripts, they can be installed with pip3 install bs4 requests json jtbl argparse
.
Scrap worldwide events from rsoe-edis.org
Usage: we [OPTION]
'we' depends on its directory structure, do not manually move the executable
to your PATH. './we --setup' will create a symbolic link to your ~/.local/bin.
Options:
-g, --get [CODE] Scrap data for event type to data/CODE.json.
-p, --print [CODE] Print saved data as json for event type.
-t, --table [CODE] Print saved data as table for event type.
-l, --list-types List types currently associated to scripts.
-s, --setup Create event scrap scripts and optionally link 'we' to PATH.
-u, --update-types Fetch current event types and save them.
-R, --rm-scripts Remove existing script(s) and optionally remove 'we' from PATH.
-C, --clear-data ([CODE]) Remove queried data or all data files (clean data/ directory).
-v, --version Print version.
-h, --help Print this help.
Improve me:
https://git.teknik.io/matf/worldevents```
#### Example for animal epidemic (type EPA):
```bash
$ we --list-types
AAT EPA EPD EPH INH HEC LSC PSI SDM TER
EVP PPP IND SUE IBE OUD ERQ LSL VOE FLD
CBE MIA OHI OTE TRI AIR PRA WTR CYC DRT
EXR HAI HEW LIT PTF SEW STO
$ we --get epa
Appended 3 event(s) to /home/user/Projects/worldevents/data/EPA.json. ✔
$ we --print epa
{
"Date": [
"2021-08-25 20:28:42",
"2021-08-25 19:49:22",
"2021-08-25 16:39:46"
],
"Location": [
"Benin, Africa",
"Nigeria, Africa",
"South Africa, Africa"
],
"Title": [
"Benin - Benin confirms H5N1 avian flu outbreak",
"Nigeria - Nigeria's southern state reports bird flu outbreak",
"South Africa - Khayelitsha animal clinic records two rabies cases after more than 20 years"
],
"Details": [
"https://rsoe-edis.org/eventList/details/111380/0",
"https://rsoe-edis.org/eventList/details/111370/0",
"https://rsoe-edis.org/eventList/details/111325/0"
]
}
###
# Not implemented yet
###
$ we --table epa
Date Title Details
------------------- --------------------------------------------------------------------------- ------------------------------------------------
2021-08-25 20:28:42 Benin - Benin confirms H5N1 avian flu outbreak https://rsoe-edis.org/eventList/details/111380/0
2021-08-25 19:49:22 Nigeria - Nigeria's southern state reports bird flu outbreak https://rsoe-edis.org/eventList/details/111370/0
2021-08-25 16:39:46 South Africa - Khayelitsha animal clinic records two rabies cases after mor https://rsoe-edis.org/eventList/details/111325/0
$ we --clear-data air
Wrn: permanently delete AIR.json? This cannot be undone. [y/N] y
/home/user/Projects/worldevents/data/AIR.json removed. ✔
$ we --clear-data
Wrn: you are about to permanently delete 8 previously scraped data file(s). Type YES to confirm. YES
/home/user/Projects/worldevents/data/ directory cleaned. ✔
$ we --rm-scripts
Wrn: remove 37 scrap scripts? [y/N] y
/home/user/Projects/worldevents/scripts/ directory cleaned. ✔
Run 'we -s' to regenerate scrap scripts.
Also remove 'we' symbolic link from your PATH? [y/N] n
- Automate fetching the list of event types into
setup/types.txt
- Add an option to scrap all categories at once instead of putting strain on the website with a request for every event category; done but does one request per even type, plus details for every event.
- Implement
-t
- Fix conflict in
fzf
for custom commands whenfzf
feeds an array (print, table), see 2604 - Check that
fzf
headers are correct when above bullet done - Implement multi arguments for
-R
- Interactive mode
- Human readable categories, not only types
- Better
fzf
interactions (prompt for other option or sequence of options when relevant), get back to main menu. etc. - Better appending (avoid duplicates, add request date, merge into same json objects instead of creating new ones); duplicates not handled, better to do it either externally or switch to db system
- Make functions into a master script that would show data, delete data, and fetch eventsot done yet)
I was just reading about the gemini
protocol and testing it with the cool amfora
client, then stumbled upon gemini://aetin.art/earth.gmi
. I found the concept pretty cool, so I started playing with it. I am not a programmer, let alone in Python, therefore I do not know if this will ever become feature complete. Cassandra helped (a lot) with getevents.py
.
This is merely a way for me to play with web-scraping and Python for something I find useful, but I am not responsible for what you may use this for. Please just don't abuse sigh --get
so that folks at rsoe-edis.org do not feel the need to add reCaptchas to their website.