Skip to content

nihell/fetchers-python

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Oxford COVID-19 (OxCOVID19) Data Fetcher Repository

This is the data fetcher-python repository for the OxCOVID19 Database, a large, single-centre, multimodal database consisting of information relating to COVID-19 pandemic.

OxCOVID19 Project https://covid19.eng.ox.ac.uk/ aims to increase our understanding of the Covid-19 pandemic and elaborate possible strategies to reduce the impact on the society through the combined power of Statistical and Mathematical Modelling, and Machine Learning techniques. OxCOVID19 data source fetchers written in Python3.

Cite as: Adam Mahdi, Piotr Błaszczyk, Paweł Dłotko, Dario Salvi, Tak-Shing Chan, John Harvey, Davide Gurnari, Yue Wu, Ahmad Farhat, Niklas Hellmer, Alexander Zarebski, Lionel Tarassenko, Oxford COVID-19 Database: multimodal data repository for understanding the global impact of COVID-19.University of Oxford, 2020.


Currently implemented fetchers:

Name Country Country Code Data source Status Regional levels mapping Terms of Use
APPLE_MOBILITY World several COVID‑19 Mobility Trends Reports - Apple release adm_area_1, adm_area_2: depending on the country Standard "all rights reserved" notice. No licensing information.
GOOGLE_MOBILITY World several COVID-19 Community Mobility Reports - Google release adm_area_1, adm_area_2: depending on the country Attribution required
GOVTRACK World several Oxford COVID-19 Government Response Tracker release NA CC-BY-4.0
WEATHER World several MET Informatics Lab release adm_area_1, adm_area_2, adm_area_3: depending on the country Open Government 3.0
WRD_ECDC World several European Centre for Disease Prevention and Control release NA Attribution required
AUS_C1A Australia AUS The Real-time COVID-19 Status in Australia release adm_area_1: NA or state Strictly for educational and academic research purposes
BEL_WY Belgium BEL github:eschnou release adm_area_1: NA CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
BRA_MSHM Brazil BRA github: elhenrico release adm_area_1: province CC0 1.0 Universal
CAN_GOV Canada CAN Government of Canada release adm_area_1: province Attribution required, non-commercial use
CHE_OPGOV Switzerland CHE Kanton Zürich Statistisches Amt release adm_area_1: canton CC 4.0
CHN_ICL China CHN MRC Centre Imperial College London release adm_area_1: province or None CC BY NC ND 4.0
DEU_JPGG Germany DEU Jan-Philip Gehrcke, from the Public Health Offices (Gesundheitsaemter) release adm_area_1: German "länder" MIT
ESP_MSVP Spain ESP Ministerio de Sanidad release, source stopped updating on 22/05/20 adm_area_1: comunidades autónomas Apache License 2.0
EU_ZH Austria AUT Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: state MIT
EU_ZH Belgium BEL Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: region; adm_area_2: province MIT
EU_ZH Czech Republic CZE Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: region MIT
EU_ZH Germany DEU Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: state MIT
EU_ZH Hungary HUN Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: NA MIT
EU_ZH Ireland IRL Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: county MIT
EU_ZH Norway NOR Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: county MIT
EU_ZH Poland POL Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: voivodeship MIT
EU_ZH Slovenia SVN Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: NA MIT
EU_ZH Sweden SWE Novel Coronavirus Outbreak in Europe - Chinese language release adm_area_1: province MIT
FRA_SPF France FRA Données hospitalières relatives à l'épidémie de COVID-19 release adm_area_1: France "régions", adm_area_2: France "départements" License Ouverte/Open License 2.0
FRA_SPFCG France FRA Cedric Guadalupe from Santé Publique France release adm_area_1: France "régions" GPL 3.0
GBR_PHE United Kingdom GBR Public Health England release adm_area_3: English lower tier local authority Open Government Licence v3.0
GBR_PHTW United Kingdom GBR Coronavirus (COVID-19) UK Historical Data release adm_area_1: NA or country, adm_area_2: NA or upper tier/health boards The Unlicense
GBR_PHW United Kingdom GBR Public Health Wales release adm_area_2: Welsh health board for deaths, local authority for tests Open Government Licence v3.0
IDN_GTPPC Indonesia IDN Government of Indonesia - Coronavirus Disease Response Acceleration Task Force release adm_area_1: province Standard "all rights reserved" notice. No licensing information.
IND_COVIND India IND COVID19-India API release adm_area_1: NA or state GPL 3.0
ITA_PC Italy ITA Protezione Civile release adm_area_1: italian regions, adm_area_2: italian provinces CC-BY-4.0
ITA_PCDM Italy ITA Davide Magno, from Protezione Civile release adm_area_1: italian region CC0 1.0 Universal
JPN_C1JACD Japan JPN COVID-19 Japan Anti-Coronavirus Dashboard release adm_area_1: prefecture CC BY
KOR_DS4C South Korea KOR Data Science for COVID-19 in South Korea release adm_area_1: NA or province CC-BY-NC-SA 4.0
LAT_DSRP Latin America several Latin America Covid-19 Data Repository by DSRP release adm_area_1: subdivision CC-BY-NC-SA 4.0
MYS_MHYS Malysia MYS ynshung release adm_area_1: NA or province Public Domain Dedication and License v1.0
NGA_CDC Nigeria NGA Nigeria Centre for Disease Control release adm_area_1: state No licensing information.
NGA_SO Nigeria NGA Covid-19 Nigeria API release adm_area_1: state No licensing information.
NLD_CW Netherlands NLD CoronaWatchNL release adm_area_1: NA CC0
PAK_GOV Pakistan PAK Government of Pakistan release adm_area_1: Province No licensing information.
POL_WIKI Poland POL Wikipedia release adm_area_1: NA or voivodeship CC-BY-SA
PRT_MSDS Portugal POR Data Science for Social Good Portugal release adm_area_1: NA or province MIT
RUS_GOV Russia RUS Russian Government release adm_area_1: federal subjects
SWE_GM Sweden SWE github: elinlutz release adm_area_1: province MIT
SWE_SIR Sweden SWE Svenska Intensivvårdsregistret (SIR) release adm_area_1: Swedish counties (Län) Public data may be used, but the source must be reported: Svenska Intensivvårdsregistret https://portal.icuregswe.org/siri/report/corona.inrapp (2020)
THA-STAT Thailand THA Thailand Ministry Of Public Health adm_area_1: NA DGA Open Government License
TUR_MHOE Turkey TUR github:ozanerturk release adm_area_1: NA MIT
USA_CTP United States USA The COVID Tracking Project release adm_area_1: state CC-BY-NC-4.00
USA_NYT United States USA New York Times release adm_area_1: US State, adm_area_2: county (exception is New York City, which includes more counties) Attribution required, non-commercial use
ZAF_DSFSI South Africa ZAF Data Science for Social Impact research group, the University of Pretoria release adm_area_1: province MIT

Explanation of status:

  • Draft: being developed, should not be tested yet
  • Candidate: development complete, being tested on a private test database
  • Release: tested, data are fed into the official public database

Database structure

See https://covid19.eng.ox.ac.uk/data_schema.html

Develop and test

You need:

  • Python3
  • (optional) Running instance of a PostgreSQL database
  • (optional) Docker

Run locally

  1. Add the DB_ADDRESS, DB_PORT, DB_NAME, DB_USERNAME and DB_PASSWORD environment variables
  2. Install requirements pip install -r requirements.txt
  3. Run fetcher python3 ./main.py

Run locally using Docker

  1. Add the STAGE=test, DB_ADDRESS, DB_PORT, DB_NAME, DB_USERNAME and DB_PASSWORD environment variables
  2. Run docker-compose up

Environmental variables

Variable name Default value Description
DB_USERNAME Postgres database adapter user name
DB_PASSWORD Postgres database adapter password
DB_ADDRESS Postgres database adapter address
DB_NAME Postgres database adapter name
DB_PORT 5432 Postgres database adapter port
SQLITE SQLITE adapter file path
CSV CSV adapter file path
VALIDATE_INPUT_DATA False Validate input data
SLIDING_WINDOW_DAYS Sliding window, number of days in the past to process
RUN_ONLY_PLUGINS ALL Run selected plugins from given list, run all plugins if empty
LOGLEVEL DEBUG Log level
SYS_EMAIL Notifications SMTP username
SYS_EMAIL_PASS Notifications SMTP password

Contribute

We need fetchers!

Create a fetcher for a country that is not listed yet and send us a pull request. Use only official sources, or sources derived from official sources.

You can find example code for fetcher in /src/plugins/_EXAMPLE/example_fetcher.py

About

Data source fetchers written in python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.3%
  • Jupyter Notebook 1.3%
  • Dockerfile 0.4%