Skip to content

Datasets

JaredNaik edited this page Apr 19, 2020 · 22 revisions

If you can't find what you're looking for, try exploring the resources in Other dataset collections at the bottom.

Please be clear and succinct when adding new sites, and try to maintain organization and format for new entries. Thanks!

Case tracking

COVID Tracking Project

COVID testing, hospitalizations, deaths; also rates the reporting reliability for each state
US only, state-by-state
date,state,positive,negative,pending,hospitalizedCurrently,hospitalizedCumulative,inIcuCurrently,inIcuCumulative,onVentilatorCurrently,onVentilatorCumulative,recovered,hash,dateChecked,death,hospitalized,total,totalTestResults,posNeg,fips,deathIncrease,hospitalizedIncrease,negativeIncrease,positiveIncrease,totalTestResultsIncrease
JSON, CSV
https://covidtracking.com/data/

NYTimes

Cumulative cases and deaths
US only, at state and county level
date,county,state,fips,cases,deaths
CSV
https://github.com/nytimes/covid-19-data

Euro CDC

Compares countries and regions (e.g., Africa, Oceania) by cases and deaths
Global, by country and territory
dateRep,day,month,year,cases,deaths,countriesAndTerritories,geoId,countryterritoryCode,popData2018
CSV, JSON, XML
https://www.ecdc.europa.eu/en/geographical-distribution-2019-ncov-cases

Nextstrain

Phylogeny and spread for COVID-19, with cool visualizations; Nextstrain has similar info for lots of other viruses
Global; by clade and by lab
name,node_attrs,division,num_date,confidence,gisaid_epi_isl,author,url,country,confidence,entropy,age,sex,host,recency,submitting_lab,originating_lab,div,region,clade_membership,branch_attrs,mutations
JSON
https://nextstrain.org/ncov

Worldometer

Lots of COVID stats and charts but no downloadable files it seems
Global and by country and territory
Total cases, new cases, active cases, recoveries, serious+critical cases, growth factor, closed-case CFR
https://www.worldometers.info/coronavirus/coronavirus-cases/

Fatality rates

CEBM, Oxford

Defines and explains CFR and IFR and how to estimate and understand them; compares and connects COVID with other diseases (Swine Flu, cardiovascular disease, concurrent infections); discusses particular populations (China, Italy, Diamond Princess)
Fatality rates by country, age range, sex
https://www.cebm.net/covid-19/global-covid-19-case-fatality-rates/

Lancet

Estimates of the severity of coronavirus disease 2019: a model-based analysis. These early estimates give an indication of the fatality ratio across the spectrum of COVID-19 disease and show a strong age gradient in risk of death.
CFR and IFR total and by age group; cases, severe cases, hospitalization rates by age group
https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30243-7/fulltext

Mobility tracking

Citymapper

Mobility index based on # of trips planned with Citymapper navigation app
Movement rate by city and date
Major cities only, but worldwide
https://citymapper.com/cmi

Cuebiq

Mobility index by county and week, based on cell location data
Anywhere within US
https://www.cuebiq.com/visitation-insights-covid19/?utm_source=nyt&utm_medium=article&utm_campaign=organic

US general health stats

CDC NHANES

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations.
Prevalence rates by major disease and year; these physical risk factors could be useful for a COVID model
US only
https://www.cdc.gov/nchs/nhanes/about_nhanes.htm

Other dataset collections

Academic Data Sci Alliance

datasets + lots of other cool COVID-19 resources
https://www.academicdatascience.org/covid

Figshare Dimensions

datasets + other COVID-19 research
Lots of biological stuff but also some data on Africa, data from tweets about coronavirus, and "attitude studies" in Spain about the effects of learning about the virus on attitudes towards it.
https://covid-19.dimensions.ai

UW HGIS

Tons of data sources and code in their GitHub: https://github.com/jakobzhao/virus
Global situational heat-map: https://hgis.uw.edu/virus/