This project is described in detail in the corresponding blog post series:
Part I: https://medium.com/analytics-vidhya/google-big-query-with-r-875facef7844
Part II: https://medium.com/analytics-vidhya/live-data-extraction-with-cron-and-r-f29324bf153e
Part III: https://medium.com/analytics-vidhya/easy-api-building-for-data-scientists-with-r-673b381c4ae1
These REST APIs provide a way for platform/language independent access to the public Google Big Query dataset bigquery-public-data:openaq.global_air_quality
of air pollution measured soley at Indian measurement points. The dataset is updated daily, however older data seem to get deleted. To access this data a Cron job fetches new data in 12 hour intervals from Google through the R script get_data_big_query.R
and adds new rows to the saved dataset. The data can be requested fully or aggregated on date intervals through the APIs provided in the Rscript API.R
. The data import via Cron and the APIs are run seperately in two Docker containers with a shared volume for the data as specified in the docker-compose.yml
.
The APIs for Cloud Storage and Big Query have to be activated first for the used Google account at https://console.cloud.google.com/ and a Service Account Token (here not included, should go in cron/src/) needs to be generated and downloaded for authentification as described in section "Service account token" at https://gargle.r-lib.org/articles/get-api-credentials.html. For information about the r package bigrquery
see https://github.com/r-dbi/bigrquery.
-
Get complete data from all Indian measurement points
POST */all NO parameters content-type: application/json
-
Get all Indian measurement locations
POST */locations NO parameters content-type: application/json
-
Get median airquality metrics of the current date with date horizon for averaging at measurement location.
POST */summed_quality_now?measurment_location=&date= measurment_location: Takes all values from the Indian measurement locations, defaults to all date: The daterange, takes either today, week, month, quarter, year, defaults to today content-type: application/json
-
Get median air quality for all dates. The timerange for averaging and the measurement location can be set.
POST */summed_quality?measurment_location=&date= measurment_location: Takes all values from the Indian measurement locations, defaults to all date: The daterange, takes either day, week, month, quarter, year, defaults to day content-type: application/json
-
Plot a test histogram
GET */plot NO parameters content-type: application/json