-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
146 lines (103 loc) · 4.36 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
message = FALSE,
warning = FALSE,
comment = "#>"
)
```
# habre_pm <a href='https://degauss.org'><img src='https://github.com/degauss-org/degauss_hex_logo/raw/main/PNG/degauss_hex.png' align='right' height='138.5' /></a>
[![](https://img.shields.io/github/v/release/degauss-org/habre_pm?color=469FC2&label=version&sort=semver)](https://github.com/degauss-org/habre_pm/releases)
[![container build status](https://github.com/degauss-org/habre_pm/workflows/build-deploy-release/badge.svg)](https://github.com/degauss-org/habre_pm/actions/workflows/build-deploy-release.yaml)
## Using
If `my_address_file_geocoded.csv` is a file in the current working directory with coordinate columns named `lat`, `lon`, (within the state of California) `start_date`, and `end_date` then the [DeGAUSS command](https://degauss.org/using_degauss.html#DeGAUSS_Commands):
```sh
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/habre_pm:0.2.1 my_address_file_geocoded.csv
```
will produce `my_address_file_geocoded_habre_pm_0.2.1.csv` with added columns:
- **`pm`**: time weighted average of weekly PM2.5
- **`sd`**: time weighted square root of average sum of squared weekly standard deviation
## Geomarker Methods
- Geocoded points are overlaid with weekly PM2.5 rasters corresponding to the input date range.
- For date ranges that span weeks, exposures are a time-weighted average.
#### Example code for time weighted average PM and SD
```{r}
library(tidyverse)
library(dht)
library(sf)
```
**Read in data**
Here we will only use row 3 of our test file because it spans two calendar weeks.
```{r}
d <- read_csv('test/my_address_file_geocoded_habre_pm25_leo.csv',
col_types = cols(start_date = col_date(format = "%m/%d/%y"),
end_date = col_date(format = "%m/%d/%y"))) |>
slice(3) |>
select(-prepm25, -prepm25_sd)
d
```
**Expand start and end dates to daily**
```{r}
d_daily <- dht::expand_dates(d, by = "day")
d_daily
```
**Read in date lookup / week index and expand start and end dates to daily**
```{r}
date_lookup <-
readr::read_csv("pm25_iweek_startdate.csv") |>
dht::expand_dates(by = "day")
date_lookup |>
filter(iweek %in% 413:414)
date_lookup <-
date_lookup |>
dplyr::select(week = iweek, date)
```
**Join week index to input data using date (daily)**
```{r}
d_week <- d_daily |>
dplyr::left_join(date_lookup, by = "date")
d_week
```
**Read in and join raster values**
```{r}
r <- terra::rast("habre.tif")
d_dedup <- d_week |>
select(lat, lon, week) |>
group_by(lat, lon, week) |>
filter(!duplicated(lat, lon, week)) |>
mutate(layer_name_mean = glue::glue("week{week}_mean"),
layer_name_sd = glue::glue("week{week}_std"))
d_for_extract <- d_dedup |>
st_as_sf(coords = c("lon", "lat"), crs = 4326) |>
st_transform(st_crs(r)) |>
terra::vect()
d_pm <- terra::extract(r,
d_for_extract,
layer = "layer_name_mean")
d_sd <- terra::extract(r,
d_for_extract,
layer = "layer_name_sd")
d_dedup <- d_dedup |>
ungroup() |>
mutate(pm = d_pm$value,
sd = d_sd$value)
d_out <- left_join(d_week, d_dedup, by = c("lat", "lon", "week"))
d_out
d_out <- d_out |>
group_by(id) |>
summarize(pm = round(sum(pm)/n(),2),
sd = round(sqrt((sum(sd^2))/n()),2))
d_out <- left_join(d, d_out, by = "id")
d_out
```
## Geomarker Data
- PM2.5 rasters were created using a model developed by Rima Habre and Lianfa Li.
> Li L, Girguis M, Lurmann F, Pavlovic N, McClure C, Franklin M, Wu J, Oman LD, Breton C, Gilliland F, Habre R. Ensemble-based deep learning for estimating PM2. 5 over California with multisource big data including wildfire smoke. Environment international. 2020 Dec 1;145:106143. https://doi.org/10.1016/j.envint.2020.106143
- The raster stack used in this container is stored in S3 at [`s3://habre/habre.tif`](https://habre.s3-us-east-2.amazonaws.com/habre.tif)
- Individual rasters that make up the raster stack are stored at [`s3://habre/li_2020/`](https://habre.s3-us-east-2.amazonaws.com/li_2020/)
## DeGAUSS Details
For detailed documentation on DeGAUSS, including general usage and installation, please see the [DeGAUSS homepage](https://degauss.org).