-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out if we can rescue the negative depth observations #173
Comments
I know one file that has a lot of negative depths is |
Was just following up on this to see where we are at now (2 years later). There are 412,218 negative temp observations at 166 unique sites across 67 sources. library(scipiper)
library(tidyverse)
wqp_in <- sc_retrieve('7a_wqp_munge/out/temp_wqp_munged_linked.feather.ind')
wqp_dat <- feather::read_feather(wqp_in)
coop_in <- sc_retrieve('7a_temp_coop_munge/out/all_coop_dat_linked.feather.ind')
coop_dat <- feather::read_feather(coop_in)
f_all_dat <- dplyr::select(wqp_dat, date = Date, time, timezone, depth, temp = wtemp, site_id = id, source_id = MonitoringLocationIdentifier, source_site_id = MonitoringLocationIdentifier) %>%
mutate(source = sprintf('wqp_%s', source_id)) %>%
bind_rows(dplyr::select(coop_dat, date = DateTime, time,
timezone, depth, temp, site_id, source_id = state_id,
source_site_id = site, source)) %>%
mutate(month = lubridate::month(date)) %>%
filter(!(month %in% c(1, 2) & temp > 10)) %>%
filter(!(month %in% c(7, 8) & depth < 0.5 & temp < 10)) %>%
mutate(timezone = ifelse(is.na(time), NA, timezone))
recent_f_dat_neg <- recent_f_dat %>% filter(depth < 0)
nrow(recent_f_dat_neg)
[1] 412218
recent_f_dat_neg %>%
group_by(site_id, source) %>%
summarize(n = n()) %>%
arrange(desc(n))
# A tibble: 166 x 3
# Groups: site_id [104]
site_id source n
<chr> <chr> <int>
1 nhdhr_32671150 7a_temp_coop_munge/tmp/Water_Temp.rds 205839
2 nhdhr_58125241 7a_temp_coop_munge/tmp/Water_Temp.rds 137455
3 nhdhr_120020307 7a_temp_coop_munge/tmp/Water_Temp.rds 46365
4 nhdhr_32672122 7a_temp_coop_munge/tmp/Water_Temp.rds 13820
5 nhdhr_120018008 7a_temp_coop_munge/tmp/Water_Temp.rds 8425
6 nhdhr_152517574 wqp_GNLK01_WQX-INGS 12
7 nhdhr_132544104 7a_temp_coop_munge/tmp/Iowa_DNR_LimnoProfiles_2000_2020.~ 11
8 nhdhr_133551903 7a_temp_coop_munge/tmp/Iowa_DNR_LimnoProfiles_2000_2020.~ 8
9 nhdhr_137044605 7a_temp_coop_munge/tmp/Iowa_DNR_LimnoProfiles_2000_2020.~ 8
10 nhdhr_60090166 wqp_LRBOI_WQX-TMan 7
# ... with 156 more rows
recent_f_dat_neg %>%
group_by(source) %>%
summarize(n = n()) %>%
arrange(desc(n))
# A tibble: 67 x 2
source n
<chr> <int>
1 7a_temp_coop_munge/tmp/Water_Temp.rds 411904
2 7a_temp_coop_munge/tmp/Iowa_DNR_LimnoProfiles_2000_2020.rds 129
3 7a_temp_coop_munge/tmp/1945_2020_All_MNDNR_MPCA_Temp_DO_Profiles.rds 38
4 7a_temp_coop_munge/tmp/MPCA_temp_data_all.rds 36
5 wqp_GNLK01_WQX-INGS 12
6 wqp_LRBOI_WQX-TMan 7
7 7a_temp_coop_munge/tmp/DNRdatarequest_Secchi_DO_and_Temp_1083_2016_AllLa~ 5
8 wqp_LRBOI_WQX-TPine 5
9 wqp_SRSTEPA-SW-BD-2017-01 5
10 wqp_SRSTEPA-SW-FD-2017-01 4
# ... with 57 more rows |
412009 temperature records were removed because they had negative depth values.
I think this is all from coop data, since the < 0m depths are silently removed from WQP at an earlier stage. 400k+ seems like a lot to lose, and it would be good to know if these all come from one provider that we could reach out to for clarification on the fields and the meaning.
code reference here
The text was updated successfully, but these errors were encountered: