Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild data pipeline for Water Year 2022 #77

Merged
merged 8 commits into from
Nov 9, 2022

Conversation

lindsayplatt
Copy link
Contributor

This created and pushed the appropriate files needed by the viz for a Water Year 2022 (Oct 1, 2021 to Sep 30, 2022) to our S3 bucket, including uploading a copy of the timeseries data to the vizlab-data bucket on the Dev VPC

Copy link
Member

@cnell-usgs cnell-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool cool cool 😎 I ran the pipeline and did a local build, the updated animation appears when I run it locally
image

There seems to be a minor issue with labelling on the timeseries chart, probably because of the addition of times to the dates (and therefore less than a day of Sept is included in the month labels).

I'm also getting a flash of super-drought in the first frame that is different than expected for this date range (see below). That is likely because the initial svg load has style encoded from the prior run and then it updates to the current?

image

@lindsayplatt
Copy link
Contributor Author

I was digging into the issue with the labels and from what I can tell, I actually over-engineered my timezone thing 😄 It appears that NWIS always reports things in the standard time of whatever timezone, not the daylight savings time. I was adjusting for the need to convert from daylight time (DT) to standard time (ST) for some times of the year by subtracting an hour. What this ended up doing was moving those dates that fall into the "daylight time" part of the year back an hour even though they were already in standard time. I've added a note in the code and changed how it was adjusting.

I don't think it is worth regenerating the historic data at this point since it is a simple hour shift. Though, I can't tell how I missed that issue in my tests and checks before. Don't really have time to think about that, but am re-running the current dates with this fix now. Code and updated data to come soon.

@lindsayplatt
Copy link
Contributor Author

Gah, but sadly not every site is reported in standard so now I end up with some sites on Oct 1, 2022 seems impossible to win here. I might just do a filter here and call it. Alternatively, I could make an in file that represents these sites that go against the norm and use that to do something different and we can add to it in the future if we need to. @ceenell let me know what you think is best.

library(tidyverse)
n_vals_per_date <- read_csv("1_fetch/out/gw_data.csv") %>%
  group_by(Date) %>%
  tally() 
head(n_vals_per_date, 3)
tail(n_vals_per_date, 3)

> head(n_vals_per_date, 3)
# A tibble: 3 x 2
  Date           n
  <date>     <int>
1 2021-10-01  2130
2 2021-10-02  2133
3 2021-10-03  2130
> tail(n_vals_per_date, 3)
# A tibble: 3 x 2
  Date           n
  <date>     <int>
1 2022-09-29  1521
2 2022-09-30  1518
3 2022-10-01    76  # <--- GAH! Now there some bleeding into the next year.

There are 76 sites across 4 states that have this issue.

library(tidyverse)
sites_in_fy23 <- read_csv("1_fetch/out/gw_data.csv") %>% 
  filter(Date > as.Date('2022-09-30'))

gw_sites_sf <- scipiper::scmake('gw_sites_sf')
gw_sites_sf_fy23 <- gw_sites_sf %>% 
  filter(site_no %in% sites_in_fy23$site_no) %>% 
  mutate(state_abbr = dataRetrieval::stateCdLookup(state_cd))
gw_sites_sf_fy23 %>% 
  st_drop_geometry() %>% 
  group_by(state_abbr) %>% 
  tally()

# A tibble: 4 x 2
  state_abbr     n
  <chr>      <int>
1 CA             4
2 IL            45
3 IN            10
4 MI            17

@cnell-usgs
Copy link
Member

Document this in an issue as a future improvement and then filter it. You were thinking to chop the last day off for the 76 sites, right? I wouldn't want to drop the sites altogether, and the time lag is going to be insignificant in the animation.

@lindsayplatt
Copy link
Contributor Author

Sounds good. And yes, going to chop off just the last day not dump those sites.

@lindsayplatt
Copy link
Contributor Author

See #78

Copy link
Member

@cnell-usgs cnell-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies - think this was waiting on me. I'm going to go ahead and merge this to get the new WY out to the public.

@cnell-usgs cnell-usgs merged commit cfeec46 into DOI-USGS:main Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants