Skip to content
This repository has been archived by the owner on Jun 2, 2023. It is now read-only.

Data release buildout 1 #1

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 14 additions & 77 deletions 1_spatial.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ packages:
- rgdal
- sf
- zip

sources:
- src/spatial_functions.R
- src/fetch_filter_functions.R
Expand All @@ -15,82 +15,19 @@ sources:
targets:
1_spatial:
depends:
- river_metadata
- out_data/study_stream_reaches.zip
- reservoir_metadata
- out_data/study_reservoirs.zip
- out_data/study_monitoring_sites.zip

# not sure what this metadata should be, if anything
# one option is to include a reach to HRU crosswalk
#out_data/river_reach_metadata.csv:
# command: create_metadata_file(target_name,
# streams_sf = modeled_streams_sf,
# stream_to_hru = 'XX')

modeled_network_sf:
command: retrieve_network(network_sf_fl = '../delaware-model-prep/1_network/out/network.rds')

network_vertices_sf:
command: retrieve_vertices(network_sf_fl = '../delaware-model-prep/1_network/out/network.rds')

# should include shapefile of HRUs
#hrus_sf:
#command: retrieve_hrus(hrus_sf_fl = 'XX')
- out_data/XX_geospatial_area_WG84.zip
- geospatial_area_metadata

# include map of network, maybe with HRUs?
#out_data/modeling_domain_map.png:
# command: plot_domain_map(target_name,
# network_sf = modeled_network_sf,
#plot_crs = I("+init=epsg:2811"))

river_metadata:
command: extract_feature(network_vertices_sf)

river_metadata2:
command: extract_feature(modeled_network_sf)

out_data/study_stream_reaches.zip:
command: sf_to_zip(target_name,
sf_object = modeled_network_sf,
layer_name = I('study_stream_reaches'))

#out_data/01_spatial_hru.zip:
# command: sf_to_zip(target_name,
# sf_object = hrus_sf,
# layer_name = I('study_hrus'))

network_lordville:
command: readRDS(file = '../delaware-model-prep/9_collaborator_data/umn/network_subset_lordville.rds')

lordville_sites:
command: get_sites(network_lordville)
# create target of spatial file
geospatial_area_WG84:
command: read_spatial_file(path = 'in_data/example_data/drb_shp/physiographic_regions_DRB.shp', selected_crs = 4326)

# Define scope of reservoir modeling data in this repo
reservoir_modeling_site_ids:
command: c(I(c(
Pepacton = 'nhdhr_151957878',
Cannonsville = 'nhdhr_120022743')))

reservoir_polygons:
command: fetch_filter_res_polygons(
out_rds = target_name,
in_ind = "../lake-temperature-model-prep/1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind",
in_repo = I('../lake-temperature-model-prep/'),
site_ids = reservoir_modeling_site_ids)

reservoir_metadata:
command: extract_feature(reservoir_polygons)

out_data/study_reservoirs.zip:
command: sf_to_zip(zip_filename = target_name,
sf_object = reservoir_polygons, layer_name = I('study_reservoirs'))

monitoring_sites:
command: readRDS(file = '../delaware-model-prep/2_observations/out/drb_filtered_sites.rds')
# grab metadata
geospatial_area_metadata:
command: extract_feature(geospatial_area_WG84)

out_data/study_monitoring_sites.zip:
command: reduce_and_zip(zip_filename = target_name,
in_dat = monitoring_sites,
layer_name = I('study_monitoring_sites'))

# Output geospatial area shp
out_data/XX_geospatial_area_WG84.zip:
command: sf_to_zip(target_name,
sf_object = geospatial_area_WG84,
layer_name = I('geospatial_area_WG84'))
95 changes: 5 additions & 90 deletions 2_observations.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,99 +7,14 @@ packages:

sources:
- src/file_functions.R
- src/fetch_filter_functions.R

targets:
2_observations:
depends:
- out_data/temperature_observations_drb.zip
- out_data/temperature_observations_lordville.zip
- out_data/temperature_observations_forecast_sites.zip
- out_data/flow_observations_drb.zip
- out_data/flow_observations_lordville.zip
- out_data/flow_observations_forecast_sites.zip
- out_data/reservoir_releases_total.csv
- out_data/reservoir_releases_by_type_drb.csv
- out_data/reservoir_releases_by_type_lordville.csv
- out_data/reservoir_realsat_monthly_surface_area.csv
- out_data/reservoir_io_obs.csv
- out_data/reservoir_temp_obs.csv
- out_data/reservoir_level_obs.csv
- out_data/XX_observations.zip

##### Transfer of files from delaware-model-prep #####
# daily flow and temperature data
out_data/temperature_observations_drb.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/9_collaborator_data/umn/obs_temp_full.csv')

out_data/temperature_observations_lordville.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/9_collaborator_data/umn/obs_temp_subset_lordville.csv')

out_data/temperature_observations_forecast_sites.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/2_observations/out/obs_temp_priority_sites.csv')

out_data/flow_observations_drb.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/9_collaborator_data/umn/obs_flow_full.csv')

out_data/flow_observations_lordville.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/9_collaborator_data/umn/obs_flow_subset_lordville.csv')

out_data/flow_observations_forecast_sites.zip:
command: zip_obs(out_file = target_name, in_file = '../delaware-model-prep/2_observations/out/obs_flow_priority_sites.csv')

out_data/reservoir_releases_total.csv:
command: file.copy(from = '../delaware-model-prep/2_observations/out/complete_reservoir_releases.csv', to = target_name)

out_data/reservoir_releases_by_type_drb.csv:
command: file.copy(from = '../delaware-model-prep/2_observations/out/reservoir_releases.csv', to = target_name)

out_data/reservoir_releases_by_type_lordville.csv:
command: filter_reservoirs(out_file = target_name, in_dat = 'out_data/reservoir_releases_by_type.csv', keep = I(c('Pepacton', 'Cannonsville')))

# get GRAND IDs of reservoirs above Lordville
reservoir_lordville:
command: c(I(c('1550', '2192')))

out_data/reservoir_realsat_monthly_surface_area.csv:
command: file.copy(from = '../delaware-model-prep/2_observations/out/realsat_monthly_surface_area.csv', to = target_name)

out_data/reservoir_io_obs.csv:
command: copy_filter_feather(
out_csv = target_name,
in_feather = '../delaware-model-prep/9_collaborator_data/res/res_io_obs.feather',
site_ids = reservoir_modeling_site_ids)

##### Transfer and filtering of files of file from lake-temperature-model-prep #####
out_data/reservoir_temp_obs.csv:
command: fetch_filter_tibble(
out_csv = target_name,
in_ind = '../lake-temperature-model-prep/7b_temp_merge/out/drb_daily_reservoir_temps.rds.ind',
in_repo = I('../lake-temperature-model-prep/'),
site_ids = reservoir_modeling_site_ids)

out_data/reservoir_level_nwis.csv:
command: fetch_filter_tibble(
out_csv = target_name,
in_ind = '../lake-temperature-model-prep/7a_nwis_munge/out/drb_reservoirs_waterlevels_munged.rds.ind',
in_repo = I('../lake-temperature-model-prep/'),
site_ids = reservoir_modeling_site_ids)

out_data/reservoir_level_nycdep.rds:
command: fetch_filter_nycdep(
out_rds = target_name,
in_ind = '../lake-temperature-model-prep/7a_nwis_munge/out/NYC_DEP_reservoir_waterlevels.rds.ind',
in_repo = I('../lake-temperature-model-prep/'),
site_ids = reservoir_modeling_site_ids)

out_data/reservoir_level_usgs_historical.rds:
command: fetch_filter_historical(
out_rds = target_name,
in_ind = '../delaware-model-prep/2_observations/out/interpolated_daily_reservoir_water_budget_components.csv.ind',
in_repo = I('../delaware-model-prep/'),
xwalk = I(c('Cannonsville' = 'nhdhr_120022743', 'Pepacton' = 'nhdhr_151957878')))

out_data/reservoir_level_obs.csv:
command: combine_level_sources(
out_csv = target_name,
nwis_levels = 'out_data/reservoir_level_nwis.csv',
nyc_levels = 'out_data/reservoir_level_nycdep.rds',
hist_levels = 'out_data/reservoir_level_usgs_historical.rds')

out_data/XX_observations.zip:
command: zip_obs(out_file = target_name, in_file = 'in_data/example_data/example_temp_drb_220101.csv')

45 changes: 20 additions & 25 deletions 3_driver.yml
Original file line number Diff line number Diff line change
@@ -1,37 +1,32 @@
target_default: 3_config
target_default: 3_driver

include:

packages:
- dplyr
- RJSONIO
- zip
- readr
- sf

sources:
- src/fetch_filter_functions.R
- src/file_functions.R
- src/fetch_filter_functions.R

targets:
3_config:
3_drivers:
depends:
- out_data/reservoir_nml_values.json

##### Transfer and filtering of files from lake-temperature-model-prep #####

# nml values in a nested list with one top-level element per site, saved as JSON
# read in with RJSONIO::fromJSON()
out_data/reservoir_nml_values.json:
command: fetch_filter_nml(
out_json = target_name,
in_ind = '../lake-temperature-model-prep/7_config_merge/out/nml_list.rds.ind',
in_repo = I('../lake-temperature-model-prep/'),
site_ids = reservoir_modeling_site_ids)

##### Archive of files from res-temperature-process-models #####

# .nml files manually copied and renamed from res-temperature-process-models
out_data/reservoir_nml_files.zip:
command: zip_files(
out_file = target_name,
'in_data/glm3_cal_nhdhr_120022743.nml',
'in_data/glm3_cal_nhdhr_151957878.nml')
- out_data/XX_driver_data_1.zip
- out_data/XX_driver_data_2.zip

## Gridmet
out_data/driver_data_1.csv:
command: file.copy(from = 'in_data/example_data/gridmet_sample1.csv', to = target_name)

out_data/XX_driver_data_1.zip:
command: zip_this(out_file = target_name, .object = 'out_data/driver_data_1.csv')

out_data/driver_data_2.csv:
command: file.copy(from = 'in_data/example_data/gridmet_sample2.csv', to = target_name)

out_data/XX_driver_data_2.zip:
command: zip_this(out_file = target_name, .object = 'out_data/driver_data_2.csv')
36 changes: 14 additions & 22 deletions 4_predictions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,24 @@ include:

packages:
- dplyr
- zip
- readr
- sf

sources:
- src/file_functions.R
- src/fetch_filter_functions.R

targets:
5_predictions:
3_driver:
depends:
- out_data/reservoir_io_sntemp.csv
- out_data/reservoir_downstream_preds.csv
- out_data/reservoir_outlet_depth_preds.csv
- out_data/dwallin_stream_preds.csv
- out_data/XX_forecasts1.zip
- out_data/XX_forecasts2.zip

out_data/reservoir_io_sntemp.csv:
command: copy_filter_feather(
out_csv = target_name,
in_feather = '../delaware-model-prep/9_collaborator_data/res/res_io_sntemp.feather',
site_ids = reservoir_modeling_site_ids)

out_data/reservoir_downstream_preds.csv:
command: file.copy(from = '../res-temperature-process-models/5_extract/out/downstream_preds.csv', to = target_name)

out_data/reservoir_outlet_depth_preds.csv:
command: file.copy(from = '../res-temperature-process-models/5_extract/out/outlet_depth_preds.csv', to = target_name)

out_data/dwallin_stream_preds.csv:
command: file.copy(from = '../delaware-model-prep/3_predictions/out/dwallin_stream_preds.csv', to = target_name)

out_data/forecast[2021-04-16_2021-07-16]_files.zip:
command: file.copy(from = 'in_data/forecast[2021-04-16_2021-07-16]_files.zip', to = target_name)
# Forecast dataset 1
out_data/XX_forecasts1.zip:
command: zip_this(out_file = target_name, .object = 'in_data/example_data/predictions_data_1.csv')

# Forecast dataset 2
out_data/XX_forecasts2.zip:
command: zip_this(out_file = target_name, .object = 'in_data/example_data/predictions_data_2.csv')
Empty file added in_data/.empty
Empty file.
22 changes: 13 additions & 9 deletions in_text/text_00_parent.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,27 @@ title: >-
Estimating Stream Salinity Impairment Across the Delaware River Basin Using Space- and Time-Aware Machine Learning

abstract: >-
[CHANGE] This data release contains information to support water quality modeling in the Delaware River Basin (DRB).
These data support both process-based and machine learning approaches to water quality modeling, including
the prediction of stream temperature. Reservoirs in the DRB serve an important role as a source of drinking
water, but also affect downstream water quality. Therefore, this data release includes data that
characterize both rivers and a subset of reservoirs in the basin. This release provides an update
to many of the files provided in a previous data release (Oliver et al., 2021).

Stream salinity has been increasing for the last four decades across the Delaware River Basin (DRB),
partially due to groundwater storage and release of road salt that is applied for deicing. If left unmanaged,
this can have proximate consequences for infrastructure and ecosystems. This data release contains data used for
by the Machine learning models aimed at integrating finer-scale spacio-temporal dynamics to estimate stream salinity across
the Delaware River Basin and provide insight into salinity-impaired reaches and the processes that drive changes in stream salinity across the DRB,
which is critical to informed salinity management. This data release includes data that characterize both
rivers and a subset of reservoirs in the basin.


The data are stored in 4 child folders: 1) spatial information, 2) observations, and 3) model driver data, 4) predictions

[CHANGE] <li><a href="https://www.sciencebase.gov/catalog/item/623e54c4d34e915b67d83580"> 1) Spatial Information </a>- Spatial data used for modeling efforts in the Delaware River Basin</li> -
<li><a href="https://www.sciencebase.gov/catalog/item/XX"> 1) Spatial Information </a>- [CHANGE] Spatial data used for modeling efforts in the Delaware River Basin</li> -
a shapefile of polylines for the river segments, point data for observation locations, and polygons for the three (Pepacton, Cannonsville, and Neversink) reservoirs in this dataset.
<li><a href="https://www.sciencebase.gov/catalog/item/623e550ad34e915b67d8366e"> 2) Observations </a>- Reservoir (surface levels, releases, diversions, water temperature) and river (water temperature and flow)
<li><a href="https://www.sciencebase.gov/catalog/item/XX"> 2) Observations </a>- [CHANGE] Reservoir (surface levels, releases, diversions, water temperature) and river (water temperature and flow)
observations that can be used to train and test water quality models. </li>
<li><a href="https://www.sciencebase.gov/catalog/item/623e5587d34e915b67d83806"> 3) Model driver data </a>- Driver data used to force water quality models, including
<li><a href="https://www.sciencebase.gov/catalog/item/XX"> 3) Model driver data </a>- [CHANGE] Driver data used to force water quality models, including
stream reach distance matrices and daily meteorology data from NOAA GEFS and gridMET. This child item also
includes the inputs and outputs of an uncalibrated run of PRMS-SNTemp which predicts mean water temperature
at all reaches in the DRB.
<li><a href="https://www.sciencebase.gov/catalog/item/XX"> 3) Predictions </a>-

This data compilation was funded by the USGS.

Expand Down
Empty file added out_data/.empty
Empty file.
Loading