Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

482 create tadamonitoringlocationidentifier in tada autoclean #523

Open
wants to merge 42 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
01ba18e
Update Utilities.R
hillarymarler Aug 30, 2024
8dc776f
Update Utilities.R
hillarymarler Aug 30, 2024
2659fba
Update Utilities.R
hillarymarler Aug 30, 2024
765a28f
Update Utilities.R
hillarymarler Sep 6, 2024
88901cc
Use TADA.MonitoringLocationIdentifier in DepthProfile.R
hillarymarler Sep 6, 2024
4516d91
Use TADA.MonitoringLocationIdentifier in grouping in functions
hillarymarler Sep 9, 2024
45ba95e
Documentation updates
hillarymarler Sep 9, 2024
b6e8a17
Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…
hillarymarler Sep 9, 2024
877946d
Update Utilities.R
hillarymarler Sep 9, 2024
72f8459
Update example data
hillarymarler Sep 9, 2024
1fbf1be
Update Utilities.R
hillarymarler Sep 9, 2024
edec4d9
Update CriteriaComparison.R
hillarymarler Sep 10, 2024
6b06998
Update depth profile functions and documentation
hillarymarler Sep 10, 2024
2883495
Update Figures.R
hillarymarler Sep 10, 2024
63761b5
Update TADA_OverviewMap.Rd
hillarymarler Sep 10, 2024
a20bb05
Updated OverviewMap and added TADA.MonitorinigLocationTypeName
hillarymarler Sep 10, 2024
d1ba61a
Update Documentation
hillarymarler Sep 10, 2024
2c55ea9
Updates incorporating TADA.MonitoringLocationTypeName
hillarymarler Sep 10, 2024
a58be61
Update Utilities.R
hillarymarler Sep 11, 2024
dc76337
Updates to TADA_FlaggedSitesMap
hillarymarler Sep 11, 2024
ff1a6b7
Update Figures.R
hillarymarler Sep 12, 2024
825cf6f
Update Figures.R
hillarymarler Sep 12, 2024
9d9dca4
Update documentation
hillarymarler Sep 12, 2024
85a5ade
Documentation updates
hillarymarler Sep 12, 2024
72deada
Analysis data filter bug fix
hillarymarler Sep 12, 2024
33d961b
Update Filtering.R
hillarymarler Sep 13, 2024
1b8fecc
Update Filtering.R
hillarymarler Sep 13, 2024
6dbbbba
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 18, 2024
2bc5b93
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 19, 2024
84c76bc
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 19, 2024
cb1693e
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 19, 2024
8d43b6c
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 19, 2024
2139687
Update TADAModule1_BeginnerTraining.Rmd
hillarymarler Sep 19, 2024
f9a3c54
Update example data
hillarymarler Sep 20, 2024
ad9029b
Update Filtering.R
hillarymarler Sep 23, 2024
bbfd1f7
Update TADAModule1_AdvancedTraining.Rmd
wokenny13 Sep 24, 2024
9bf9db0
Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…
hillarymarler Oct 29, 2024
e3d1ff2
Update from develop
hillarymarler Oct 29, 2024
d6d80ad
Update Utilities.R
hillarymarler Oct 31, 2024
940bcdc
Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…
hillarymarler Nov 1, 2024
a2ebe60
Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…
hillarymarler Nov 7, 2024
6ea8b11
Merge updates from develop
hillarymarler Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions R/CriteriaComparison.R
Original file line number Diff line number Diff line change
Expand Up @@ -186,15 +186,17 @@ TADA_CreatePairRef <- function(.data, ph = TRUE, hardness = TRUE, temp = TRUE,
#' Pair Results for Numeric Criteria Calculation (UNDER ACTIVE DEVELOPMENT)
#'
#' This function pairs TADA results with results from specified characteristics from the same
#' MonitoringLocation within a user-specified time window to facilitate the calculation of numeric
#' criteria. The columns created by TADA_AutoClean are required to run this function. If they are not
#' present in the data frame, the function will stop and print an error message.
#' TADA.MonitoringLocation within a user-specified time window to facilitate the calculation of
#' numeric criteria. The columns created by TADA_AutoClean are required to run this function. If
#' they are not present in the data frame, the function will stop and print an error message.
#'
#' Users can provide a pairing reference file (can be created using TADA_CreatePairRef) to specify
#' which combinations of TADA.CharacteristicName, TADA.ResultMeasure.MeasureUnit,
#' TADA.MethodSpeciationName, and TADA.ResultSampleFractionText should be used for hardness, pH,
#' temperature, salinity, chloride or other user-defined groups. If no ref is specified, all possible
#' combinations for hardness, pH, temperature, salinity and chloride will be used.
#' temperature, salinity, chloride or other user-defined groups. If no ref is specified, all
#' possible combinations for hardness, pH, temperature, salinity and chloride will be used. It is
#' highly reccomended that users perform all unit conversion and synonym harmonization before using
#' TADA_PairForCriteriaCalc.
#'
#' @param .data TADA dataframe
#'
Expand Down Expand Up @@ -282,7 +284,7 @@ TADA_PairForCriteriaCalc <- function(.data, ref = "null", hours_range = 4) {
) %>%
dplyr::select(
TADA.CharacteristicName, TADA.ResultMeasureValue, TADA.ResultMeasure.MeasureUnitCode,
ActivityIdentifier, MonitoringLocationIdentifier, ActivityStartDateTime,
ActivityIdentifier, TADA.MonitoringLocationIdentifier, ActivityStartDateTime,
TADA.ResultSampleFractionText, TADA.MethodSpeciationName
) %>%
dplyr::left_join(ref.subset,
Expand Down Expand Up @@ -340,11 +342,11 @@ TADA_PairForCriteriaCalc <- function(.data, ref = "null", hours_range = 4) {
dplyr::filter(
!ResultIdentifier %in% pair.activityid$ResultIdentifier,
!is.na(ActivityStartDateTime),
MonitoringLocationIdentifier %in% pair.subset$MonitoringLocationIdentifier
TADA.MonitoringLocationIdentifier %in% pair.subset$TADA.MonitoringLocationIdentifier
) %>%
dplyr::left_join(pair.subset2,
relationship = "many-to-many",
by = dplyr::join_by(MonitoringLocationIdentifier)
by = dplyr::join_by(TADA.MonitoringLocationIdentifier)
) %>%
dplyr::group_by(ResultIdentifier) %>%
# Figure out fastest time comparison method - needs to be absolute time comparison
Expand Down
96 changes: 48 additions & 48 deletions R/DepthProfile.R

Large diffs are not rendered by default.

198 changes: 130 additions & 68 deletions R/Figures.R

Large diffs are not rendered by default.

21 changes: 12 additions & 9 deletions R/Filtering.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ TADA_FieldCounts <- function(.data, display = c("key", "most", "all"), character
"ActivityMediaSubdivisionName",
"ActivityCommentText",
"ResultCommentText",
"MonitoringLocationTypeName",
"TADA.MonitoringLocationTypeName",
"StateCode",
"OrganizationFormalName",
"TADA.CharacteristicName",
Expand Down Expand Up @@ -79,8 +79,10 @@ TADA_FieldCounts <- function(.data, display = c("key", "most", "all"), character
"ActivityRelativeDepthName",
"ProjectIdentifier",
"ProjectName",
"TADA.MonitoringLocationIdentifier",
"MonitoringLocationIdentifier",
"MonitoringLocationName",
"MonitoringLocationTypeName",
"ActivityCommentText",
"SampleAquifer",
"HydrologicCondition",
Expand Down Expand Up @@ -112,7 +114,6 @@ TADA_FieldCounts <- function(.data, display = c("key", "most", "all"), character
"ResultDetectionQuantitationLimitUrl",
"DetectionQuantitationLimitTypeName",
"ProviderName",
"MonitoringLocationTypeName",
"MonitoringLocationDescriptionText",
"HUCEightDigitCode",
"HorizontalCollectionMethodName",
Expand Down Expand Up @@ -191,10 +192,14 @@ TADA_FieldValuesTable <- function(.data, field = "null", characteristicName = "n
if (!field %in% names(.data)) {
stop("Field input does not exist in dataset. Please populate the 'field' argument with a valid field name. Enter ?TADA_FieldValuesTable in console for more information.")
}

# change NAs to "NA" (character string)
.data[[field]][is.na(.data[[field]])] <- "NA"

# filter to characteristic if provided
if (!characteristicName %in% c("null")) {
.data <- subset(.data, .data$TADA.CharacteristicName %in% c(characteristicName))
.data <- .data %>%
dplyr::filter(TADA.CharacteristicName %in% characteristicName)
if (dim(.data)[1] < 1) {
stop("Characteristic name(s) provided are not contained within the input dataset. Note that TADA converts characteristic names to ALL CAPS for easier harmonization.")
}
Expand Down Expand Up @@ -278,11 +283,9 @@ TADA_AnalysisDataFilter <- function(.data,
# import MonitoringLocationTypeNames and TADA.Media.Flags
sw.sitetypes <- utils::read.csv(system.file("extdata", "WQXMonitoringLocationTypeNameRef.csv", package = "EPATADA")) %>%
dplyr::select(Name, TADA.Media.Flag) %>%
dplyr::rename(
ML.Media.Flag = TADA.Media.Flag,
MonitoringLocationTypeName = Name
)

dplyr::rename(ML.Media.Flag = TADA.Media.Flag) %>%
dplyr::mutate(MonitoringLocationTypeName = toupper(Name)) %>%
dplyr::select(-Name)

# add TADA.Media.Flag column
.data <- .data %>%
Expand All @@ -300,7 +303,7 @@ TADA_AnalysisDataFilter <- function(.data,
ActivityMediaSubdivisionName == "Surface Water" ~ "Surface Water",
!ActivityMediaName %in% c("WATER", "Water", "water") ~ ActivityMediaName
)) %>%
# add TADA.Media.Flag for additional rows based on MonitoringLocationTypeName
# add TADA.Media.Flag for additional rows based on TADA.MonitoringLocationTypeName
dplyr::left_join(sw.sitetypes, by = "MonitoringLocationTypeName") %>%
dplyr::mutate(
TADA.Media.Flag = ifelse(is.na(TADA.Media.Flag),
Expand Down
2 changes: 2 additions & 0 deletions R/RequiredCols.R
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,9 @@ require.cols <- c(
"StateCode",
"CountyCode",
"MonitoringLocationName", # required
"TADA.MonitoringLocationName", # generated
"MonitoringLocationTypeName",
"TADA.MonitoringLocationTypeName", #generated
"MonitoringLocationDescriptionText",
"LatitudeMeasure",
"TADA.LatitudeMeasure", # generated
Expand Down
10 changes: 5 additions & 5 deletions R/Tables.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ TADA_SummarizeColumn <- function(.data, col = "TADA.CharacteristicName") {
wqp_summary <- .data %>%
dplyr::group_by(summ) %>%
dplyr::summarize(
n_sites = length(unique(MonitoringLocationIdentifier)),
n_sites = length(unique(TADA.MonitoringLocationIdentifier)),
n_records = length(TADA.ResultMeasureValue),
.groups = "drop"
) %>%
Expand Down Expand Up @@ -51,16 +51,16 @@ TADA_SummarizeColumn <- function(.data, col = "TADA.CharacteristicName") {
#' columns 'TADA.ResultMeasureValue', 'TADA.ResultMeasure.MeasureUnitCode',
#' 'TADA.ResultSampleFractionText', 'TADA.MethodSpeciationName',
#' 'TADA.ComparableDataIdentifier', 'TADA.CensoredData.Flag',
#' 'DetectionQuantitationLimitTypeName', and 'MonitoringLocationIdentifier' to
#' 'DetectionQuantitationLimitTypeName', and 'TADA.MonitoringLocationIdentifier' to
#' run this function. The 'TADA.ComparableDataIdentifier' can be added to the
#' data frame by running the function TADA_CreateComparableID().
#'
#' @param group_cols This function automatically uses
#' 'TADA.ComparableDataIdentifier' as a grouping column. However, the user may
#' want to summarize their dataset by additional grouping columns. For
#' example, a user may want to create a summary table where each row is
#' specific to one comparable data identifier AND one monitoring location.
#' This input would look like: group_cols = c("MonitoringLocationIdentifier")
#' specific to one comparable data identifier AND one TADA monitoring location.
#' This input would look like: group_cols = c("TADA.MonitoringLocationIdentifier")
#'
#' @return stats table
#'
Expand Down Expand Up @@ -92,7 +92,7 @@ TADA_Stats <- function(.data, group_cols = c("TADA.ComparableDataIdentifier")) {
dplyr::filter(!is.na(TADA.ResultMeasureValue)) %>%
dplyr::group_by(dplyr::across(dplyr::all_of(group_cols))) %>%
dplyr::summarize(
Location_Count = length(unique(MonitoringLocationIdentifier)),
Location_Count = length(unique(TADA.MonitoringLocationIdentifier)),
Measurement_Count = length(unique(ResultIdentifier)),
Non_Detect_Count = length(TADA.CensoredData.Flag[TADA.CensoredData.Flag %in% c("Non-Detect")]),
Non_Detect_Pct = length(TADA.CensoredData.Flag[TADA.CensoredData.Flag %in% c("Non-Detect")]) / length(TADA.CensoredData.Flag) * 100,
Expand Down
8 changes: 4 additions & 4 deletions R/Transformations.R
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ TADA_HarmonizeSynonyms <- function(.data, ref, np_speciation = TRUE) {
#' @param .data TADA dataframe, ideally harmonized using TADA_HarmonizeSynonyms.
#' If user wants to consider grouping N or P subspecies across multiple
#' organizations, user should have run TADA_FindNearbySites and grouped all
#' nearby sites to one common MonitoringLocationIdentifier,
#' nearby sites to one common TADA.MonitoringLocationIdentifier,
#' TADA.LatitudeMeasure, TADA.LongitudeMeasure, etc.
#' @param sum_ref Optional. A custom summation reference dataframe the user has
#' loaded into the R environment. Dataframe must have same columns as default
Expand Down Expand Up @@ -267,7 +267,7 @@ TADA_CalculateTotalNP <- function(.data, sum_ref, daily_agg = c("max", "min", "m
"TADA.ResultMeasure.MeasureUnitCode",
"TADA.ResultMeasureValue",
"ActivityStartDate",
"MonitoringLocationIdentifier",
"TADA.MonitoringLocationIdentifier",
"ActivityTypeCode"
)
TADA_CheckColumns(.data, expected_cols = req_cols)
Expand All @@ -293,7 +293,7 @@ TADA_CalculateTotalNP <- function(.data, sum_ref, daily_agg = c("max", "min", "m
"ActivityStartDate",
# "ActivityStartDateTime", #does not make sense to include for daily agg
"ActivityRelativeDepthName",
"MonitoringLocationIdentifier",
"TADA.MonitoringLocationIdentifier",
"MonitoringLocationName",
"TADA.LongitudeMeasure",
"TADA.LatitudeMeasure",
Expand All @@ -317,7 +317,7 @@ TADA_CalculateTotalNP <- function(.data, sum_ref, daily_agg = c("max", "min", "m
thecols <- grpcols[!grpcols %in% c("TADA.ComparableDataIdentifier")]

# # find nearby sites
# nearsites = unique(sum_dat[,c("MonitoringLocationIdentifier","TADA.LatitudeMeasure","TADA.LongitudeMeasure")])
# nearsites = unique(sum_dat[,c("TADA.MonitoringLocationIdentifier","TADA.LatitudeMeasure","TADA.LongitudeMeasure")])
# nearsites = TADA_FindNearbySites(nearsites)
# nearsites = subset(nearsites, !nearsites$TADA.NearbySiteGroups%in%c("No nearby sites"))

Expand Down
Loading
Loading