You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@rdstern I wanted to check something on the start of rains dialog which I think is a bug:
(1) On the start of rains dialog, the first year is always giving NA (even if there is a start)
We are running a rolling sum of rainfall from the right. So, if am doing a rolling sum of n = 2 values, I have
9, 10, 3, 2
This becomes
NA, 19, 13, 5
This is okay except for that NA. Because, we then say that start rain is:
If the first rainfall value is NA, then the start of rains is NA
If the first rolling sum value is NA, then the start of rains is NA
Otherwise, take the first DOY value
From our second argument there, this means that our first year will always take that NA even if there is a start day
Is this what we want or expect? If so, then I'll stop here. But, I don't think we want the first year to always be NA!
Question 1: Why are we saying that if the first rolling sum is NA, then the start of rains is NA?
(As someone who wrote this code seven years ago, this could have just been a misunderstanding at the time on my part, or I could now be missing something major). It seems this is where both of our problems are coming from. Instead, don't we just want to look at:
If the first rainfall value is NA, then the start of rains is NA
Otherwise, take the first DOY value
This is just removing a simple line from the filter in the R Code. (solution (a))
An alternative option (b) is: If we set fill to be a value (e.g., 0), then it will set the first n - 1 values as 0.
> x <- c(3, 4, NA, 1, 2, 3)
> RcppRoll::roll_sumr(x, n = 2, fill = NA, na.rm = FALSE)
# [1] NA 7 NA NA 3 5
> RcppRoll::roll_sumr(x, n = 2, fill = 0, na.rm = FALSE)
# [1] 0 7 NA NA 3 5
This helps, but, I would like to understand why we "care" about NAs in the rolling sum anyway
(2) If the last n days of the previous year is missing, then the next year has no start date (NA)
(e.g. if the rolling sum is n = 2, then, the 366th doy on the last year being NA, so the first day on the following year is NA since it's doing a rolling sum with an NA and a value, and so it gives NA)
This is fixed by solution (a) above (but not solution (b)). But, this also leads to a side-question to potentially fix/look into even if we do do solution (a) above:
If there is a missing value in the rainfall, the rolling sum set is NA. Do we want this?
Presumably, if the rolling sum of the rainfall exceeds a threshold despite an NA in it, then we are happy to ignore the NA? Because, the NA isn't going to be negative rainfall -- if we are looking for the rolling sum of over 50mm of rain in 3 days, and we have
DAY 1: 1mm
DAY 2: 48mm
DAY 3: NA
DAY 4: 20mm
Then currently that would give: NA, NA, NA, NA. But, on day 4 we have exceeded 50mm of rainfall so we don't care what the NA value is?
x <- c(1, 48, NA, 20)
RcppRoll::roll_sumr(x, n = 3, fill = NA, na.rm = FALSE)
# [1] NA NA NA NA
RcppRoll::roll_sumr(x, n = 3, fill = NA, na.rm = TRUE)
# [1] NA NA 49 68
So: Side Question: If there is a missing value in the rainfall, the rolling sum set is NA. Do we want this?
The text was updated successfully, but these errors were encountered:
lilyclements
changed the title
Bug in Start Rains dialog
Bug in Start of Rains - First Year getting "NA"
Sep 11, 2024
@lilyclements I agree that the rolling sum is missing (for a given year/season) if it finds a missing rainfall value. But I hope the first year isn't always missing. It is missing in the first year if day 1 is also the earliest starting day. Then the rolling sum needs to start at the end of the previous year - which isn't there.
@rdstern makes sense - and you're right that its only missing if the starting day is day one. Is this what we want? If you're happy with how this is all working then I'll close the issue. Thanks!
@rdstern I wanted to check something on the start of rains dialog which I think is a bug:
(1) On the start of rains dialog, the first year is always giving NA (even if there is a start)
We are running a rolling sum of rainfall from the right. So, if am doing a rolling sum of
n = 2
values, I have9, 10, 3, 2
This becomes
NA, 19, 13, 5
This is okay except for that NA. Because, we then say that start rain is:
From our second argument there, this means that our first year will always take that NA even if there is a start day
Is this what we want or expect? If so, then I'll stop here. But, I don't think we want the first year to always be NA!
Question 1: Why are we saying that if the first rolling sum is NA, then the start of rains is NA?
(As someone who wrote this code seven years ago, this could have just been a misunderstanding at the time on my part, or I could now be missing something major). It seems this is where both of our problems are coming from. Instead, don't we just want to look at:
This is just removing a simple line from the
filter
in the R Code. (solution (a))An alternative option (b) is: If we set
fill
to be a value (e.g., 0), then it will set the firstn - 1
values as0
.This helps, but, I would like to understand why we "care" about NAs in the rolling sum anyway
(2) If the last
n
days of the previous year is missing, then the next year has no start date (NA)(e.g. if the rolling sum is n = 2, then, the 366th doy on the last year being NA, so the first day on the following year is NA since it's doing a rolling sum with an NA and a value, and so it gives NA)
This is fixed by solution (a) above (but not solution (b)). But, this also leads to a side-question to potentially fix/look into even if we do do solution (a) above:
If there is a missing value in the rainfall, the rolling sum set is NA. Do we want this?
Presumably, if the rolling sum of the rainfall exceeds a threshold despite an NA in it, then we are happy to ignore the NA? Because, the NA isn't going to be negative rainfall -- if we are looking for the rolling sum of over 50mm of rain in 3 days, and we have
DAY 1: 1mm
DAY 2: 48mm
DAY 3: NA
DAY 4: 20mm
Then currently that would give: NA, NA, NA, NA. But, on day 4 we have exceeded 50mm of rainfall so we don't care what the NA value is?
So: Side Question: If there is a missing value in the rainfall, the rolling sum set is NA. Do we want this?
The text was updated successfully, but these errors were encountered: