You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.
I noticed that seg_id_nat 1638 had a high RMSE for the RGCN model (RMSE = 6.13) after I updated with the most recent data. This is right below the Neversink. Looks like there are multiple monitoring sites on this reach, and older sites from Ecosheds were further downstream than the new USGS site, which is capturing colder dynamics from the reservoir.
We may want to reconsider what data we're keeping, particularly at these reservoir sites where different places along the reach can have really different temperature signals. I don't think this is the reason this site is doing so bad (I think it's doing poorly because the model is clearly not picking up the fact that there is reservoir influence, and I think EcoSheds data was added after this model was trained, so the model didn't get a chance to see any of the NYCDEC data):
I see your point that the red points still differ a lot from the model predictions, but the blue points still can't be helping the RMSE, right?
The predictions go down to near zero in winter, and the red points seem to be concentrated in the summer - is part of the impressive difference in your first plot due to the fact that the blue points are more year-round?
hmm, that's tough. Maybe we could add a separate distance criteria for what observations sites to keep if the segment is directly below a reservoir, like only keep sites that are within 1000 m of the top of the segment. But then again, we're trying to predict the entire segment's mean temperature so we can't really throw away sites downstream either.
I wonder if this is a scenario where satellite temperatures could be useful since they have more spatial coverage - it might help represent the segment's mean temperature rather than training / testing on sites from either end of the segment. Or maybe we could somehow tell the model about where in the segment the data are coming from or add observation error?
I think PRMS seeks to predict temperatures at the downstream point of each reach, so our observed temperatures should actually prefer the downstream end when there are choices (or just accept the noise and average them all anyway).
We might want to keep a separate copy of nearest-to-reservoir observations for validation of reservoir model predictions.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I noticed that
seg_id_nat 1638
had a high RMSE for the RGCN model (RMSE = 6.13) after I updated with the most recent data. This is right below the Neversink. Looks like there are multiple monitoring sites on this reach, and older sites from Ecosheds were further downstream than the new USGS site, which is capturing colder dynamics from the reservoir.We may want to reconsider what data we're keeping, particularly at these reservoir sites where different places along the reach can have really different temperature signals. I don't think this is the reason this site is doing so bad (I think it's doing poorly because the model is clearly not picking up the fact that there is reservoir influence, and I think EcoSheds data was added after this model was trained, so the model didn't get a chance to see any of the NYCDEC data):
The text was updated successfully, but these errors were encountered: