-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation of bad RMSE lakes against PB0 #171
Comments
For future reference in easy to copy/paste code form: pb0_matched_to_observations %>% group_by(site_id) %>% summarize(rmse = sqrt(mean((pred-obs)^2, na.rm=TRUE)), n = length(depth)) %>% arrange(desc(rmse))
# A tibble: 2,377 x 3
site_id rmse n
<chr> <dbl> <int>
1 nhdhr_109986912 17.2 29
2 nhdhr_109989488 15.3 14
3 nhdhr_121207127 14.7 20
4 nhdhr_121650602 13.2 69
5 nhdhr_145608202 12.1 27
6 nhdhr_109984628 11.8 21
7 nhdhr_121650552 11.6 26
8 nhdhr_121650633 11.4 83
9 nhdhr_121207134 11.3 61
10 nhdhr_109987472 11.3 20
11 nhdhr_121627799 11.3 51
12 nhdhr_121628955 11.3 34
13 nhdhr_121650613 11.2 84
14 nhdhr_109990726 10.9 68
15 nhdhr_69545019 10.8 86
16 nhdhr_85083102 10.8 53
17 nhdhr_109989482 10.2 32
18 nhdhr_121650592 10.2 59
19 nhdhr_109986464 9.60 48
20 nhdhr_121625003 8.98 105
# … with 2,367 more rows |
Out of all of these sites, table(d$`ResultAnalyticalMethod/MethodIdentifier`)
FIELD LAB
2714 1609 median of |
I have tacked on the monitoring ID to the Now, I can group by mutate(pb0_matched_to_observations, pred_diff = pred-obs) %>%
group_by(source) %>% summarize(rmse = sqrt(mean((pred_diff)^2, na.rm=TRUE)), n = length(source)) %>% arrange(desc(rmse)) %>% print(n=100)
# A tibble: 2,924 x 3
source rmse n
<chr> <dbl> <int>
1 wqp_LCOWIS_WQX-E16 19.7 7
2 wqp_LCOWIS_WQX-E-16 15.5 154
3 wqp_IL_EPA-RML-1 15.4 7
4 wqp_USGS-475150098210000 15.1 2
5 wqp_LCOWIS_WQX-E-9 14.7 36
6 wqp_SDDENR_WQX-WHITELAWL03 13.9 26
7 wqp_WIDNR_WQX-10031157 11.6 74
8 wqp_MNPCA-21-0057-00-206 11.4 14
9 wqp_SDWRAP-SWLAZZZ3703A 10.9 6
10 wqp_MNPCA-21-0103-00-202 10.7 24
11 wqp_SDDENR_WQX-WALLZZZWL08 10.7 8
12 wqp_LCOWIS_WQX-E17 10.6 12
13 wqp_MNPCA-21-0106-01-204 10.5 24
14 wqp_MNPCA-21-0106-02-201 10.4 8
15 wqp_IL_EPA_WQX-WGZJ-2 10.3 2
16 wqp_WIDNR_WQX-10029926 9.92 174
17 wqp_MNPCA-21-0085-00-207 9.76 24
18 7a_temp_coop_munge/tmp/South_Center_DO_2018_09_11_All.rds 9.61 853
19 7a_temp_coop_munge/tmp/Carlos_DO_2018_11_05_All.rds 9.57 996
20 wqp_MNPCA-21-0054-00-205 9.53 23
21 7a_temp_coop_munge/tmp/Greenwood_DO_2018_09_14_All.rds 9.51 1043
22 wqp_MNPCA-77-0150-02-205 9.34 52
23 wqp_MNPCA-69-0939-02-203 9.23 18
24 wqp_MNPCA-82-0001-00-206 8.98 2
25 wqp_WIDNR_WQX-10033610 8.92 4
26 wqp_IL_EPA_WQX-RGE-2 8.91 5
27 wqp_NARS_WQX-NLA06608-0859 8.83 20
28 wqp_LCOWIS_WQX-E-17 8.78 82
29 wqp_IL_EPA_WQX-RGE-1 8.65 92
30 wqp_IL_EPA_WQX-RTI-3 8.48 3
31 wqp_WIDNR_WQX-443514 8.36 12
32 wqp_MNPCA-27-0139-00-201 8.33 253
33 wqp_MNPCA-21-0052-00-205 8.32 24
34 wqp_USGS-454616092082100 8.25 6
35 wqp_IL_EPA_WQX-RGL-1 8.17 140
36 wqp_MNPCA-70-0091-00-452 8.05 1
37 wqp_MNPCA-11-0246-00-201 8.03 1
38 wqp_IL_EPA_WQX-RPC-2 8.02 7
39 wqp_MNPCA-19-0071-00-202 7.98 5
40 wqp_MNPCA-69-0790-00-201 7.87 43
41 wqp_MNPCA-27-0133-10-101 7.85 120
42 wqp_NALMS-6703 7.83 4
43 wqp_WIDNR_WQX-403112 7.78 75
44 wqp_MNPCA-69-0694-00-117 7.73 1
45 wqp_IL_EPA_WQX-RHD-2 7.71 3
46 wqp_MNPCA-18-0372-00-101 7.69 95
47 wqp_NALMS-3283 7.63 1
48 wqp_USGS-480352099093800 7.61 11
49 wqp_USGS-425235088075302 7.60 1
50 wqp_WIDNR_WQX-403107 7.59 485
51 wqp_MNPCA-29-0142-00-201 7.58 10
52 wqp_IL_EPA_WQX-RTW-1 7.58 134
53 wqp_MNPCA-21-0080-00-204 7.56 24
54 wqp_USGS-482018092292001 7.48 36
55 wqp_MNPCA-73-0139-00-204 7.45 57
56 wqp_WIDNR_WQX-193050 7.40 17
57 wqp_IL_EPA-WGX-1 7.38 7
58 wqp_IL_EPA_WQX-WGZJ-1 7.37 63
59 wqp_MNPCA-27-0062-03-202 7.36 1
60 wqp_MNPCA-18-0044-00-201 7.31 1
61 wqp_MNPCA-69-0859-02-201 7.28 5
62 wqp_USGS-423755088341700 7.26 40
63 wqp_USGS-435721084561801 7.25 5
64 wqp_MNPCA-15-0068-00-207 7.24 38
65 wqp_IL_EPA_WQX-RPA-1 7.17 99
66 wqp_LCOWIS_WQX-W-4 7.17 216
67 wqp_MNPCA-82-0033-00-201 7.15 50
68 wqp_MNPCA-62-0005-00-201 7.14 2
69 wqp_WIDNR_WQX-403110 7.13 1712
70 wqp_MNPCA-82-0031-00-201 7.12 5
71 wqp_MNPCA-21-0123-00-218 7.10 24
72 wqp_MNPCA-27-0014-00-201 7.09 2286
73 wqp_MNPCA-77-0215-00-209 7.07 101
74 7a_temp_coop_munge/tmp/Tenmile_1997_Temperatures.rds 7.06 28
75 wqp_SDDENR_WQX-KINGSBUC03 7.02 16
76 wqp_MNPCA-71-0159-00-203 7.00 5
77 wqp_USGS-454856094544602 6.99 37
78 wqp_MNPCA-77-0215-00-202 6.98 80
79 wqp_IL_EPA_WQX-RGE-3 6.98 8
80 wqp_21NDHDWQ-385455 6.98 5
81 wqp_MNPCA-82-0110-00-451 6.92 22
82 wqp_MNPCA-16-0253-00-202 6.86 1
83 wqp_USGS-444016085310201 6.82 6
84 wqp_MNPCA-19-0024-00-451 6.80 11
85 wqp_MNPCA-27-0129-00-201 6.77 1
86 wqp_IL_EPA_WQX-RGB-2 6.77 9
87 wqp_MNPCA-18-0358-00-201 6.75 4
88 wqp_MNPCA-69-0939-01-204 6.73 89
89 wqp_LCOWIS_WQX-RND-3 6.71 168
90 wqp_WIDNR_WQX-513088 6.67 307
91 wqp_WIDNR_WQX-013144 6.66 103
92 wqp_MNPCA-61-0023-00-204 6.66 10
93 wqp_WIDNR_WQX-10007592 6.63 13
94 wqp_USGS-425235088075300 6.62 28
95 wqp_MNPCA-27-0133-02-205 6.56 2
96 wqp_IL_EPA_WQX-RTW-2 6.55 2
97 wqp_USGS-435009088550100 6.54 9
98 wqp_LCOWIS_WQX-W7 6.51 14
99 7a_temp_coop_munge/tmp/grant_mnlakedata_historicalfiles_manualentry.rds 6.50 64
100 wqp_IL_EPA_WQX-VTJ-1 6.46 129
# … with 2,824 more rows and taking the first one off the top since it has a small number of obs: pb0_matched_to_observations %>% filter(source == 'wqp_LCOWIS_WQX-E16')
# A tibble: 7 x 6
site_id date depth obs pred source
<chr> <date> <dbl> <dbl> <dbl> <chr>
1 nhdhr_74926427 2013-07-15 7.62 5.78 24.1 wqp_LCOWIS_WQX-E16
2 nhdhr_74926427 2013-07-15 10.7 4.39 24.0 wqp_LCOWIS_WQX-E16
3 nhdhr_74926427 2013-07-15 13.7 3.83 23.9 wqp_LCOWIS_WQX-E16
4 nhdhr_74926427 2013-07-15 16.8 3.83 23.8 wqp_LCOWIS_WQX-E16
5 nhdhr_74926427 2013-07-15 19.8 3.83 23.8 wqp_LCOWIS_WQX-E16
6 nhdhr_74926427 2013-07-15 22.9 3.83 23.7 wqp_LCOWIS_WQX-E16
7 nhdhr_74926427 2013-07-15 24.4 3.83 23.7 wqp_LCOWIS_WQX-E16 This is Lake Chippewa in Sawyer, WI
The second worst Modeled (red) and observed (black) are very different The pb0 model thinks this is a well-mixed lake (at least up to 25 m deep) while the obs are a strongly stratified system that looks more like a small lake to me. Perhaps this is a bay. Other sources seem clearly wrong, like @limnoliver heads up on that one ☝️ but note we haven't done any kind of comprehensive look. |
Looks like at least |
Yikes! The explainer file for
And that was interpreted (by me) as simply needing to multiply by -1. And, turns out, I processed South Center, Carlos, and Greewood with the same parser, and did the same thing, since all had negative depth vals. So, more likely, this is distance from bottom, where 0 is bottom, and ~-28m is surface? In that case, I'm guessing we will lose these data because we can't be certain on depth? OR, we assume the first measure is taken at 0m? |
Perhaps looping in w/ Holly related to these files and #173 would be good. Doesn't help us for this immediate issue, but probably good to get on the radar. |
Now that we have the 6_evaluation stage set up, I was looking at some of the worst performers
Some of these have observations that don't make sense, such as 1° temperatures in October
Same kind of integer pattern in
obs
for another onewith values that don't make sense.
I thought maybe these would be a coop source where a column was flipped or something, but for the top worst sites, they all have
wqp
as the only sourceThis pattern seems to continue to at least the 20th worst site
I wonder if these are all from the same provider?
The text was updated successfully, but these errors were encountered: