-
Notifications
You must be signed in to change notification settings - Fork 4
Edit search radius used for site-to-reach matching #116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I haven't run this yet because I have a question for you (see comment).
flowline_indices <- nhdplusTools::get_flowline_index(flines = reaches_nhd_fields, | ||
points = sites_sf, | ||
max_matches = max_matches, | ||
search_radius = search_radius) %>% | ||
search_radius = search_radius*2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why *2 now? Is this to allow NWIS sites to be matched and later retained? If that's why, I think it would be better to add a separate search radius column for the NWIS sites to retain. This would allow for still retaining NWIS sites when the search_radius
is reduced to something smaller than would match the NWIS sites
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good question. I made this change to initially use a wider search radius when we apply nhdplusTools::get_flowline_index()
. That function uses a radius search implemented in RANN::nn2()
and I noticed when looking at DO data that a couple sites were not being picked up with a 500 m search radius (but were picked up with a 1000 m radius) even though the distance between those sites was < 500 m, as confirmed by the offset returned and separately using {sf}. More details are here.
So this doesn't have anything to do with the two NWIS sites but rather, to ensure we're picking up all the sites we would expect initially, which we then filter to the requested search_radius
a couple lines down. I'm open to other suggestions, too, so happy to hear your thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange that those 2 sites do not get picked up! I'm not familiar with the nn2
function, and the source is in C, so I'm less inclined to look into if it's a matter of using a different argument in get_flowline_index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't worried about those two sites in particular, but the behavior was puzzling to me. The distances returned seem to be correct, so for now I decided to just adjust the argument in get_flowline_index
to make sure we're capturing all the sites we'd expect. I tried to explain this reasoning in the comments because I know the different argument to search_radius is a little odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can merge when you're ready
Addresses #112.
When matching SC sites to PRMS segments we're currently using a search radius of 0.1 degrees (~10 km), although this is probably too broad and results in headwater tributaries getting improperly snapped to the larger mainstem rivers represented in PRMS (see #112 for further details).
I've edited
get_site_flowlines()
so that the search_radius is given in meters rather than degrees, and that the offset returned bynhdplusTools::get_flowline_index()
is also in meters. The following changes are also included:bird_dist_to_subseg_m
isNA
for unmatched locations), the function now returns a sparse data set that only contains sites that are matched to a segment as determined by the search_radius or are specified by the user as sites to retain.If we use 500 m here (while adding an argument to retain the two NWIS sites on the Delaware River that are >500 m away from their respective segments),
p2_sites_w_segs
goes from 3,523 rows to 1,845 rows and we lose 9,048 NWIS observation-days ( (~5% of our total NWIS obs-days for SC).Here's a preview of
p2_sites_w_segs
: