Skip to content
This repository has been archived by the owner on Jun 30, 2023. It is now read-only.

Reassess sites-to-segs search radius #112

Closed
2 tasks done
lekoenig opened this issue Mar 17, 2022 · 8 comments · Fixed by #116
Closed
2 tasks done

Reassess sites-to-segs search radius #112

lekoenig opened this issue Mar 17, 2022 · 8 comments · Fixed by #116
Assignees

Comments

@lekoenig
Copy link
Collaborator

lekoenig commented Mar 17, 2022

In USGS-R/drb-do-ml#36, we realized that the search radius being used to match sites to segments likely resulted in many headwater streams being matched to the mainstem rivers represented in the PRMS network. This issue includes the following steps:

@lekoenig lekoenig self-assigned this Mar 17, 2022
@jds485
Copy link
Member

jds485 commented Mar 17, 2022

Thanks for discovering this issue! I'm supportive of reducing the search radius to 1 km (or shorter) and dropping sites that are further away than that. It would be helpful to know how many sites, particularly continuous, are dropped to potentially encourage modeling at finer NHD scale in the future.

@lekoenig
Copy link
Collaborator Author

We're currently snapping points to segments by searching for the nearest segment within a radius of 0.1 degrees (~10 km). The plot below shows the distance between our SC sites (including discrete + NWIS observations) and the matched segment, showing that there are quite a few sites that are matched to reaches that are relatively far away (i.e., ~1360 sites out of 3449 total sites are > 1 km away from their matched segment).

snapped_dist

Most of these sites that don't get matched/are matched at long distances are discrete sites. However, there are some NWIS sites that are getting snapped to PRMS segments even though the sites probably aren't located on those segments. NWIS 01478950 (on Pike Creek and not the Christina River, which is shown in red) is an example of that:

snap_ex

@lekoenig
Copy link
Collaborator Author

So I'd suggest we should reduce our search radius. In other projects such as temperature and now DO, we use a search radius of 500 m (e.g. see lines 56-61 in the 2wp-temp repo). If we reduce our search radius to 500 m, we lose 26 NWIS gages out of 125. I quickly checked the location of these 26 gages and most of them are located on streams not represented by the PRMS network. But there are also edge cases where larger rivers might not get snapped if they're wider than 500 m (e.g. issue USGS-R/delaware-model-prep#34 in delaware-model-prep). I noticed this with sites on the Delaware River (e.g. 01474703 and 01477050, which are ~620 and 630 m away from that segment, respectively.

I'm leaning towards keeping the search radius at 500 m, but adding special handling to retain 01474703 and 01477050. Implementing this change would result in a loss of 24 NWIS gages and 9,048 NWIS observation-days (~5% of our total NWIS obs-days for SC).

@jds485
Copy link
Member

jds485 commented Mar 18, 2022

Thanks for the update. Using 500 m and special handling for points where the river width is large sounds good to me. River width is an attribute that could be used to determine which segments should receive special handling. seg_width in p1_sntemp_inputs_outputs. The maximum width is 125.7 m, so this attribute must be measuring something other than river width...the metadata says river width in m, though.

@jds485
Copy link
Member

jds485 commented Mar 18, 2022

I used Google Earth for a quick river width estimation. It seems like the main stem south of Trenton is > 500 m. It doesn't look like any of the tributaries are > 500 m. I can quickly grab all of the main stem segment numbers that could use > 500 m tolerance, but would not have a quick way to say what the tolerance should be for these reaches

@lekoenig
Copy link
Collaborator Author

Thanks for looking into this (I haven't looked at p1_sntemp_inputs_outputs closely, but that's odd that 125.7 m is the largest width). The other large river sites seem to be retained OK using a 500 m search radius, so that's why I assigned those two NWIS sites on the Delaware. We could retain segs instead of sites (i.e., all sites that match to a seg downstream of Trenton), but I'd have to do a quick check that we don't re-introduce any tributaries that way.

@jds485
Copy link
Member

jds485 commented Mar 18, 2022

I don't think it would introduce tributaries using the a recursive function to grab all PRMS segments downstream of 333_1. Or do you mean that the samples could be on tributaries and not the main stem?

@lekoenig
Copy link
Collaborator Author

I don't think it would introduce tributaries using the a recursive function to grab all PRMS segments downstream of 333_1. Or do you mean that the samples could be on tributaries and not the main stem?

Yeah, thanks for clarifying. I meant tributary samples that got matched to the mainstem. I don't expect that to be an issue if we're sticking with a 500 m radius, though.

@lekoenig lekoenig linked a pull request Mar 18, 2022 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants