Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intersection proximity seems inaccurate for a number of labels #24

Open
nch0w opened this issue Jul 9, 2019 · 13 comments
Open

Intersection proximity seems inaccurate for a number of labels #24

nch0w opened this issue Jul 9, 2019 · 13 comments
Labels
bug Something isn't working question Further information is requested

Comments

@nch0w
Copy link
Contributor

nch0w commented Jul 9, 2019

For example, label # 17088, which is at lat: 47.6554679870605, lng: -122.32576751709
The middleness I get from intersection-proximity is 90.00%, but the label is at an intersection:
Screen Shot 2019-07-09 at 12 55 19 PM

@jonfroehlich
Copy link
Member

Can you provide a more quantitative assessment? Analyze, say, 100 different examples with screenshots.

@nch0w
Copy link
Contributor Author

nch0w commented Jul 10, 2019

There is a bug with label-intersection-proximity. Sometimes it can classify a point that is close to the intersection as having a high middleness, or it can classify a point that is far from an intersection as having a low middleness, because the street segments used to compute intersection proximity don’t always correspond to streets. I tested 100 random labels, and out of these 77 were correct, 4 were classified as closer to an intersection than they actually were, and 19 were classified as farther from an intersection than they actually were.

Here is an example. The point (red dot) has a high middleness even though it is close to the intersection (two red dots) and the street extends further to the left Screen Shot 2019-07-09 at 5 52 16 PM

And this has a low middleness even though it is relatively far from the intersection.
Screen Shot 2019-07-09 at 6 04 38 PM

I assessed this by comparing the middleness of a label to whether or not the label seemed close to an intersection on the dashboard.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 10, 2019

Capturing our Slack discussion about this. The tl;dr is that this is a known limitation in our approach related to how we incorporate OSM data and segment streets. CC @misaugstad @tongning.

@NChowder, when you get a chance, can you add in the full zip file of your analysis to this thread?

Jon Froehlich 5:56 PM
@neil Chowdhury thanks so much for investigating this and for posting a summary of your findings. Can you provide more details on what you tested, how are you evaluated, and the results were supporting data. Also, if there is indeed a bug, this affects any project that uses our deep learning model since it relies on this input. We also need to see if the Washington DC data is also affected as that’s the core of our assets paper

Neil Chowdhury 5:58 PM
I have to leave soon but here are 100 screenshots of points + the endpoints of the street segments they identified.

Jon Froehlich 5:58 PM
Please post more information to the relevant github issue ASAP
Neil Chowdhury 6:07 PM
Everything is at #24

Anthony Li 6:13 PM
Hey @neil Chowdhury, thanks for bringing this up. Unfortunately I think this is effectively expected behavior for the approach we are taking to calculate middleness, which is to assume that OSM segments correspond to blocks. This is not always true, but I believe we OK'd this approach in the interest of time.

Jon Froehlich 6:16 PM
Hmm, I don’t remember discussing this. Can you print me to the discussion thread to refresh my memory. Sounds like a potential summer task to improve this. Perhaps @neil Chowdhury can take this on? @antli, did you ever perform any systematic analysis of your street code performance that you could share?

Anthony Li 6:20 PM
This was in our initial email thread in Feb. We decided to go with the street segment approach knowing that it wouldn't be perfect, but I don't think we did any formal performance analysis. In my testing I just hand-selected some points and checked that the results produced seemed reasonable

Mikey Saugstad 8:43 PM
note that this will be worse in Seattle and Newberg than in DC. because in the newer cities we split street segments at neighorhood boundaries. in DC that didn't happen so a street could span multiple neighborhoods

@nch0w
Copy link
Contributor Author

nch0w commented Jul 11, 2019

Zip file: https://drive.google.com/open?id=1rOC8sl0Dk5rk86LmwFoLhxXPZRC0CRJ_
So what should we do about the issue? I think it's limiting the accuracy of the label classifier.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 11, 2019 via email

@nch0w
Copy link
Contributor Author

nch0w commented Jul 11, 2019

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 11, 2019 via email

@nch0w
Copy link
Contributor Author

nch0w commented Jul 11, 2019

TODO: Create ground truth set to test intersection proximity algorithm

@nch0w
Copy link
Contributor Author

nch0w commented Jul 16, 2019

There are some points where middleness is just not a reliable metric, even though absolute distance is. For example:
Screen Shot 2019-07-16 at 10 33 46 AM
The middleness is 40%, but the distance to the intersection is 8 meters.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 16, 2019 via email

@nch0w
Copy link
Contributor Author

nch0w commented Jul 16, 2019

Yes. If you saw that the middleness was 40%, you might think that the label is far from an intersection even though it is very close.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 16, 2019 via email

@jonfroehlich
Copy link
Member

Where did we leave this? Can you write up a report describing your findings and, if you have one, a solution.

@daotyl000 daotyl000 added invalid This doesn't seem right question Further information is requested bug Something isn't working and removed invalid This doesn't seem right labels Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants