Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance as a function of training set size #21

Open
jonfroehlich opened this issue Apr 24, 2019 · 5 comments
Open

Performance as a function of training set size #21

jonfroehlich opened this issue Apr 24, 2019 · 5 comments

Comments

@jonfroehlich
Copy link
Member

@galenweld ran these experiments yesterday and briefly showed us graphs. I believe this was for pre-crop performance only (validation scenario). Could we copy those results as a table (and graph) into this github issue?

Also, are we planning on running this experiment for the other scenario (labeling scenario)? I imagine this experiment will take significantly more time.

@galenweld
Copy link
Collaborator

I was thinking that it may be worthwhile to run this for the labeling scenario, it won't take more than 5-6 hours to run. But to do that, we should run it on the ground truth labels that Esther and I made, and I want to run those on the final model first. I'm still creating crops for those right now, then we'll be able to run. Creating crops is slow, but once you have crops, it's pretty fast to run with a new model.

Here are the results (for pre crop)
Performance Improvement with Additional Training Data

Dataset Size 500 1000 5000 10000 25000 50000 100000
Overall 62.24 63.9 72.8 74.28 76.41 77.65 78.21
Curb Ramp 77 83.53 83.95 87.25 90.26 89.09 92.08
Missing Ramp 26.46 27.01 38.23 41.06 44.89 47.99 45.69
Obstruction 43.11 47.13 63.88 64.94 66.04 70.04 71.58
Sfc Problem 8.2 12.38 27.13 37.46 40.4 42.45 46.96
Null Crop 79.12 81.73 86.36 86.83 87.64 87.89 87.96

@jonfroehlich
Copy link
Member Author

Love it! We should discuss whether we need it for the labeling scenario as well--I'm leaning yes but not if it means we have to sacrifice, for example, the cross-city analysis...

@galenweld
Copy link
Collaborator

galenweld commented Apr 24, 2019 via email

@jonfroehlich
Copy link
Member Author

jonfroehlich commented Jul 11, 2019

This is still an open question since I believe we only did this for the validation task (and not for the labeling task) so marking for future work (however, said future work would not be for ASSETS'19 CR).

@galenweld
Copy link
Collaborator

Certainly, although I believe that our results on validation should give us an excellent estimate of our performance for the labeling task. I'd say lower priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants