-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context-based post processing for linear features #342
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #342 +/- ##
==========================================
+ Coverage 59.64% 60.49% +0.85%
==========================================
Files 35 37 +2
Lines 6165 6334 +169
==========================================
+ Hits 3677 3832 +155
- Misses 2488 2502 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
To run, first run inference on some patches and save outputs by calling Then: import pandas as pd
from mapreader.process.post_process import PatchDataFrame
df = pd.read_csv("./predictions_patch_df.csv", index_col=0)
labels_map = {
0: "no",
1: "railspace",
2: "building",
3: "railspace+building"
}
patches = PatchDataFrame(df, labels_map=labels_map)
patches.get_context(labels=["railspace", "railspace+building",])
patches.update_preds(remap={"railspace": "no", "railspace+building": "building"}, conf=0.8) This will select all railspace/railspace+building patches, get their context, then update predictions for patches with no surrounding railspace and confidence score of less than 0.8. Can also set |
See here for stats on post-processing https://github.com/Living-with-machines/railspace/issues/14 |
TBC | ||
MapReader post-processing's sub-package currently contains one method for post-processing the predictions from your model based on the idea that features such as railways, roads, coastlines, etc. are continuous and so patches with these labels should be found near to other patches also with these labels. | ||
|
||
For example, if a patch is predicted to be a railspace, but is surrounded by patches predicted to be non-railspace, then it is likely that the railspace patch is a false positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could be even more explicit and say: "The current method checks whether any of the 8 surrounding patches have the same label as a given patch (e.g. railspace), and if not, assumes this to be a false positive".
Perhaps could also mention: "Future releases may add functionality to create custom filter rules for your use case"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment, but otherwise LGTM
Summary
As per #339, this PR implements a post-processing script so that users can filter out false positives.
This works for linear features or anything where you expect multiple patches to be clustered but solo patches would be false positive.
It also adds a
save_predictions()
method the classifier to make sure predictions and confidence scores are saved in format expected for post-processing.Fixes #218
Addresses #339
Checklist before assigning a reviewer (update as needed)
Reviewer checklist
Please add anything you want reviewers to specifically focus/comment on.