Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compatibility for Satellite data format #8

Merged
merged 15 commits into from
Apr 5, 2021

Conversation

avirkud
Copy link
Contributor

@avirkud avirkud commented Dec 15, 2020

Resolves #5

  • Satellite v1:
    • process all tagging and answers files
    • add nested field for answer IPs
  • v2
    • add field for trials
    • add fields for fetch (Satellite)

@avirkud avirkud added the enhancement New feature or request label Dec 15, 2020
@avirkud avirkud self-assigned this Dec 15, 2020
@avirkud avirkud changed the title Add compatibility for v1 and v2 data formats Add compatibility for Satellite data format Jan 13, 2021
@avirkud avirkud marked this pull request as ready for review January 13, 2021 22:33
@avirkud avirkud changed the base branch from master to blockpage-matching January 13, 2021 22:34
Copy link
Member

@ramakrishnansr ramakrishnansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@avirkud Looks good, can you verify that this matches all the new changes to Satellitev2?

Comment on lines 99 to 100
'false_positive': ('boolean', 'nullable'),
'indicators': ('string', 'nullable'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the same exclude and exclude_reason format we use in Satellitev2, just to maintain compatibility?

@@ -122,6 +131,9 @@
ROWS_PCOLLECION_NAME = 'rows'

SATELLITE_TAGS = {'ip', 'http', 'asnum', 'asname', 'cert'}
CDN_REGEX = re.compile("AMAZON|Akamai|OPENDNS|CLOUDFLARENET|GOOGLE")
VERIFY_THRESHOLD = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be at least 2 by default, I've seen that setting the value to 2 or 3 is best for getting a good FP:TP ratio. (maybe we could add that as a comment here).

@@ -800,8 +836,48 @@ def _calculate_confidence(scan: Dict[str, Any]) -> Dict[str, Any]:
confidence['matches'].append(ip_match)

confidence['average'] = sum(confidence['matches']) / len(confidence['matches'])
scan['confidence'] = confidence
# Sanity check for untagged responses: do not claim interference
if confidence['untagged_response'] and confidence['average'] > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent with the latest additions to Satellite, right? Any answers which are untagged, any answers from control resolvers that are untagged should both not be marked as anomalies.

@avirkud
Copy link
Contributor Author

avirkud commented Mar 10, 2021

@avirkud Looks good, can you verify that this matches all the new changes to Satellitev2?

Yes - the fields, verify threshold, and untagged logic are updated

@ramakrishnansr ramakrishnansr merged commit 728972f into blockpage-matching Apr 5, 2021
@ramakrishnansr ramakrishnansr deleted the read-satellite branch April 5, 2021 18:17
ohnorobo added a commit that referenced this pull request Apr 28, 2021
Adding CI using github actions
ohnorobo pushed a commit that referenced this pull request Apr 28, 2021
Add compatibility for Satellite data format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support multiple data formats
2 participants