Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to analyzercontext to complain about data. #20

Open
gubser opened this issue Jul 1, 2016 · 0 comments
Open

Add method to analyzercontext to complain about data. #20

gubser opened this issue Jul 1, 2016 · 0 comments
Labels

Comments

@gubser
Copy link
Collaborator

gubser commented Jul 1, 2016

Analyzers should give warnings/errors/complaints if they cannot work with data and what resolution they have taken.

A concrete example, if we would run pathspider3 now the data is affected by issue mami-project/pathspider#33. In effect, all IPv4 measurements are fine but all IPv6 measurements are defect. Although the measurement is half broken we still want to use the IPv4 measurements.

Idea A

If we want to represent this kind of partial validity of uploads, we could make the valid attribute in the upload database to hold information how to isolate valid from invalid data.

For example:

valid: [ip4:true, ip6:false]

This is not a good idea because: The isolation description looks different for every data format and it needs to be powerful enough to cover all problems of every data format. In the end, most likely only the analyzer itself will know what to do with that description.

Idea B

The analyzer module knows best how correct data should look like. When it processes a measurement it needs to make sure that the data used for generating the observations look sane.
If it detects a flaw in the data, it should report the flaw and tell what it has done to solve the issue (i.e. "ignored all IPv6 in this file" or "encountered IPv6 with port 0, ignored these") so that a human can understand it while taking conclusions from these observations (or derived observations)

TODO: write out in words

  • cannot possibly detect all flaws beforehand, especially when writing a novel analysis
  • analyzer can scan container formats (such as ipfix) and if it includes all required fields it takes the measurement into account
  • human operators should set the valid only to false if the analyzer cannot possibly detect a flaw by itself. i.e. a bad measurement setup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant