You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analyzers should give warnings/errors/complaints if they cannot work with data and what resolution they have taken.
A concrete example, if we would run pathspider3 now the data is affected by issue mami-project/pathspider#33. In effect, all IPv4 measurements are fine but all IPv6 measurements are defect. Although the measurement is half broken we still want to use the IPv4 measurements.
Idea A
If we want to represent this kind of partial validity of uploads, we could make the valid attribute in the upload database to hold information how to isolate valid from invalid data.
For example:
valid: [ip4:true, ip6:false]
This is not a good idea because: The isolation description looks different for every data format and it needs to be powerful enough to cover all problems of every data format. In the end, most likely only the analyzer itself will know what to do with that description.
Idea B
The analyzer module knows best how correct data should look like. When it processes a measurement it needs to make sure that the data used for generating the observations look sane.
If it detects a flaw in the data, it should report the flaw and tell what it has done to solve the issue (i.e. "ignored all IPv6 in this file" or "encountered IPv6 with port 0, ignored these") so that a human can understand it while taking conclusions from these observations (or derived observations)
TODO: write out in words
cannot possibly detect all flaws beforehand, especially when writing a novel analysis
analyzer can scan container formats (such as ipfix) and if it includes all required fields it takes the measurement into account
human operators should set the valid only to false if the analyzer cannot possibly detect a flaw by itself. i.e. a bad measurement setup
The text was updated successfully, but these errors were encountered:
Analyzers should give warnings/errors/complaints if they cannot work with data and what resolution they have taken.
A concrete example, if we would run pathspider3 now the data is affected by issue mami-project/pathspider#33. In effect, all IPv4 measurements are fine but all IPv6 measurements are defect. Although the measurement is half broken we still want to use the IPv4 measurements.
Idea A
If we want to represent this kind of partial validity of uploads, we could make the
valid
attribute in the upload database to hold information how to isolate valid from invalid data.For example:
This is not a good idea because: The isolation description looks different for every data format and it needs to be powerful enough to cover all problems of every data format. In the end, most likely only the analyzer itself will know what to do with that description.
Idea B
The analyzer module knows best how correct data should look like. When it processes a measurement it needs to make sure that the data used for generating the observations look sane.
If it detects a flaw in the data, it should report the flaw and tell what it has done to solve the issue (i.e. "ignored all IPv6 in this file" or "encountered IPv6 with port 0, ignored these") so that a human can understand it while taking conclusions from these observations (or derived observations)
TODO: write out in words
The text was updated successfully, but these errors were encountered: