Add method to analyzercontext to complain about data. #20

gubser · 2016-07-01T09:08:22Z

Analyzers should give warnings/errors/complaints if they cannot work with data and what resolution they have taken.

A concrete example, if we would run pathspider3 now the data is affected by issue mami-project/pathspider#33. In effect, all IPv4 measurements are fine but all IPv6 measurements are defect. Although the measurement is half broken we still want to use the IPv4 measurements.

Idea A

If we want to represent this kind of partial validity of uploads, we could make the valid attribute in the upload database to hold information how to isolate valid from invalid data.

For example:

valid: [ip4:true, ip6:false]

This is not a good idea because: The isolation description looks different for every data format and it needs to be powerful enough to cover all problems of every data format. In the end, most likely only the analyzer itself will know what to do with that description.

Idea B

The analyzer module knows best how correct data should look like. When it processes a measurement it needs to make sure that the data used for generating the observations look sane.
If it detects a flaw in the data, it should report the flaw and tell what it has done to solve the issue (i.e. "ignored all IPv6 in this file" or "encountered IPv6 with port 0, ignored these") so that a human can understand it while taking conclusions from these observations (or derived observations)

TODO: write out in words

cannot possibly detect all flaws beforehand, especially when writing a novel analysis
analyzer can scan container formats (such as ipfix) and if it includes all required fields it takes the measurement into account
human operators should set the valid only to false if the analyzer cannot possibly detect a flaw by itself. i.e. a bad measurement setup

The text was updated successfully, but these errors were encountered:

gubser mentioned this issue Jul 1, 2016

Adapt analyzer to actually use Spark rather than just collect data from HDFS gubser/analyzer-ecnspider1#2

Open

gubser added the ideas label Jul 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add method to analyzercontext to complain about data. #20

Add method to analyzercontext to complain about data. #20

gubser commented Jul 1, 2016 •

edited

Loading

Add method to analyzercontext to complain about data. #20

Add method to analyzercontext to complain about data. #20

Comments

gubser commented Jul 1, 2016 • edited Loading

Idea A

Idea B

gubser commented Jul 1, 2016 •

edited

Loading