Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threshold of anomaly #8

Closed
GodofRap opened this issue May 26, 2023 · 4 comments
Closed

Threshold of anomaly #8

GodofRap opened this issue May 26, 2023 · 4 comments

Comments

@GodofRap
Copy link

Hi, I have a quick question about how to decide the threshold for the predicted anomaly. I see there is always a pre-defined threshold in thresholds.json for NAB. How do you decide the threshold in ARTime? Do you find it with the ROC curve or something? What do you think is a good threshold strategy (e.g., a dynamic threshold) for online settings?

@markNZed
Copy link
Owner

NAB calculates the threshold that maximizes the score across the benchmark. This version of ARTime is for NAB. In a real system an ART based solution would take into account the real time score and adapt based on that feedback. Ideally NAB would be extended to allow for this strategy, I raised an issue about that at numenta/NAB#391

@GodofRap
Copy link
Author

If NAB calculates the threshold maximizing the score, doesn't it mean it uses the ground truth label for deciding the threshold? Doesn't it violate the unsupervised setting as the threshold is a hyperparameter?

@markNZed
Copy link
Owner

markNZed commented May 28, 2023

The threshold is not passed to the detectors. The threshold is calculated as part of the scoring algorithm. NAB assumes the detector is indicating the likelihood of an anomaly (a measure between 0 and 1). A detector like ARTime can be configured to give a binary indication of anomalies but it would be a major disadvantage given the way NAB is scored.

One major issue in NAB is that it provides the range of the input values, this provides a lot of information about the future values of a given time series.

@GodofRap
Copy link
Author

Okay, I see. I just went through the calcScoreByThreshold function. The threshold is tried on all the prediction probabilities, and then it is decided as the one with the best score. Thanks for your clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants