-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chauvenet criterion wrong results #8
Comments
Thanks for spotting this! That being said, perhaps
This would produce what is expected for a normal distribution while also better catering for non-normal distributions as well. |
Thanks for your return. |
Hi @baptistelabat-syroco, |
Describe the bug
Chauvenet's criterion should consider around 1 sample of a normal distribution as a outlier.
Here it is considering 30 to 40% percent of point to be outliers
To Reproduce
from pythresh.thresholds.chau import CHAU
import numpy as np
normal_array = np.random.randn(99)
outlier_array = CHAU().eval(normal_array)
print(np.vstack([normal_array, outlier_array]).T)
np.sum(outlier_array)
Expected behavior
We tested on a normal distribution only a few points or zero should be considered as outliers.
Desktop (please complete the following information):
Additional context
https://www.statisticshowto.com/chauvenets-criterion/
This table can be obtained with the following code:
prob_threshold = 1.0 / (2.0 * n)
number_of_tails = 2
threshold = -scipy_stats.norm.isf(1 - prob_threshold/number_of_tails)
The text was updated successfully, but these errors were encountered: