Replies: 2 comments 1 reply
-
I'm sorry I don't know enough about drift detection to be of help. @jacobmontiel is a bit busy at the moment but I'm sure he'll pop by at some point. |
Beta Was this translation helpful? Give feedback.
0 replies
-
@TawabG Did you figure it out? Why when print every 100 steps, the two drifts are not correct? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all!
Thank you for developing this awesome package!
I have been experimenting with drift detection methods, however, I cannot understand why certain things are happening. I have followed @jacobmontiel tutorial as part of the "Open Source Machine Learning for Data Streams" tutorial at DSAA 2021.
let me take you through my process:
The data is generated using the AGRAWAL data generator with 3 gradual drifts at the 5k, 10k, and 15k marks. agr_a_20k.csv
I then used progressive_val_score() to print the accuracy scores and store them in a log file. Then I used that log file and transformed the scores into a readable format and inserted them into a list.
I then plot the results and we can see this graph:
However, when I am trying to discover when a certain drift is happening by using ADWIN, I get these results:
I can understand why ADWIN is detecting so many changes, but I wonder if there is a specific way to just focus on these gradual changes at the 5k/10k/15k mark. I really hope somebody could explain this topic to me!
Update 1: When using the metric to print every 100 steps. it starts to get interesting:
However, this is still not very accurate since we know the gradual drifts are at the 5k/10k/15K mark (this should translate to a drift detection mark at around 50(*100), 100(*100) and 150(*100) which is clearly not the case as can be seen from the graph above.
Beta Was this translation helpful? Give feedback.
All reactions