Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Issues with CluStream clustering #1634

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

PhilJahn
Copy link

Two fixes regarding issues with the clustering procedure of CluStream:

  1. fixes mismatch between centers used in distance calculation and cluster assignment that lead to incorrect assignments
  2. ensures kMeans clustering at timegap, rather than allowing for skipping when a data point is added to an existing micro-cluster
    A more in-depth description can be found here: Issues with CluStream clustering #1633

Deals with a mismatch between centers used for distance calculation (mc.center) and centers used for cluster assignment (_mc_centers). This causes issues as _mc_centers is only updated when using k-Means. Due to cluster merging/deletion, the index found by the check for the closest micro-cluster may not necessarily refer to the same microcluster as the stored center in _mc_centers.
Deals with an issue where CluStream can skip the kMeans step if the datapoint at timestep "self._timestamp % self.time_gap == self.time_gap - 1" is added to an existing micro-cluster
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant