Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel #83

Merged
merged 4 commits into from
Jul 19, 2018
Merged

Parallel #83

merged 4 commits into from
Jul 19, 2018

Conversation

rphes
Copy link
Contributor

@rphes rphes commented Jul 19, 2018

Parallel implementations of K-Modes and K-Prototypes. See #76 for more information.

The approach is mostly identical to scikit-learn's implementation of parallel K-Means. Changes are:

  • allowing multi-core reproducibility by propagating random seeds to all jobs
  • extracting the parallel unit of work, a single run, to a separate function
  • performing units of work sequentially if n_jobs=1, or using joblib otherwise

Note that it only makes sense to use parallel execution for n_init>1. This is not enforced in code.

I updated the documentation and tests accordingly. The expected outcomes of K-Modes tests had to be changed due to the new method of seeding.

When this PR is merged, I will look into updating the examples/benchmarks to make use of the changes.

@coveralls
Copy link

coveralls commented Jul 19, 2018

Coverage Status

Coverage increased (+0.1%) to 96.706% when pulling e2ca4b1 on rphes:parallel into 73b4ffe on nicodv:master.

@nicodv
Copy link
Owner

nicodv commented Jul 19, 2018

Beautiful work, @rphes !

@nicodv nicodv merged commit caa40b3 into nicodv:master Jul 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants