Skip to content

scikit-learn-contrib/radius_clustering

Repository files navigation

License: GPLv3 PyPI Code style: Ruff GitHub Actions Workflow Status Python version supported

Radius Clustering

Radius clustering is a Python package that implements clustering under radius constraint based on the Minimum Dominating Set (MDS) problem. This problem is NP-Hard but has been studied in the literature and proven to be linked to the clustering under radius constraint problem (see references for more details).

Features

  • Implements both exact and approximate MDS-based clustering algorithms
  • Compatible with scikit-learn's API for clustering algorithms
  • Supports radius-constrained clustering
  • Provides options for exact and approximate solutions
  • Easy to use and integrate with existing Python data science workflows
  • Includes comprehensive documentation and examples
  • Full test coverage to ensure reliability and correctness
  • Supports custom MDS solvers for flexibility in clustering approaches
  • Provides a user-friendly interface for clustering tasks

Caution

Deprecation Notice: The threshold parameter in the RadiusClustering class has been deprecated. Please use the radius parameter instead for specifying the radius for clustering. It is planned to be completely removed in version 2.0.0. The radius parameter is now the standard way to define the radius for clustering, aligning with our objective of making the parameters' name more intuitive and user-friendly.

Note

NEW VERSIONS: The package is currently under active development for new features and improvements, including some refactoring and enhancements to the existing codebase. Backwards compatibility is not guaranteed, so please check the CHANGELOG for details on changes and updates.

Roadmap

  • Version 1.4.0:
    • Add support for custom MDS solvers
    • Improve documentation and examples
    • Add more examples and tutorials

Installation

You can install Radius Clustering using pip:

pip install radius-clustering

Usage

Here's a basic example of how to use Radius Clustering:

import numpy as np
from radius_clustering import RadiusClustering

# Example usage
X = np.random.rand(100, 2)  # Generate random data

# Create an instance of MdsClustering
rad_clustering = RadiusClustering(manner="approx", radius=0.5)

# Fit the model to the data
rad_clustering.fit(X)

# Get cluster labels
labels = rad_clustering.labels_

print(labels)

Documentation

You can find the full documentation for Radius Clustering here.

Building the documentation

To build the documentation, you can run the following command, assuming you have all dependencies needed installed:

cd docs
make html

Then you can open the index.html file in the build directory to view the full documentation.

More information

For more information please refer to the official documentation.

If you want insights on how the algorithm works, please refer to the presentation.

If you want to know more about the experiments conducted with the package, please refer to the experiments.

Contributing

Contributions to Radius Clustering are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

MDS Algorithms

The two MDS algorithms implemented are forked and modified (or rewritten) from the following authors:

  • Alejandra Casado for the minimum dominating set heuristic code [1]. We rewrote the code in C++ to adapt to the need of python interfacing.
  • Hua Jiang for the minimum dominating set exact algorithm code [2]. The code has been adapted to the need of python interfacing.

Funders

The Radius Clustering work has been funded by:

Contributors

References

About

Source code repository of the Radius clustering python package.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •