Skip to content

Latest commit

 

History

History
407 lines (288 loc) · 14.5 KB

CONTRIBUTING.md

File metadata and controls

407 lines (288 loc) · 14.5 KB

Contributing guidelines

Introduction

Thank you for contributing to SHAP. SHAP is an open source collective effort, and contributions of all forms are welcome!

You can contribute by:

If you are looking for a good place to get started, look for issues with the good first issue label.

Writing helpful bug reports

When submitting bug reports on the issue tracker, it is very helpful for the maintainers to include a good Minimal Reproducible Example (MRE).

An MRE should be:

  • Minimal: Use as little code as possible that still produces the same problem.
  • Self-contained: Include everything needed to reproduce your problem, including imports and input data.
  • Reproducible: Test the code you're about to provide to make sure it reproduces the problem.

For more information, see How To Craft Minimal Bug Reports.

Installing the latest version

To get the very latest version of shap, you can pip-install the library directly from the master branch:

pip install git+https://github.com/shap/shap.git@master

This can be useful to test if a particular issue or bug has been fixed since the most recent release.

Alternatively, if you are considering making changes to the code you can clone the repository and install your local copy as described below.

Setting up a local development environment

Fork the repository

Click this link to fork the repository on GitHub to your user area.

Clone the repository to your local environment, using the URL provided by the green <> Code button on your projects home page.

Creating a python environment

Create a new isolated environment for the project, e.g. with conda:

conda create -n shap python=3.11
conda activate shap

Installing from source

To build from source, you need a compiler to build the C extension.

  • On linux, you can install gcc with:

    sudo apt install build-essential
  • Or on Windows, one way of getting a compiler is to install mingw64.

Pip-install the project with the --editable flag, which ensures that any changes you make to the source code are immediately reflected in your environment.

pip install --editable '.[test,plots,docs]'

The various pip extras are defined in pyproject.toml:

  • test-core: a minimal set of dependencies to run pytest.
  • test: a wider set of 3rd party packages for the full test suite such as tensorflow, pytest, xgboost.
  • plots: includes matplotlib.
  • docs: dependencies for building the docs with Sphinx.

Note: When installing from source, shap will attempt to build the C extension and the CUDA extension. If CUDA is not available, shap will retry the build without CUDA support.

Consequently, is is quite normal to see warnings such as WARNING: Could not compile cuda extensions when building from source if you do not have CUDA available.

Code checks with precommit

We use pre-commit hooks to run code checks. Enable pre-commit in your local environment with:

pip install pre-commit
pre-commit install

To run the checks on all files, use:

pre-commit install
pre-commit run --all-files

Ruff is used as a linter, and it is enabled as a pre-commit hook. You can also run ruff locally with:

pip install ruff
ruff check .

Unit tests with pytest

The unit test suite can be run locally with:

pytest

Pull Requests (PRs)

Etiquette for creating PRs

Before starting on a PR, please make a proposal by opening an Issue, checking for any duplicates. This isn't necessary for trivial PRs such as fixing a typo.

Keep the scope small. This makes PRs a lot easier to review. Separate functional code changes (such as bug fixes) from refactoring changes (such as style improvements). PRs should contain one or the other, but not both.

Open a Draft PR as early as possible, do not wait until the feature is ready. Work on a feature branch with a descriptive name such as fix/lightgbm-warnings or doc/contributing.

Use a descriptive title, such as:

  • FIX: Update parameters to remove DeprecationWarning in TreeExplainer
  • ENH: Add support for python 3.11
  • DOCS: Fix formatting of ExactExplainer docstring

Checklist for publishing PRs

Before marking your PR as "ready for review" (by removing the Draft status), please ensure:

  • Your feature branch is up-to-date with the master branch,
  • All pre-commit hooks pass, and
  • Unit tests have been added (if your PR adds any new features or fixes a bug).

Documentation

The documentation is hosted at shap.readthedocs.io. If you have modified the docstrings or notebooks, please also check that the changes are are rendered properly in the generated HTML files.

Previewing changes on Pull Requests

The documentation is built automatically on each Pull Request, to facilitate previewing how your changes will render. To see the preview:

  1. Look for "All checks have passed", and click "Show all checks".
  2. Browse to the check called "docs/readthedocs.org".
  3. Click the Details hyperlink to open a preview of the docs.

The PR previews are typically hosted on a URL of the form below, replacing <pr-number>:

https://shap--<pr-number>.org.readthedocs.build/en/<pr-number>

Building the docs locally

To build the documentation locally:

  1. Navigate to the docs directory.
  2. Run make html.
  3. Open "_build/html/index.html" in your browser to inspect the documentation.

Note that nbsphinx currently requires the stand-alone program pandoc. If you get an error "Pandoc wasn't found", install pandoc as described in nbsphinx installation guide.

Jupyter notebook style guide

If you are contributing changes to the Jupyter notebooks in the documentation, please adhere to the following style guidelines.

General Jupyter guidelines

Before committing your notebook(s),

  • Ensure that you "Restart Kernel and Run All Cells...", making sure that cells are executed in order, the notebook is reproducible and does not have any hidden states.
  • Ensure that the notebook does not raise syntax warnings in the Sphinx build logs as a result of your changes.

Links / Cross-references

You are advised to include links in the notebooks as much as possible if it provides the reader with more background / context on the topic at hand.

Here's an example of how you would accomplish this in a Markdown cell in the notebook:

# Force Plot Colors

The [scatter][scatter_doclink] plot create Python matplotlib plots that can be customized at will.

[scatter_doclink]: ../../../generated/shap.plots.scatter.rst#shap.plots.scatter

where the link specified is a relative path to the rst file generated by Sphinx. Prefer relative links over absolute paths.

Notebook linting and formatting

We use ruff to perform code linting and auto-formatting on our notebooks. Assuming you have set up pre-commit as described above, these checks will run automatically whenever you commit any changes.

To run the code-quality checks manually, you can do, e.g.:

pre-commit run --files notebook1.ipynb notebook2.ipynb

replacing notebook1.ipynb and notebook2.ipynb with any notebook(s) you have modified.

Maintainer guide

Issue triage

Bug reports and feature requests are managed on the github issue tracker. We use automation to help prioritise and organise the issues.

The good first issue label should be assigned to any issue that could be suitable for new contributors.

The awaiting feedback label should be assigned if more information is required from the author, such as a reproducible example.

The stale bot will mark issues and PRs that have not had any activity for a long period of time with the stale label, and comment to solicit feedback from our community. If there is still no activity, the issue will be closed after a further period of time.

We value feedback from our users very highly, so the bot is configured with long time periods before marking issues as stale.

Issues marked with the todo label will never be marked as stale, so this label should be assigned to any issues that should be kept open such as long-running feature requests.

PR triage

Pull Requests should generally be assigned a category label such as bug, enhancement or BREAKING. These labels are used to categorise the PR in the release notes, as described below.

All PRs should have at least one review before being merged. In particular, maintainers should generally ensure that PRs have sufficient unit tests to cover any fixed bugs or new features.

PRs are usually completed with "squash and merge" in order to maintain a clear linear history and make it easier to debug any issues.

Versioning

shap uses a PEP 440-compliant versioning scheme of MAJOR.MINOR.PATCH. Like numpy, shap does not use semantic versioning, and has never made a major release. Most releases increment minor, typically made every month or two. patch releases are sometimes made for any important bugfixes.

Breaking changes are done with care, given that shap is a very popular package. When breaking changes are made, the PR should be tagged with the BREAKING label to ensure it is highlighted in the release notes. Deprecation cycles are used to mitigate the impact on downstream users.

GitHub milestones can be used to track any actions that need to be completed for a given release, such as those relating to deprecation cycles.

We use setuptools-scm to source the version number from the git history automatically. At build time, the version number is determined from the git tag.

Minimum supported dependencies

We aim to follow the SPEC 0 convention on minimum supported dependencies.

  • Support for Python versions are dropped 3 years after their initial release.
  • Support for core package dependencies are dropped 2 years after their initial release.

We may support python versions for slightly longer than this window where it does not add any extra maintenance burden.

Making releases

We try to use automation to make the release process reliable, transparent and reproducible. This also helps us make releases more frequently.

A release is made by publishing a GitHub Release, tagged with an appropriately incremented version number.

When a release is published, the wheels will be built and published to PyPI automatically by the build_wheels GitHub action. This workflow can also be triggered manually at any time to do a dry-run of cibuildwheel.

In the run-up to a release, create a GitHub issue for the release such as [Meta issue] Release 0.43.0. This can be used to co-ordinate with other maintainers and agree to make a release.

Suggested release checklist:

- [ ] Dry-run cibuildwheel & test
- [ ] Make GitHub release & tag
- [ ] Confirm PyPI wheels published
- [ ] Conda forge published

The conda package is managed in a separate repo. The conda-forge bot will automatically make a PR to this repo to update the conda package, typically within a few hours of the PyPSA package being published.

Release notes from PR labels

Release notes can be automatically drafted by Github using the titles and labels of PRs that were merged since the previous release. See the GitHub docs on automatically generated release notes for more information.

The generated notes will follow the template defined in .github/release.yml, arranging PRs into subheadings by label and excluding PRs made by bots. See the docs for the available configuration options.

It's helpful to assign labels such as BREAKING, bug, enhancement or skip-changelog to each PR, so that the change will show up in the notes under the right section. It also helps to ensure each PR has a descriptive name.

The notes can be edited (both before and after release) to remove information that is unlikely to be of high interest to users, such as maintenance updates.