diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml index 9fc1a221b..94a581931 100644 --- a/.github/workflows/pkgdown.yaml +++ b/.github/workflows/pkgdown.yaml @@ -41,6 +41,13 @@ jobs: extra-packages: any::pkgdown, local::. needs: website + - name: Render python README to assets folder + run: | + rmarkdown::render(input = "python/README.md", + output_format = "html_document", + output_file = "../pkgdown/assets/README_py.html") + shell: Rscript {0} + - name: Build site run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) shell: Rscript {0} diff --git a/README.Rmd b/README.Rmd index ae925e8fc..ae2e80860 100644 --- a/README.Rmd +++ b/README.Rmd @@ -24,12 +24,17 @@ knitr::opts_chunk$set( [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit) [![DOI](https://joss.theoj.org/papers/10.21105/joss.02027/status.svg)](https://doi.org/10.21105/joss.02027) + +See the pkgdown site at [norskregnesentral.github.io/shapr/](https://norskregnesentral.github.io/shapr/) +for a complete introduction with examples and documentation of the package. -## Brief NEWS -This is `shapr` version 1.0.0 (Released on GitHub Nov 2024), which provides a full restructuring of the code based, and -provides a full suit of new functionality, including: +## NEWS + +With `shapr` version 1.0.0 (GitHub only, Nov 2024) and version 1.0.1 (CRAN, Jan 2025), +the package was subject to a major update, providing a full restructuring of the code based, and +a full suit of new functionality, including: * A long list of approaches for estimating the contribution/value function $v(S)$, including Variational Autoencoders, and regression-based methods @@ -40,23 +45,24 @@ and regression-based methods * Several other methodological, computational and user-experience improvements * Python wrapper making the core functionality of `shapr` available in Python -Below we provide a brief overview of the breaking changes. -See the [NEWS](https://github.com/NorskRegnesentral/shapr/blob/master/NEWS.md) for the full list of details. - -### Breaking changes +See the [NEWS](https://github.com/NorskRegnesentral/shapr/blob/master/NEWS.md) for a complete list. -The new syntax for explaining models essentially amounts to using a single function (`explain()`) instead of two functions (`shapr()` and `explain()`). +### Coming from shapr < 1.0.0? +`shapr` version > 1.0.0 comes with a number of breaking changes. +Most notably, we moved from using two function (`shapr()` and `explain()`) to +a single function (`explain()`). In addition, custom models are now explained by passing the prediction function directly to `explain()`, -some input arguments got new names, and a few functions for edge cases was removed to simplify the code base. +quite a few input arguments got new names, and a few functions for edge cases was removed to simplify the code base. -Note that the CRAN version of `shapr` (v0.2.2) still uses the old syntax. -The examples below uses the new syntax. -[Here](https://github.com/NorskRegnesentral/shapr/blob/cranversion_0.2.2/README.md) is a version of this README with the syntax of the CRAN version (v0.2.2). +Click [here](https://github.com/NorskRegnesentral/shapr/blob/cranversion_0.2.2/README.md) to view a version of this +README with old syntax (v0.2.2). ### Python wrapper -We now also provide a Python wrapper (`shaprpy`) which allows explaining python models with the methodology implemented in `shapr`, directly from Python. -The wrapper is available [here](https://github.com/NorskRegnesentral/shapr/tree/master/python). +We provide an (experimental) Python wrapper (`shaprpy`) which allows explaining Python models with the methodology +implemented in `shapr`, directly from Python. +The wrapper calls `R` internally, and therefore requires an installation of `R`. +See [here](https://github.com/NorskRegnesentral/shapr/tree/master/python) for installation instructions and examples. ## The package @@ -72,28 +78,28 @@ shapr is as a highly efficient and user-friendly tool, delivering precise estima which are critical for understanding how features truly contribute to predictions. A basic example is provided below. -Otherwise we refer to the [pkgdown website](https://norskregnesentral.github.io/shapr/) and the vignettes there -for details and further examples. +Otherwise we refer to the [pkgdown website](https://norskregnesentral.github.io/shapr/) and the different vignettes +there for details and further examples. ## Installation -We highly recommend to install the development version of shapr (with the new explanation syntax and all functionality), +`shapr` is available on [CRAN](https://cran.r-project.org/package=shapr) and can be installed in R as: ```{r, eval = FALSE} -remotes::install_github("NorskRegnesentral/shapr") +install.packages("shapr") ``` -To also install all dependencies, use +To install the development version of `shapr`, available on GitHub, use ```{r, eval = FALSE} -remotes::install_github("NorskRegnesentral/shapr", dependencies = TRUE) +remotes::install_github("NorskRegnesentral/shapr") ``` -**The CRAN version of `shapr` (NOT RECOMMENDED) can be installed with** +To also install all dependencies, use ```{r, eval = FALSE} -install.packages("shapr") +remotes::install_github("NorskRegnesentral/shapr", dependencies = TRUE) ``` @@ -150,7 +156,7 @@ model <- xgboost( # Specifying the phi_0, i.e. the expected prediction without any features p0 <- mean(y_train) -# Computing the actual Shapley values with kernelSHAP accounting for feature dependence using +# Computing the Shapley values with kernelSHAP accounting for feature dependence using # the empirical (conditional) distribution approach with bandwidth parameter sigma = 0.1 (default) explanation <- explain( model = model, @@ -168,8 +174,8 @@ print(explanation$shapley_values_est) plot(explanation) ``` -See the [vignette](https://norskregnesentral.github.io/shapr/articles/general_usage.html) for further basic usage -examples. +See the [general usage vignette](https://norskregnesentral.github.io/shapr/articles/general_usage.html) for further +basic usage examples. ## Contribution diff --git a/README.md b/README.md index 92cb743b8..9f96515a6 100644 --- a/README.md +++ b/README.md @@ -14,13 +14,19 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit) [![DOI](https://joss.theoj.org/papers/10.21105/joss.02027/status.svg)](https://doi.org/10.21105/joss.02027) + +See the pkgdown site at +[norskregnesentral.github.io/shapr/](https://norskregnesentral.github.io/shapr/) +for a complete introduction with examples and documentation of the +package. -## Brief NEWS +## NEWS -This is `shapr` version 1.0.0 (Released on GitHub Nov 2024), which -provides a full restructuring of the code based, and provides a full -suit of new functionality, including: +With `shapr` version 1.0.0 (GitHub only, Nov 2024) and version 1.0.1 +(CRAN, Jan 2025), the package was subject to a major update, providing a +full restructuring of the code based, and a full suit of new +functionality, including: - A long list of approaches for estimating the contribution/value function $v(S)$, including Variational Autoencoders, and @@ -34,31 +40,31 @@ suit of new functionality, including: - Python wrapper making the core functionality of `shapr` available in Python -Below we provide a brief overview of the breaking changes. See the +See the [NEWS](https://github.com/NorskRegnesentral/shapr/blob/master/NEWS.md) -for the full list of details. +for a complete list. -### Breaking changes +### Coming from shapr \< 1.0.0? -The new syntax for explaining models essentially amounts to using a -single function (`explain()`) instead of two functions (`shapr()` and -`explain()`). In addition, custom models are now explained by passing -the prediction function directly to `explain()`, some input arguments -got new names, and a few functions for edge cases was removed to -simplify the code base. +`shapr` version \> 1.0.0 comes with a number of breaking changes. Most +notably, we moved from using two function (`shapr()` and `explain()`) to +a single function (`explain()`). In addition, custom models are now +explained by passing the prediction function directly to `explain()`, +quite a few input arguments got new names, and a few functions for edge +cases was removed to simplify the code base. -Note that the CRAN version of `shapr` (v0.2.2) still uses the old -syntax. The examples below uses the new syntax. -[Here](https://github.com/NorskRegnesentral/shapr/blob/cranversion_0.2.2/README.md) -is a version of this README with the syntax of the CRAN version -(v0.2.2). +Click +[here](https://github.com/NorskRegnesentral/shapr/blob/cranversion_0.2.2/README.md) +to view a version of this README with old syntax (v0.2.2). ### Python wrapper -We now also provide a Python wrapper (`shaprpy`) which allows explaining -python models with the methodology implemented in `shapr`, directly from -Python. The wrapper is available -[here](https://github.com/NorskRegnesentral/shapr/tree/master/python). +We provide an (experimental) Python wrapper (`shaprpy`) which allows +explaining Python models with the methodology implemented in `shapr`, +directly from Python. The wrapper calls `R` internally, and therefore +requires an installation of `R`. See +[here](https://github.com/NorskRegnesentral/shapr/tree/master/python) +for installation instructions and examples. ## The package @@ -76,29 +82,28 @@ precise estimates of conditional Shapley values, which are critical for understanding how features truly contribute to predictions. A basic example is provided below. Otherwise we refer to the [pkgdown -website](https://norskregnesentral.github.io/shapr/) and the vignettes -there -for details and further examples. +website](https://norskregnesentral.github.io/shapr/) and the different +vignettes there for details and further examples. ## Installation -We highly recommend to install the development version of shapr (with -the new explanation syntax and all functionality), +`shapr` is available on [CRAN](https://cran.r-project.org/package=shapr) +and can be installed in R as: ``` r -remotes::install_github("NorskRegnesentral/shapr") +install.packages("shapr") ``` -To also install all dependencies, use +To install the development version of `shapr`, available on GitHub, use ``` r -remotes::install_github("NorskRegnesentral/shapr", dependencies = TRUE) +remotes::install_github("NorskRegnesentral/shapr") ``` -**The CRAN version of `shapr` (NOT RECOMMENDED) can be installed with** +To also install all dependencies, use ``` r -install.packages("shapr") +remotes::install_github("NorskRegnesentral/shapr", dependencies = TRUE) ``` ## Example @@ -163,7 +168,7 @@ model <- xgboost( # Specifying the phi_0, i.e. the expected prediction without any features p0 <- mean(y_train) -# Computing the actual Shapley values with kernelSHAP accounting for feature dependence using +# Computing the Shapley values with kernelSHAP accounting for feature dependence using # the empirical (conditional) distribution approach with bandwidth parameter sigma = 0.1 (default) explanation <- explain( model = model, @@ -178,14 +183,14 @@ explanation <- explain( #> max_n_coalitions is NULL or larger than or 2^n_features = 16, #> and is therefore set to 2^n_features = 16. #> -#> ── Starting `shapr::explain()` at 2024-11-20 12:23:18 ────────────────────────── +#> ── Starting `shapr::explain()` at 2025-01-21 13:30:06 ────────────────────────── #> • Model class: #> • Approach: empirical #> • Iterative estimation: FALSE #> • Number of feature-wise Shapley values: 4 #> • Number of observations to explain: 6 #> • Computations (temporary) saved at: -#> '/tmp/Rtmp4yBCHY/shapr_obj_17459f7fdc4b8f.rds' +#> '/tmp/Rtmpf5zleu/shapr_obj_3676de5b39f33b.rds' #> #> ── Main computation started ── #> @@ -209,8 +214,8 @@ plot(explanation) -See the -[vignette](https://norskregnesentral.github.io/shapr/articles/general_usage.html) +See the [general usage +vignette](https://norskregnesentral.github.io/shapr/articles/general_usage.html) for further basic usage examples. ## Contribution diff --git a/_pkgdown.yml b/_pkgdown.yml index 5a61c4c94..b768a879b 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -1,9 +1,11 @@ url: https://norskregnesentral.github.io/shapr/ - +template: + bootstrap: 5 + light-switch: true navbar: structure: - left: [home, articles, news, reference] - right: [github] + left: [home, articles, news, reference, python] + right: [search, github, lightswitch] components: articles: text: Vignettes @@ -22,3 +24,6 @@ navbar: reference: text: Manual href: reference/index.html + python: + text: Python + href: README_py.html diff --git a/man/figures/README-basic_example-1.png b/man/figures/README-basic_example-1.png index 7c3f4ee4a..ca8efec52 100644 Binary files a/man/figures/README-basic_example-1.png and b/man/figures/README-basic_example-1.png differ diff --git a/pkgdown/assets/README_py.html b/pkgdown/assets/README_py.html new file mode 100644 index 000000000..e21de469f --- /dev/null +++ b/pkgdown/assets/README_py.html @@ -0,0 +1,471 @@ + + + + + + + + + + + + + +README + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + +
+

shaprpy

+

Python wrapper for the R package shapr.

+

NOTE: This wrapper is in an EXPERIMENTAL state. Bugs and breaking +changes are not unlikely to occur.

+
+

Install

+

The below instructions assume you already have pip and +R installed and exposed to the python environment where you +want to run shaprpy. Official instructions for installing +pip can be found here, and for +R here. R can +also be installed with pip as follows:

+
pip install rbase
+

and conda:

+
conda install -c r r
+
+

Install R-package

+

The shaprpy Python wrapper requires the development +version of the shapr R-package (from the master branch on +GitHub). Install it by running the following terminal command from the +folder of this readme file (.../shapr/python):

+
Rscript install_r_packages.R
+
+
+
+
+

Install python wrapper

+

In the folder of this readme file (.../shapr/python), +run

+
pip install -e .
+
+

Demo

+
from sklearn.ensemble import RandomForestRegressor
+from shaprpy import explain
+from shaprpy.datasets import load_california_housing
+
+dfx_train, dfx_test, dfy_train, dfy_test = load_california_housing()
+
+## Fit model
+model = RandomForestRegressor()
+model.fit(dfx_train, dfy_train.values.flatten())
+
+## Shapr
+explanation = explain(
+    model = model,
+    x_train = dfx_train,
+    x_explain = dfx_test,
+    approach = 'empirical',
+    phi0 = dfy_train.mean().item(),
+)
+print(explanation["shapley_values_est"])
+

shaprpy knows how to explain predictions from models +from sklearn, keras and xgboost. +For other models, one can provide a custom predict_model +function (and optionally a custom get_model_specs) to +shaprpy.explain.

+

See /examples for runnable examples, including an +example of a custom PyTorch model.

+

The /examples/regression_paradigm.py file demonstrates +how to use the regression paradigm explained in Olsen +et al. (2024). We describe how to specify the regression model, how +to enable automatic cross-validation of the model’s hyperparameters, and +applying pre-processing steps to the data before fitting the regression +models. We refer to Olsen +et al. (2024) for when one should use the different paradigms, +method classes, and methods.

+
+
+ + + + +
+ + + + + + + + + + + + + + + diff --git a/python/README.md b/python/README.md index 6f57ea4fb..2be39efe4 100644 --- a/python/README.md +++ b/python/README.md @@ -1,3 +1,8 @@ +--- +output: + html_document: default + pdf_document: default +--- ## shaprpy Python wrapper for the R package [shapr](https://github.com/NorskRegnesentral/shapr). diff --git a/vignettes/regression.Rmd b/vignettes/regression.Rmd index 8eb146306..3e76c4958 100644 --- a/vignettes/regression.Rmd +++ b/vignettes/regression.Rmd @@ -79,7 +79,7 @@ natively support categorical data to encode the categorical features. We use the same data and predictive models in this vignette as in the general usage. -See the end of the [continious data](#summary) and +See the end of the [continious data](#summary_figures) and [mixed data](#summary_mixed) sections for summary figures of all the methods used in this vignette to compute the Shapley value explanations. @@ -1755,7 +1755,7 @@ plot_MSEv_scores(explanation_list, method_line = "MC_empirical") ![](figure_regression/ppr-plot-1.png) -# Summary figures {#summary} +# Summary figures {#summary_figures} In this section, we compute the Shapley value explanations for the Monte Carlo-based methods in the `shapr` package and compare the results diff --git a/vignettes/regression.Rmd.orig b/vignettes/regression.Rmd.orig index 4e53a2441..a6fea7350 100644 --- a/vignettes/regression.Rmd.orig +++ b/vignettes/regression.Rmd.orig @@ -91,7 +91,7 @@ natively support categorical data to encode the categorical features. We use the same data and predictive models in this vignette as in the general usage. -See the end of the [continious data](#summary) and +See the end of the [continious data](#summary_figures) and [mixed data](#summary_mixed) sections for summary figures of all the methods used in this vignette to compute the Shapley value explanations. @@ -1087,7 +1087,7 @@ plot_MSEv_scores(explanation_list, method_line = "MC_empirical") ``` -# Summary figures {#summary} +# Summary figures {#summary_figures} In this section, we compute the Shapley value explanations for the Monte Carlo-based methods in the `shapr` package and compare the results