diff --git a/.github/workflows/build-test.yml b/.github/workflows/build-test.yml
index 8276100..781c3b8 100644
--- a/.github/workflows/build-test.yml
+++ b/.github/workflows/build-test.yml
@@ -14,16 +14,17 @@ jobs:
strategy:
fail-fast: false
matrix:
- python-version: ["3.8", "3.9", "3.10", "3.11"]
+ python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
steps:
- - uses: actions/checkout@v2
+ - uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
- uses: actions/setup-python@v2
+ uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
+ allow-prereleases: true
- name: Install dependencies
run: |
diff --git a/CHANGELOG.md b/CHANGELOG.md
index f3e29be..c0538e8 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,8 +1,19 @@
# Changelog
-## 0.1.11, in development
-
-- Coming soon...
+## 0.2.0, 3 September 2023
+
+- Moved to something more closely resembling semantic versioning, which is the main reason this is version 0.2.0.
+- Builds and tests on Python 3.11 have been successful, so now supporting this version. Started testing on Python 3.12, which is not supported for the time being.
+- Added custom 'alarm' `Detector`, which can be instantiated with a function and a warning to emit when the function returns True for a 1D array. You can easily write your own detectors with this class.
+- Added `make_detector_pipeline()` which can take sequences of functions and warnings (or a mapping of functions to warnings) and returns a `scikit-learn.pipeline.Pipeline` containing a `Detector` for each function.
+- Added `RegressionMultimodalDetector` to allow detection of non-unimodal distributions in features, when considered across the entire dataset. (Coming soon, a similar detector for classification tasks that will partition the data by class.)
+- Redefined `is_standardized` (deprecated) as `is_standard_normal`, which implements the Kolmogorov–Smirnov test. It seems more reliable than assuming the data will have a mean of almost exactly 0 and standard deviation of exactly 1, when all we really care about is that the feature is roughly normal.
+- Changed the wording slightly in the existing detector warning messages.
+- No longer warning if `y` is `None` in, eg, `ImportanceDetector`, since you most likely know this.
+- Some changes to `ImportanceDetector`. It now uses KNN estimators instead of SVMs as the third measure of importance; the SVMs were too unstable, causing numerical issues. It also now requires that the number of important features is less than the total number of features to be triggered. So if you have 2 features and both are important, it does not trigger.
+- Improved `is_continuous()` which was erroneously classifying integer arrays with many consecutive values as non-continuous.
+- Added a `Tutorial.ipynb` notebook to the docs.
+- Added a **Copy** button to code blocks in the docs.
## 0.1.10, 21 November 2022
diff --git a/README.md b/README.md
index b3d8178..579c271 100644
--- a/README.md
+++ b/README.md
@@ -8,8 +8,6 @@
🚩 `redflag` aims to be an automatic safety net for machine learning datasets. The vision is to accept input of a Pandas `DataFrame` or NumPy `ndarray` (one for each of the input `X` and target `y` in a machine learning task). `redflag` will provide an analysis of each feature, and of the target, including aspects such as class imbalance, leakage, outliers, anomalous data patterns, threats to the IID assumption, and so on. The goal is to complement other projects like `pandas-profiling` and `greatexpectations`.
-⚠️ **This project is very rough and does not do much yet. The API will very likely change without warning. Please consider contributing!**
-
## Installation
diff --git a/docs/conf.py b/docs/conf.py
index d7482db..68b39b6 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -48,11 +48,12 @@ def setup(app):
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
- 'sphinx.ext.githubpages',
'sphinxcontrib.apidoc',
+ 'sphinx.ext.githubpages',
'sphinx.ext.napoleon',
- 'myst_nb',
'sphinx.ext.coverage',
+ 'sphinx_copybutton',
+ 'myst_nb',
]
myst_enable_extensions = ["dollarmath", "amsmath"]
diff --git a/docs/index.rst b/docs/index.rst
index 5669fc3..7703273 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -41,6 +41,7 @@ User guide
installation
_notebooks/Basic_usage.ipynb
_notebooks/Using_redflag_with_sklearn.ipynb
+ _notebooks/Tutorial.ipynb
API reference
@@ -82,5 +83,5 @@ Indices and tables
PyPI releases
Code in GitHub
Issue tracker
- Community guidelines
- Scienxlab
+ Community guidelines
+ Scienxlab
diff --git a/docs/make.bat b/docs/make.bat
deleted file mode 100644
index 153be5e..0000000
--- a/docs/make.bat
+++ /dev/null
@@ -1,35 +0,0 @@
-@ECHO OFF
-
-pushd %~dp0
-
-REM Command file for Sphinx documentation
-
-if "%SPHINXBUILD%" == "" (
- set SPHINXBUILD=sphinx-build
-)
-set SOURCEDIR=.
-set BUILDDIR=_build
-
-if "%1" == "" goto help
-
-%SPHINXBUILD% >NUL 2>NUL
-if errorlevel 9009 (
- echo.
- echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
- echo.installed, then set the SPHINXBUILD environment variable to point
- echo.to the full path of the 'sphinx-build' executable. Alternatively you
- echo.may add the Sphinx directory to PATH.
- echo.
- echo.If you don't have Sphinx installed, grab it from
- echo.https://www.sphinx-doc.org/
- exit /b 1
-)
-
-%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
-goto end
-
-:help
-%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
-
-:end
-popd
diff --git a/docs/notebooks/Tutorial.ipynb b/docs/notebooks/Tutorial.ipynb
index 5fa283a..8830a0b 100644
--- a/docs/notebooks/Tutorial.ipynb
+++ b/docs/notebooks/Tutorial.ipynb
@@ -80,7 +80,7 @@
"X_scaled = scaler.transform(X)\n",
"\n",
"clf.fit(X_scaled, y)\n",
- "clf.predict(X)"
+ "clf.predict(X) # <-- Oops, we predicted on unscaled data."
]
},
{
@@ -100,7 +100,7 @@
{
"data": {
"text/plain": [
- "array(['ms', 'ss'], dtype='"
+ ""
]
},
- "execution_count": 11,
+ "execution_count": 10,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
- "image/png": "\n",
+ "image/png": "\n",
"text/plain": [
"