diff --git a/.doctrees/_notebooks/Basic_usage.doctree b/.doctrees/_notebooks/Basic_usage.doctree index 3510506..4f56556 100644 Binary files a/.doctrees/_notebooks/Basic_usage.doctree and b/.doctrees/_notebooks/Basic_usage.doctree differ diff --git a/.doctrees/_notebooks/Tutorial.doctree b/.doctrees/_notebooks/Tutorial.doctree index 30601cb..5a8cc15 100644 Binary files a/.doctrees/_notebooks/Tutorial.doctree and b/.doctrees/_notebooks/Tutorial.doctree differ diff --git a/.doctrees/_notebooks/Using_redflag_with_Pandas.doctree b/.doctrees/_notebooks/Using_redflag_with_Pandas.doctree index 91864cc..12d9c93 100644 Binary files a/.doctrees/_notebooks/Using_redflag_with_Pandas.doctree and b/.doctrees/_notebooks/Using_redflag_with_Pandas.doctree differ diff --git a/.doctrees/_notebooks/Using_redflag_with_sklearn.doctree b/.doctrees/_notebooks/Using_redflag_with_sklearn.doctree index 5fcbcea..215c071 100644 Binary files a/.doctrees/_notebooks/Using_redflag_with_sklearn.doctree and b/.doctrees/_notebooks/Using_redflag_with_sklearn.doctree differ diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle index aacb62d..9e0a4c2 100644 Binary files a/.doctrees/environment.pickle and b/.doctrees/environment.pickle differ diff --git a/_images/74d2d18146189daf45b1ee624ebeb7cd87cb427a8ee1b9a267a1ae8870cbaaa6.png b/_images/74d2d18146189daf45b1ee624ebeb7cd87cb427a8ee1b9a267a1ae8870cbaaa6.png deleted file mode 100644 index 6b92aea..0000000 Binary files a/_images/74d2d18146189daf45b1ee624ebeb7cd87cb427a8ee1b9a267a1ae8870cbaaa6.png and /dev/null differ diff --git a/_images/c565f9dff80f5405d7d514d7c2afe2f4941fe1c6fb1ddecc4458ba39f014f3bd.png b/_images/c565f9dff80f5405d7d514d7c2afe2f4941fe1c6fb1ddecc4458ba39f014f3bd.png new file mode 100644 index 0000000..15c8936 Binary files /dev/null and b/_images/c565f9dff80f5405d7d514d7c2afe2f4941fe1c6fb1ddecc4458ba39f014f3bd.png differ diff --git a/_images/d78e85605ed7b0b7927b040c65b314bea5f357944ad18bb72003f8c4b14217bf.png b/_images/d78e85605ed7b0b7927b040c65b314bea5f357944ad18bb72003f8c4b14217bf.png deleted file mode 100644 index 04e5989..0000000 Binary files a/_images/d78e85605ed7b0b7927b040c65b314bea5f357944ad18bb72003f8c4b14217bf.png and /dev/null differ diff --git a/_notebooks/Basic_usage.html b/_notebooks/Basic_usage.html index 07eda8c..122b699 100644 --- a/_notebooks/Basic_usage.html +++ b/_notebooks/Basic_usage.html @@ -254,7 +254,7 @@

🚩 Basic usage -
'0.4.2rc1'
+
'0.4.2'
 
@@ -560,7 +560,7 @@

Imbalance metrics -
-
<seaborn.axisgrid.FacetGrid at 0x7f8da12554c0>
+
<seaborn.axisgrid.FacetGrid at 0x7fe6b03e67b0>
 
../_images/5488f73cdf034061545c78f25fe27c514af3927c350d04e04bbee7c2b07662e0.png @@ -772,7 +772,7 @@

Distribution shape -
<seaborn.axisgrid.FacetGrid at 0x7f8da016a060>
+
<seaborn.axisgrid.FacetGrid at 0x7fe6b0424da0>
 
../_images/07bc624e812626af7a9a216f8aa7b233e9130872991e9e5a1b2c98f0be17d7a4.png @@ -921,7 +921,7 @@

Feature importance -
array([0.36923691, 0.31790105, 0.22145207, 0.08204644, 0.        ])
+
array([0.41621216, 0.2725526 , 0.23427556, 0.07695968, 0.        ])
 
@@ -966,7 +966,7 @@

Feature importance -
-
---------------------------------------------------------------------------
-ValueError                                Traceback (most recent call last)
-Cell In[3], line 11
-      8 X_train_scaled = scaler.transform(X_train)
-      9 X_test_scaled = scaler.transform(X_test)
----> 11 clf.fit(X_train_scaled, y_train)
-     12 clf.predict(X_test_scaled)
-
-File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/base.py:1152, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
-   1145     estimator._validate_params()
-   1147 with config_context(
-   1148     skip_parameter_validation=(
-   1149         prefer_skip_nested_validation or global_skip_validation
-   1150     )
-   1151 ):
--> 1152     return fit_method(estimator, *args, **kwargs)
-
-File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/svm/_base.py:199, in BaseLibSVM.fit(self, X, y, sample_weight)
-    189 else:
-    190     X, y = self._validate_data(
-    191         X,
-    192         y,
-   (...)
-    196         accept_large_sparse=False,
-    197     )
---> 199 y = self._validate_targets(y)
-    201 sample_weight = np.asarray(
-    202     [] if sample_weight is None else sample_weight, dtype=np.float64
-    203 )
-    204 solver_type = LIBSVM_IMPL.index(self._impl)
-
-File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/svm/_base.py:747, in BaseSVC._validate_targets(self, y)
-    745 self.class_weight_ = compute_class_weight(self.class_weight, classes=cls, y=y_)
-    746 if len(cls) < 2:
---> 747     raise ValueError(
-    748         "The number of classes has to be greater than one; got %d class"
-    749         % len(cls)
-    750     )
-    752 self.classes_ = cls
-    754 return np.asarray(y, dtype=np.float64, order="C")
-
-ValueError: The number of classes has to be greater than one; got 1 class
+
array(['ss', 'ms'], dtype='<U2')
 
@@ -372,7 +331,7 @@

A quick look at

-
is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
-use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
-The figure layout has changed to tight
-
-
-
<seaborn.axisgrid.FacetGrid at 0x7efc2f1dff10>
+
<seaborn.axisgrid.FacetGrid at 0x7f153fc84830>
 
-../_images/d78e85605ed7b0b7927b040c65b314bea5f357944ad18bb72003f8c4b14217bf.png +../_images/5488f73cdf034061545c78f25fe27c514af3927c350d04e04bbee7c2b07662e0.png
@@ -678,7 +632,7 @@

Importance - +
🚩 Feature 3 has low importance; check for relevance.
+ℹ️ Dummy classifier scores: {'f1': 0.25488459423559595, 'roc_auc': 0.5} (most_frequent strategy).
 
Pipeline(steps=[('detector',
-                 Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7efc26efbf60>,
+                 Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7f153c8e3d80>,
                           message='are negative')),
                 ('svc', SVC())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

The noise feature we added has negative values; the others are all positive, which is what we expect for these data.

diff --git a/_notebooks/Using_redflag_with_Pandas.html b/_notebooks/Using_redflag_with_Pandas.html index ff4be60..bf1e362 100644 --- a/_notebooks/Using_redflag_with_Pandas.html +++ b/_notebooks/Using_redflag_with_Pandas.html @@ -247,7 +247,7 @@

🚩 Using redf

-
'0.4.2rc1'
+
'0.4.2'
 
@@ -443,8 +443,8 @@

Series accessor -
{'f1': 0.24642601013909113,
- 'roc_auc': 0.5071664777645397,
+
{'f1': 0.2411344733492839,
+ 'roc_auc': 0.5030196416166594,
  'strategy': 'stratified',
  'task': 'classification'}
 
@@ -476,9 +476,9 @@

Series accessor
Continuous data suitable for regression
-Outliers:    [ 141  142  175  532  575  581  583  633  662  757  768  769  773  801
- 1316 1498 1547 1744 1745 1754 1756 1778 1779 1780 1784 1785 1788 1808
- 1809 1812 2884 2932 2973 2974 3004 3087 3094 3095 3100 3109]
+Outliers:    [  34   35  136  140  141  142  143  145  175  180  181  182  581  633
+  662  768  769  801 1316 1547 1731 1732 1744 1754 1756 1778 1779 1780
+ 1784 1788 1808 1812 2884 2973 2974 3004 3079 3080 3087 3109]
 Correlated:  True
 Dummy scores:{'mean': {'mean_squared_error': 47528.78263092096, 'r2': 0.0}}
 
@@ -498,7 +498,7 @@

DataFrame accessor -
array([0.23175155, 0.21627564, 0.34215632, 0.20981648])
+
array([0.23155584, 0.21912608, 0.33738409, 0.21193399])
 
diff --git a/_notebooks/Using_redflag_with_sklearn.html b/_notebooks/Using_redflag_with_sklearn.html index 345adf0..74d2ba2 100644 --- a/_notebooks/Using_redflag_with_sklearn.html +++ b/_notebooks/Using_redflag_with_sklearn.html @@ -581,7 +581,7 @@

Using the pre-built
🚩 There are more outliers than expected in the training data (349 vs 31).
 

-
ℹ️ Dummy classifier scores: {'f1': 0.26207946089678574, 'roc_auc': 0.49778842936294404} (stratified strategy).
+
ℹ️ Dummy classifier scores: {'f1': 0.25488459423559595, 'roc_auc': 0.5} (most_frequent strategy).
 
Pipeline(steps=[('standardscaler', StandardScaler()),
@@ -750,7 +750,7 @@ 

The imbalance comparator
🚩 There is a different number of minority classes (2) compared to the training data (4).
-🚩 The minority classes (sandstone, dolomite) are different from those in the training data (sandstone, dolomite, wackestone, mudstone).
+🚩 The minority classes (sandstone, dolomite) are different from those in the training data (wackestone, sandstone, mudstone, dolomite).
 
array([[  66.276     , 2359.73324716,    3.591     ],
@@ -786,12 +786,12 @@ 

Making your own smoke detector
Pipeline(steps=[('detector',
-                 Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7fa46947d760>,
+                 Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7f9c489831a0>,
                           message='are NaNs')),
                 ('svc', SVC())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

There are no NaNs.

@@ -818,30 +818,30 @@

Making your own smoke detector
Pipeline(steps=[('standardscaler', StandardScaler()),
                 ('pipeline',
                  Pipeline(steps=[('detector-1',
-                                  Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7fa424977600>,
+                                  Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7f9c489822a0>,
                                            message='fail custom func '
                                                    'has_nans()')),
                                  ('detector-2',
-                                  Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7fa41db54fe0>,
+                                  Detector(func=<function BaseRedflagDetector.__init__.<locals>.<lambda> at 0x7f9c489b4fe0>,
                                            message='fail custom func '
                                                    'has_outliers()'))])),
                 ('svc', SVC())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

diff --git a/reports/_notebooks/Tutorial.err.log b/reports/_notebooks/Tutorial.err.log deleted file mode 100644 index 33ba613..0000000 --- a/reports/_notebooks/Tutorial.err.log +++ /dev/null @@ -1,78 +0,0 @@ -Traceback (most recent call last): - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/jupyter_cache/executors/utils.py", line 58, in single_nb_execution - executenb( - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/nbclient/client.py", line 1314, in execute - return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute() - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/jupyter_core/utils/__init__.py", line 173, in wrapped - return loop.run_until_complete(inner) - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/asyncio/base_events.py", line 664, in run_until_complete - return future.result() - ^^^^^^^^^^^^^^^ - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/nbclient/client.py", line 709, in async_execute - await self.async_execute_cell( - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/nbclient/client.py", line 1062, in async_execute_cell - await self._check_raise_for_error(cell, cell_index, exec_reply) - File "/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/nbclient/client.py", line 918, in _check_raise_for_error - raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content) -nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell: ------------------- -from sklearn.model_selection import train_test_split - -scaler = StandardScaler() -scaler.fit(X) - -X_train, X_test, y_train, y_test = train_test_split(X, y) - -X_train_scaled = scaler.transform(X_train) -X_test_scaled = scaler.transform(X_test) - -clf.fit(X_train_scaled, y_train) -clf.predict(X_test_scaled) ------------------- - - ---------------------------------------------------------------------------- -ValueError Traceback (most recent call last) -Cell In[3], line 11 - 8 X_train_scaled = scaler.transform(X_train) - 9 X_test_scaled = scaler.transform(X_test) ----> 11 clf.fit(X_train_scaled, y_train) - 12 clf.predict(X_test_scaled) - -File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/base.py:1152, in _fit_context..decorator..wrapper(estimator, *args, **kwargs) - 1145 estimator._validate_params() - 1147 with config_context( - 1148 skip_parameter_validation=( - 1149 prefer_skip_nested_validation or global_skip_validation - 1150 ) - 1151 ): --> 1152 return fit_method(estimator, *args, **kwargs) - -File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/svm/_base.py:199, in BaseLibSVM.fit(self, X, y, sample_weight) - 189 else: - 190 X, y = self._validate_data( - 191 X, - 192 y, - (...) - 196 accept_large_sparse=False, - 197 ) ---> 199 y = self._validate_targets(y) - 201 sample_weight = np.asarray( - 202 [] if sample_weight is None else sample_weight, dtype=np.float64 - 203 ) - 204 solver_type = LIBSVM_IMPL.index(self._impl) - -File /opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/sklearn/svm/_base.py:747, in BaseSVC._validate_targets(self, y) - 745 self.class_weight_ = compute_class_weight(self.class_weight, classes=cls, y=y_) - 746 if len(cls) < 2: ---> 747 raise ValueError( - 748 "The number of classes has to be greater than one; got %d class" - 749 % len(cls) - 750 ) - 752 self.classes_ = cls - 754 return np.asarray(y, dtype=np.float64, order="C") - -ValueError: The number of classes has to be greater than one; got 1 class - diff --git a/searchindex.js b/searchindex.js index b536e9b..c92c772 100644 --- a/searchindex.js +++ b/searchindex.js @@ -1 +1 @@ -Search.setIndex({"docnames": ["_notebooks/Basic_usage", "_notebooks/Tutorial", "_notebooks/Using_redflag_with_Pandas", "_notebooks/Using_redflag_with_sklearn", "authors", "changelog", "contributing", "development", "index", "installation", "license", "redflag", "redflag.distributions", "redflag.imbalance", "redflag.importance", "redflag.independence", "redflag.markov", "redflag.outliers", "redflag.pandas", "redflag.sklearn", "redflag.target", "redflag.utils", "what_is_redflag"], "filenames": ["_notebooks/Basic_usage.ipynb", "_notebooks/Tutorial.ipynb", "_notebooks/Using_redflag_with_Pandas.ipynb", "_notebooks/Using_redflag_with_sklearn.ipynb", "authors.md", "changelog.md", "contributing.md", "development.md", "index.rst", "installation.md", "license.md", "redflag.rst", "redflag.distributions.rst", "redflag.imbalance.rst", "redflag.importance.rst", "redflag.independence.rst", "redflag.markov.rst", "redflag.outliers.rst", "redflag.pandas.rst", "redflag.sklearn.rst", "redflag.target.rst", "redflag.utils.rst", "what_is_redflag.md"], "titles": ["\ud83d\udea9 Basic usage", "\ud83d\udea9 Tutorial", "\ud83d\udea9 Using redflag with Pandas", "\ud83d\udea9 Using redflag with sklearn", "Authors", "Changelog", "Contributing", "Development", "Redflag: safer ML by design", "\ud83d\udea9 Installation", "License", "redflag package", "redflag.distributions module", "redflag.imbalance module", "redflag.importance module", "redflag.independence module", "redflag.markov module", "redflag.outliers module", "redflag.pandas module", "redflag.sklearn module", "redflag.target module", "redflag.utils module", "\ud83d\udea9 What is redflag?"], "terms": {"welcom": [0, 2], "redflag": [0, 5, 7, 9], "It": [0, 1, 5, 13, 18, 19, 21, 22], "": [0, 1, 2, 3, 5, 6, 7, 8, 10, 15, 16, 18, 19, 20, 21], "still": [0, 3, 5, 22], "earli": [0, 5], "dai": 0, "thi": [0, 1, 2, 3, 5, 6, 7, 10, 13, 14, 16, 17, 18, 19, 20, 21, 22], "librari": [0, 1, 8, 22], "ar": [0, 1, 2, 3, 5, 6, 7, 8, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "few": [0, 3], "thing": [0, 1, 3], "you": [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 16, 17, 18, 19, 21], "can": [0, 1, 2, 3, 5, 6, 7, 9, 12, 13, 16, 17, 18, 19, 20, 21, 22], "do": [0, 1, 5, 8, 10, 16, 18, 22], "detect": [0, 1, 3, 5, 17, 19], "label": [0, 1, 3, 5, 12, 13, 14, 18, 19, 20, 21], "ani": [0, 1, 3, 5, 10, 12, 13, 14, 15, 17, 19, 20, 21, 22], "other": [0, 1, 3, 5, 6, 7, 10, 12, 21], "variabl": [0, 3, 16, 17, 18, 20], "rf": [0, 1, 2, 3, 8], "__version__": [0, 1, 2], "0": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "4": [0, 1, 2, 3, 12, 13, 14, 15, 17, 18, 19, 20, 21], "2rc1": [0, 2], "panda": [0, 1, 3, 5, 8, 11, 22], "pd": [0, 1, 2, 3, 21], "df": [0, 1, 2, 3, 8, 22], "read_csv": [0, 1, 2, 3], "http": [0, 1, 2, 3, 10, 13, 16, 21], "raw": [0, 1, 2, 3], "githubusercont": [0, 1, 2, 3], "com": [0, 1, 2, 3, 16], "scienxlab": [0, 1, 2, 3, 6], "dataset": [0, 1, 2, 3, 5, 12, 13, 15, 17, 18, 19, 21, 22], "main": [0, 1, 2, 3, 5, 7, 8], "kg": [0, 1, 2, 3], "panoma_training_data": [0, 1, 2, 3], "csv": [0, 1, 2, 3], "look": [0, 2, 3, 8], "transpos": [0, 3], "summari": [0, 3], "each": [0, 3, 5, 10, 12, 14, 18, 19, 21], "column": [0, 1, 3, 5, 21], "datafram": [0, 3, 5, 8, 22], "i": [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "row": [0, 3, 5, 15], "here": [0, 3, 6], "describ": [0, 3, 10], "t": [0, 1, 3, 5, 7, 19, 21], "count": [0, 3, 13, 17, 20, 21], "mean": [0, 1, 2, 3, 5, 10, 12, 18, 20, 21, 22], "std": [0, 3], "min": [0, 1, 3, 13, 21], "25": [0, 3, 14, 18, 20, 21], "50": [0, 3, 10], "75": [0, 3, 21], "max": [0, 1, 3, 12, 21], "depth": [0, 1, 2, 3], "3966": [0, 3], "882": [0, 3], "674555": [0, 3], "40": [0, 3, 12, 21], "150056": [0, 3], "784": [0, 3], "402800": [0, 3], "858": [0, 3], "012000": [0, 3], "888": [0, 3], "339600": [0, 3], "913": [0, 3], "028400": [0, 3], "963": [0, 3], "320400": [0, 3], "relpo": [0, 1, 2, 3], "524999": [0, 3], "286375": [0, 3], "010000": [0, 3], "282000": [0, 3], "531000": [0, 3], "773000": [0, 3], "1": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "000000": [0, 3], "marin": [0, 1, 2, 3], "325013": [0, 3], "589539": [0, 3], "2": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "gr": [0, 1, 2, 3, 21], "64": [0, 1, 3], "367899": [0, 3], "28": [0, 3], "414603": [0, 3], "12": [0, 1, 2, 3, 5, 12], "036000": [0, 3], "45": [0, 1, 2, 3, 14, 18, 21], "311250": [0, 3], "840000": [0, 3], "78": [0, 1, 2, 3], "809750": [0, 3], "200": [0, 3, 12, 18, 19, 20], "ild": [0, 1, 2, 3], "5": [0, 1, 2, 3, 5, 12, 13, 14, 17, 18, 19, 20, 21], "240308": [0, 3], "3": [0, 1, 2, 3, 12, 13, 14, 17, 18, 19, 20, 21], "190416": [0, 3], "340408": [0, 3], "169567": [0, 3], "305266": [0, 3], "6": [0, 1, 2, 3, 12, 13, 15, 17, 18, 20, 21], "664234": [0, 3], "32": [0, 3], "136605": [0, 3], "deltaphi": [0, 1, 2, 3], "469088": [0, 3], "922310": [0, 3], "21": [0, 3], "832000": [0, 3], "292500": [0, 3], "124750": [0, 3], "18": [0, 3], "600000": [0, 3], "phind": [0, 1, 2, 3], "13": [0, 1, 2, 3, 21], "008807": [0, 3], "936391": [0, 3], "550000": [0, 3], "8": [0, 1, 2, 3, 12, 13, 14, 15, 18, 20, 21], "196250": [0, 3], "11": [0, 1, 2, 3, 12], "781500": [0, 3], "16": [0, 3], "050000": [0, 3], "52": [0, 3], "369000": [0, 3], "pe": [0, 1, 2, 3], "686427": [0, 3], "815113": [0, 3], "200000": [0, 3], "123000": [0, 3], "514500": [0, 3], "241750": [0, 3], "094000": [0, 3], "faci": [0, 1, 2, 3], "471004": [0, 3], "406180": [0, 3], "9": [0, 1, 2, 3, 10, 12, 15, 18, 20, 21], "latitud": [0, 1, 2, 3], "37": [0, 1, 2, 3], "632575": [0, 3], "299398": [0, 3], "180732": [0, 3], "356426": [0, 3], "500380": [0, 3], "910583": [0, 3], "38": [0, 3], "063373": [0, 3], "longitud": [0, 1, 2, 3], "101": [0, 3], "294895": [0, 3], "230454": [0, 3], "646452": [0, 3], "389189": [0, 3], "325130": [0, 3], "106045": [0, 3], "100": [0, 1, 2, 3, 12, 17, 20, 21], "987305": [0, 1, 2, 3], "ild_log10": [0, 1, 2, 3], "648860": [0, 3], "251542": [0, 3], "468000": [0, 3], "501000": [0, 3], "634000": [0, 3], "823750": [0, 3], "507000": [0, 3], "rhob": [0, 1, 2, 3], "2288": [0, 3], "861692": [0, 3], "218": [0, 3], "038459": [0, 3], "1500": [0, 3], "2201": [0, 3], "007475": [0, 3], "2342": [0, 3], "202051": [0, 3], "2434": [0, 3], "166399": [0, 3], "2802": [0, 3], "871147": [0, 3], "fairli": 0, "easi": [0, 1], "tell": [0, 1, 21], "numer": [0, 5, 21], "harder": 0, "decid": [0, 3, 18, 20, 21], "we": [0, 1, 2, 3, 5, 6, 17, 19, 21], "us": [0, 1, 5, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "is_continu": [0, 5, 11, 20], "check": [0, 1, 3, 5, 13, 15, 18, 19, 21, 22], "target": [0, 2, 3, 5, 8, 11, 14, 18, 19, 21, 22], "heurist": [0, 3, 5], "definit": [0, 5, 10, 21], "foolproof": 0, "intern": 0, "sometim": [0, 21], "how": [0, 1, 3, 5, 6], "treat": 0, "col": 0, "print": [0, 2, 5, 19, 20, 21], "f": 0, "20": [0, 3, 12, 15, 18, 19, 21], "well": [0, 1, 2, 3, 5, 16], "name": [0, 1, 2, 3, 5, 10, 12, 13, 16, 19], "fals": [0, 1, 5, 12, 15, 16, 17, 18, 19, 20, 21], "true": [0, 1, 2, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "format": [0, 1, 2, 19], "lithologi": [0, 1, 2, 3], "mineralogi": [0, 1, 2], "siliciclast": [0, 1, 2], "These": [0, 1, 5, 22], "all": [0, 1, 2, 3, 5, 7, 9, 10, 12, 13, 17, 18, 19, 20], "correct": [0, 17], "first": [0, 1, 2, 3, 12, 13, 14, 16, 18, 19, 21], "ll": [0, 1, 3], "measur": [0, 1, 3, 5, 12, 13, 14, 18], "class_imbal": [0, 5], "For": [0, 1, 2, 3, 5, 9, 10, 16, 17, 19, 21, 22], "binari": [0, 13, 20], "imbalac": 0, "ratio": [0, 1, 13, 21], "between": [0, 5, 12, 13, 18, 19, 21], "major": [0, 1, 13], "minor": [0, 1, 3, 5, 13, 18, 19], "class": [0, 1, 5, 8, 13, 16, 18, 19, 20, 21], "multiclass": [0, 13, 20], "degre": [0, 1, 5, 13, 18, 19], "ortigosa": [0, 13, 18], "hernandez": [0, 13, 18], "et": [0, 13, 18], "al": [0, 13, 18], "2017": [0, 13, 18], "singl": [0, 3, 5, 17, 18, 20], "valu": [0, 1, 3, 5, 12, 17, 19, 20, 21], "explain": [0, 3], "mani": [0, 3, 5, 17, 20], "b": [0, 10, 12, 14, 18, 20, 21], "skew": 0, "support": [0, 1, 3, 5, 10], "imbalance_degre": [0, 1, 2, 5, 8, 11, 13, 18], "378593040846633": [0, 1, 2], "To": [0, 1, 3, 5, 7, 19, 22], "interpret": [0, 1], "number": [0, 1, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "two": [0, 1, 3, 5, 7, 13, 21, 22], "part": [0, 1, 3, 5, 6, 10, 13, 18, 19], "The": [0, 1, 2, 4, 5, 7, 8, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "integ": [0, 1, 5, 13, 16, 18, 19, 20, 21], "equal": [0, 1], "m": [0, 1, 3, 5, 7, 13, 14, 16, 18], "where": [0, 1, 5, 10, 13, 14, 19], "fraction": [0, 1, 13, 18, 19, 21], "378": [0, 1], "amount": [0, 1], "balanc": [0, 1], "perfectli": [0, 1], "999": [0, 1, 21], "realli": [0, 1, 5], "bad": [0, 1], "If": [0, 1, 3, 5, 6, 7, 9, 10, 12, 13, 14, 16, 17, 18, 19, 21], "have": [0, 1, 2, 3, 4, 5, 10, 19, 21], "In": [0, 1, 3, 5, 10, 14, 18, 20, 21], "gener": [0, 1, 3, 5, 6, 7, 10, 16, 18, 20, 21], "statist": [0, 1, 3, 12, 16, 19], "more": [0, 1, 2, 3, 5, 7, 8, 10, 16, 17, 19, 20, 21, 22], "inform": [0, 1, 3, 10], "than": [0, 1, 3, 5, 12, 16, 17, 19, 20, 21], "commonli": [0, 1], "imbalance_ratio": [0, 1, 5, 11, 13], "which": [0, 1, 3, 5, 7, 10, 12, 13, 14, 17, 18, 19, 20, 22], "maximum": [0, 1, 21], "minimum": [0, 1, 3], "regard": [0, 1, 10], "get": [0, 1, 2, 7, 12, 13, 18, 19, 21], "those": [0, 1, 3, 10, 22], "fewer": [0, 1, 20], "sampl": [0, 1, 3, 15, 17, 19, 20, 21], "expect": [0, 1, 3, 5, 13, 14, 17, 18, 19], "return": [0, 1, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "order": [0, 1, 3, 4, 5, 12, 13, 14, 16, 18, 20, 21], "smallest": [0, 1], "minority_class": [0, 1, 3, 5, 11, 13, 18], "dolomit": [0, 1, 3], "sandston": [0, 1, 3], "mudston": [0, 1, 3], "wackeston": [0, 1, 3], "dtype": [0, 1, 3, 12, 13, 14, 15, 17, 20, 21], "u10": [0, 1], "empir": [0, 3, 13, 17, 18, 21, 22], "observ": [0, 5, 11, 16], "frequenc": [0, 16], "\u03b6": [0, 13], "e": [0, 1, 3, 5, 8, 13, 14, 16, 18, 20, 21, 22], "empirical_distribut": [0, 11, 13], "39989914": 0, "18582955": 0, "15834594": 0, "04790721": 0, "13691377": 0, "07110439": 0, "same": [0, 1, 3, 5, 12, 13, 16, 18], "uniqu": [0, 12, 16, 20, 21], "note": [0, 3, 5, 12, 17, 19, 21], "differ": [0, 1, 3, 5, 10, 12, 13, 18], "from": [0, 1, 3, 5, 8, 10, 12, 13, 14, 16, 18, 19, 21, 22], "np": [0, 1, 3, 12, 16, 17, 18, 19, 20, 21], "sort": [0, 16, 21], "siltston": [0, 1, 2, 3], "limeston": [0, 1], "object": [0, 1, 2, 3, 5, 10, 16, 18, 19, 22], "also": [0, 1, 3, 5, 13, 16, 17, 18, 19, 22], "inspect": [0, 5, 13, 18, 19], "displai": [0, 10], "ax": [0, 3], "value_count": 0, "plot": 0, "kind": [0, 1, 3, 5, 10, 18, 20, 22], "bar": 0, "add": [0, 1, 3, 5, 6, 9, 10, 12], "line": [0, 1, 9], "level": [0, 3, 16, 17, 18, 19, 20, 21], "axhlin": 0, "len": [0, 1, 12], "c": [0, 1, 3, 5, 9, 10, 12, 14, 18, 19, 21], "r": [0, 12, 20], "matplotlib": 0, "line2d": 0, "0x7f8da2fd35c0": 0, "get_outli": [0, 3, 5, 11, 17], "function": [0, 1, 2, 3, 5, 7, 8, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22], "indic": [0, 3, 10, 14, 17, 19, 20, 21], "point": [0, 3, 17, 21], "301": 0, "302": 0, "303": 0, "415": 0, "416": 0, "417": 0, "418": 0, "799": 0, "896": 0, "897": 0, "898": 0, "899": [0, 3], "996": 0, "997": 0, "1843": 0, "1844": 0, "2278": 0, "2279": 0, "2280": 0, "2638": 0, "2639": 0, "2640": 0, "2641": 0, "2642": 0, "2643": 0, "2920": 0, "2921": 0, "2922": 0, "3070": 0, "3071": 0, "3074": 0, "3075": 0, "3076": 0, "3079": 0, "3080": 0, "3081": 0, "3580": 0, "3581": 0, "3582": 0, "3583": 0, "see": [0, 1, 2, 3, 5, 6, 7, 14, 18, 21], "lie": [0, 21], "seaborn": [0, 1, 3], "sn": [0, 1, 3], "kdeplot": [0, 3], "rugplot": 0, "loc": [0, 1, 3, 12], "c1": 0, "lw": 0, "alpha": 0, "xlabel": [0, 3], "ylabel": [0, 3], "densiti": [0, 5, 12], "By": [0, 6, 19, 21], "default": [0, 3, 5, 12, 13, 14, 16, 17, 18, 19, 20, 21], "an": [0, 2, 3, 5, 6, 9, 10, 12, 14, 17, 18, 21], "isol": [0, 3, 17], "forest": [0, 3, 14, 17, 18], "99": [0, 3, 12, 17, 19, 21], "confid": [0, 3, 16, 17, 18, 19, 20, 21], "opt": [0, 1], "local": [0, 1, 3, 7, 17], "factor": [0, 17, 19], "ellipt": [0, 17], "envelop": [0, 17], "mahalanobi": [0, 5, 11, 17, 19, 21], "distanc": [0, 3, 5, 12, 13, 16, 17, 18, 19, 21], "set": [0, 3, 9, 14, 16, 17, 19, 21], "choos": [0, 10], "equival": [0, 16, 19, 21], "threshold": [0, 1, 3, 5, 12, 13, 14, 15, 17, 18, 19, 21], "standard": [0, 1, 3, 5, 12, 14, 17, 18, 21], "deviat": [0, 3, 5, 17, 21], "awai": [0, 3], "signal": 0, "accept": [0, 10, 18, 19], "univari": [0, 5, 17, 19, 21, 22], "multivari": [0, 3, 5, 12, 19], "method": [0, 2, 3, 5, 12, 13, 17, 18, 19, 20, 21, 22], "mah": [0, 3, 17], "jointplot": 0, "x": [0, 1, 3, 5, 8, 12, 14, 17, 18, 19, 21, 22], "y": [0, 1, 3, 5, 8, 12, 14, 18, 19, 20, 21, 22], "hue": 0, "index_to_bool": [0, 11, 21], "n": [0, 14, 15, 16, 17, 18, 20, 21], "axisgrid": [0, 1], "jointgrid": 0, "0x7f8da12c57f0": 0, "A": [0, 3, 8, 10, 13, 16, 18, 19, 20, 21], "helper": [0, 5], "comput": [0, 5, 10, 12, 13, 16, 17, 19, 21], "given": [0, 3, 8, 12, 14, 16, 17, 18, 19, 21], "size": [0, 1, 12, 18, 19, 20, 21], "assum": [0, 5, 10, 12, 14, 18], "gaussian": [0, 3, 12, 21], "expected_outli": [0, 3, 11, 17], "80": [0, 3, 14, 18, 21], "44": 0, "so": [0, 1, 2, 3, 5, 9], "becaus": [0, 1, 3, 5, 19], "ha": [0, 1, 2, 3, 7, 10, 16, 19, 20, 21, 22], "lot": [0, 1, 3, 5, 19, 21], "truncat": 0, "tail": 0, "test": [0, 3, 5, 8, 9, 12, 14, 18, 19, 21, 22], "directli": [0, 2, 3, 5, 12, 19, 22], "has_outli": [0, 3, 5, 11, 17, 19], "compar": [0, 5, 8, 12, 13, 17, 18, 19, 21, 22], "result": [0, 3, 5, 10, 12, 21], "numpi": [0, 1, 3, 20, 21], "random": [0, 1, 3, 12, 14, 16, 18, 19, 20, 21], "normal": [0, 1, 5, 10, 12, 14, 18, 19, 21], "10_000": [0, 17, 20], "d": [0, 1, 3, 7, 10, 17, 19, 21], "p": [0, 3, 17, 19, 21], "displot": [0, 1, 3], "facetgrid": [0, 1], "0x7f8da2565b50": 0, "onli": [0, 1, 2, 3, 5, 10, 14, 16, 18, 19, 20, 22], "about": [0, 5, 7, 8, 19, 21, 22], "60": 0, "10": [0, 1, 2, 12, 13, 16, 17, 18, 20, 21], "000": [0, 1, 2, 17], "record": [0, 1, 3, 5, 19], "been": [0, 1, 3, 5, 10, 22], "multipl": [0, 1, 5, 17, 21], "instanc": [0, 1, 19, 21], "its": [0, 1, 2, 5, 10, 12], "There": [0, 1, 3, 6, 7, 8, 22], "legitim": [0, 1], "reason": [0, 1, 3, 5, 10], "why": [0, 1, 3, 14, 18], "might": [0, 1, 3], "happen": [0, 1, 7], "exampl": [0, 1, 2, 3, 5, 6, 7, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "mai": [0, 1, 2, 3, 10, 17, 19, 21], "natur": [0, 1, 3], "bound": [0, 1, 20], "g": [0, 1, 3, 5, 8, 18, 20, 21, 22], "poros": [0, 1], "alwai": [0, 1, 5], "greater": [0, 1], "deliber": [0, 1, 10], "prepar": [0, 1, 10], "process": [0, 1, 5, 22], "is_clip": [0, 1, 5, 11, 21], "0x7f8da12554c0": 0, "tri": [0, 5], "guess": [0, 5], "follow": [0, 1, 3, 4, 5, 7, 10, 13, 21], "scipi": [0, 12], "stat": [0, 21], "norm": [0, 12, 13, 18], "cosin": 0, "expon": 0, "exponpow": 0, "gamma": [0, 1], "gumbel_l": 0, "gumbel_r": 0, "powerlaw": 0, "triang": [0, 12], "trapz": 0, "uniform": [0, 19], "along": [0, 3, 10], "paramet": [0, 3, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "locat": [0, 3, 19], "scale": [0, 1, 3, 12, 21], "spite": 0, "find": [0, 1, 3, 5, 12, 17], "nearli": 0, "best_distribut": [0, 11, 12], "36789939485628": 0, "411020184908292": 0, "contrast": 0, "andbest": 0, "model": [0, 1, 3, 5, 12, 19, 21, 22], "gumbel": 0, "040572536302586": 0, "93432972751726": 0, "0x7f8da016a060": 0, "often": [0, 1, 3], "like": [0, 1, 2, 3, 5, 7, 9, 14, 16, 18, 19, 20, 21, 22], "implicit": 0, "our": [0, 1, 3, 19], "across": [0, 5, 12, 14, 18], "variou": [0, 1, 5], "respect": [0, 6], "both": [0, 3, 5, 7, 13, 22], "wasserstein": [0, 3, 5, 11, 12, 19], "facilit": 0, "calcul": [0, 12, 17], "aka": [0, 21], "earth": [0, 3], "mover": [0, 3], "train": [0, 1, 3, 5, 19, 21, 22], "score": [0, 1, 2, 3, 5, 12, 18, 20, 21], "substanti": 0, "w": 0, "25985545": 0, "28404634": 0, "49139232": 0, "33701782": 0, "22736457": 0, "13473663": 0, "33672956": 0, "20969657": 0, "41216725": 0, "34568777": 0, "39729747": 0, "48092099": 0, "0801856": 0, "10675027": 0, "13740318": 0, "10325295": 0, "19913347": 0, "21828753": 0, "26995735": 0, "33063277": 0, "24612402": 0, "23889923": 0, "26699721": 0, "2350674": 0, "20666445": 0, "44112543": 0, "16229232": 0, "63527036": 0, "18187639": 0, "34992043": 0, "19400917": 0, "74988182": 0, "31761526": 0, "27206283": 0, "30255291": 0, "24779581": 0, "could": [0, 3, 22], "heatmap": 0, "yticklabel": 0, "xticklabel": 0, "show": [0, 1, 3, 5, 14, 18], "u": [0, 1, 21], "log": [0, 1, 3], "7": [0, 3, 12, 14, 15, 17, 18, 20, 21], "somewhat": 0, "anomal": [0, 5, 8], "suggest": [0, 17], "cross": [0, 1, 10, 12], "h": 0, "cattl": 0, "sklearn": [0, 1, 2, 5, 8, 11, 13, 17, 18, 22], "model_select": [0, 1], "train_test_split": [0, 1], "preprocess": [0, 1, 3], "standardscal": [0, 1, 3], "x_train": [0, 1, 3, 21], "x_": 0, "test_siz": 0, "random_st": [0, 14, 18, 19, 20, 21], "42": [0, 1, 12, 14, 18, 20], "re": [0, 1, 3, 6, 19], "illustr": 0, "purpos": [0, 10], "valid": [0, 1, 3, 12, 19], "wai": [0, 1, 2, 3, 5, 6, 8, 13, 18, 22], "indeped": 0, "x_val": [0, 21], "x_test": [0, 1, 3], "should": [0, 1, 3, 5, 7, 12], "scaler": [0, 1], "fit_transform": [0, 8, 11, 19, 22], "transform": [0, 1, 5, 8, 10, 11, 19, 21, 22], "case": [0, 5, 14, 17, 18, 19, 21], "pass": [0, 3, 5, 12, 17, 19, 21], "them": [0, 3, 5, 13, 16, 17, 18, 22], "list": [0, 10, 13, 16, 18, 19, 20, 21], "tupl": [0, 12, 13, 16, 21], "03860982": 0, "02506236": 0, "04321734": 0, "03437337": 0, "04402681": 0, "02528225": 0, "0385111": 0, "05694201": 0, "04388196": 0, "049464": 0, "05560379": 0, "04002712": 0, "quit": [0, 5], "low": [0, 1, 3, 5, 18, 19, 20], "randomli": [0, 1, 3, 16], "correl": [0, 1, 2, 3, 15, 19], "lag": [0, 1], "shift": [0, 1, 3], "version": [0, 1, 3, 5, 7, 10, 19], "itself": [0, 1, 3, 6, 21], "sever": [0, 1, 3, 5, 6], "themselv": [0, 1, 3, 16, 19], "is_correl": [0, 1, 11, 15], "depend": [0, 1, 5, 8, 13, 18, 21], "That": [0, 1, 3, 12, 20], "shuffl": [0, 1], "remov": [0, 1, 3, 5], "doe": [0, 1, 3, 5, 10, 12, 13, 18, 19, 21, 22], "to_numpi": [0, 1], "copi": [0, 1, 5, 10, 21], "know": [0, 3, 5], "most": [0, 1, 3, 5, 7, 12, 14, 18, 20, 21, 22], "seri": [0, 5, 8, 12, 22], "your": [0, 5, 8, 10], "assess": [0, 14, 18], "averag": [0, 14, 18], "serv": [0, 5], "control": [0, 10], "let": [0, 1, 2, 3], "small": [0, 3, 5, 19], "come": [0, 5, 19], "veri": [0, 1, 2, 3, 5], "close": [0, 5, 21, 22], "zero": [0, 16, 21], "constant": 0, "classif": [0, 2, 5, 14, 18, 20], "task": [0, 1, 2, 5, 14, 18, 19, 20], "imagin": 0, "try": [0, 1, 2, 3, 12, 17], "predict": [0, 1, 3, 5, 18, 19, 20, 22], "feature_import": [0, 1, 2, 5, 11, 14, 18], "36923691": 0, "31790105": 0, "22145207": 0, "08204644": 0, "unsurprisingli": 0, "useless": 0, "help": [0, 1, 5, 6, 7, 9, 22], "least": [0, 1, 5, 10, 14], "least_important_featur": [0, 5, 11, 14], "And": 0, "complementari": [0, 5], "report": [0, 2, 5, 6, 11, 18], "most_important_featur": [0, 5, 11, 14], "now": [0, 1, 2, 3, 5], "regress": [0, 2, 5, 14, 18, 20], "includ": [0, 1, 3, 5, 10, 16], "dummi": [0, 1, 2, 3, 5, 19, 20], "09514279": 0, "35815121": 0, "52083108": 0, "02587492": 0, "less": [0, 5, 21], "again": 0, "go": 1, "featur": [1, 2, 3, 5, 6, 8, 12, 14, 17, 18, 19, 21, 22], "problem": [1, 3, 13, 22], "machin": [1, 8, 22], "learn": [1, 3, 5, 8, 17, 19, 20, 22], "need": [1, 5, 7, 18, 21], "packag": [1, 5, 8, 9, 17, 22], "run": [1, 3, 5, 7, 12, 21], "code": [1, 5, 10, 16, 19], "burn": 1, "ourselv": 1, "19": [1, 12], "23": 1, "35": [1, 14, 18, 21], "59": 1, "31": [1, 3, 17], "rai": 1, "ss": 1, "svm": [1, 3, 5], "svc": [1, 3], "clf": 1, "kernel": [1, 5, 12], "linear": 1, "fit": [1, 3, 10, 11, 12, 19, 22], "arrai": [1, 2, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "u2": 1, "far": [1, 2, 3], "good": [1, 20, 21], "everyth": 1, "work": [1, 3, 5, 10, 13, 19], "someon": 1, "x_scale": 1, "oop": 1, "unscal": 1, "easili": [1, 3, 5, 22], "done": 1, "peopl": [1, 4], "stack": [1, 19], "overflow": 1, "wonder": 1, "thei": [1, 2, 3, 5, 14, 18, 22], "ve": 1, "someth": [1, 3, 5, 16], "even": [1, 2, 10], "easier": [1, 5], "common": [1, 5, 10, 17], "pattern": [1, 8, 13], "y_train": [1, 3, 21], "y_test": [1, 3], "x_train_scal": 1, "x_test_scal": 1, "valueerror": [1, 12, 21], "traceback": [1, 12, 20, 21], "recent": [1, 12, 20, 21, 22], "call": [1, 2, 3, 5, 12, 18, 19, 20, 21, 22], "last": [1, 12, 20, 21], "cell": [1, 3], "file": [1, 5, 7, 10], "hostedtoolcach": 1, "python": [1, 5, 7, 8, 22], "x64": 1, "lib": 1, "python3": 1, "site": 1, "base": [1, 3, 10, 16, 18, 19, 22], "py": [1, 5, 19], "1152": 1, "_fit_context": 1, "decor": [1, 18, 21], "wrapper": 1, "estim": [1, 5, 12, 19, 20, 21], "arg": [1, 16, 18, 19], "kwarg": [1, 19], "1145": 1, "_validate_param": 1, "1147": 1, "config_context": 1, "1148": 1, "skip_parameter_valid": 1, "1149": 1, "prefer_skip_nested_valid": 1, "global_skip_valid": 1, "1150": 1, "1151": 1, "fit_method": 1, "_base": 1, "199": 1, "baselibsvm": 1, "self": [1, 3, 16, 19], "sample_weight": 1, "189": 1, "els": 1, "190": 1, "_validate_data": 1, "191": 1, "192": 1, "196": 1, "accept_large_spars": 1, "197": 1, "_validate_target": 1, "201": 1, "asarrai": 1, "202": 1, "none": [1, 2, 5, 12, 13, 14, 16, 17, 18, 19, 20, 21], "float64": 1, "203": 1, "204": 1, "solver_typ": 1, "libsvm_impl": 1, "index": [1, 8, 21], "_impl": 1, "747": 1, "basesvc": 1, "745": 1, "class_weight_": 1, "compute_class_weight": 1, "class_weight": 1, "cl": 1, "y_": 1, "746": 1, "rais": [1, 3, 19, 20, 22], "748": 1, "one": [1, 3, 5, 10, 12, 16, 19, 21], "got": 1, "749": 1, "750": 1, "752": 1, "classes_": 1, "754": 1, "three": [1, 3, 8, 17, 22], "block": [1, 5], "split": [1, 3, 5, 21], "total": [1, 5, 13, 18], "stratifi": [1, 2, 3, 5, 20], "preserv": 1, "wa": [1, 5, 10, 16, 21], "entir": [1, 5, 21], "leak": 1, "hidden": 1, "cannot": [1, 3, 10, 19], "plenti": 1, "too": [1, 3, 5, 20, 21], "reproduc": [1, 5, 10], "enough": [1, 3], "etc": [1, 3, 21], "error": 1, "everywher": [1, 6], "want": [1, 3, 9, 12, 13, 18, 19, 21], "chang": [1, 3, 5, 10], "sure": [1, 3, 5], "v0": 1, "otherwis": [1, 10, 13], "pip": [1, 7, 8, 9], "instal": [1, 2, 5, 8], "environ": [1, 3, 5, 9], "dev18": 1, "g743d11f": 1, "d20230927": 1, "head": [1, 2], "shrimplin": [1, 2], "851": [1, 2], "3064": [1, 2], "a1": [1, 2], "sh": [1, 2], "77": [1, 2, 3], "613176": [1, 2], "915": [1, 2], "978076": [1, 2], "664": [1, 2], "2393": [1, 2], "499945": [1, 2], "4588": [1, 2], "979": [1, 2], "26": [1, 2], "581419": [1, 2], "14": [1, 2], "565": [1, 2], "661": [1, 2], "2416": [1, 2], "119814": [1, 2], "6112": [1, 2], "957": [1, 2], "79": [1, 2], "05": [1, 2, 14, 21], "549881": [1, 2], "050": [1, 2], "658": [1, 2], "2404": [1, 2], "576056": [1, 2], "7636": [1, 2], "936": [1, 2], "86": [1, 2], "518559": [1, 2], "115": [1, 2], "655": [1, 2], "249071": [1, 2], "9160": [1, 2], "74": [1, 2], "58": [1, 2], "436086": [1, 2], "300": [1, 2], "647": [1, 2], "2382": [1, 2], "602601": [1, 2], "later": [1, 3, 19], "spuriou": [1, 22], "rng": [1, 12, 18, 19, 20, 21], "default_rng": [1, 12, 18, 19, 20, 21], "nois": [1, 3], "algorithm": 1, "flag": [1, 3, 17, 21], "outlier": [1, 2, 3, 5, 8, 11, 19], "distribut": [1, 3, 5, 8, 10, 11, 13, 18, 19, 21, 22], "shape": [1, 3, 8, 12, 17, 19], "is_categorical_dtyp": 1, "deprec": [1, 5, 11, 21], "futur": [1, 2, 3, 5, 9, 16], "isinst": 1, "categoricaldtyp": 1, "instead": [1, 3, 5, 19], "use_inf_as_na": 1, "option": [1, 3, 7, 8, 17, 19, 21, 22], "convert": [1, 21], "inf": 1, "nan": [1, 3, 21], "befor": [1, 3, 12, 17, 19, 22], "oper": 1, "figur": [1, 21], "layout": 1, "tight": 1, "0x7efc2f1dff10": 1, "But": [1, 3], "around": 1, "issu": [1, 3, 5, 6, 10, 14, 18, 22], "41495144": 1, "20256686": 1, "3156612": 1, "0668205": 1, "As": [1, 2, 3, 8], "hope": 1, "attribut": [1, 10, 19], "shown": 1, "possibl": [1, 3, 5, 10], "would": [1, 12], "nice": 1, "smoke": [1, 8], "alarm": [1, 5, 19], "prebuilt": 1, "won": 1, "abl": 1, "catch": 1, "howev": [1, 5, 10], "hard": [1, 5], "spot": 1, "alert": [1, 19, 22], "user": [1, 22], "potenti": [1, 20, 22], "provid": [1, 3, 5, 10, 13, 16, 18, 20, 22], "wrap": [1, 5, 18, 20], "anywai": 1, "sensibl": 1, "test_wel": [1, 3], "crawford": [1, 3], "stuart": [1, 3], "test_flag": [1, 3], "isin": [1, 3], "step": [1, 3, 12, 16, 19, 21, 22], "x27": [1, 3], "imbalancedetector": [1, 5, 8, 11, 13, 18, 19, 22], "clipdetector": [1, 5, 11, 19], "correlationdetector": [1, 5, 11, 19], "multimod": [1, 3, 5, 12, 19], "multimodalitydetector": [1, 3, 5, 11, 19], "outlierdetector": [1, 5, 11, 19], "distributioncompar": [1, 5, 11, 19, 22], "importancedetector": [1, 5, 11, 19], "dummypredictor": [1, 3, 11, 19], "jupyt": [1, 3], "pleas": [1, 3, 6, 7, 9], "rerun": [1, 3], "html": [1, 3, 7], "represent": [1, 3], "trust": [1, 3], "notebook": [1, 3, 5], "On": [1, 3], "github": [1, 3, 7, 8, 16], "unabl": [1, 3], "render": [1, 3], "page": [1, 3, 5, 7, 8], "nbviewer": [1, 3], "org": [1, 3, 10, 13, 21], "pipelinepipelin": [1, 3], "imbalancedetectorimbalancedetector": [1, 3], "clipdetectorclipdetector": [1, 3], "correlationdetectorcorrelationdetector": [1, 3], "multimodalitydetectormultimodalitydetector": [1, 3], "outlierdetectoroutlierdetector": [1, 3], "distributioncomparatordistributioncompar": [1, 3], "importancedetectorimportancedetector": [1, 3], "dummypredictordummypredictor": [1, 3], "make_pipelin": [1, 3, 19], "pipe": [1, 3, 19], "standardscalerstandardscal": [1, 3], "svcsvc": [1, 3], "imbalanc": [1, 3, 13, 18], "420": [1, 3], "400": [1, 3], "minority_classes_": [1, 3, 19], "succeed": [1, 3], "group": [1, 3, 5, 12, 21], "316": 1, "v": [1, 3, 12], "relev": [1, 5], "\u2139": [1, 3], "classifi": [1, 3, 5], "f1": [1, 2, 3, 5, 18, 20], "2777104337392946": 1, "roc_auc": [1, 2, 3, 18, 20], "5013219070647543": 1, "strategi": [1, 2, 3, 5, 18, 20], "643721188696941": 1, "detector": [1, 5, 8, 11, 13, 18, 19, 22], "def": [1, 3], "has_neg": [1, 19], "bool": [1, 3, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "trigger": [1, 3, 5, 19], "neg": [1, 3, 19], "negative_detector": [1, 3], "nb": 1, "func": [1, 3, 12, 19], "lt": [1, 3], "baseredflagdetector": [1, 3, 11, 19], "__init__": [1, 3], "gt": [1, 3], "lambda": [1, 3, 12, 19], "0x7efc26efbf60": 1, "messag": [1, 3, 5, 19], "detectordetector": [1, 3], "ad": [1, 5], "posit": [1, 5, 21], "what": [1, 5, 8, 17, 18, 20], "care": [1, 5], "basic_usag": [2, 3, 5], "ipynb": [2, 3, 5], "using_redflag_with_panda": 2, "some": [2, 3, 5, 6, 8, 13, 18, 19, 22], "give": [2, 3, 5, 10], "access": [2, 5, 22], "almost": [2, 5], "were": [2, 3, 5, 12], "best": [2, 5, 12, 20], "idea": [2, 3], "though": 2, "import": [2, 3, 5, 6, 8, 10, 11, 18, 19, 20, 21], "long": 2, "regist": 2, "time": [2, 3, 12, 19], "being": [2, 3, 14, 18, 21], "simplic": 2, "notic": [2, 10], "extra": 2, "insert": [2, 22], "Or": [2, 9], "ask": 2, "new": [2, 3, 5, 6, 7], "dummy_scor": [2, 5, 11, 18, 20], "24642601013909113": 2, "5071664777645397": 2, "mean_squared_error": [2, 18, 20], "47528": 2, "78263092096": 2, "r2": [2, 5, 18, 20], "simpl": [2, 8], "continu": [2, 5, 8, 18, 19, 20], "data": [2, 3, 5, 8, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22], "suitabl": [2, 5], "141": 2, "142": 2, "175": 2, "532": 2, "575": 2, "581": 2, "583": 2, "633": 2, "662": 2, "757": 2, "768": 2, "769": 2, "773": 2, "801": 2, "1316": 2, "1498": 2, "1547": 2, "1744": 2, "1745": 2, "1754": 2, "1756": 2, "1778": 2, "1779": 2, "1780": 2, "1784": 2, "1785": 2, "1788": 2, "1808": 2, "1809": 2, "1812": 2, "2884": 2, "2932": 2, "2973": 2, "2974": 2, "3004": 2, "3087": 2, "3094": 2, "3095": 2, "3100": 2, "3109": 2, "experiment": [2, 5, 18], "releas": [2, 5, 7], "feedback": 2, "correlation_detector": [2, 5, 11, 18], "implement": [2, 3, 5, 16, 19, 21, 22], "23175155": 2, "21627564": 2, "34215632": 2, "20981648": 2, "appear": [2, 5, 10, 12, 14, 18, 21], "autocorrel": 2, "inde": 2, "red": 3, "load": [3, 8], "independ": [3, 5, 8, 11], "furthermor": 3, "clip": [3, 5, 8, 11, 19, 21, 22], "histplot": 3, "subsequ": [3, 5, 10, 19, 22], "product": [3, 5, 10], "mostli": [3, 5], "unsupervis": [3, 14, 18, 19], "iid": [3, 8], "particular": [3, 10], "univariateoutlierdetector": [3, 11, 19], "consid": [3, 5, 6, 17, 19], "separ": [3, 10, 19], "usual": 3, "probabl": [3, 5, 13, 17, 19, 20, 21], "multivariateoutlierdetector": [3, 11, 19], "togeth": [3, 19], "dure": [3, 5, 19, 22], "word": [3, 5, 16, 21], "examin": 3, "final": [3, 19], "bit": [3, 5], "mode": 3, "supervis": 3, "fulli": 3, "triger": 3, "similar": [3, 5], "seen": 3, "ordinari": 3, "rfpipelin": [3, 5, 11, 19], "contain": [3, 5, 7, 10, 13, 16, 18, 21], "out": [3, 10, 22], "read": [3, 6, 7, 9, 22], "compat": 3, "requir": [3, 5, 7, 10, 17, 18, 19, 20], "comparison": [3, 5], "vector": [3, 19, 20], "avail": [3, 10], "anoth": [3, 5, 6, 21], "compos": [3, 22], "multi": [3, 13, 20], "make_rf_pipelin": [3, 5, 11, 19], "just": [3, 5, 7, 13, 19, 21], "carri": [3, 8, 10], "phase": 3, "categor": [3, 5, 8, 18, 19, 20], "input": [3, 19, 21], "349": 3, "26207946089678574": 3, "49778842936294404": 3, "3682141715600706": 3, "when": [3, 5, 19, 21, 22], "categori": [3, 20], "y_pred": 3, "30": [3, 21], "argument": [3, 5, 12, 18], "element": [3, 16, 21], "redflag_pipelin": 3, "compon": [3, 5, 8, 19, 22], "yet": [3, 5], "sensit": [3, 21], "instanti": [3, 5, 19], "construct": [3, 19], "drop": 3, "leav": 3, "don": [3, 7, 21], "think": 3, "troubl": 3, "lower": [3, 20], "qualifi": 3, "rememb": 3, "longer": [3, 5], "839": 3, "626": 3, "154443705823081": 3, "higher": 3, "fail": [3, 5, 22], "mention": 3, "whether": [3, 10, 12, 14, 16, 17, 18, 20], "never": 3, "rfpipelinerfpipelin": 3, "imbalancecomparatorimbalancecompar": 3, "therefor": [3, 19], "infer": [3, 14, 16, 18], "66": 3, "276": 3, "2359": 3, "73324716": 3, "591": 3, "252": 3, "2354": 3, "54679144": 3, "341": 3, "82": 3, "2330": 3, "35783664": 3, "064": 3, "90": [3, 5, 14, 18, 21], "49": [3, 13, 14, 18], "2193": 3, "06953439": 3, "168": 3, "975": 3, "2192": 3, "32922081": 3, "154": 3, "108": 3, "2176": 3, "62535394": 3, "125": 3, "emit": [3, 5, 21], "has_nan": [3, 5, 11, 21], "isnan": 3, "0x7fa46947d760": 3, "make_detector_pipelin": [3, 5, 11, 19], "combin": [3, 10, 12], "ab": [3, 12], "custom": [3, 5, 19], "0x7fa424977600": 3, "0x7fa41db54fe0": 3, "class_count": [3, 11, 13], "worri": 3, "concern": 3, "seem": [3, 5, 19], "lose": 3, "dynam": 3, "rang": [3, 5, 16, 17, 18, 20], "daili": 3, "temperatur": [3, 18, 20], "europ": 3, "deg": 3, "dealt": 3, "attenu": 3, "larg": [3, 6, 21], "sens": [3, 5, 19], "simpli": 3, "suspici": 3, "without": [3, 10], "perform": [3, 5, 10, 19, 21], "awar": 3, "research": [3, 22], "contigu": 3, "space": 3, "spatial": [3, 12], "rock": 3, "properti": [3, 16], "assumpt": [3, 8, 14, 18], "One": 3, "big": 3, "pitfal": [3, 22], "non": [3, 5, 10], "must": [3, 10, 12, 17, 19, 21], "leakag": [3, 8], "thu": [3, 20], "over": [3, 12, 21], "optimist": 3, "evaul": 3, "date": [3, 10], "patient": 3, "id": [3, 13, 18, 19], "borehol": 3, "robust": [3, 19], "covari": [3, 17, 19], "insensit": 3, "dimension": 3, "analog": [3, 17], "varianc": [3, 14, 18], "certain": 3, "fall": 3, "centr": 3, "within": [3, 10, 19, 21], "sd": [3, 17], "1000": [3, 17, 21], "val": 3, "iso": [3, 17], "okai": 3, "keep": 3, "bin": [3, 12, 19, 21], "No": [3, 5, 12, 19, 21], "evalu": [3, 5], "turn": [3, 16], "treatment": 3, "crack": 3, "sign": 3, "violat": 3, "ident": [3, 8, 12], "current": [3, 5, 16], "visual": 3, "especi": 3, "ignor": [3, 12, 13, 17, 18], "forget": 3, "appli": [3, 5, 10, 12, 17, 19, 22], "domain": 3, "geograph": 3, "type": [3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "widget": 3, "select": 3, "unintend": 3, "classic": 3, "medic": 3, "diagnosi": 3, "encod": 3, "hand": [3, 13], "distract": 3, "improv": [3, 5, 6, 10], "desir": 3, "contribut": [4, 8, 10], "project": [4, 6, 7], "alphabet": 4, "matt": 4, "hall": 4, "agil": [4, 6], "scientif": 4, "canada": 4, "orcid": 4, "0000": 4, "0002": 4, "4054": 4, "8295": 4, "make": [5, 6, 7, 8, 10, 13, 16, 19, 20], "document": [5, 6, 7, 9, 10], "repons": 5, "review": 5, "submiss": [5, 10], "journal": [5, 12, 21], "open": 5, "sourc": [5, 9, 10], "softwar": [5, 10, 22], "joss": [5, 22], "89": 5, "91": 5, "92": 5, "93": 5, "94": 5, "95": [5, 16, 18, 20, 21], "build": 5, "window": 5, "maco": 5, "linux": 5, "ci": 5, "intend": 5, "preview": 5, "relat": [5, 12, 15, 16, 17, 20], "accessor": [5, 8, 18, 22], "is_imbalanc": [5, 11, 13, 18, 22], "conda": [5, 7, 8, 9], "manag": [5, 10], "forg": [5, 8, 9], "warn": [5, 8, 14, 18, 19, 21, 22], "valueexcept": 5, "allow": [5, 21], "pipelin": [5, 8, 19, 22], "break": 5, "is_ord": [5, 11, 18, 20], "markov": [5, 11], "chain": [5, 16, 19], "analysi": 5, "chi": [5, 16], "squar": [5, 12, 16, 19, 20, 21], "transit": [5, 16], "matrix": [5, 12, 16], "boolean": [5, 17, 21], "perhap": 5, "below": [5, 8, 10, 13, 18], "is_multimod": [5, 11, 12], "present": [5, 19], "modal": 5, "partit": [5, 12], "insufficientdatadetector": [5, 11, 19], "regressionmultimodaldetector": 5, "multimodaldetector": 5, "via": 5, "subject": [5, 10], "text": [5, 10], "dummy_classification_scor": [5, 11, 20], "dummy_regression_scor": [5, 11, 20], "naiv": [5, 19], "mse": [5, 20], "roc": [5, 20], "auc": [5, 20], "addition": 5, "most_frequ": [5, 18, 20], "emploi": 5, "suit": [5, 20], "appropri": [5, 10, 18, 20], "move": 5, "update_p": [5, 11, 21], "util": [5, 11, 17], "imbal": [5, 8, 11, 18, 19], "up": [5, 19], "debat": 5, "has_low_distance_stdev": 5, "resembl": 5, "semant": 5, "success": 5, "1d": [5, 12, 17, 19, 21], "write": [5, 6, 10], "own": [5, 8, 10, 22], "take": [5, 21], "sequenc": [5, 12, 16, 19, 21], "map": 5, "scikit": [5, 8, 17, 19, 20, 22], "unimod": 5, "soon": 5, "redefin": 5, "is_standard": [5, 11, 21], "is_standard_norm": [5, 11, 21], "kolmogorov": [5, 21], "smirnov": [5, 21], "reliabl": [5, 22], "exactli": [5, 20], "roughli": 5, "slightli": 5, "exist": [5, 22], "eg": 5, "sinc": 5, "knn": [5, 14, 18], "third": [5, 10, 21], "unstabl": 5, "caus": [5, 10], "erron": 5, "consecut": [5, 11, 21], "tutori": [5, 6, 8], "doc": 5, "button": 5, "half": [5, 14], "high": [5, 18, 19, 20], "imbalancecompar": [5, 11, 19], "throw": 5, "garden": 5, "special": [5, 10], "straight": 5, "fork": [5, 8], "claus": [5, 19], "bsd": [5, 19], "licens": [5, 8, 19], "using_redflag_with_sklearn": 5, "buggi": 5, "convers": [5, 10, 17], "discret": [5, 13], "ones": [5, 21], "test_redflag": 5, "wherea": 5, "doctest": [5, 7], "onc": 5, "pytest": [5, 7], "coverag": 5, "excess": [5, 19], "reorgan": 5, "modul": [5, 8], "namespac": 5, "doesn": 5, "affect": 5, "confus": 5, "either": [5, 7, 10, 12, 14, 16, 18], "conveni": [5, 21], "oneclasssvm": 5, "ellipticenvelop": 5, "zscore_outli": 5, "kde_peak": [5, 11, 12], "peak": [5, 12], "fit_kd": [5, 11, 12], "get_kd": [5, 11, 12], "find_large_peak": [5, 11, 12], "bandwidth": [5, 12], "bw_silverman": [5, 11, 12], "bw_scott": [5, 11, 12], "overrid": 5, "fix": [5, 6, 22], "bug": [5, 6], "using_redflag": 5, "has_monoton": [5, 11, 21], "has_flat": [5, 11, 21], "interpol": 5, "iter_group": [5, 11, 21], "ecdf": [5, 11, 21], "flatten": [5, 11, 21], "stdev_to_proport": [5, 11, 17, 21], "proportion_to_stdev": [5, 11, 21], "wrote": 5, "has_few_sampl": [5, 11, 21], "z": [5, 21], "goe": 5, "workflow": [5, 7, 8, 22], "stabl": 5, "flail": 5, "auto": [5, 15, 18, 19, 20], "thank": 6, "submit": [6, 10, 22], "request": [6, 7], "propos": 6, "pull": [6, 7], "typo": 6, "fortun": 6, "profession": 6, "commun": [6, 10], "mutual": 6, "consider": 6, "protect": 6, "everyon": 6, "wish": 6, "identifi": [6, 12, 17, 22], "author": [6, 8, 10], "yourself": 6, "md": [6, 7], "agre": [6, 10], "shall": [6, 10], "govern": 6, "term": [6, 10], "unless": [6, 10], "specif": [6, 21], "agreement": [6, 10], "made": [6, 10, 14, 18], "start": [7, 21], "dev": [7, 9], "back": [7, 13], "cov": 7, "docstr": [7, 21], "further": 7, "folder": 7, "repo": 7, "pep": 7, "518": 7, "style": 7, "tar": 7, "gz": 7, "whl": 7, "command": [7, 9], "cd": 7, "sphinx": 7, "manual": 7, "stuff": 7, "makefil": 7, "script": 7, "updat": [7, 21], "publish": [7, 21], "action": 7, "push": 7, "upload": 7, "pypi": 7, "interfac": [7, 10, 19], "lightweight": 8, "safeti": 8, "net": 8, "ndarrai": [8, 12, 13, 14, 16, 17, 19, 21], "analys": 8, "threat": 8, "channel": [8, 9], "program": 8, "standalon": [8, 22], "explor": 8, "overview": 8, "basic": 8, "usag": 8, "metric": [8, 12, 13, 14, 18], "pre": 8, "built": [8, 19], "submodul": 8, "content": [8, 10], "develop": [8, 9], "changelog": 8, "search": [8, 12], "At": 9, "config": 9, "channel_prior": 9, "strict": 9, "apach": 10, "januari": 10, "2004": 10, "www": 10, "AND": 10, "condit": [10, 19], "FOR": 10, "reproduct": 10, "defin": [10, 13, 18], "section": 10, "through": [10, 22], "licensor": 10, "copyright": 10, "owner": 10, "entiti": 10, "grant": 10, "legal": 10, "union": 10, "act": 10, "under": [10, 19], "power": 10, "direct": [10, 16], "indirect": 10, "contract": 10, "ii": 10, "ownership": 10, "fifti": 10, "percent": 10, "outstand": 10, "share": 10, "iii": 10, "benefici": 10, "individu": 10, "exercis": 10, "permiss": 10, "form": 10, "prefer": 10, "modif": 10, "limit": 10, "configur": 10, "mechan": 10, "translat": 10, "compil": 10, "media": 10, "authorship": 10, "attach": 10, "appendix": 10, "deriv": [10, 13], "editori": 10, "revis": 10, "annot": 10, "elabor": 10, "repres": [10, 12, 14, 16, 18], "whole": [10, 19], "origin": [10, 16, 19], "remain": 10, "mere": 10, "link": 10, "bind": 10, "thereof": 10, "addit": [10, 19], "intention": 10, "inclus": 10, "behalf": 10, "electron": 10, "verbal": 10, "written": 10, "sent": 10, "mail": 10, "system": [10, 16], "track": 10, "discuss": 10, "exclud": 10, "conspicu": 10, "mark": [10, 21], "design": 10, "Not": [10, 13, 19], "contributor": [10, 19], "whom": 10, "receiv": 10, "incorpor": [10, 22], "herebi": 10, "perpetu": 10, "worldwid": 10, "exclus": 10, "charg": 10, "royalti": 10, "free": 10, "irrevoc": 10, "publicli": 10, "sublicens": 10, "patent": 10, "except": [10, 22], "state": [10, 14, 16, 18], "offer": [10, 22], "sell": 10, "transfer": 10, "claim": 10, "necessarili": 10, "infring": 10, "alon": 10, "institut": 10, "litig": 10, "against": [10, 12, 22], "counterclaim": 10, "lawsuit": 10, "alleg": 10, "constitut": 10, "contributori": 10, "termin": 10, "redistribut": 10, "medium": 10, "meet": [10, 19], "recipi": 10, "modifi": 10, "promin": 10, "retain": 10, "trademark": 10, "pertain": 10, "readabl": 10, "place": 10, "wherev": 10, "parti": 10, "alongsid": 10, "addendum": 10, "constru": 10, "statement": 10, "compli": 10, "explicitli": 10, "notwithstand": 10, "abov": [10, 19], "noth": [10, 18, 19], "herein": 10, "supersed": 10, "execut": 10, "trade": 10, "servic": 10, "customari": 10, "disclaim": 10, "warranti": 10, "applic": 10, "law": 10, "AS": 10, "basi": 10, "OR": 10, "OF": 10, "express": [10, 19], "impli": 10, "titl": 10, "merchant": 10, "sole": 10, "respons": 10, "determin": [10, 17], "risk": 10, "associ": 10, "liabil": 10, "event": [10, 13, 18], "theori": 10, "tort": 10, "neglig": 10, "grossli": 10, "liabl": 10, "damag": 10, "incident": 10, "consequenti": 10, "charact": [10, 16], "aris": 10, "inabl": 10, "loss": 10, "goodwil": 10, "stoppag": 10, "failur": 10, "malfunct": 10, "commerci": 10, "advis": 10, "while": [10, 18, 19, 20], "fee": 10, "indemn": 10, "oblig": 10, "right": 10, "consist": 10, "indemnifi": 10, "defend": 10, "hold": 10, "harmless": 10, "incur": 10, "assert": 10, "end": [10, 21], "cv_kde": [11, 12], "wasserstein_multi": [11, 12], "wasserstein_ovo": [11, 12], "wasserstein_ovr": [11, 12], "diverg": [11, 13, 18], "furthest_distribut": [11, 13], "major_minor": [11, 13], "markov_chain": [11, 16], "chi_squar": [11, 16], "degrees_of_freedom": [11, 16], "expected_freq": [11, 16], "from_sequ": [11, 16], "generate_st": [11, 16], "normalized_differ": [11, 16], "observed_freq": [11, 16], "hollow_matrix": [11, 16], "regular": [11, 16], "mahalanobis_outli": [11, 17], "dataframeaccessor": [11, 18], "seriesaccessor": [11, 18], "null_decor": [11, 18], "formatwarn": [11, 19], "is_binari": [11, 20], "is_multiclass": [11, 20], "is_multioutput": [11, 20], "n_class": [11, 20], "bool_to_index": [11, 21], "cv": [11, 12, 21], "docstring_from": [11, 21], "generate_data": [11, 13, 18, 21], "get_idx": [11, 21], "is_numer": [11, 21], "ordered_uniqu": [11, 21], "split_and_standard": [11, 21], "zscore": [11, 21], "understand": [12, 15, 17, 20], "buffer": [12, 13, 14, 15, 17, 20, 21], "_supportsarrai": [12, 13, 14, 15, 17, 20, 21], "_nestedsequ": [12, 13, 14, 15, 17, 20, 21], "int": [12, 13, 14, 15, 16, 17, 18, 20, 21], "float": [12, 13, 14, 15, 16, 17, 18, 20, 21], "complex": [12, 13, 14, 15, 17, 20, 21, 22], "str": [12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "byte": [12, 13, 14, 15, 17, 20, 21], "namedtupl": 12, "histogram": [12, 19], "8771812708978117": 12, "5001419889107208": 12, "3286356643172673": 12, "3406453953773365": 12, "scott": [12, 19], "6162678270732356": 12, "1e": 12, "silverman": 12, "bw": 12, "1981": 12, "investig": 12, "royal": 12, "societi": 12, "vol": 12, "43": 12, "pp": 12, "97": 12, "581810759152688": 12, "n_bandwidth": 12, "grid": 12, "optim": 12, "fold": 12, "5212113989811242": 12, "largest": [12, 21], "amplitud": 12, "cut": 12, "off": 12, "smaller": [12, 19, 20], "x_peak": 12, "y_peak": 12, "15": [12, 14, 18, 20, 21], "kde": 12, "2124714013056916": 12, "014367259502733645": 12, "rule": 12, "thumb": 12, "354649738246933": 12, "162332012191087": 12, "per": [12, 15, 21], "concaten": 12, "67243035": 12, "88998226": 12, "22014721": 12, "19729456": 12, "ovr": 12, "reduc": 12, "callabl": [12, 13], "ovo": 12, "full": 12, "axi": 12, "2d": [12, 17, 21], "latter": 12, "implicitli": 12, "reshap": [12, 17], "97490053": 12, "1392715": 12, "11417203": 12, "69635752": 12, "22475": 12, "39754762": 12, "71161667": 12, "24495": 12, "pairwis": 12, "squareform": 12, "match": [12, 17, 21], "k": [12, 13], "55708601": 12, "39271504": 12, "83562902": 12, "rest": 12, "refer": 13, "jonathan": 13, "inaki": 13, "inza": 13, "jose": 13, "lozano": 13, "extent": 13, "recognit": 13, "letter": 13, "98": 13, "doi": [13, 21], "1016": 13, "j": 13, "patrec": 13, "08": 13, "002": 13, "dict": [13, 18, 19, 20], "counter": 13, "recommend": [13, 18], "omit": [13, 18], "encount": [13, 22], "helling": [13, 18], "string": [13, 16, 18, 19, 21], "euclidean": [13, 18], "manhattan": [13, 18], "kl": [13, 18], "tv": [13, 18], "actual": [13, 17, 18, 19], "zeta": [13, 18], "equat": 13, "length": [13, 19, 21], "discov": 13, "ir": [13, 18], "furthest": 13, "reflect": [13, 18], "minu": [13, 18], "accord": [13, 18], "eq": [13, 18], "mathrm": [13, 18], "frac": [13, 18], "d_": [13, 18], "delta": [13, 18], "mathbf": [13, 18], "iota": [13, 18], "_m": [13, 18], "l1": [13, 18], "l2": [13, 18], "variat": [13, 18, 21], "kullback": [13, 18], "leibner": [13, 18], "288": [13, 18], "round": [13, 18], "76": [13, 18], "629": [13, 18], "333": [13, 18], "511": [13, 18], "81": [13, 18], "61": [13, 18], "73": [13, 18], "65": [13, 18, 21], "maj": 13, "logist": [14, 18], "permut": [14, 18], "lasso": [14, 18], "cluster": [14, 18], "highest": [14, 18], "kept": [14, 18], "55": [14, 17, 18, 21], "85": [14, 18, 21], "99416839": [14, 18], "00583161": [14, 18], "x0": [14, 18], "x1": [14, 18], "x2": [14, 18], "cutoff": 14, "01": 14, "24": 14, "int64": [14, 17, 21], "revers": 14, "chunk": 15, "agilescientif": 16, "striplog": 16, "observed_count": 16, "include_self": 16, "q": [16, 18, 20], "critic": 16, "bigger": 16, "second": 16, "reject": 16, "hypothesi": 16, "classmethod": 16, "strings_are_st": 16, "pars": 16, "specifi": [16, 19], "upward": 16, "inner": 16, "token": 16, "sst": 16, "mud": 16, "lst": 16, "previou": 16, "dimens": [16, 20, 21], "current_st": 16, "next": 16, "hollow": 16, "diagon": 16, "seq_of_seq": 16, "plu": [16, 21], "atleast_2d": 16, "137": 17, "contamin": 17, "approxim": [17, 21], "lof": 17, "ee": 17, "mahanalobi": 17, "inlier": 17, "convent": [17, 21], "four": 17, "33": 17, "multipli": 17, "rousseeuw": 17, "van": [17, 22], "driessen": 17, "n_sampl": [17, 19], "n_featur": [17, 19], "6583124": 17, "1055416": 17, "5527708": 17, "01173463": 17, "67448975": 17, "33724488": 17, "stdev": [17, 21], "api": 17, "outsid": 17, "70": 17, "89163847": 17, "million": 17, "datapoint": 17, "billion": 17, "pandas_obj": 18, "automat": [18, 19, 20], "tomorrow": [18, 20], "rain": [18, 20], "cloud": [18, 20], "sun": [18, 20], "seed": [18, 20, 21], "dictionari": [18, 20], "3333333333333333": [18, 20], "top": [18, 20], "middl": [18, 20], "bottom": [18, 20], "baseestim": [19, 22], "transformermixin": [19, 22], "fit_param": 19, "n_output": 19, "x_new": 19, "n_features_new": 19, "sin": 19, "linspac": 19, "38077051": 19, "42977406": 19, "05260728": 19, "92571458": 19, "81188195": 19, "7482485": 19, "84147098": 19, "warn_if_zero": 19, "memori": 19, "expens": 19, "anyth": 19, "bother": 19, "min_class_diff": 19, "imbalance_": 19, "adjust": 19, "unusu": 19, "difficult": 19, "suffici": 19, "mutlivari": 19, "1_000": 19, "12573022": 19, "13210486": 19, "64042265": 19, "10490012": 19, "53566937": 19, "36159505": 19, "24972527": 19, "75063397": 19, "55581573": 19, "01881162": 19, "90942756": 19, "36922933": 19, "outliers_": 19, "beyond": 19, "covarianc": 19, "verbos": 19, "adapt": 19, "handl": 19, "prior": [19, 21], "iter": [19, 21], "fulfil": 19, "xt": 19, "n_transformed_featur": 19, "presenc": 19, "mappabl": 19, "correspond": 19, "safer": 19, "shorthand": 19, "constructor": 19, "permit": 19, "lowercas": 19, "joblib": 19, "cach": 19, "path": 19, "directori": 19, "enabl": 19, "clone": 19, "named_step": 19, "advantag": 19, "consum": 19, "elaps": 19, "complet": 19, "baselin": [20, 22], "dummyclassifi": 20, "20000000000000004": 20, "35654761904761906": 20, "dummyregressor": 20, "root": 20, "whichev": 20, "arr": [20, 21], "randint": 20, "output": [20, 21], "typeerror": 20, "cond": 21, "stepsiz": 21, "coeffici": 21, "decim": 21, "5163977794943222": 21, "instruct": 21, "human": 21, "friendli": 21, "source_func": 21, "downsampl": 21, "cdf": 21, "switch": 21, "weight": 21, "mid": 21, "halfwai": 21, "formal": 21, "unbias": 21, "everi": [21, 22], "foo": 21, "l": 21, "toler": [21, 22], "flat": 21, "interv": 21, "monoton": 21, "idx": 21, "atol": 21, "001": 21, "faster": 21, "isclos": 21, "\u03bc": 21, "\u03c3": 21, "allclos": 21, "absolut": 21, "yield": 21, "mask": 21, "item": 21, "unord": 21, "fast": 21, "reli": 21, "job": 21, "slow": 21, "1000000000": 21, "invers": 21, "magnif": 21, "hyperellipsoid": 21, "sdhe": 21, "proport": 21, "2816": 21, "tabl": 21, "1371": 21, "pone": 21, "0118537": 21, "decent": 21, "precis": 21, "1e9": 21, "575829302496098": 21, "039137525465009": 21, "8000000000000003": 21, "y_val": 21, "whose": 21, "68": 21, "27": 21, "39": 21, "signific": 21, "beta": 21, "paper": [21, 22], "poseidon": 21, "csd": 21, "auth": 21, "pdf": 21, "ververidis08a": 21, "exact": 21, "6826894921370859": 21, "6826894916531445": 21, "9973002039367398": 21, "9973002039633309": 21, "39346933952920327": 21, "9946544947734935": 21, "bayesian": 21, "rate": 21, "posterior": 21, "4999999999999998": 21, "54919334": 21, "161895": 21, "77459667": 21, "38729833": 21, "practition": 22, "field": 22, "ensur": 22, "safe": 22, "lead": 22, "overconfid": 22, "wildli": 22, "integr": 22, "enhanc": 22, "qualiti": 22, "hazard": 22, "situat": 22, "harm": 22, "concept": 22, "known": 22, "prevent": 22, "civil": 22, "engin": 22, "industri": 22, "decad": 22, "gelder": 22, "etal": 22, "2021": 22, "motiv": 22, "draft": 22, "scientist": 22, "alreadi": 22, "clippingdetector": 22, "although": 22, "subclass": 22, "attempt": 22}, "objects": {"": [[11, 0, 0, "-", "redflag"]], "redflag": [[12, 0, 0, "-", "distributions"], [13, 0, 0, "-", "imbalance"], [14, 0, 0, "-", "importance"], [15, 0, 0, "-", "independence"], [16, 0, 0, "-", "markov"], [17, 0, 0, "-", "outliers"], [18, 0, 0, "-", "pandas"], [19, 0, 0, "-", "sklearn"], [20, 0, 0, "-", "target"], [21, 0, 0, "-", "utils"]], "redflag.distributions": [[12, 1, 1, "", "best_distribution"], [12, 1, 1, "", "bw_scott"], [12, 1, 1, "", "bw_silverman"], [12, 1, 1, "", "cv_kde"], [12, 1, 1, "", "find_large_peaks"], [12, 1, 1, "", "fit_kde"], [12, 1, 1, "", "get_kde"], [12, 1, 1, "", "is_multimodal"], [12, 1, 1, "", "kde_peaks"], [12, 1, 1, "", "wasserstein"], [12, 1, 1, "", "wasserstein_multi"], [12, 1, 1, "", "wasserstein_ovo"], [12, 1, 1, "", "wasserstein_ovr"]], "redflag.imbalance": [[13, 1, 1, "", "class_counts"], [13, 1, 1, "", "divergence"], [13, 1, 1, "", "empirical_distribution"], [13, 1, 1, "", "furthest_distribution"], [13, 1, 1, "", "imbalance_degree"], [13, 1, 1, "", "imbalance_ratio"], [13, 1, 1, "", "is_imbalanced"], [13, 1, 1, "", "major_minor"], [13, 1, 1, "", "minority_classes"]], "redflag.importance": [[14, 1, 1, "", "feature_importances"], [14, 1, 1, "", "least_important_features"], [14, 1, 1, "", "most_important_features"]], "redflag.independence": [[15, 1, 1, "", "is_correlated"]], "redflag.markov": [[16, 2, 1, "", "Markov_chain"], [16, 1, 1, "", "hollow_matrix"], [16, 1, 1, "", "observations"], [16, 1, 1, "", "regularize"]], "redflag.markov.Markov_chain": [[16, 3, 1, "", "chi_squared"], [16, 4, 1, "", "degrees_of_freedom"], [16, 4, 1, "", "expected_freqs"], [16, 3, 1, "", "from_sequence"], [16, 3, 1, "", "generate_states"], [16, 4, 1, "", "normalized_difference"], [16, 4, 1, "", "observed_freqs"]], "redflag.outliers": [[17, 1, 1, "", "expected_outliers"], [17, 1, 1, "", "get_outliers"], [17, 1, 1, "", "has_outliers"], [17, 1, 1, "", "mahalanobis"], [17, 1, 1, "", "mahalanobis_outliers"]], "redflag.pandas": [[18, 2, 1, "", "DataFrameAccessor"], [18, 2, 1, "", "SeriesAccessor"], [18, 1, 1, "", "null_decorator"]], "redflag.pandas.DataFrameAccessor": [[18, 3, 1, "", "correlation_detector"], [18, 3, 1, "", "feature_importances"]], "redflag.pandas.SeriesAccessor": [[18, 3, 1, "", "dummy_scores"], [18, 3, 1, "", "imbalance_degree"], [18, 3, 1, "", "is_imbalanced"], [18, 3, 1, "", "is_ordered"], [18, 3, 1, "", "minority_classes"], [18, 3, 1, "", "report"]], "redflag.sklearn": [[19, 2, 1, "", "BaseRedflagDetector"], [19, 2, 1, "", "ClipDetector"], [19, 2, 1, "", "CorrelationDetector"], [19, 2, 1, "", "Detector"], [19, 2, 1, "", "DistributionComparator"], [19, 2, 1, "", "DummyPredictor"], [19, 2, 1, "", "ImbalanceComparator"], [19, 2, 1, "", "ImbalanceDetector"], [19, 2, 1, "", "ImportanceDetector"], [19, 2, 1, "", "InsufficientDataDetector"], [19, 2, 1, "", "MultimodalityDetector"], [19, 2, 1, "", "MultivariateOutlierDetector"], [19, 2, 1, "", "OutlierDetector"], [19, 2, 1, "", "RfPipeline"], [19, 2, 1, "", "UnivariateOutlierDetector"], [19, 1, 1, "", "formatwarning"], [19, 1, 1, "", "make_detector_pipeline"], [19, 1, 1, "", "make_rf_pipeline"]], "redflag.sklearn.BaseRedflagDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.DistributionComparator": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.DummyPredictor": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImbalanceComparator": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImbalanceDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImportanceDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.InsufficientDataDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.MultimodalityDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.MultivariateOutlierDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.OutlierDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.RfPipeline": [[19, 3, 1, "", "transform"]], "redflag.target": [[20, 1, 1, "", "dummy_classification_scores"], [20, 1, 1, "", "dummy_regression_scores"], [20, 1, 1, "", "dummy_scores"], [20, 1, 1, "", "is_binary"], [20, 1, 1, "", "is_continuous"], [20, 1, 1, "", "is_multiclass"], [20, 1, 1, "", "is_multioutput"], [20, 1, 1, "", "is_ordered"], [20, 1, 1, "", "n_classes"]], "redflag.utils": [[21, 1, 1, "", "bool_to_index"], [21, 1, 1, "", "clipped"], [21, 1, 1, "", "consecutive"], [21, 1, 1, "", "cv"], [21, 1, 1, "", "deprecated"], [21, 1, 1, "", "docstring_from"], [21, 1, 1, "", "ecdf"], [21, 1, 1, "", "flatten"], [21, 1, 1, "", "generate_data"], [21, 1, 1, "", "get_idx"], [21, 1, 1, "", "has_few_samples"], [21, 1, 1, "", "has_flat"], [21, 1, 1, "", "has_monotonic"], [21, 1, 1, "", "has_nans"], [21, 1, 1, "", "index_to_bool"], [21, 1, 1, "", "is_clipped"], [21, 1, 1, "", "is_numeric"], [21, 1, 1, "", "is_standard_normal"], [21, 1, 1, "", "is_standardized"], [21, 1, 1, "", "iter_groups"], [21, 1, 1, "", "ordered_unique"], [21, 1, 1, "", "proportion_to_stdev"], [21, 1, 1, "", "split_and_standardize"], [21, 1, 1, "", "stdev_to_proportion"], [21, 1, 1, "", "update_p"], [21, 1, 1, "", "zscore"]]}, "objtypes": {"0": "py:module", "1": "py:function", "2": "py:class", "3": "py:method", "4": "py:property"}, "objnames": {"0": ["py", "module", "Python module"], "1": ["py", "function", "Python function"], "2": ["py", "class", "Python class"], "3": ["py", "method", "Python method"], "4": ["py", "property", "Python property"]}, "titleterms": {"basic": 0, "usag": 0, "load": [0, 1], "some": [0, 1], "data": [0, 1], "categor": 0, "continu": [0, 7], "imbal": [0, 1, 3, 13], "metric": [0, 1], "outlier": [0, 17], "clip": [0, 1], "distribut": [0, 12], "shape": 0, "ident": 0, "assumpt": [0, 1], "alreadi": 0, "split": 0, "out": 0, "group": 0, "arrai": 0, "independ": [0, 1, 15], "featur": 0, "import": [0, 1, 14], "tutori": 1, "A": 1, "simpl": 1, "ml": [1, 8], "workflow": 1, "quick": [1, 8], "look": 1, "redflag": [1, 2, 3, 8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "pipelin": [1, 3], "make": [1, 3], "your": [1, 3], "own": [1, 3], "test": [1, 7], "us": [2, 3], "panda": [2, 18], "seri": 2, "accessor": 2, "datafram": 2, "sklearn": [3, 19], "The": 3, "detector": 3, "class": 3, "pre": 3, "built": 3, "transform": 3, "compar": 3, "smoke": 3, "what": [3, 22], "do": 3, "about": 3, "warn": 3, "imbalancedetector": 3, "imbalancecompar": 3, "clipdetector": 3, "correlationdetector": 3, "outlierdetector": 3, "distributioncompar": 3, "importancedetector": 3, "author": 4, "changelog": 5, "0": 5, "4": 5, "2": 5, "10": 5, "decemb": 5, "2023": 5, "1": 5, "octob": 5, "28": 5, "septemb": 5, "3": 5, "21": 5, "novemb": 5, "2022": 5, "9": 5, "25": 5, "august": 5, "8": 5, "juli": 5, "7": 5, "11": 5, "februari": 5, "31": 5, "januari": 5, "30": 5, "contribut": [6, 7], "code": 6, "conduct": 6, "authorship": 6, "licens": [6, 10], "develop": 7, "instal": [7, 9], "build": 7, "packag": [7, 11], "doc": 7, "integr": 7, "safer": 8, "design": [8, 22], "start": 8, "user": 8, "guid": 8, "api": 8, "refer": 8, "other": 8, "resourc": 8, "indic": 8, "tabl": 8, "option": 9, "depend": 9, "submodul": 11, "modul": [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "content": 11, "markov": 16, "target": 20, "util": 21, "i": 22, "overview": 22, "safeti": 22, "": 22}, "envversion": {"sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx": 60}, "alltitles": {"\ud83d\udea9 Basic usage": [[0, "basic-usage"]], "Load some data": [[0, "load-some-data"], [1, "load-some-data"]], "Categorical or continuous?": [[0, "categorical-or-continuous"]], "Imbalance metrics": [[0, "imbalance-metrics"], [1, "imbalance-metrics"]], "Outliers": [[0, "outliers"]], "Clipping": [[0, "clipping"], [1, "clipping"]], "Distribution shape": [[0, "distribution-shape"]], "Identical distribution assumption": [[0, "identical-distribution-assumption"]], "Already split out group arrays": [[0, "already-split-out-group-arrays"]], "Independence assumption": [[0, "independence-assumption"], [1, "independence-assumption"]], "Feature importance": [[0, "feature-importance"]], "\ud83d\udea9 Tutorial": [[1, "tutorial"]], "A simple ML workflow": [[1, "a-simple-ml-workflow"]], "A quick look at redflag": [[1, "a-quick-look-at-redflag"]], "Importance": [[1, "importance"]], "Pipelines": [[1, "pipelines"]], "Making your own tests": [[1, "making-your-own-tests"]], "\ud83d\udea9 Using redflag with Pandas": [[2, "using-redflag-with-pandas"]], "Series accessor": [[2, "series-accessor"]], "DataFrame accessor": [[2, "dataframe-accessor"]], "\ud83d\udea9 Using redflag with sklearn": [[3, "using-redflag-with-sklearn"]], "The redflag detector classes": [[3, "the-redflag-detector-classes"]], "Using the pre-built redflag pipeline": [[3, "using-the-pre-built-redflag-pipeline"]], "Using the \u2018detector\u2019 transformers": [[3, "using-the-detector-transformers"]], "The imbalance comparator": [[3, "the-imbalance-comparator"]], "Making your own smoke detector": [[3, "making-your-own-smoke-detector"]], "What to do about the warnings": [[3, "what-to-do-about-the-warnings"]], "ImbalanceDetector and ImbalanceComparator": [[3, "imbalancedetector-and-imbalancecomparator"]], "ClipDetector": [[3, "clipdetector"]], "CorrelationDetector": [[3, "correlationdetector"]], "OutlierDetector": [[3, "outlierdetector"]], "DistributionComparator": [[3, "distributioncomparator"]], "ImportanceDetector": [[3, "importancedetector"]], "Authors": [[4, "authors"]], "Changelog": [[5, "changelog"]], "0.4.2, 10 December 2023": [[5, "december-2023"]], "0.4.1, 2 October 2023": [[5, "october-2023"]], "0.4.0, 28 September 2023": [[5, "september-2023"]], "0.3.0, 21 September 2023": [[5, "id1"]], "0.2.0, 4 September 2023": [[5, "id2"]], "0.1.10, 21 November 2022": [[5, "november-2022"]], "0.1.9, 25 August 2022": [[5, "august-2022"]], "0.1.8, 8 July 2022": [[5, "july-2022"]], "0.1.3 to 0.1.7, 9\u201311 February 2022": [[5, "to-0-1-7-911-february-2022"]], "0.1.2, 1 February 2022": [[5, "february-2022"]], "0.1.1, 31 January 2022": [[5, "january-2022"]], "0.1.0, 30 January 2022": [[5, "id3"]], "Contributing": [[6, "contributing"], [7, "contributing"]], "Code of conduct": [[6, "code-of-conduct"]], "Authorship": [[6, "authorship"]], "License": [[6, "license"], [10, "license"]], "Development": [[7, "development"]], "Installation": [[7, "installation"]], "Testing": [[7, "testing"]], "Building the package": [[7, "building-the-package"]], "Building the docs": [[7, "building-the-docs"]], "Continuous integration": [[7, "continuous-integration"]], "Redflag: safer ML by design": [[8, "redflag-safer-ml-by-design"]], "Quick start": [[8, "quick-start"]], "User guide": [[8, "user-guide"], [8, null]], "API reference": [[8, "api-reference"], [8, null]], "Other resources": [[8, "other-resources"], [8, null]], "Indices and tables": [[8, "indices-and-tables"]], "\ud83d\udea9 Installation": [[9, "installation"]], "Optional dependencies": [[9, "optional-dependencies"]], "redflag package": [[11, "redflag-package"]], "Submodules": [[11, "submodules"]], "Module contents": [[11, "module-redflag"]], "redflag.distributions module": [[12, "module-redflag.distributions"]], "redflag.imbalance module": [[13, "module-redflag.imbalance"]], "redflag.importance module": [[14, "module-redflag.importance"]], "redflag.independence module": [[15, "module-redflag.independence"]], "redflag.markov module": [[16, "module-redflag.markov"]], "redflag.outliers module": [[17, "module-redflag.outliers"]], "redflag.pandas module": [[18, "module-redflag.pandas"]], "redflag.sklearn module": [[19, "module-redflag.sklearn"]], "redflag.target module": [[20, "module-redflag.target"]], "redflag.utils module": [[21, "module-redflag.utils"]], "\ud83d\udea9 What is redflag?": [[22, "what-is-redflag"]], "Overview": [[22, "overview"]], "Safety by design": [[22, "safety-by-design"]], "What\u2019s in redflag": [[22, "what-s-in-redflag"]]}, "indexentries": {"module": [[11, "module-redflag"], [12, "module-redflag.distributions"], [13, "module-redflag.imbalance"], [14, "module-redflag.importance"], [15, "module-redflag.independence"], [16, "module-redflag.markov"], [17, "module-redflag.outliers"], [18, "module-redflag.pandas"], [19, "module-redflag.sklearn"], [20, "module-redflag.target"], [21, "module-redflag.utils"]], "redflag": [[11, "module-redflag"]], "best_distribution() (in module redflag.distributions)": [[12, "redflag.distributions.best_distribution"]], "bw_scott() (in module redflag.distributions)": [[12, "redflag.distributions.bw_scott"]], "bw_silverman() (in module redflag.distributions)": [[12, "redflag.distributions.bw_silverman"]], "cv_kde() (in module redflag.distributions)": [[12, "redflag.distributions.cv_kde"]], "find_large_peaks() (in module redflag.distributions)": [[12, "redflag.distributions.find_large_peaks"]], "fit_kde() (in module redflag.distributions)": [[12, "redflag.distributions.fit_kde"]], "get_kde() (in module redflag.distributions)": [[12, "redflag.distributions.get_kde"]], "is_multimodal() (in module redflag.distributions)": [[12, "redflag.distributions.is_multimodal"]], "kde_peaks() (in module redflag.distributions)": [[12, "redflag.distributions.kde_peaks"]], "redflag.distributions": [[12, "module-redflag.distributions"]], "wasserstein() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein"]], "wasserstein_multi() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_multi"]], "wasserstein_ovo() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_ovo"]], "wasserstein_ovr() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_ovr"]], "class_counts() (in module redflag.imbalance)": [[13, "redflag.imbalance.class_counts"]], "divergence() (in module redflag.imbalance)": [[13, "redflag.imbalance.divergence"]], "empirical_distribution() (in module redflag.imbalance)": [[13, "redflag.imbalance.empirical_distribution"]], "furthest_distribution() (in module redflag.imbalance)": [[13, "redflag.imbalance.furthest_distribution"]], "imbalance_degree() (in module redflag.imbalance)": [[13, "redflag.imbalance.imbalance_degree"]], "imbalance_ratio() (in module redflag.imbalance)": [[13, "redflag.imbalance.imbalance_ratio"]], "is_imbalanced() (in module redflag.imbalance)": [[13, "redflag.imbalance.is_imbalanced"]], "major_minor() (in module redflag.imbalance)": [[13, "redflag.imbalance.major_minor"]], "minority_classes() (in module redflag.imbalance)": [[13, "redflag.imbalance.minority_classes"]], "redflag.imbalance": [[13, "module-redflag.imbalance"]], "feature_importances() (in module redflag.importance)": [[14, "redflag.importance.feature_importances"]], "least_important_features() (in module redflag.importance)": [[14, "redflag.importance.least_important_features"]], "most_important_features() (in module redflag.importance)": [[14, "redflag.importance.most_important_features"]], "redflag.importance": [[14, "module-redflag.importance"]], "is_correlated() (in module redflag.independence)": [[15, "redflag.independence.is_correlated"]], "redflag.independence": [[15, "module-redflag.independence"]], "markov_chain (class in redflag.markov)": [[16, "redflag.markov.Markov_chain"]], "chi_squared() (redflag.markov.markov_chain method)": [[16, "redflag.markov.Markov_chain.chi_squared"]], "degrees_of_freedom (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.degrees_of_freedom"]], "expected_freqs (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.expected_freqs"]], "from_sequence() (redflag.markov.markov_chain class method)": [[16, "redflag.markov.Markov_chain.from_sequence"]], "generate_states() (redflag.markov.markov_chain method)": [[16, "redflag.markov.Markov_chain.generate_states"]], "hollow_matrix() (in module redflag.markov)": [[16, "redflag.markov.hollow_matrix"]], "normalized_difference (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.normalized_difference"]], "observations() (in module redflag.markov)": [[16, "redflag.markov.observations"]], "observed_freqs (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.observed_freqs"]], "redflag.markov": [[16, "module-redflag.markov"]], "regularize() (in module redflag.markov)": [[16, "redflag.markov.regularize"]], "expected_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.expected_outliers"]], "get_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.get_outliers"]], "has_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.has_outliers"]], "mahalanobis() (in module redflag.outliers)": [[17, "redflag.outliers.mahalanobis"]], "mahalanobis_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.mahalanobis_outliers"]], "redflag.outliers": [[17, "module-redflag.outliers"]], "dataframeaccessor (class in redflag.pandas)": [[18, "redflag.pandas.DataFrameAccessor"]], "seriesaccessor (class in redflag.pandas)": [[18, "redflag.pandas.SeriesAccessor"]], "correlation_detector() (redflag.pandas.dataframeaccessor method)": [[18, "redflag.pandas.DataFrameAccessor.correlation_detector"]], "dummy_scores() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.dummy_scores"]], "feature_importances() (redflag.pandas.dataframeaccessor method)": [[18, "redflag.pandas.DataFrameAccessor.feature_importances"]], "imbalance_degree() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.imbalance_degree"]], "is_imbalanced() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.is_imbalanced"]], "is_ordered() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.is_ordered"]], "minority_classes() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.minority_classes"]], "null_decorator() (in module redflag.pandas)": [[18, "redflag.pandas.null_decorator"]], "redflag.pandas": [[18, "module-redflag.pandas"]], "report() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.report"]], "baseredflagdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.BaseRedflagDetector"]], "clipdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ClipDetector"]], "correlationdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.CorrelationDetector"]], "detector (class in redflag.sklearn)": [[19, "redflag.sklearn.Detector"]], "distributioncomparator (class in redflag.sklearn)": [[19, "redflag.sklearn.DistributionComparator"]], "dummypredictor (class in redflag.sklearn)": [[19, "redflag.sklearn.DummyPredictor"]], "imbalancecomparator (class in redflag.sklearn)": [[19, "redflag.sklearn.ImbalanceComparator"]], "imbalancedetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ImbalanceDetector"]], "importancedetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ImportanceDetector"]], "insufficientdatadetector (class in redflag.sklearn)": [[19, "redflag.sklearn.InsufficientDataDetector"]], "multimodalitydetector (class in redflag.sklearn)": [[19, "redflag.sklearn.MultimodalityDetector"]], "multivariateoutlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.MultivariateOutlierDetector"]], "outlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.OutlierDetector"]], "rfpipeline (class in redflag.sklearn)": [[19, "redflag.sklearn.RfPipeline"]], "univariateoutlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.UnivariateOutlierDetector"]], "fit() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.fit"]], "fit() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.fit"]], "fit() (redflag.sklearn.dummypredictor method)": [[19, "redflag.sklearn.DummyPredictor.fit"]], "fit() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.fit"]], "fit() (redflag.sklearn.imbalancedetector method)": [[19, "redflag.sklearn.ImbalanceDetector.fit"]], "fit() (redflag.sklearn.importancedetector method)": [[19, "redflag.sklearn.ImportanceDetector.fit"]], "fit() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.fit"]], "fit() (redflag.sklearn.multimodalitydetector method)": [[19, "redflag.sklearn.MultimodalityDetector.fit"]], "fit() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.fit"]], "fit() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.fit"]], "fit_transform() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.fit_transform"]], "fit_transform() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.fit_transform"]], "fit_transform() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.fit_transform"]], "fit_transform() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.fit_transform"]], "fit_transform() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.fit_transform"]], "fit_transform() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.fit_transform"]], "formatwarning() (in module redflag.sklearn)": [[19, "redflag.sklearn.formatwarning"]], "make_detector_pipeline() (in module redflag.sklearn)": [[19, "redflag.sklearn.make_detector_pipeline"]], "make_rf_pipeline() (in module redflag.sklearn)": [[19, "redflag.sklearn.make_rf_pipeline"]], "redflag.sklearn": [[19, "module-redflag.sklearn"]], "transform() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.transform"]], "transform() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.transform"]], "transform() (redflag.sklearn.dummypredictor method)": [[19, "redflag.sklearn.DummyPredictor.transform"]], "transform() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.transform"]], "transform() (redflag.sklearn.imbalancedetector method)": [[19, "redflag.sklearn.ImbalanceDetector.transform"]], "transform() (redflag.sklearn.importancedetector method)": [[19, "redflag.sklearn.ImportanceDetector.transform"]], "transform() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.transform"]], "transform() (redflag.sklearn.multimodalitydetector method)": [[19, "redflag.sklearn.MultimodalityDetector.transform"]], "transform() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.transform"]], "transform() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.transform"]], "transform() (redflag.sklearn.rfpipeline method)": [[19, "redflag.sklearn.RfPipeline.transform"]], "dummy_classification_scores() (in module redflag.target)": [[20, "redflag.target.dummy_classification_scores"]], "dummy_regression_scores() (in module redflag.target)": [[20, "redflag.target.dummy_regression_scores"]], "dummy_scores() (in module redflag.target)": [[20, "redflag.target.dummy_scores"]], "is_binary() (in module redflag.target)": [[20, "redflag.target.is_binary"]], "is_continuous() (in module redflag.target)": [[20, "redflag.target.is_continuous"]], "is_multiclass() (in module redflag.target)": [[20, "redflag.target.is_multiclass"]], "is_multioutput() (in module redflag.target)": [[20, "redflag.target.is_multioutput"]], "is_ordered() (in module redflag.target)": [[20, "redflag.target.is_ordered"]], "n_classes() (in module redflag.target)": [[20, "redflag.target.n_classes"]], "redflag.target": [[20, "module-redflag.target"]], "bool_to_index() (in module redflag.utils)": [[21, "redflag.utils.bool_to_index"]], "clipped() (in module redflag.utils)": [[21, "redflag.utils.clipped"]], "consecutive() (in module redflag.utils)": [[21, "redflag.utils.consecutive"]], "cv() (in module redflag.utils)": [[21, "redflag.utils.cv"]], "deprecated() (in module redflag.utils)": [[21, "redflag.utils.deprecated"]], "docstring_from() (in module redflag.utils)": [[21, "redflag.utils.docstring_from"]], "ecdf() (in module redflag.utils)": [[21, "redflag.utils.ecdf"]], "flatten() (in module redflag.utils)": [[21, "redflag.utils.flatten"]], "generate_data() (in module redflag.utils)": [[21, "redflag.utils.generate_data"]], "get_idx() (in module redflag.utils)": [[21, "redflag.utils.get_idx"]], "has_few_samples() (in module redflag.utils)": [[21, "redflag.utils.has_few_samples"]], "has_flat() (in module redflag.utils)": [[21, "redflag.utils.has_flat"]], "has_monotonic() (in module redflag.utils)": [[21, "redflag.utils.has_monotonic"]], "has_nans() (in module redflag.utils)": [[21, "redflag.utils.has_nans"]], "index_to_bool() (in module redflag.utils)": [[21, "redflag.utils.index_to_bool"]], "is_clipped() (in module redflag.utils)": [[21, "redflag.utils.is_clipped"]], "is_numeric() (in module redflag.utils)": [[21, "redflag.utils.is_numeric"]], "is_standard_normal() (in module redflag.utils)": [[21, "redflag.utils.is_standard_normal"]], "is_standardized() (in module redflag.utils)": [[21, "redflag.utils.is_standardized"]], "iter_groups() (in module redflag.utils)": [[21, "redflag.utils.iter_groups"]], "ordered_unique() (in module redflag.utils)": [[21, "redflag.utils.ordered_unique"]], "proportion_to_stdev() (in module redflag.utils)": [[21, "redflag.utils.proportion_to_stdev"]], "redflag.utils": [[21, "module-redflag.utils"]], "split_and_standardize() (in module redflag.utils)": [[21, "redflag.utils.split_and_standardize"]], "stdev_to_proportion() (in module redflag.utils)": [[21, "redflag.utils.stdev_to_proportion"]], "update_p() (in module redflag.utils)": [[21, "redflag.utils.update_p"]], "zscore() (in module redflag.utils)": [[21, "redflag.utils.zscore"]]}}) \ No newline at end of file +Search.setIndex({"docnames": ["_notebooks/Basic_usage", "_notebooks/Tutorial", "_notebooks/Using_redflag_with_Pandas", "_notebooks/Using_redflag_with_sklearn", "authors", "changelog", "contributing", "development", "index", "installation", "license", "redflag", "redflag.distributions", "redflag.imbalance", "redflag.importance", "redflag.independence", "redflag.markov", "redflag.outliers", "redflag.pandas", "redflag.sklearn", "redflag.target", "redflag.utils", "what_is_redflag"], "filenames": ["_notebooks/Basic_usage.ipynb", "_notebooks/Tutorial.ipynb", "_notebooks/Using_redflag_with_Pandas.ipynb", "_notebooks/Using_redflag_with_sklearn.ipynb", "authors.md", "changelog.md", "contributing.md", "development.md", "index.rst", "installation.md", "license.md", "redflag.rst", "redflag.distributions.rst", "redflag.imbalance.rst", "redflag.importance.rst", "redflag.independence.rst", "redflag.markov.rst", "redflag.outliers.rst", "redflag.pandas.rst", "redflag.sklearn.rst", "redflag.target.rst", "redflag.utils.rst", "what_is_redflag.md"], "titles": ["\ud83d\udea9 Basic usage", "\ud83d\udea9 Tutorial", "\ud83d\udea9 Using redflag with Pandas", "\ud83d\udea9 Using redflag with sklearn", "Authors", "Changelog", "Contributing", "Development", "Redflag: safer ML by design", "\ud83d\udea9 Installation", "License", "redflag package", "redflag.distributions module", "redflag.imbalance module", "redflag.importance module", "redflag.independence module", "redflag.markov module", "redflag.outliers module", "redflag.pandas module", "redflag.sklearn module", "redflag.target module", "redflag.utils module", "\ud83d\udea9 What is redflag?"], "terms": {"welcom": [0, 2], "redflag": [0, 5, 7, 9], "It": [0, 1, 5, 13, 18, 19, 21, 22], "": [0, 1, 2, 3, 5, 6, 7, 8, 10, 15, 16, 18, 19, 20, 21], "still": [0, 3, 5, 22], "earli": [0, 5], "dai": 0, "thi": [0, 1, 2, 3, 5, 6, 7, 10, 13, 14, 16, 17, 18, 19, 20, 21, 22], "librari": [0, 1, 8, 22], "ar": [0, 1, 2, 3, 5, 6, 7, 8, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "few": [0, 3], "thing": [0, 1, 3], "you": [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 16, 17, 18, 19, 21], "can": [0, 1, 2, 3, 5, 6, 7, 9, 12, 13, 16, 17, 18, 19, 20, 21, 22], "do": [0, 1, 5, 8, 10, 16, 18, 22], "detect": [0, 1, 3, 5, 17, 19], "label": [0, 1, 3, 5, 12, 13, 14, 18, 19, 20, 21], "ani": [0, 1, 3, 5, 10, 12, 13, 14, 15, 17, 19, 20, 21, 22], "other": [0, 1, 3, 5, 6, 7, 10, 12, 21], "variabl": [0, 3, 16, 17, 18, 20], "rf": [0, 1, 2, 3, 8], "__version__": [0, 1, 2], "0": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "4": [0, 1, 2, 3, 12, 13, 14, 15, 17, 18, 19, 20, 21], "2": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "panda": [0, 1, 3, 5, 8, 11, 22], "pd": [0, 1, 2, 3, 21], "df": [0, 1, 2, 3, 8, 22], "read_csv": [0, 1, 2, 3], "http": [0, 1, 2, 3, 10, 13, 16, 21], "raw": [0, 1, 2, 3], "githubusercont": [0, 1, 2, 3], "com": [0, 1, 2, 3, 16], "scienxlab": [0, 1, 2, 3, 6], "dataset": [0, 1, 2, 3, 5, 12, 13, 15, 17, 18, 19, 21, 22], "main": [0, 1, 2, 3, 5, 7, 8], "kg": [0, 1, 2, 3], "panoma_training_data": [0, 1, 2, 3], "csv": [0, 1, 2, 3], "look": [0, 2, 3, 8], "transpos": [0, 3], "summari": [0, 3], "each": [0, 3, 5, 10, 12, 14, 18, 19, 21], "column": [0, 1, 3, 5, 21], "datafram": [0, 3, 5, 8, 22], "i": [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "row": [0, 3, 5, 15], "here": [0, 3, 6], "describ": [0, 3, 10], "t": [0, 1, 3, 5, 7, 19, 21], "count": [0, 3, 13, 17, 20, 21], "mean": [0, 1, 2, 3, 5, 10, 12, 18, 20, 21, 22], "std": [0, 3], "min": [0, 1, 3, 13, 21], "25": [0, 3, 14, 18, 20, 21], "50": [0, 3, 10], "75": [0, 3, 21], "max": [0, 1, 3, 12, 21], "depth": [0, 1, 2, 3], "3966": [0, 3], "882": [0, 3], "674555": [0, 3], "40": [0, 3, 12, 21], "150056": [0, 3], "784": [0, 3], "402800": [0, 3], "858": [0, 3], "012000": [0, 3], "888": [0, 3], "339600": [0, 3], "913": [0, 3], "028400": [0, 3], "963": [0, 3], "320400": [0, 3], "relpo": [0, 1, 2, 3], "524999": [0, 3], "286375": [0, 3], "010000": [0, 3], "282000": [0, 3], "531000": [0, 3], "773000": [0, 3], "1": [0, 1, 2, 3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "000000": [0, 3], "marin": [0, 1, 2, 3], "325013": [0, 3], "589539": [0, 3], "gr": [0, 1, 2, 3, 21], "64": [0, 1, 3], "367899": [0, 3], "28": [0, 3], "414603": [0, 3], "12": [0, 1, 2, 3, 5, 12], "036000": [0, 3], "45": [0, 1, 2, 3, 14, 18, 21], "311250": [0, 3], "840000": [0, 3], "78": [0, 1, 2, 3], "809750": [0, 3], "200": [0, 3, 12, 18, 19, 20], "ild": [0, 1, 2, 3], "5": [0, 1, 2, 3, 5, 12, 13, 14, 17, 18, 19, 20, 21], "240308": [0, 3], "3": [0, 1, 2, 3, 12, 13, 14, 17, 18, 19, 20, 21], "190416": [0, 3], "340408": [0, 3], "169567": [0, 3], "305266": [0, 3], "6": [0, 1, 2, 3, 12, 13, 15, 17, 18, 20, 21], "664234": [0, 3], "32": [0, 3], "136605": [0, 3], "deltaphi": [0, 1, 2, 3], "469088": [0, 3], "922310": [0, 3], "21": [0, 3], "832000": [0, 3], "292500": [0, 3], "124750": [0, 3], "18": [0, 3], "600000": [0, 3], "phind": [0, 1, 2, 3], "13": [0, 1, 2, 3, 21], "008807": [0, 3], "936391": [0, 3], "550000": [0, 3], "8": [0, 1, 2, 3, 12, 13, 14, 15, 18, 20, 21], "196250": [0, 3], "11": [0, 1, 2, 3, 12], "781500": [0, 3], "16": [0, 3], "050000": [0, 3], "52": [0, 3], "369000": [0, 3], "pe": [0, 1, 2, 3], "686427": [0, 3], "815113": [0, 3], "200000": [0, 3], "123000": [0, 3], "514500": [0, 3], "241750": [0, 3], "094000": [0, 3], "faci": [0, 1, 2, 3], "471004": [0, 3], "406180": [0, 3], "9": [0, 1, 2, 3, 10, 12, 15, 18, 20, 21], "latitud": [0, 1, 2, 3], "37": [0, 1, 2, 3], "632575": [0, 3], "299398": [0, 3], "180732": [0, 3], "356426": [0, 3], "500380": [0, 3], "910583": [0, 3], "38": [0, 3], "063373": [0, 3], "longitud": [0, 1, 2, 3], "101": [0, 3], "294895": [0, 3], "230454": [0, 3], "646452": [0, 3], "389189": [0, 3], "325130": [0, 3], "106045": [0, 3], "100": [0, 1, 2, 3, 12, 17, 20, 21], "987305": [0, 1, 2, 3], "ild_log10": [0, 1, 2, 3], "648860": [0, 3], "251542": [0, 3], "468000": [0, 3], "501000": [0, 3], "634000": [0, 3], "823750": [0, 3], "507000": [0, 3], "rhob": [0, 1, 2, 3], "2288": [0, 3], "861692": [0, 3], "218": [0, 3], "038459": [0, 3], "1500": [0, 3], "2201": [0, 3], "007475": [0, 3], "2342": [0, 3], "202051": [0, 3], "2434": [0, 3], "166399": [0, 3], "2802": [0, 3], "871147": [0, 3], "fairli": 0, "easi": [0, 1], "tell": [0, 1, 21], "numer": [0, 5, 21], "harder": 0, "decid": [0, 3, 18, 20, 21], "we": [0, 1, 2, 3, 5, 6, 17, 19, 21], "us": [0, 1, 5, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "is_continu": [0, 5, 11, 20], "check": [0, 1, 3, 5, 13, 15, 18, 19, 21, 22], "target": [0, 2, 3, 5, 8, 11, 14, 18, 19, 21, 22], "heurist": [0, 3, 5], "definit": [0, 5, 10, 21], "foolproof": 0, "intern": 0, "sometim": [0, 21], "how": [0, 1, 3, 5, 6], "treat": 0, "col": 0, "print": [0, 2, 5, 19, 20, 21], "f": 0, "20": [0, 3, 12, 15, 18, 19, 21], "well": [0, 1, 2, 3, 5, 16], "name": [0, 1, 2, 3, 5, 10, 12, 13, 16, 19], "fals": [0, 1, 5, 12, 15, 16, 17, 18, 19, 20, 21], "true": [0, 1, 2, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "format": [0, 1, 2, 19], "lithologi": [0, 1, 2, 3], "mineralogi": [0, 1, 2], "siliciclast": [0, 1, 2], "These": [0, 1, 5, 22], "all": [0, 1, 2, 3, 5, 7, 9, 10, 12, 13, 17, 18, 19, 20], "correct": [0, 17], "first": [0, 1, 2, 3, 12, 13, 14, 16, 18, 19, 21], "ll": [0, 1, 3], "measur": [0, 1, 3, 5, 12, 13, 14, 18], "class_imbal": [0, 5], "For": [0, 1, 2, 3, 5, 9, 10, 16, 17, 19, 21, 22], "binari": [0, 13, 20], "imbalac": 0, "ratio": [0, 1, 13, 21], "between": [0, 5, 12, 13, 18, 19, 21], "major": [0, 1, 13], "minor": [0, 1, 3, 5, 13, 18, 19], "class": [0, 1, 5, 8, 13, 16, 18, 19, 20, 21], "multiclass": [0, 13, 20], "degre": [0, 1, 5, 13, 18, 19], "ortigosa": [0, 13, 18], "hernandez": [0, 13, 18], "et": [0, 13, 18], "al": [0, 13, 18], "2017": [0, 13, 18], "singl": [0, 3, 5, 17, 18, 20], "valu": [0, 1, 3, 5, 12, 17, 19, 20, 21], "explain": [0, 3], "mani": [0, 3, 5, 17, 20], "b": [0, 10, 12, 14, 18, 20, 21], "skew": 0, "support": [0, 1, 3, 5, 10], "imbalance_degre": [0, 1, 2, 5, 8, 11, 13, 18], "378593040846633": [0, 1, 2], "To": [0, 1, 3, 5, 7, 19, 22], "interpret": [0, 1], "number": [0, 1, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "two": [0, 1, 3, 5, 7, 13, 21, 22], "part": [0, 1, 3, 5, 6, 10, 13, 18, 19], "The": [0, 1, 2, 4, 5, 7, 8, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "integ": [0, 1, 5, 13, 16, 18, 19, 20, 21], "equal": [0, 1], "m": [0, 1, 3, 5, 7, 13, 14, 16, 18], "where": [0, 1, 5, 10, 13, 14, 19], "fraction": [0, 1, 13, 18, 19, 21], "378": [0, 1], "amount": [0, 1], "balanc": [0, 1], "perfectli": [0, 1], "999": [0, 1, 21], "realli": [0, 1, 5], "bad": [0, 1], "If": [0, 1, 3, 5, 6, 7, 9, 10, 12, 13, 14, 16, 17, 18, 19, 21], "have": [0, 1, 2, 3, 4, 5, 10, 19, 21], "In": [0, 1, 3, 5, 10, 14, 18, 20, 21], "gener": [0, 1, 3, 5, 6, 7, 10, 16, 18, 20, 21], "statist": [0, 1, 3, 12, 16, 19], "more": [0, 1, 2, 3, 5, 7, 8, 10, 16, 17, 19, 20, 21, 22], "inform": [0, 1, 3, 10], "than": [0, 1, 3, 5, 12, 16, 17, 19, 20, 21], "commonli": [0, 1], "imbalance_ratio": [0, 1, 5, 11, 13], "which": [0, 1, 3, 5, 7, 10, 12, 13, 14, 17, 18, 19, 20, 22], "maximum": [0, 1, 21], "minimum": [0, 1, 3], "regard": [0, 1, 10], "get": [0, 1, 2, 7, 12, 13, 18, 19, 21], "those": [0, 1, 3, 10, 22], "fewer": [0, 1, 20], "sampl": [0, 1, 3, 15, 17, 19, 20, 21], "expect": [0, 1, 3, 5, 13, 14, 17, 18, 19], "return": [0, 1, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "order": [0, 1, 3, 4, 5, 12, 13, 14, 16, 18, 20, 21], "smallest": [0, 1], "minority_class": [0, 1, 3, 5, 11, 13, 18], "dolomit": [0, 1, 3], "sandston": [0, 1, 3], "mudston": [0, 1, 3], "wackeston": [0, 1, 3], "dtype": [0, 1, 3, 12, 13, 14, 15, 17, 20, 21], "u10": [0, 1], "empir": [0, 3, 13, 17, 18, 21, 22], "observ": [0, 5, 11, 16], "frequenc": [0, 16], "\u03b6": [0, 13], "e": [0, 1, 3, 5, 8, 13, 14, 16, 18, 20, 21, 22], "empirical_distribut": [0, 11, 13], "39989914": 0, "18582955": 0, "15834594": 0, "04790721": 0, "13691377": 0, "07110439": 0, "same": [0, 1, 3, 5, 12, 13, 16, 18], "uniqu": [0, 12, 16, 20, 21], "note": [0, 3, 5, 12, 17, 19, 21], "differ": [0, 1, 3, 5, 10, 12, 13, 18], "from": [0, 1, 3, 5, 8, 10, 12, 13, 14, 16, 18, 19, 21, 22], "np": [0, 1, 3, 12, 16, 17, 18, 19, 20, 21], "sort": [0, 16, 21], "siltston": [0, 1, 2, 3], "limeston": [0, 1], "object": [0, 1, 2, 3, 5, 10, 16, 18, 19, 22], "also": [0, 1, 3, 5, 13, 16, 17, 18, 19, 22], "inspect": [0, 5, 13, 18, 19], "displai": [0, 10], "ax": [0, 3], "value_count": 0, "plot": 0, "kind": [0, 1, 3, 5, 10, 18, 20, 22], "bar": 0, "add": [0, 1, 3, 5, 6, 9, 10, 12], "line": [0, 9], "level": [0, 3, 16, 17, 18, 19, 20, 21], "axhlin": 0, "len": [0, 1, 12], "c": [0, 3, 5, 9, 10, 12, 14, 18, 19, 21], "r": [0, 12, 20], "matplotlib": 0, "line2d": 0, "0x7fe6b127ae10": 0, "get_outli": [0, 3, 5, 11, 17], "function": [0, 1, 2, 3, 5, 7, 8, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22], "indic": [0, 3, 10, 14, 17, 19, 20, 21], "point": [0, 3, 17, 21], "301": 0, "302": 0, "303": 0, "415": 0, "416": 0, "417": 0, "418": 0, "799": 0, "896": 0, "897": 0, "898": 0, "899": [0, 3], "996": 0, "997": 0, "1843": 0, "1844": 0, "2278": 0, "2279": 0, "2280": 0, "2638": 0, "2639": 0, "2640": 0, "2641": 0, "2642": 0, "2643": 0, "2920": 0, "2921": 0, "2922": 0, "3070": 0, "3071": 0, "3074": 0, "3075": 0, "3076": 0, "3079": [0, 2], "3080": [0, 2], "3081": 0, "3580": 0, "3581": 0, "3582": 0, "3583": 0, "see": [0, 1, 2, 3, 5, 6, 7, 14, 18, 21], "lie": [0, 21], "seaborn": [0, 1, 3], "sn": [0, 1, 3], "kdeplot": [0, 3], "rugplot": 0, "loc": [0, 1, 3, 12], "c1": 0, "lw": 0, "alpha": 0, "xlabel": [0, 3], "ylabel": [0, 3], "densiti": [0, 5, 12], "By": [0, 6, 19, 21], "default": [0, 3, 5, 12, 13, 14, 16, 17, 18, 19, 20, 21], "an": [0, 2, 3, 5, 6, 9, 10, 12, 14, 17, 18, 21], "isol": [0, 3, 17], "forest": [0, 3, 14, 17, 18], "99": [0, 3, 12, 17, 19, 21], "confid": [0, 3, 16, 17, 18, 19, 20, 21], "opt": 0, "local": [0, 1, 3, 7, 17], "factor": [0, 17, 19], "ellipt": [0, 17], "envelop": [0, 17], "mahalanobi": [0, 5, 11, 17, 19, 21], "distanc": [0, 3, 5, 12, 13, 16, 17, 18, 19, 21], "set": [0, 3, 9, 14, 16, 17, 19, 21], "choos": [0, 10], "equival": [0, 16, 19, 21], "threshold": [0, 1, 3, 5, 12, 13, 14, 15, 17, 18, 19, 21], "standard": [0, 1, 3, 5, 12, 14, 17, 18, 21], "deviat": [0, 3, 5, 17, 21], "awai": [0, 3], "signal": 0, "accept": [0, 10, 18, 19], "univari": [0, 5, 17, 19, 21, 22], "multivari": [0, 3, 5, 12, 19], "method": [0, 2, 3, 5, 12, 13, 17, 18, 19, 20, 21, 22], "mah": [0, 3, 17], "jointplot": 0, "x": [0, 1, 3, 5, 8, 12, 14, 17, 18, 19, 21, 22], "y": [0, 1, 3, 5, 8, 12, 14, 18, 19, 20, 21, 22], "hue": 0, "index_to_bool": [0, 11, 21], "n": [0, 14, 15, 16, 17, 18, 20, 21], "axisgrid": [0, 1], "jointgrid": 0, "0x7fe6b06cd340": 0, "A": [0, 3, 8, 10, 13, 16, 18, 19, 20, 21], "helper": [0, 5], "comput": [0, 5, 10, 12, 13, 16, 17, 19, 21], "given": [0, 3, 8, 12, 14, 16, 17, 18, 19, 21], "size": [0, 1, 12, 18, 19, 20, 21], "assum": [0, 5, 10, 12, 14, 18], "gaussian": [0, 3, 12, 21], "expected_outli": [0, 3, 11, 17], "80": [0, 3, 14, 18, 21], "44": 0, "so": [0, 1, 2, 3, 5, 9], "becaus": [0, 1, 3, 5, 19], "ha": [0, 1, 2, 3, 7, 10, 16, 19, 20, 21, 22], "lot": [0, 1, 3, 5, 19, 21], "truncat": 0, "tail": 0, "test": [0, 3, 5, 8, 9, 12, 14, 18, 19, 21, 22], "directli": [0, 2, 3, 5, 12, 19, 22], "has_outli": [0, 3, 5, 11, 17, 19], "compar": [0, 5, 8, 12, 13, 17, 18, 19, 21, 22], "result": [0, 3, 5, 10, 12, 21], "numpi": [0, 1, 3, 20, 21], "random": [0, 1, 3, 12, 14, 16, 18, 19, 20, 21], "normal": [0, 1, 5, 10, 12, 14, 18, 19, 21], "10_000": [0, 17, 20], "d": [0, 1, 3, 7, 10, 17, 19, 21], "p": [0, 3, 17, 19, 21], "displot": [0, 1, 3], "facetgrid": [0, 1], "0x7fe6b0501070": 0, "onli": [0, 1, 2, 3, 5, 10, 14, 16, 18, 19, 20, 22], "about": [0, 5, 7, 8, 19, 21, 22], "60": 0, "10": [0, 1, 2, 12, 13, 16, 17, 18, 20, 21], "000": [0, 1, 2, 17], "record": [0, 1, 3, 5, 19], "been": [0, 1, 3, 5, 10, 22], "multipl": [0, 1, 5, 17, 21], "instanc": [0, 1, 19, 21], "its": [0, 1, 2, 5, 10, 12], "There": [0, 1, 3, 6, 7, 8, 22], "legitim": [0, 1], "reason": [0, 1, 3, 5, 10], "why": [0, 1, 3, 14, 18], "might": [0, 1, 3], "happen": [0, 1, 7], "exampl": [0, 1, 2, 3, 5, 6, 7, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "mai": [0, 1, 2, 3, 10, 17, 19, 21], "natur": [0, 1, 3], "bound": [0, 1, 20], "g": [0, 1, 3, 5, 8, 18, 20, 21, 22], "poros": [0, 1], "alwai": [0, 1, 5], "greater": [0, 1], "deliber": [0, 1, 10], "prepar": [0, 1, 10], "process": [0, 1, 5, 22], "is_clip": [0, 1, 5, 11, 21], "0x7fe6b03e67b0": 0, "tri": [0, 5], "guess": [0, 5], "follow": [0, 1, 3, 4, 5, 7, 10, 13, 21], "scipi": [0, 12], "stat": [0, 21], "norm": [0, 12, 13, 18], "cosin": 0, "expon": 0, "exponpow": 0, "gamma": [0, 1], "gumbel_l": 0, "gumbel_r": 0, "powerlaw": 0, "triang": [0, 12], "trapz": 0, "uniform": [0, 19], "along": [0, 3, 10], "paramet": [0, 3, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "locat": [0, 3, 19], "scale": [0, 1, 3, 12, 21], "spite": 0, "find": [0, 1, 3, 5, 12, 17], "nearli": 0, "best_distribut": [0, 11, 12], "36789939485628": 0, "411020184908292": 0, "contrast": 0, "andbest": 0, "model": [0, 1, 3, 5, 12, 19, 21, 22], "gumbel": 0, "040572536302586": 0, "93432972751726": 0, "0x7fe6b0424da0": 0, "often": [0, 1, 3], "like": [0, 1, 2, 3, 5, 7, 9, 14, 16, 18, 19, 20, 21, 22], "implicit": 0, "our": [0, 1, 3, 19], "across": [0, 5, 12, 14, 18], "variou": [0, 1, 5], "respect": [0, 6], "both": [0, 3, 5, 7, 13, 22], "wasserstein": [0, 3, 5, 11, 12, 19], "facilit": 0, "calcul": [0, 12, 17], "aka": [0, 21], "earth": [0, 3], "mover": [0, 3], "train": [0, 1, 3, 5, 19, 21, 22], "score": [0, 1, 2, 3, 5, 12, 18, 20, 21], "substanti": 0, "w": 0, "25985545": 0, "28404634": 0, "49139232": 0, "33701782": 0, "22736457": 0, "13473663": 0, "33672956": 0, "20969657": 0, "41216725": 0, "34568777": 0, "39729747": 0, "48092099": 0, "0801856": 0, "10675027": 0, "13740318": 0, "10325295": 0, "19913347": 0, "21828753": 0, "26995735": 0, "33063277": 0, "24612402": 0, "23889923": 0, "26699721": 0, "2350674": 0, "20666445": 0, "44112543": 0, "16229232": 0, "63527036": 0, "18187639": 0, "34992043": 0, "19400917": 0, "74988182": 0, "31761526": 0, "27206283": 0, "30255291": 0, "24779581": 0, "could": [0, 3, 22], "heatmap": 0, "yticklabel": 0, "xticklabel": 0, "show": [0, 1, 3, 5, 14, 18], "u": [0, 1, 21], "log": [0, 1, 3], "7": [0, 3, 12, 14, 15, 17, 18, 20, 21], "somewhat": 0, "anomal": [0, 5, 8], "suggest": [0, 17], "cross": [0, 1, 10, 12], "h": 0, "cattl": 0, "sklearn": [0, 1, 2, 5, 8, 11, 13, 17, 18, 22], "model_select": [0, 1], "train_test_split": [0, 1], "preprocess": [0, 1, 3], "standardscal": [0, 1, 3], "x_train": [0, 1, 3, 21], "x_": 0, "test_siz": 0, "random_st": [0, 14, 18, 19, 20, 21], "42": [0, 1, 12, 14, 18, 20], "re": [0, 1, 3, 6, 19], "illustr": 0, "purpos": [0, 10], "valid": [0, 1, 3, 12, 19], "wai": [0, 1, 2, 3, 5, 6, 8, 13, 18, 22], "indeped": 0, "x_val": [0, 21], "x_test": [0, 1, 3], "should": [0, 1, 3, 5, 7, 12], "scaler": [0, 1], "fit_transform": [0, 8, 11, 19, 22], "transform": [0, 1, 5, 8, 10, 11, 19, 21, 22], "case": [0, 5, 14, 17, 18, 19, 21], "pass": [0, 3, 5, 12, 17, 19, 21], "them": [0, 3, 5, 13, 16, 17, 18, 22], "list": [0, 10, 13, 16, 18, 19, 20, 21], "tupl": [0, 12, 13, 16, 21], "03860982": 0, "02506236": 0, "04321734": 0, "03437337": 0, "04402681": 0, "02528225": 0, "0385111": 0, "05694201": 0, "04388196": 0, "049464": 0, "05560379": 0, "04002712": 0, "quit": [0, 5], "low": [0, 1, 3, 5, 18, 19, 20], "randomli": [0, 1, 3, 16], "correl": [0, 1, 2, 3, 15, 19], "lag": [0, 1], "shift": [0, 1, 3], "version": [0, 1, 3, 5, 7, 10, 19], "itself": [0, 1, 3, 6, 21], "sever": [0, 1, 3, 5, 6], "themselv": [0, 1, 3, 16, 19], "is_correl": [0, 1, 11, 15], "depend": [0, 1, 5, 8, 13, 18, 21], "That": [0, 1, 3, 12, 20], "shuffl": [0, 1], "remov": [0, 1, 3, 5], "doe": [0, 1, 3, 5, 10, 12, 13, 18, 19, 21, 22], "to_numpi": [0, 1], "copi": [0, 1, 5, 10, 21], "know": [0, 3, 5], "most": [0, 3, 5, 7, 12, 14, 18, 20, 21, 22], "seri": [0, 5, 8, 12, 22], "your": [0, 5, 8, 10], "assess": [0, 14, 18], "averag": [0, 14, 18], "serv": [0, 5], "control": [0, 10], "let": [0, 1, 2, 3], "small": [0, 3, 5, 19], "come": [0, 5, 19], "veri": [0, 1, 2, 3, 5], "close": [0, 5, 21, 22], "zero": [0, 16, 21], "constant": 0, "classif": [0, 2, 5, 14, 18, 20], "task": [0, 1, 2, 5, 14, 18, 19, 20], "imagin": 0, "try": [0, 1, 2, 3, 12, 17], "predict": [0, 1, 3, 5, 18, 19, 20, 22], "feature_import": [0, 1, 2, 5, 11, 14, 18], "41621216": 0, "2725526": 0, "23427556": 0, "07695968": 0, "unsurprisingli": 0, "useless": 0, "help": [0, 1, 5, 6, 7, 9, 22], "least": [0, 1, 5, 10, 14], "least_important_featur": [0, 5, 11, 14], "And": 0, "complementari": [0, 5], "report": [0, 2, 5, 6, 11, 18], "most_important_featur": [0, 5, 11, 14], "now": [0, 1, 2, 3, 5], "regress": [0, 2, 5, 14, 18, 20], "includ": [0, 1, 3, 5, 10, 16], "dummi": [0, 1, 2, 3, 5, 19, 20], "09713124": 0, "36666505": 0, "50735191": 0, "0288518": 0, "less": [0, 5, 21], "again": 0, "go": 1, "featur": [1, 2, 3, 5, 6, 8, 12, 14, 17, 18, 19, 21, 22], "problem": [1, 3, 13, 22], "machin": [1, 8, 22], "learn": [1, 3, 5, 8, 17, 19, 20, 22], "need": [1, 5, 7, 18, 21], "packag": [1, 5, 8, 9, 17, 22], "run": [1, 3, 5, 7, 12, 21], "code": [1, 5, 10, 16, 19], "burn": 1, "ourselv": 1, "19": [1, 12], "23": 1, "35": [1, 2, 14, 18, 21], "59": 1, "31": [1, 3, 17], "rai": 1, "ss": 1, "svm": [1, 3, 5], "svc": [1, 3], "clf": 1, "kernel": [1, 5, 12], "linear": 1, "fit": [1, 3, 10, 11, 12, 19, 22], "arrai": [1, 2, 3, 5, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "u2": 1, "far": [1, 2, 3], "good": [1, 20, 21], "everyth": 1, "work": [1, 3, 5, 10, 13, 19], "someon": 1, "x_scale": 1, "oop": 1, "unscal": 1, "easili": [1, 3, 5, 22], "done": 1, "peopl": [1, 4], "stack": [1, 19], "overflow": 1, "wonder": 1, "thei": [1, 2, 3, 5, 14, 18, 22], "ve": 1, "someth": [1, 3, 5, 16], "even": [1, 2, 10], "easier": [1, 5], "common": [1, 5, 10, 17], "pattern": [1, 8, 13], "y_train": [1, 3, 21], "y_test": [1, 3], "x_train_scal": 1, "x_test_scal": 1, "three": [1, 3, 8, 17, 22], "block": [1, 5], "split": [1, 3, 5, 21], "total": [1, 5, 13, 18], "stratifi": [1, 2, 5, 20], "preserv": 1, "wa": [1, 5, 10, 16, 21], "entir": [1, 5, 21], "leak": 1, "hidden": 1, "cannot": [1, 3, 10, 19], "plenti": 1, "too": [1, 3, 5, 20, 21], "reproduc": [1, 5, 10], "enough": [1, 3], "etc": [1, 3, 21], "error": 1, "everywher": [1, 6], "want": [1, 3, 9, 12, 13, 18, 19, 21], "chang": [1, 3, 5, 10], "sure": [1, 3, 5], "v0": 1, "otherwis": [1, 10, 13], "python": [1, 5, 7, 8, 22], "pip": [1, 7, 8, 9], "instal": [1, 2, 5, 8], "environ": [1, 3, 5, 9], "head": [1, 2], "shrimplin": [1, 2], "851": [1, 2], "3064": [1, 2], "a1": [1, 2], "sh": [1, 2], "77": [1, 2, 3], "613176": [1, 2], "915": [1, 2], "978076": [1, 2], "664": [1, 2], "2393": [1, 2], "499945": [1, 2], "4588": [1, 2], "979": [1, 2], "26": [1, 2], "581419": [1, 2], "14": [1, 2], "565": [1, 2], "661": [1, 2], "2416": [1, 2], "119814": [1, 2], "6112": [1, 2], "957": [1, 2], "79": [1, 2], "05": [1, 2, 14, 21], "549881": [1, 2], "050": [1, 2], "658": [1, 2], "2404": [1, 2], "576056": [1, 2], "7636": [1, 2], "936": [1, 2], "86": [1, 2], "518559": [1, 2], "115": [1, 2], "655": [1, 2], "249071": [1, 2], "9160": [1, 2], "74": [1, 2], "58": [1, 2], "436086": [1, 2], "300": [1, 2], "647": [1, 2], "2382": [1, 2], "602601": [1, 2], "later": [1, 3, 19], "spuriou": [1, 22], "rng": [1, 12, 18, 19, 20, 21], "default_rng": [1, 12, 18, 19, 20, 21], "nois": [1, 3], "algorithm": 1, "flag": [1, 3, 17, 21], "outlier": [1, 2, 3, 5, 8, 11, 19], "distribut": [1, 3, 5, 8, 10, 11, 13, 18, 19, 21, 22], "shape": [1, 3, 8, 12, 17, 19], "0x7f153fc84830": 1, "But": [1, 3], "around": 1, "issu": [1, 3, 5, 6, 10, 14, 18, 22], "40996915": 1, "20901374": 1, "31748025": 1, "06353685": 1, "As": [1, 2, 3, 8], "hope": 1, "attribut": [1, 10, 19], "shown": 1, "possibl": [1, 3, 5, 10], "would": [1, 12], "nice": 1, "smoke": [1, 8], "alarm": [1, 5, 19], "prebuilt": 1, "won": 1, "abl": 1, "catch": 1, "howev": [1, 5, 10], "hard": [1, 5], "spot": 1, "self": [1, 3, 16, 19], "alert": [1, 19, 22], "user": [1, 22], "potenti": [1, 20, 22], "provid": [1, 3, 5, 10, 13, 16, 18, 20, 22], "wrap": [1, 5, 18, 20], "anywai": 1, "sensibl": 1, "test_wel": [1, 3], "crawford": [1, 3], "stuart": [1, 3], "test_flag": [1, 3], "isin": [1, 3], "step": [1, 3, 12, 16, 19, 21, 22], "x27": [1, 3], "imbalancedetector": [1, 5, 8, 11, 13, 18, 19, 22], "clipdetector": [1, 5, 11, 19], "correlationdetector": [1, 5, 11, 19], "multimod": [1, 3, 5, 12, 19], "multimodalitydetector": [1, 3, 5, 11, 19], "outlierdetector": [1, 5, 11, 19], "distributioncompar": [1, 5, 11, 19, 22], "importancedetector": [1, 5, 11, 19], "dummypredictor": [1, 3, 11, 19], "jupyt": [1, 3], "pleas": [1, 3, 6, 7, 9], "rerun": [1, 3], "cell": [1, 3], "html": [1, 3, 7], "represent": [1, 3], "trust": [1, 3], "notebook": [1, 3, 5], "On": [1, 3], "github": [1, 3, 7, 8, 16], "unabl": [1, 3], "render": [1, 3], "page": [1, 3, 5, 7, 8], "nbviewer": [1, 3], "org": [1, 3, 10, 13, 21], "pipelinepipelin": [1, 3], "imbalancedetectorimbalancedetector": [1, 3], "clipdetectorclipdetector": [1, 3], "correlationdetectorcorrelationdetector": [1, 3], "multimodalitydetectormultimodalitydetector": [1, 3], "outlierdetectoroutlierdetector": [1, 3], "distributioncomparatordistributioncompar": [1, 3], "importancedetectorimportancedetector": [1, 3], "dummypredictordummypredictor": [1, 3], "make_pipelin": [1, 3, 19], "pipe": [1, 3, 19], "standardscalerstandardscal": [1, 3], "svcsvc": [1, 3], "imbalanc": [1, 3, 13, 18], "420": [1, 3], "400": [1, 3], "minority_classes_": [1, 3, 19], "\u2139": [1, 3], "succeed": [1, 3], "group": [1, 3, 5, 12, 21], "316": 1, "v": [1, 3, 12], "relev": [1, 5], "classifi": [1, 3, 5], "f1": [1, 2, 3, 5, 18, 20], "25488459423559595": [1, 3], "roc_auc": [1, 2, 3, 18, 20], "most_frequ": [1, 3, 5, 18, 20], "strategi": [1, 2, 3, 5, 18, 20], "643721188696941": 1, "detector": [1, 5, 8, 11, 13, 18, 19, 22], "def": [1, 3], "has_neg": [1, 19], "bool": [1, 3, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "trigger": [1, 3, 5, 19], "neg": [1, 3, 19], "negative_detector": [1, 3], "nb": 1, "func": [1, 3, 12, 19], "lt": [1, 3], "baseredflagdetector": [1, 3, 11, 19], "__init__": [1, 3], "gt": [1, 3], "lambda": [1, 3, 12, 19], "0x7f153c8e3d80": 1, "messag": [1, 3, 5, 19], "detectordetector": [1, 3], "ad": [1, 5], "posit": [1, 5, 21], "what": [1, 5, 8, 17, 18, 20], "care": [1, 5], "basic_usag": [2, 3, 5], "ipynb": [2, 3, 5], "using_redflag_with_panda": 2, "some": [2, 3, 5, 6, 8, 13, 18, 19, 22], "give": [2, 3, 5, 10], "access": [2, 5, 22], "almost": [2, 5], "were": [2, 3, 5, 12], "best": [2, 5, 12, 20], "idea": [2, 3], "though": 2, "import": [2, 3, 5, 6, 8, 10, 11, 18, 19, 20, 21], "long": 2, "regist": 2, "time": [2, 3, 12, 19], "being": [2, 3, 14, 18, 21], "call": [2, 3, 5, 12, 18, 19, 20, 21, 22], "simplic": 2, "notic": [2, 10], "extra": 2, "insert": [2, 22], "Or": [2, 9], "ask": 2, "new": [2, 3, 5, 6, 7], "dummy_scor": [2, 5, 11, 18, 20], "2411344733492839": 2, "5030196416166594": 2, "mean_squared_error": [2, 18, 20], "47528": 2, "78263092096": 2, "r2": [2, 5, 18, 20], "simpl": [2, 8], "continu": [2, 5, 8, 18, 19, 20], "data": [2, 3, 5, 8, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22], "suitabl": [2, 5], "34": 2, "136": 2, "140": 2, "141": 2, "142": 2, "143": 2, "145": 2, "175": 2, "180": 2, "181": 2, "182": 2, "581": 2, "633": 2, "662": 2, "768": 2, "769": 2, "801": 2, "1316": 2, "1547": 2, "1731": 2, "1732": 2, "1744": 2, "1754": 2, "1756": 2, "1778": 2, "1779": 2, "1780": 2, "1784": 2, "1788": 2, "1808": 2, "1812": 2, "2884": 2, "2973": 2, "2974": 2, "3004": 2, "3087": 2, "3109": 2, "experiment": [2, 5, 18], "futur": [2, 3, 5, 9, 16], "releas": [2, 5, 7], "feedback": 2, "correlation_detector": [2, 5, 11, 18], "implement": [2, 3, 5, 16, 19, 21, 22], "23155584": 2, "21912608": 2, "33738409": 2, "21193399": 2, "none": [2, 5, 12, 13, 14, 16, 17, 18, 19, 20, 21], "appear": [2, 5, 10, 12, 14, 18, 21], "autocorrel": 2, "inde": 2, "rais": [3, 19, 20, 22], "red": 3, "load": [3, 8], "independ": [3, 5, 8, 11], "furthermor": 3, "clip": [3, 5, 8, 11, 19, 21, 22], "histplot": 3, "subsequ": [3, 5, 10, 19, 22], "product": [3, 5, 10], "mostli": [3, 5], "unsupervis": [3, 14, 18, 19], "iid": [3, 8], "particular": [3, 10], "univariateoutlierdetector": [3, 11, 19], "consid": [3, 5, 6, 17, 19], "separ": [3, 10, 19], "usual": 3, "probabl": [3, 5, 13, 17, 19, 20, 21], "instead": [3, 5, 19], "multivariateoutlierdetector": [3, 11, 19], "togeth": [3, 19], "dure": [3, 5, 19, 22], "word": [3, 5, 16, 21], "examin": 3, "final": [3, 19], "one": [3, 5, 10, 12, 16, 19, 21], "bit": [3, 5], "mode": 3, "supervis": 3, "base": [3, 10, 16, 18, 19, 22], "fulli": 3, "triger": 3, "similar": [3, 5], "seen": 3, "ordinari": 3, "rfpipelin": [3, 5, 11, 19], "contain": [3, 5, 7, 10, 13, 16, 18, 21], "out": [3, 10, 22], "read": [3, 6, 7, 9, 22], "compat": 3, "requir": [3, 5, 7, 10, 17, 18, 19, 20], "comparison": [3, 5], "vector": [3, 19, 20], "avail": [3, 10], "anoth": [3, 5, 6, 21], "compos": [3, 22], "multi": [3, 13, 20], "make_rf_pipelin": [3, 5, 11, 19], "just": [3, 5, 7, 13, 19, 21], "carri": [3, 8, 10], "phase": 3, "categor": [3, 5, 8, 18, 19, 20], "input": [3, 19, 21], "349": 3, "3682141715600706": 3, "when": [3, 5, 19, 21, 22], "categori": [3, 20], "y_pred": 3, "30": [3, 21], "argument": [3, 5, 12, 18], "element": [3, 16, 21], "redflag_pipelin": 3, "compon": [3, 5, 8, 19, 22], "yet": [3, 5], "sensit": [3, 21], "instanti": [3, 5, 19], "construct": [3, 19], "drop": 3, "leav": 3, "don": [3, 7, 21], "think": 3, "troubl": 3, "lower": [3, 20], "qualifi": 3, "rememb": 3, "longer": [3, 5], "839": 3, "626": 3, "154443705823081": 3, "higher": 3, "fail": [3, 5, 22], "mention": 3, "whether": [3, 10, 12, 14, 16, 17, 18, 20], "never": 3, "rfpipelinerfpipelin": 3, "imbalancecomparatorimbalancecompar": 3, "therefor": [3, 19], "infer": [3, 14, 16, 18], "66": 3, "276": 3, "2359": 3, "73324716": 3, "591": 3, "252": 3, "2354": 3, "54679144": 3, "341": 3, "82": 3, "2330": 3, "35783664": 3, "064": 3, "90": [3, 5, 14, 18, 21], "49": [3, 13, 14, 18], "2193": 3, "06953439": 3, "168": 3, "975": 3, "2192": 3, "32922081": 3, "154": 3, "108": 3, "2176": 3, "62535394": 3, "125": 3, "emit": [3, 5, 21], "has_nan": [3, 5, 11, 21], "nan": [3, 21], "isnan": 3, "0x7f9c489831a0": 3, "make_detector_pipelin": [3, 5, 11, 19], "combin": [3, 10, 12], "ab": [3, 12], "custom": [3, 5, 19], "0x7f9c489822a0": 3, "0x7f9c489b4fe0": 3, "class_count": [3, 11, 13], "worri": 3, "concern": 3, "option": [3, 7, 8, 17, 19, 21, 22], "seem": [3, 5, 19], "lose": 3, "dynam": 3, "rang": [3, 5, 16, 17, 18, 20], "daili": 3, "temperatur": [3, 18, 20], "europ": 3, "deg": 3, "dealt": 3, "attenu": 3, "larg": [3, 6, 21], "sens": [3, 5, 19], "simpli": 3, "suspici": 3, "without": [3, 10], "perform": [3, 5, 10, 19, 21], "awar": 3, "research": [3, 22], "contigu": 3, "space": 3, "spatial": [3, 12], "rock": 3, "properti": [3, 16], "assumpt": [3, 8, 14, 18], "One": 3, "big": 3, "pitfal": [3, 22], "non": [3, 5, 10], "must": [3, 10, 12, 17, 19, 21], "leakag": [3, 8], "thu": [3, 20], "over": [3, 12, 21], "optimist": 3, "evaul": 3, "date": [3, 10], "patient": 3, "id": [3, 13, 18, 19], "borehol": 3, "robust": [3, 19], "covari": [3, 17, 19], "insensit": 3, "dimension": 3, "analog": [3, 17], "varianc": [3, 14, 18], "certain": 3, "fall": 3, "centr": 3, "within": [3, 10, 19, 21], "sd": [3, 17], "1000": [3, 17, 21], "val": 3, "iso": [3, 17], "okai": 3, "keep": 3, "befor": [3, 12, 17, 19, 22], "bin": [3, 12, 19, 21], "No": [3, 5, 12, 19, 21], "evalu": [3, 5], "turn": [3, 16], "treatment": 3, "crack": 3, "sign": 3, "violat": 3, "ident": [3, 8, 12], "current": [3, 5, 16], "visual": 3, "especi": 3, "ignor": [3, 12, 13, 17, 18], "forget": 3, "appli": [3, 5, 10, 12, 17, 19, 22], "domain": 3, "geograph": 3, "type": [3, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "widget": 3, "select": 3, "unintend": 3, "classic": 3, "medic": 3, "diagnosi": 3, "encod": 3, "hand": [3, 13], "distract": 3, "improv": [3, 5, 6, 10], "desir": 3, "contribut": [4, 8, 10], "project": [4, 6, 7], "alphabet": 4, "matt": 4, "hall": 4, "agil": [4, 6], "scientif": 4, "canada": 4, "orcid": 4, "0000": 4, "0002": 4, "4054": 4, "8295": 4, "make": [5, 6, 7, 8, 10, 13, 16, 19, 20], "document": [5, 6, 7, 9, 10], "repons": 5, "review": 5, "submiss": [5, 10], "journal": [5, 12, 21], "open": 5, "sourc": [5, 9, 10], "softwar": [5, 10, 22], "joss": [5, 22], "89": 5, "91": 5, "92": 5, "93": 5, "94": 5, "95": [5, 16, 18, 20, 21], "build": 5, "window": 5, "maco": 5, "linux": 5, "ci": 5, "intend": 5, "preview": 5, "relat": [5, 12, 15, 16, 17, 20], "accessor": [5, 8, 18, 22], "is_imbalanc": [5, 11, 13, 18, 22], "conda": [5, 7, 8, 9], "manag": [5, 10], "forg": [5, 8, 9], "warn": [5, 8, 14, 18, 19, 21, 22], "valueexcept": 5, "allow": [5, 21], "pipelin": [5, 8, 19, 22], "break": 5, "is_ord": [5, 11, 18, 20], "markov": [5, 11], "chain": [5, 16, 19], "analysi": 5, "chi": [5, 16], "squar": [5, 12, 16, 19, 20, 21], "transit": [5, 16], "matrix": [5, 12, 16], "boolean": [5, 17, 21], "perhap": 5, "below": [5, 8, 10, 13, 18], "is_multimod": [5, 11, 12], "present": [5, 19], "modal": 5, "partit": [5, 12], "insufficientdatadetector": [5, 11, 19], "regressionmultimodaldetector": 5, "multimodaldetector": 5, "via": 5, "subject": [5, 10], "text": [5, 10], "dummy_classification_scor": [5, 11, 20], "dummy_regression_scor": [5, 11, 20], "naiv": [5, 19], "mse": [5, 20], "roc": [5, 20], "auc": [5, 20], "addition": 5, "emploi": 5, "suit": [5, 20], "appropri": [5, 10, 18, 20], "move": 5, "update_p": [5, 11, 21], "util": [5, 11, 17], "imbal": [5, 8, 11, 18, 19], "up": [5, 19], "debat": 5, "has_low_distance_stdev": 5, "resembl": 5, "semant": 5, "success": 5, "1d": [5, 12, 17, 19, 21], "write": [5, 6, 10], "own": [5, 8, 10, 22], "take": [5, 21], "sequenc": [5, 12, 16, 19, 21], "map": 5, "scikit": [5, 8, 17, 19, 20, 22], "unimod": 5, "soon": 5, "redefin": 5, "is_standard": [5, 11, 21], "deprec": [5, 11, 21], "is_standard_norm": [5, 11, 21], "kolmogorov": [5, 21], "smirnov": [5, 21], "reliabl": [5, 22], "exactli": [5, 20], "roughli": 5, "slightli": 5, "exist": [5, 22], "eg": 5, "sinc": 5, "knn": [5, 14, 18], "estim": [5, 12, 19, 20, 21], "third": [5, 10, 21], "unstabl": 5, "caus": [5, 10], "erron": 5, "consecut": [5, 11, 21], "tutori": [5, 6, 8], "doc": 5, "button": 5, "half": [5, 14], "high": [5, 18, 19, 20], "imbalancecompar": [5, 11, 19], "throw": 5, "garden": 5, "special": [5, 10], "straight": 5, "fork": [5, 8], "claus": [5, 19], "bsd": [5, 19], "licens": [5, 8, 19], "using_redflag_with_sklearn": 5, "buggi": 5, "convers": [5, 10, 17], "discret": [5, 13], "ones": [5, 21], "test_redflag": 5, "py": [5, 19], "file": [5, 7, 10], "wherea": 5, "doctest": [5, 7], "onc": 5, "pytest": [5, 7], "coverag": 5, "excess": [5, 19], "reorgan": 5, "modul": [5, 8], "namespac": 5, "doesn": 5, "affect": 5, "confus": 5, "either": [5, 7, 10, 12, 14, 16, 18], "conveni": [5, 21], "oneclasssvm": 5, "ellipticenvelop": 5, "zscore_outli": 5, "kde_peak": [5, 11, 12], "peak": [5, 12], "fit_kd": [5, 11, 12], "get_kd": [5, 11, 12], "find_large_peak": [5, 11, 12], "bandwidth": [5, 12], "bw_silverman": [5, 11, 12], "bw_scott": [5, 11, 12], "overrid": 5, "fix": [5, 6, 22], "bug": [5, 6], "using_redflag": 5, "has_monoton": [5, 11, 21], "has_flat": [5, 11, 21], "interpol": 5, "iter_group": [5, 11, 21], "ecdf": [5, 11, 21], "flatten": [5, 11, 21], "stdev_to_proport": [5, 11, 17, 21], "proportion_to_stdev": [5, 11, 21], "wrote": 5, "has_few_sampl": [5, 11, 21], "z": [5, 21], "goe": 5, "workflow": [5, 7, 8, 22], "stabl": 5, "flail": 5, "auto": [5, 15, 18, 19, 20], "thank": 6, "submit": [6, 10, 22], "request": [6, 7], "propos": 6, "pull": [6, 7], "typo": 6, "fortun": 6, "profession": 6, "commun": [6, 10], "mutual": 6, "consider": 6, "protect": 6, "everyon": 6, "wish": 6, "identifi": [6, 12, 17, 22], "author": [6, 8, 10], "yourself": 6, "md": [6, 7], "agre": [6, 10], "shall": [6, 10], "govern": 6, "term": [6, 10], "unless": [6, 10], "specif": [6, 21], "agreement": [6, 10], "made": [6, 10, 14, 18], "start": [7, 21], "dev": [7, 9], "back": [7, 13], "cov": 7, "docstr": [7, 21], "further": 7, "folder": 7, "repo": 7, "pep": 7, "518": 7, "style": 7, "tar": 7, "gz": 7, "whl": 7, "command": [7, 9], "cd": 7, "sphinx": 7, "manual": 7, "stuff": 7, "makefil": 7, "script": 7, "updat": [7, 21], "publish": [7, 21], "action": 7, "push": 7, "upload": 7, "pypi": 7, "interfac": [7, 10, 19], "lightweight": 8, "safeti": 8, "net": 8, "ndarrai": [8, 12, 13, 14, 16, 17, 19, 21], "analys": 8, "threat": 8, "channel": [8, 9], "program": 8, "standalon": [8, 22], "explor": 8, "overview": 8, "basic": 8, "usag": 8, "metric": [8, 12, 13, 14, 18], "pre": 8, "built": [8, 19], "submodul": 8, "content": [8, 10], "develop": [8, 9], "changelog": 8, "index": [8, 21], "search": [8, 12], "At": 9, "config": 9, "channel_prior": 9, "strict": 9, "apach": 10, "januari": 10, "2004": 10, "www": 10, "AND": 10, "condit": [10, 19], "FOR": 10, "reproduct": 10, "defin": [10, 13, 18], "section": 10, "through": [10, 22], "licensor": 10, "copyright": 10, "owner": 10, "entiti": 10, "grant": 10, "legal": 10, "union": 10, "act": 10, "under": [10, 19], "power": 10, "direct": [10, 16], "indirect": 10, "contract": 10, "ii": 10, "ownership": 10, "fifti": 10, "percent": 10, "outstand": 10, "share": 10, "iii": 10, "benefici": 10, "individu": 10, "exercis": 10, "permiss": 10, "form": 10, "prefer": 10, "modif": 10, "limit": 10, "configur": 10, "mechan": 10, "translat": 10, "compil": 10, "media": 10, "authorship": 10, "attach": 10, "appendix": 10, "deriv": [10, 13], "editori": 10, "revis": 10, "annot": 10, "elabor": 10, "repres": [10, 12, 14, 16, 18], "whole": [10, 19], "origin": [10, 16, 19], "remain": 10, "mere": 10, "link": 10, "bind": 10, "thereof": 10, "addit": [10, 19], "intention": 10, "inclus": 10, "behalf": 10, "electron": 10, "verbal": 10, "written": 10, "sent": 10, "mail": 10, "system": [10, 16], "track": 10, "discuss": 10, "exclud": 10, "conspicu": 10, "mark": [10, 21], "design": 10, "Not": [10, 13, 19], "contributor": [10, 19], "whom": 10, "receiv": 10, "incorpor": [10, 22], "herebi": 10, "perpetu": 10, "worldwid": 10, "exclus": 10, "charg": 10, "royalti": 10, "free": 10, "irrevoc": 10, "publicli": 10, "sublicens": 10, "patent": 10, "except": [10, 22], "state": [10, 14, 16, 18], "offer": [10, 22], "sell": 10, "transfer": 10, "claim": 10, "necessarili": 10, "infring": 10, "alon": 10, "institut": 10, "litig": 10, "against": [10, 12, 22], "counterclaim": 10, "lawsuit": 10, "alleg": 10, "constitut": 10, "contributori": 10, "termin": 10, "redistribut": 10, "medium": 10, "meet": [10, 19], "recipi": 10, "modifi": 10, "promin": 10, "retain": 10, "trademark": 10, "pertain": 10, "readabl": 10, "place": 10, "wherev": 10, "parti": 10, "alongsid": 10, "addendum": 10, "constru": 10, "statement": 10, "compli": 10, "explicitli": 10, "notwithstand": 10, "abov": [10, 19], "noth": [10, 18, 19], "herein": 10, "supersed": 10, "execut": 10, "trade": 10, "servic": 10, "customari": 10, "disclaim": 10, "warranti": 10, "applic": 10, "law": 10, "AS": 10, "basi": 10, "OR": 10, "OF": 10, "express": [10, 19], "impli": 10, "titl": 10, "merchant": 10, "sole": 10, "respons": 10, "determin": [10, 17], "risk": 10, "associ": 10, "liabil": 10, "event": [10, 13, 18], "theori": 10, "tort": 10, "neglig": 10, "grossli": 10, "liabl": 10, "damag": 10, "incident": 10, "consequenti": 10, "charact": [10, 16], "aris": 10, "inabl": 10, "loss": 10, "goodwil": 10, "stoppag": 10, "failur": 10, "malfunct": 10, "commerci": 10, "advis": 10, "while": [10, 18, 19, 20], "fee": 10, "indemn": 10, "oblig": 10, "right": 10, "consist": 10, "indemnifi": 10, "defend": 10, "hold": 10, "harmless": 10, "incur": 10, "assert": 10, "end": [10, 21], "cv_kde": [11, 12], "wasserstein_multi": [11, 12], "wasserstein_ovo": [11, 12], "wasserstein_ovr": [11, 12], "diverg": [11, 13, 18], "furthest_distribut": [11, 13], "major_minor": [11, 13], "markov_chain": [11, 16], "chi_squar": [11, 16], "degrees_of_freedom": [11, 16], "expected_freq": [11, 16], "from_sequ": [11, 16], "generate_st": [11, 16], "normalized_differ": [11, 16], "observed_freq": [11, 16], "hollow_matrix": [11, 16], "regular": [11, 16], "mahalanobis_outli": [11, 17], "dataframeaccessor": [11, 18], "seriesaccessor": [11, 18], "null_decor": [11, 18], "formatwarn": [11, 19], "is_binari": [11, 20], "is_multiclass": [11, 20], "is_multioutput": [11, 20], "n_class": [11, 20], "bool_to_index": [11, 21], "cv": [11, 12, 21], "docstring_from": [11, 21], "generate_data": [11, 13, 18, 21], "get_idx": [11, 21], "is_numer": [11, 21], "ordered_uniqu": [11, 21], "split_and_standard": [11, 21], "zscore": [11, 21], "understand": [12, 15, 17, 20], "buffer": [12, 13, 14, 15, 17, 20, 21], "_supportsarrai": [12, 13, 14, 15, 17, 20, 21], "_nestedsequ": [12, 13, 14, 15, 17, 20, 21], "int": [12, 13, 14, 15, 16, 17, 18, 20, 21], "float": [12, 13, 14, 15, 16, 17, 18, 20, 21], "complex": [12, 13, 14, 15, 17, 20, 21, 22], "str": [12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "byte": [12, 13, 14, 15, 17, 20, 21], "namedtupl": 12, "histogram": [12, 19], "8771812708978117": 12, "5001419889107208": 12, "3286356643172673": 12, "3406453953773365": 12, "scott": [12, 19], "6162678270732356": 12, "1e": 12, "silverman": 12, "bw": 12, "1981": 12, "investig": 12, "royal": 12, "societi": 12, "vol": 12, "43": 12, "pp": 12, "97": 12, "581810759152688": 12, "n_bandwidth": 12, "grid": 12, "optim": 12, "fold": 12, "5212113989811242": 12, "traceback": [12, 20, 21], "recent": [12, 20, 21, 22], "last": [12, 20, 21], "valueerror": [12, 21], "largest": [12, 21], "amplitud": 12, "cut": 12, "off": 12, "smaller": [12, 19, 20], "x_peak": 12, "y_peak": 12, "15": [12, 14, 18, 20, 21], "kde": 12, "2124714013056916": 12, "014367259502733645": 12, "rule": 12, "thumb": 12, "354649738246933": 12, "162332012191087": 12, "per": [12, 15, 21], "concaten": 12, "67243035": 12, "88998226": 12, "22014721": 12, "19729456": 12, "ovr": 12, "reduc": 12, "callabl": [12, 13], "ovo": 12, "full": 12, "axi": 12, "2d": [12, 17, 21], "latter": 12, "implicitli": 12, "reshap": [12, 17], "97490053": 12, "1392715": 12, "11417203": 12, "69635752": 12, "22475": 12, "39754762": 12, "71161667": 12, "24495": 12, "pairwis": 12, "squareform": 12, "match": [12, 17, 21], "k": [12, 13], "55708601": 12, "39271504": 12, "83562902": 12, "rest": 12, "refer": 13, "jonathan": 13, "inaki": 13, "inza": 13, "jose": 13, "lozano": 13, "extent": 13, "recognit": 13, "letter": 13, "98": 13, "doi": [13, 21], "1016": 13, "j": 13, "patrec": 13, "08": 13, "002": 13, "dict": [13, 18, 19, 20], "counter": 13, "recommend": [13, 18], "omit": [13, 18], "encount": [13, 22], "helling": [13, 18], "string": [13, 16, 18, 19, 21], "euclidean": [13, 18], "manhattan": [13, 18], "kl": [13, 18], "tv": [13, 18], "actual": [13, 17, 18, 19], "zeta": [13, 18], "equat": 13, "length": [13, 19, 21], "discov": 13, "ir": [13, 18], "furthest": 13, "reflect": [13, 18], "minu": [13, 18], "accord": [13, 18], "eq": [13, 18], "mathrm": [13, 18], "frac": [13, 18], "d_": [13, 18], "delta": [13, 18], "mathbf": [13, 18], "iota": [13, 18], "_m": [13, 18], "l1": [13, 18], "l2": [13, 18], "variat": [13, 18, 21], "kullback": [13, 18], "leibner": [13, 18], "288": [13, 18], "round": [13, 18], "76": [13, 18], "629": [13, 18], "333": [13, 18], "511": [13, 18], "81": [13, 18], "61": [13, 18], "73": [13, 18], "65": [13, 18, 21], "maj": 13, "logist": [14, 18], "permut": [14, 18], "lasso": [14, 18], "cluster": [14, 18], "highest": [14, 18], "kept": [14, 18], "55": [14, 17, 18, 21], "85": [14, 18, 21], "99416839": [14, 18], "00583161": [14, 18], "x0": [14, 18], "x1": [14, 18], "x2": [14, 18], "cutoff": 14, "01": 14, "24": 14, "int64": [14, 17, 21], "revers": 14, "chunk": 15, "agilescientif": 16, "striplog": 16, "observed_count": 16, "include_self": 16, "q": [16, 18, 20], "critic": 16, "bigger": 16, "second": 16, "reject": 16, "hypothesi": 16, "classmethod": 16, "strings_are_st": 16, "pars": 16, "specifi": [16, 19], "upward": 16, "inner": 16, "token": 16, "sst": 16, "mud": 16, "lst": 16, "previou": 16, "dimens": [16, 20, 21], "current_st": 16, "next": 16, "hollow": 16, "diagon": 16, "arg": [16, 18, 19], "seq_of_seq": 16, "plu": [16, 21], "atleast_2d": 16, "137": 17, "contamin": 17, "approxim": [17, 21], "lof": 17, "ee": 17, "mahanalobi": 17, "inlier": 17, "convent": [17, 21], "four": 17, "33": 17, "multipli": 17, "rousseeuw": 17, "van": [17, 22], "driessen": 17, "n_sampl": [17, 19], "n_featur": [17, 19], "6583124": 17, "1055416": 17, "5527708": 17, "01173463": 17, "67448975": 17, "33724488": 17, "stdev": [17, 21], "api": 17, "outsid": 17, "70": 17, "89163847": 17, "million": 17, "datapoint": 17, "billion": 17, "pandas_obj": 18, "automat": [18, 19, 20], "tomorrow": [18, 20], "rain": [18, 20], "cloud": [18, 20], "sun": [18, 20], "seed": [18, 20, 21], "dictionari": [18, 20], "3333333333333333": [18, 20], "top": [18, 20], "middl": [18, 20], "bottom": [18, 20], "decor": [18, 21], "kwarg": 19, "baseestim": [19, 22], "transformermixin": [19, 22], "fit_param": 19, "n_output": 19, "x_new": 19, "n_features_new": 19, "sin": 19, "linspac": 19, "38077051": 19, "42977406": 19, "05260728": 19, "92571458": 19, "81188195": 19, "7482485": 19, "84147098": 19, "warn_if_zero": 19, "memori": 19, "expens": 19, "anyth": 19, "bother": 19, "min_class_diff": 19, "imbalance_": 19, "adjust": 19, "unusu": 19, "difficult": 19, "suffici": 19, "mutlivari": 19, "1_000": 19, "12573022": 19, "13210486": 19, "64042265": 19, "10490012": 19, "53566937": 19, "36159505": 19, "24972527": 19, "75063397": 19, "55581573": 19, "01881162": 19, "90942756": 19, "36922933": 19, "outliers_": 19, "beyond": 19, "covarianc": 19, "verbos": 19, "adapt": 19, "handl": 19, "prior": [19, 21], "iter": [19, 21], "fulfil": 19, "xt": 19, "n_transformed_featur": 19, "presenc": 19, "mappabl": 19, "correspond": 19, "safer": 19, "shorthand": 19, "constructor": 19, "permit": 19, "lowercas": 19, "joblib": 19, "cach": 19, "path": 19, "directori": 19, "enabl": 19, "clone": 19, "named_step": 19, "advantag": 19, "consum": 19, "elaps": 19, "complet": 19, "baselin": [20, 22], "dummyclassifi": 20, "20000000000000004": 20, "35654761904761906": 20, "dummyregressor": 20, "root": 20, "whichev": 20, "arr": [20, 21], "randint": 20, "output": [20, 21], "typeerror": 20, "cond": 21, "stepsiz": 21, "coeffici": 21, "decim": 21, "5163977794943222": 21, "instruct": 21, "human": 21, "friendli": 21, "source_func": 21, "downsampl": 21, "cdf": 21, "switch": 21, "weight": 21, "mid": 21, "halfwai": 21, "formal": 21, "unbias": 21, "everi": [21, 22], "foo": 21, "l": 21, "toler": [21, 22], "flat": 21, "interv": 21, "monoton": 21, "idx": 21, "convert": 21, "atol": 21, "001": 21, "faster": 21, "isclos": 21, "\u03bc": 21, "\u03c3": 21, "allclos": 21, "absolut": 21, "yield": 21, "mask": 21, "item": 21, "unord": 21, "fast": 21, "reli": 21, "job": 21, "slow": 21, "1000000000": 21, "invers": 21, "magnif": 21, "hyperellipsoid": 21, "sdhe": 21, "proport": 21, "2816": 21, "tabl": 21, "1371": 21, "pone": 21, "0118537": 21, "decent": 21, "precis": 21, "1e9": 21, "575829302496098": 21, "039137525465009": 21, "8000000000000003": 21, "y_val": 21, "whose": 21, "68": 21, "27": 21, "39": 21, "signific": 21, "figur": 21, "beta": 21, "paper": [21, 22], "poseidon": 21, "csd": 21, "auth": 21, "pdf": 21, "ververidis08a": 21, "exact": 21, "6826894921370859": 21, "6826894916531445": 21, "9973002039367398": 21, "9973002039633309": 21, "39346933952920327": 21, "9946544947734935": 21, "bayesian": 21, "rate": 21, "posterior": 21, "4999999999999998": 21, "54919334": 21, "161895": 21, "77459667": 21, "38729833": 21, "practition": 22, "field": 22, "ensur": 22, "safe": 22, "lead": 22, "overconfid": 22, "wildli": 22, "integr": 22, "enhanc": 22, "qualiti": 22, "hazard": 22, "situat": 22, "harm": 22, "concept": 22, "known": 22, "prevent": 22, "civil": 22, "engin": 22, "industri": 22, "decad": 22, "gelder": 22, "etal": 22, "2021": 22, "motiv": 22, "draft": 22, "scientist": 22, "alreadi": 22, "clippingdetector": 22, "although": 22, "subclass": 22, "attempt": 22}, "objects": {"": [[11, 0, 0, "-", "redflag"]], "redflag": [[12, 0, 0, "-", "distributions"], [13, 0, 0, "-", "imbalance"], [14, 0, 0, "-", "importance"], [15, 0, 0, "-", "independence"], [16, 0, 0, "-", "markov"], [17, 0, 0, "-", "outliers"], [18, 0, 0, "-", "pandas"], [19, 0, 0, "-", "sklearn"], [20, 0, 0, "-", "target"], [21, 0, 0, "-", "utils"]], "redflag.distributions": [[12, 1, 1, "", "best_distribution"], [12, 1, 1, "", "bw_scott"], [12, 1, 1, "", "bw_silverman"], [12, 1, 1, "", "cv_kde"], [12, 1, 1, "", "find_large_peaks"], [12, 1, 1, "", "fit_kde"], [12, 1, 1, "", "get_kde"], [12, 1, 1, "", "is_multimodal"], [12, 1, 1, "", "kde_peaks"], [12, 1, 1, "", "wasserstein"], [12, 1, 1, "", "wasserstein_multi"], [12, 1, 1, "", "wasserstein_ovo"], [12, 1, 1, "", "wasserstein_ovr"]], "redflag.imbalance": [[13, 1, 1, "", "class_counts"], [13, 1, 1, "", "divergence"], [13, 1, 1, "", "empirical_distribution"], [13, 1, 1, "", "furthest_distribution"], [13, 1, 1, "", "imbalance_degree"], [13, 1, 1, "", "imbalance_ratio"], [13, 1, 1, "", "is_imbalanced"], [13, 1, 1, "", "major_minor"], [13, 1, 1, "", "minority_classes"]], "redflag.importance": [[14, 1, 1, "", "feature_importances"], [14, 1, 1, "", "least_important_features"], [14, 1, 1, "", "most_important_features"]], "redflag.independence": [[15, 1, 1, "", "is_correlated"]], "redflag.markov": [[16, 2, 1, "", "Markov_chain"], [16, 1, 1, "", "hollow_matrix"], [16, 1, 1, "", "observations"], [16, 1, 1, "", "regularize"]], "redflag.markov.Markov_chain": [[16, 3, 1, "", "chi_squared"], [16, 4, 1, "", "degrees_of_freedom"], [16, 4, 1, "", "expected_freqs"], [16, 3, 1, "", "from_sequence"], [16, 3, 1, "", "generate_states"], [16, 4, 1, "", "normalized_difference"], [16, 4, 1, "", "observed_freqs"]], "redflag.outliers": [[17, 1, 1, "", "expected_outliers"], [17, 1, 1, "", "get_outliers"], [17, 1, 1, "", "has_outliers"], [17, 1, 1, "", "mahalanobis"], [17, 1, 1, "", "mahalanobis_outliers"]], "redflag.pandas": [[18, 2, 1, "", "DataFrameAccessor"], [18, 2, 1, "", "SeriesAccessor"], [18, 1, 1, "", "null_decorator"]], "redflag.pandas.DataFrameAccessor": [[18, 3, 1, "", "correlation_detector"], [18, 3, 1, "", "feature_importances"]], "redflag.pandas.SeriesAccessor": [[18, 3, 1, "", "dummy_scores"], [18, 3, 1, "", "imbalance_degree"], [18, 3, 1, "", "is_imbalanced"], [18, 3, 1, "", "is_ordered"], [18, 3, 1, "", "minority_classes"], [18, 3, 1, "", "report"]], "redflag.sklearn": [[19, 2, 1, "", "BaseRedflagDetector"], [19, 2, 1, "", "ClipDetector"], [19, 2, 1, "", "CorrelationDetector"], [19, 2, 1, "", "Detector"], [19, 2, 1, "", "DistributionComparator"], [19, 2, 1, "", "DummyPredictor"], [19, 2, 1, "", "ImbalanceComparator"], [19, 2, 1, "", "ImbalanceDetector"], [19, 2, 1, "", "ImportanceDetector"], [19, 2, 1, "", "InsufficientDataDetector"], [19, 2, 1, "", "MultimodalityDetector"], [19, 2, 1, "", "MultivariateOutlierDetector"], [19, 2, 1, "", "OutlierDetector"], [19, 2, 1, "", "RfPipeline"], [19, 2, 1, "", "UnivariateOutlierDetector"], [19, 1, 1, "", "formatwarning"], [19, 1, 1, "", "make_detector_pipeline"], [19, 1, 1, "", "make_rf_pipeline"]], "redflag.sklearn.BaseRedflagDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.DistributionComparator": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.DummyPredictor": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImbalanceComparator": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImbalanceDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.ImportanceDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.InsufficientDataDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.MultimodalityDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "transform"]], "redflag.sklearn.MultivariateOutlierDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.OutlierDetector": [[19, 3, 1, "", "fit"], [19, 3, 1, "", "fit_transform"], [19, 3, 1, "", "transform"]], "redflag.sklearn.RfPipeline": [[19, 3, 1, "", "transform"]], "redflag.target": [[20, 1, 1, "", "dummy_classification_scores"], [20, 1, 1, "", "dummy_regression_scores"], [20, 1, 1, "", "dummy_scores"], [20, 1, 1, "", "is_binary"], [20, 1, 1, "", "is_continuous"], [20, 1, 1, "", "is_multiclass"], [20, 1, 1, "", "is_multioutput"], [20, 1, 1, "", "is_ordered"], [20, 1, 1, "", "n_classes"]], "redflag.utils": [[21, 1, 1, "", "bool_to_index"], [21, 1, 1, "", "clipped"], [21, 1, 1, "", "consecutive"], [21, 1, 1, "", "cv"], [21, 1, 1, "", "deprecated"], [21, 1, 1, "", "docstring_from"], [21, 1, 1, "", "ecdf"], [21, 1, 1, "", "flatten"], [21, 1, 1, "", "generate_data"], [21, 1, 1, "", "get_idx"], [21, 1, 1, "", "has_few_samples"], [21, 1, 1, "", "has_flat"], [21, 1, 1, "", "has_monotonic"], [21, 1, 1, "", "has_nans"], [21, 1, 1, "", "index_to_bool"], [21, 1, 1, "", "is_clipped"], [21, 1, 1, "", "is_numeric"], [21, 1, 1, "", "is_standard_normal"], [21, 1, 1, "", "is_standardized"], [21, 1, 1, "", "iter_groups"], [21, 1, 1, "", "ordered_unique"], [21, 1, 1, "", "proportion_to_stdev"], [21, 1, 1, "", "split_and_standardize"], [21, 1, 1, "", "stdev_to_proportion"], [21, 1, 1, "", "update_p"], [21, 1, 1, "", "zscore"]]}, "objtypes": {"0": "py:module", "1": "py:function", "2": "py:class", "3": "py:method", "4": "py:property"}, "objnames": {"0": ["py", "module", "Python module"], "1": ["py", "function", "Python function"], "2": ["py", "class", "Python class"], "3": ["py", "method", "Python method"], "4": ["py", "property", "Python property"]}, "titleterms": {"basic": 0, "usag": 0, "load": [0, 1], "some": [0, 1], "data": [0, 1], "categor": 0, "continu": [0, 7], "imbal": [0, 1, 3, 13], "metric": [0, 1], "outlier": [0, 17], "clip": [0, 1], "distribut": [0, 12], "shape": 0, "ident": 0, "assumpt": [0, 1], "alreadi": 0, "split": 0, "out": 0, "group": 0, "arrai": 0, "independ": [0, 1, 15], "featur": 0, "import": [0, 1, 14], "tutori": 1, "A": 1, "simpl": 1, "ml": [1, 8], "workflow": 1, "quick": [1, 8], "look": 1, "redflag": [1, 2, 3, 8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], "pipelin": [1, 3], "make": [1, 3], "your": [1, 3], "own": [1, 3], "test": [1, 7], "us": [2, 3], "panda": [2, 18], "seri": 2, "accessor": 2, "datafram": 2, "sklearn": [3, 19], "The": 3, "detector": 3, "class": 3, "pre": 3, "built": 3, "transform": 3, "compar": 3, "smoke": 3, "what": [3, 22], "do": 3, "about": 3, "warn": 3, "imbalancedetector": 3, "imbalancecompar": 3, "clipdetector": 3, "correlationdetector": 3, "outlierdetector": 3, "distributioncompar": 3, "importancedetector": 3, "author": 4, "changelog": 5, "0": 5, "4": 5, "2": 5, "10": 5, "decemb": 5, "2023": 5, "1": 5, "octob": 5, "28": 5, "septemb": 5, "3": 5, "21": 5, "novemb": 5, "2022": 5, "9": 5, "25": 5, "august": 5, "8": 5, "juli": 5, "7": 5, "11": 5, "februari": 5, "31": 5, "januari": 5, "30": 5, "contribut": [6, 7], "code": 6, "conduct": 6, "authorship": 6, "licens": [6, 10], "develop": 7, "instal": [7, 9], "build": 7, "packag": [7, 11], "doc": 7, "integr": 7, "safer": 8, "design": [8, 22], "start": 8, "user": 8, "guid": 8, "api": 8, "refer": 8, "other": 8, "resourc": 8, "indic": 8, "tabl": 8, "option": 9, "depend": 9, "submodul": 11, "modul": [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], "content": 11, "markov": 16, "target": 20, "util": 21, "i": 22, "overview": 22, "safeti": 22, "": 22}, "envversion": {"sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx": 60}, "alltitles": {"\ud83d\udea9 Basic usage": [[0, "basic-usage"]], "Load some data": [[0, "load-some-data"], [1, "load-some-data"]], "Categorical or continuous?": [[0, "categorical-or-continuous"]], "Imbalance metrics": [[0, "imbalance-metrics"], [1, "imbalance-metrics"]], "Outliers": [[0, "outliers"]], "Clipping": [[0, "clipping"], [1, "clipping"]], "Distribution shape": [[0, "distribution-shape"]], "Identical distribution assumption": [[0, "identical-distribution-assumption"]], "Already split out group arrays": [[0, "already-split-out-group-arrays"]], "Independence assumption": [[0, "independence-assumption"], [1, "independence-assumption"]], "Feature importance": [[0, "feature-importance"]], "\ud83d\udea9 Tutorial": [[1, "tutorial"]], "A simple ML workflow": [[1, "a-simple-ml-workflow"]], "A quick look at redflag": [[1, "a-quick-look-at-redflag"]], "Importance": [[1, "importance"]], "Pipelines": [[1, "pipelines"]], "Making your own tests": [[1, "making-your-own-tests"]], "\ud83d\udea9 Using redflag with Pandas": [[2, "using-redflag-with-pandas"]], "Series accessor": [[2, "series-accessor"]], "DataFrame accessor": [[2, "dataframe-accessor"]], "\ud83d\udea9 Using redflag with sklearn": [[3, "using-redflag-with-sklearn"]], "The redflag detector classes": [[3, "the-redflag-detector-classes"]], "Using the pre-built redflag pipeline": [[3, "using-the-pre-built-redflag-pipeline"]], "Using the \u2018detector\u2019 transformers": [[3, "using-the-detector-transformers"]], "The imbalance comparator": [[3, "the-imbalance-comparator"]], "Making your own smoke detector": [[3, "making-your-own-smoke-detector"]], "What to do about the warnings": [[3, "what-to-do-about-the-warnings"]], "ImbalanceDetector and ImbalanceComparator": [[3, "imbalancedetector-and-imbalancecomparator"]], "ClipDetector": [[3, "clipdetector"]], "CorrelationDetector": [[3, "correlationdetector"]], "OutlierDetector": [[3, "outlierdetector"]], "DistributionComparator": [[3, "distributioncomparator"]], "ImportanceDetector": [[3, "importancedetector"]], "Authors": [[4, "authors"]], "Changelog": [[5, "changelog"]], "0.4.2, 10 December 2023": [[5, "december-2023"]], "0.4.1, 2 October 2023": [[5, "october-2023"]], "0.4.0, 28 September 2023": [[5, "september-2023"]], "0.3.0, 21 September 2023": [[5, "id1"]], "0.2.0, 4 September 2023": [[5, "id2"]], "0.1.10, 21 November 2022": [[5, "november-2022"]], "0.1.9, 25 August 2022": [[5, "august-2022"]], "0.1.8, 8 July 2022": [[5, "july-2022"]], "0.1.3 to 0.1.7, 9\u201311 February 2022": [[5, "to-0-1-7-911-february-2022"]], "0.1.2, 1 February 2022": [[5, "february-2022"]], "0.1.1, 31 January 2022": [[5, "january-2022"]], "0.1.0, 30 January 2022": [[5, "id3"]], "Contributing": [[6, "contributing"], [7, "contributing"]], "Code of conduct": [[6, "code-of-conduct"]], "Authorship": [[6, "authorship"]], "License": [[6, "license"], [10, "license"]], "Development": [[7, "development"]], "Installation": [[7, "installation"]], "Testing": [[7, "testing"]], "Building the package": [[7, "building-the-package"]], "Building the docs": [[7, "building-the-docs"]], "Continuous integration": [[7, "continuous-integration"]], "Redflag: safer ML by design": [[8, "redflag-safer-ml-by-design"]], "Quick start": [[8, "quick-start"]], "User guide": [[8, "user-guide"], [8, null]], "API reference": [[8, "api-reference"], [8, null]], "Other resources": [[8, "other-resources"], [8, null]], "Indices and tables": [[8, "indices-and-tables"]], "\ud83d\udea9 Installation": [[9, "installation"]], "Optional dependencies": [[9, "optional-dependencies"]], "redflag package": [[11, "redflag-package"]], "Submodules": [[11, "submodules"]], "Module contents": [[11, "module-redflag"]], "redflag.distributions module": [[12, "module-redflag.distributions"]], "redflag.imbalance module": [[13, "module-redflag.imbalance"]], "redflag.importance module": [[14, "module-redflag.importance"]], "redflag.independence module": [[15, "module-redflag.independence"]], "redflag.markov module": [[16, "module-redflag.markov"]], "redflag.outliers module": [[17, "module-redflag.outliers"]], "redflag.pandas module": [[18, "module-redflag.pandas"]], "redflag.sklearn module": [[19, "module-redflag.sklearn"]], "redflag.target module": [[20, "module-redflag.target"]], "redflag.utils module": [[21, "module-redflag.utils"]], "\ud83d\udea9 What is redflag?": [[22, "what-is-redflag"]], "Overview": [[22, "overview"]], "Safety by design": [[22, "safety-by-design"]], "What\u2019s in redflag": [[22, "what-s-in-redflag"]]}, "indexentries": {"module": [[11, "module-redflag"], [12, "module-redflag.distributions"], [13, "module-redflag.imbalance"], [14, "module-redflag.importance"], [15, "module-redflag.independence"], [16, "module-redflag.markov"], [17, "module-redflag.outliers"], [18, "module-redflag.pandas"], [19, "module-redflag.sklearn"], [20, "module-redflag.target"], [21, "module-redflag.utils"]], "redflag": [[11, "module-redflag"]], "best_distribution() (in module redflag.distributions)": [[12, "redflag.distributions.best_distribution"]], "bw_scott() (in module redflag.distributions)": [[12, "redflag.distributions.bw_scott"]], "bw_silverman() (in module redflag.distributions)": [[12, "redflag.distributions.bw_silverman"]], "cv_kde() (in module redflag.distributions)": [[12, "redflag.distributions.cv_kde"]], "find_large_peaks() (in module redflag.distributions)": [[12, "redflag.distributions.find_large_peaks"]], "fit_kde() (in module redflag.distributions)": [[12, "redflag.distributions.fit_kde"]], "get_kde() (in module redflag.distributions)": [[12, "redflag.distributions.get_kde"]], "is_multimodal() (in module redflag.distributions)": [[12, "redflag.distributions.is_multimodal"]], "kde_peaks() (in module redflag.distributions)": [[12, "redflag.distributions.kde_peaks"]], "redflag.distributions": [[12, "module-redflag.distributions"]], "wasserstein() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein"]], "wasserstein_multi() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_multi"]], "wasserstein_ovo() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_ovo"]], "wasserstein_ovr() (in module redflag.distributions)": [[12, "redflag.distributions.wasserstein_ovr"]], "class_counts() (in module redflag.imbalance)": [[13, "redflag.imbalance.class_counts"]], "divergence() (in module redflag.imbalance)": [[13, "redflag.imbalance.divergence"]], "empirical_distribution() (in module redflag.imbalance)": [[13, "redflag.imbalance.empirical_distribution"]], "furthest_distribution() (in module redflag.imbalance)": [[13, "redflag.imbalance.furthest_distribution"]], "imbalance_degree() (in module redflag.imbalance)": [[13, "redflag.imbalance.imbalance_degree"]], "imbalance_ratio() (in module redflag.imbalance)": [[13, "redflag.imbalance.imbalance_ratio"]], "is_imbalanced() (in module redflag.imbalance)": [[13, "redflag.imbalance.is_imbalanced"]], "major_minor() (in module redflag.imbalance)": [[13, "redflag.imbalance.major_minor"]], "minority_classes() (in module redflag.imbalance)": [[13, "redflag.imbalance.minority_classes"]], "redflag.imbalance": [[13, "module-redflag.imbalance"]], "feature_importances() (in module redflag.importance)": [[14, "redflag.importance.feature_importances"]], "least_important_features() (in module redflag.importance)": [[14, "redflag.importance.least_important_features"]], "most_important_features() (in module redflag.importance)": [[14, "redflag.importance.most_important_features"]], "redflag.importance": [[14, "module-redflag.importance"]], "is_correlated() (in module redflag.independence)": [[15, "redflag.independence.is_correlated"]], "redflag.independence": [[15, "module-redflag.independence"]], "markov_chain (class in redflag.markov)": [[16, "redflag.markov.Markov_chain"]], "chi_squared() (redflag.markov.markov_chain method)": [[16, "redflag.markov.Markov_chain.chi_squared"]], "degrees_of_freedom (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.degrees_of_freedom"]], "expected_freqs (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.expected_freqs"]], "from_sequence() (redflag.markov.markov_chain class method)": [[16, "redflag.markov.Markov_chain.from_sequence"]], "generate_states() (redflag.markov.markov_chain method)": [[16, "redflag.markov.Markov_chain.generate_states"]], "hollow_matrix() (in module redflag.markov)": [[16, "redflag.markov.hollow_matrix"]], "normalized_difference (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.normalized_difference"]], "observations() (in module redflag.markov)": [[16, "redflag.markov.observations"]], "observed_freqs (redflag.markov.markov_chain property)": [[16, "redflag.markov.Markov_chain.observed_freqs"]], "redflag.markov": [[16, "module-redflag.markov"]], "regularize() (in module redflag.markov)": [[16, "redflag.markov.regularize"]], "expected_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.expected_outliers"]], "get_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.get_outliers"]], "has_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.has_outliers"]], "mahalanobis() (in module redflag.outliers)": [[17, "redflag.outliers.mahalanobis"]], "mahalanobis_outliers() (in module redflag.outliers)": [[17, "redflag.outliers.mahalanobis_outliers"]], "redflag.outliers": [[17, "module-redflag.outliers"]], "dataframeaccessor (class in redflag.pandas)": [[18, "redflag.pandas.DataFrameAccessor"]], "seriesaccessor (class in redflag.pandas)": [[18, "redflag.pandas.SeriesAccessor"]], "correlation_detector() (redflag.pandas.dataframeaccessor method)": [[18, "redflag.pandas.DataFrameAccessor.correlation_detector"]], "dummy_scores() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.dummy_scores"]], "feature_importances() (redflag.pandas.dataframeaccessor method)": [[18, "redflag.pandas.DataFrameAccessor.feature_importances"]], "imbalance_degree() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.imbalance_degree"]], "is_imbalanced() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.is_imbalanced"]], "is_ordered() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.is_ordered"]], "minority_classes() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.minority_classes"]], "null_decorator() (in module redflag.pandas)": [[18, "redflag.pandas.null_decorator"]], "redflag.pandas": [[18, "module-redflag.pandas"]], "report() (redflag.pandas.seriesaccessor method)": [[18, "redflag.pandas.SeriesAccessor.report"]], "baseredflagdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.BaseRedflagDetector"]], "clipdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ClipDetector"]], "correlationdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.CorrelationDetector"]], "detector (class in redflag.sklearn)": [[19, "redflag.sklearn.Detector"]], "distributioncomparator (class in redflag.sklearn)": [[19, "redflag.sklearn.DistributionComparator"]], "dummypredictor (class in redflag.sklearn)": [[19, "redflag.sklearn.DummyPredictor"]], "imbalancecomparator (class in redflag.sklearn)": [[19, "redflag.sklearn.ImbalanceComparator"]], "imbalancedetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ImbalanceDetector"]], "importancedetector (class in redflag.sklearn)": [[19, "redflag.sklearn.ImportanceDetector"]], "insufficientdatadetector (class in redflag.sklearn)": [[19, "redflag.sklearn.InsufficientDataDetector"]], "multimodalitydetector (class in redflag.sklearn)": [[19, "redflag.sklearn.MultimodalityDetector"]], "multivariateoutlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.MultivariateOutlierDetector"]], "outlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.OutlierDetector"]], "rfpipeline (class in redflag.sklearn)": [[19, "redflag.sklearn.RfPipeline"]], "univariateoutlierdetector (class in redflag.sklearn)": [[19, "redflag.sklearn.UnivariateOutlierDetector"]], "fit() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.fit"]], "fit() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.fit"]], "fit() (redflag.sklearn.dummypredictor method)": [[19, "redflag.sklearn.DummyPredictor.fit"]], "fit() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.fit"]], "fit() (redflag.sklearn.imbalancedetector method)": [[19, "redflag.sklearn.ImbalanceDetector.fit"]], "fit() (redflag.sklearn.importancedetector method)": [[19, "redflag.sklearn.ImportanceDetector.fit"]], "fit() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.fit"]], "fit() (redflag.sklearn.multimodalitydetector method)": [[19, "redflag.sklearn.MultimodalityDetector.fit"]], "fit() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.fit"]], "fit() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.fit"]], "fit_transform() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.fit_transform"]], "fit_transform() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.fit_transform"]], "fit_transform() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.fit_transform"]], "fit_transform() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.fit_transform"]], "fit_transform() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.fit_transform"]], "fit_transform() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.fit_transform"]], "formatwarning() (in module redflag.sklearn)": [[19, "redflag.sklearn.formatwarning"]], "make_detector_pipeline() (in module redflag.sklearn)": [[19, "redflag.sklearn.make_detector_pipeline"]], "make_rf_pipeline() (in module redflag.sklearn)": [[19, "redflag.sklearn.make_rf_pipeline"]], "redflag.sklearn": [[19, "module-redflag.sklearn"]], "transform() (redflag.sklearn.baseredflagdetector method)": [[19, "redflag.sklearn.BaseRedflagDetector.transform"]], "transform() (redflag.sklearn.distributioncomparator method)": [[19, "redflag.sklearn.DistributionComparator.transform"]], "transform() (redflag.sklearn.dummypredictor method)": [[19, "redflag.sklearn.DummyPredictor.transform"]], "transform() (redflag.sklearn.imbalancecomparator method)": [[19, "redflag.sklearn.ImbalanceComparator.transform"]], "transform() (redflag.sklearn.imbalancedetector method)": [[19, "redflag.sklearn.ImbalanceDetector.transform"]], "transform() (redflag.sklearn.importancedetector method)": [[19, "redflag.sklearn.ImportanceDetector.transform"]], "transform() (redflag.sklearn.insufficientdatadetector method)": [[19, "redflag.sklearn.InsufficientDataDetector.transform"]], "transform() (redflag.sklearn.multimodalitydetector method)": [[19, "redflag.sklearn.MultimodalityDetector.transform"]], "transform() (redflag.sklearn.multivariateoutlierdetector method)": [[19, "redflag.sklearn.MultivariateOutlierDetector.transform"]], "transform() (redflag.sklearn.outlierdetector method)": [[19, "redflag.sklearn.OutlierDetector.transform"]], "transform() (redflag.sklearn.rfpipeline method)": [[19, "redflag.sklearn.RfPipeline.transform"]], "dummy_classification_scores() (in module redflag.target)": [[20, "redflag.target.dummy_classification_scores"]], "dummy_regression_scores() (in module redflag.target)": [[20, "redflag.target.dummy_regression_scores"]], "dummy_scores() (in module redflag.target)": [[20, "redflag.target.dummy_scores"]], "is_binary() (in module redflag.target)": [[20, "redflag.target.is_binary"]], "is_continuous() (in module redflag.target)": [[20, "redflag.target.is_continuous"]], "is_multiclass() (in module redflag.target)": [[20, "redflag.target.is_multiclass"]], "is_multioutput() (in module redflag.target)": [[20, "redflag.target.is_multioutput"]], "is_ordered() (in module redflag.target)": [[20, "redflag.target.is_ordered"]], "n_classes() (in module redflag.target)": [[20, "redflag.target.n_classes"]], "redflag.target": [[20, "module-redflag.target"]], "bool_to_index() (in module redflag.utils)": [[21, "redflag.utils.bool_to_index"]], "clipped() (in module redflag.utils)": [[21, "redflag.utils.clipped"]], "consecutive() (in module redflag.utils)": [[21, "redflag.utils.consecutive"]], "cv() (in module redflag.utils)": [[21, "redflag.utils.cv"]], "deprecated() (in module redflag.utils)": [[21, "redflag.utils.deprecated"]], "docstring_from() (in module redflag.utils)": [[21, "redflag.utils.docstring_from"]], "ecdf() (in module redflag.utils)": [[21, "redflag.utils.ecdf"]], "flatten() (in module redflag.utils)": [[21, "redflag.utils.flatten"]], "generate_data() (in module redflag.utils)": [[21, "redflag.utils.generate_data"]], "get_idx() (in module redflag.utils)": [[21, "redflag.utils.get_idx"]], "has_few_samples() (in module redflag.utils)": [[21, "redflag.utils.has_few_samples"]], "has_flat() (in module redflag.utils)": [[21, "redflag.utils.has_flat"]], "has_monotonic() (in module redflag.utils)": [[21, "redflag.utils.has_monotonic"]], "has_nans() (in module redflag.utils)": [[21, "redflag.utils.has_nans"]], "index_to_bool() (in module redflag.utils)": [[21, "redflag.utils.index_to_bool"]], "is_clipped() (in module redflag.utils)": [[21, "redflag.utils.is_clipped"]], "is_numeric() (in module redflag.utils)": [[21, "redflag.utils.is_numeric"]], "is_standard_normal() (in module redflag.utils)": [[21, "redflag.utils.is_standard_normal"]], "is_standardized() (in module redflag.utils)": [[21, "redflag.utils.is_standardized"]], "iter_groups() (in module redflag.utils)": [[21, "redflag.utils.iter_groups"]], "ordered_unique() (in module redflag.utils)": [[21, "redflag.utils.ordered_unique"]], "proportion_to_stdev() (in module redflag.utils)": [[21, "redflag.utils.proportion_to_stdev"]], "redflag.utils": [[21, "module-redflag.utils"]], "split_and_standardize() (in module redflag.utils)": [[21, "redflag.utils.split_and_standardize"]], "stdev_to_proportion() (in module redflag.utils)": [[21, "redflag.utils.stdev_to_proportion"]], "update_p() (in module redflag.utils)": [[21, "redflag.utils.update_p"]], "zscore() (in module redflag.utils)": [[21, "redflag.utils.zscore"]]}}) \ No newline at end of file