[ENH] Implement Proximity Forest classifier #1729

itsdivya1309 · 2024-06-27T16:09:49Z

Reference Issues/PRs

Closes #159

What does this implement/fix? Explain your changes.

Implementation of Proximity Forest Algorithm using the Proximity Trees.

aeon-actions-bot · 2024-06-27T16:10:18Z

Thank you for contributing to `aeon`

I have added the following labels to this PR based on the title: [ $\color{#FEF1BE}{\textsf{enhancement}}$ ].
I have added the following labels to this PR based on the changes made: [ $\color{#BCAE15}{\textsf{classification}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)

MatthewMiddlehurst

Great work, we'll have to give it a run through on the UCR archive datasets as we discussed. Next steps are up to you really, we can discuss on Slack.

aeon/classification/distance_based/_proximity_forest.py

MatthewMiddlehurst · 2024-07-04T14:56:13Z

This needs to be included in the API documentation also.

baraline · 2024-07-07T21:51:36Z

aeon/classification/distance_based/_proximity_forest.py

+    def _fit_tree(self, X, y):
+        clf = ProximityTree(
+            n_splitters=self.n_splitters,
+            max_depth=self.max_depth,
+            min_samples_split=self.min_samples_split,
+            random_state=self.random_state,
+            n_jobs=self.n_jobs,
+        )
+        clf.fit(X, y)
+        return clf


Similar comment for predict, but I think it might be better to define the function you parallelize with joblib outside of the object you call them from. Something to do with the fact that joblib pickling the objects you parallelize, if I remember right ? This might mean that you create a copy of the ProximityForest object every time you call _fit_tree .

To avoid that, you would define _fit_tree as a function outside ProximityForest.

Thanks for pointing this out.

Is this true? I think we have functions elsewhere that do this. Interesting to see if that needs to be changed.

baraline

Oher than this testing issue, the rest LGTM !

aeon/classification/distance_based/tests/test_proximity_forest.py

MatthewMiddlehurst

You should investigate how we use joblib in other estimators in the classification module. Typically we use "threads" as a default backend, with a parameter to change that,

The docstring n_jobs needs updating.

baraline

Only this small parameter missing and it should be good to go !

baraline · 2024-07-14T10:25:39Z

aeon/classification/distance_based/_proximity_forest.py

+        )
+
+    def _predict_proba(self, X):
+        output_probas = Parallel(n_jobs=self._n_jobs, prefer="threads")(


We discussed the need of a parameter for the joblib backend (i.e. threads vs processes), you should add a class parameter that default to threads.

It might be better to use the backend parameter instead of the prefer (see docs) to have a more fine grained control over the chosen backend.

itsdivya1309 and others added 2 commits June 26, 2024 14:52

Proximity Forest draft

dc8bd1d

Merge branch 'aeon-toolkit:main' into proximityForest

d4ce3e3

aeon-actions-bot bot added classification Classification package enhancement New feature, improvement request or other non-bug code enhancement labels Jun 27, 2024

itsdivya1309 and others added 8 commits June 28, 2024 15:57

Update init

7406dff

Merge branch 'aeon-toolkit:main' into proximityForest

7ed886f

Tests for forest

d063b99

Docstring

81f86ee

Fix initialization error

1ab94aa

Merge branch 'main' into proximityForest

3359fe3

Update tags

80f1ca8

Fix tests

db136c6

itsdivya1309 marked this pull request as ready for review July 3, 2024 04:58

itsdivya1309 requested review from MatthewMiddlehurst and TonyBagnall as code owners July 3, 2024 04:58

MatthewMiddlehurst reviewed Jul 4, 2024

View reviewed changes

aeon/classification/distance_based/_proximity_forest.py Outdated Show resolved Hide resolved

aeon/classification/distance_based/_proximity_forest.py Outdated Show resolved Hide resolved

itsdivya1309 and others added 3 commits July 5, 2024 16:01

Review comments resolved

4631e8d

Review comments resolved

b7d0461

Merge branch 'main' into proximityForest

59a8175

itsdivya1309 requested a review from MatthewMiddlehurst July 5, 2024 11:19

Parallelization using joblib

8905461

baraline reviewed Jul 7, 2024

View reviewed changes

itsdivya1309 and others added 2 commits July 8, 2024 19:53

Merge branch 'aeon-toolkit:main' into proximityForest

0294311

pickling objects

2d74d4d

baraline requested changes Jul 8, 2024

View reviewed changes

aeon/classification/distance_based/tests/test_proximity_forest.py Outdated Show resolved Hide resolved

MatthewMiddlehurst reviewed Jul 9, 2024

View reviewed changes

itsdivya1309 and others added 2 commits July 11, 2024 17:31

Merge branch 'aeon-toolkit:main' into proximityForest

7953c00

Parallel threading

b7505ad

itsdivya1309 and others added 2 commits July 11, 2024 22:51

Using unit test dataset

c853e55

Merge branch 'main' into proximityForest

8959efa

itsdivya1309 requested review from baraline and MatthewMiddlehurst July 12, 2024 04:30

Merge branch 'main' into proximityForest

1cb74a4

baraline requested changes Jul 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Implement Proximity Forest classifier #1729

[ENH] Implement Proximity Forest classifier #1729

itsdivya1309 commented Jun 27, 2024 •

edited

Loading

aeon-actions-bot bot commented Jun 27, 2024

MatthewMiddlehurst left a comment

MatthewMiddlehurst commented Jul 4, 2024

baraline Jul 7, 2024 •

edited

Loading

itsdivya1309 Jul 8, 2024

MatthewMiddlehurst Jul 9, 2024 •

edited

Loading

baraline left a comment

MatthewMiddlehurst left a comment

baraline left a comment

baraline Jul 14, 2024 •

edited

Loading

[ENH] Implement Proximity Forest classifier #1729

Are you sure you want to change the base?

[ENH] Implement Proximity Forest classifier #1729

Conversation

itsdivya1309 commented Jun 27, 2024 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

aeon-actions-bot bot commented Jun 27, 2024

Thank you for contributing to aeon

PR CI actions

MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst commented Jul 4, 2024

baraline Jul 7, 2024 • edited Loading

Choose a reason for hiding this comment

itsdivya1309 Jul 8, 2024

Choose a reason for hiding this comment

MatthewMiddlehurst Jul 9, 2024 • edited Loading

Choose a reason for hiding this comment

baraline left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

baraline left a comment

Choose a reason for hiding this comment

baraline Jul 14, 2024 • edited Loading

Choose a reason for hiding this comment

itsdivya1309 commented Jun 27, 2024 •

edited

Loading

Thank you for contributing to `aeon`

baraline Jul 7, 2024 •

edited

Loading

MatthewMiddlehurst Jul 9, 2024 •

edited

Loading

baraline Jul 14, 2024 •

edited

Loading