Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Update the projects lists #2146

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
67afd52
remove Theta
TonyBagnall Jul 21, 2024
3ac8417
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 1, 2024
d97ff65
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 1, 2024
fc78f2d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 1, 2024
f33cadc
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 2, 2024
863b48b
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 2, 2024
cb4ecc5
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 5, 2024
347fd25
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 6, 2024
342e3bf
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 8, 2024
e5f41f5
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 12, 2024
b9c1984
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Aug 21, 2024
edff6a6
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 7, 2024
b83f7e7
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 7, 2024
9742327
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 14, 2024
ce763e3
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 17, 2024
fc35148
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 17, 2024
ee31e90
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 18, 2024
d42c6dc
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 24, 2024
e55d296
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 25, 2024
08eb529
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 25, 2024
a561fa3
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Sep 30, 2024
5b597ce
projects
TonyBagnall Oct 2, 2024
1563e11
Merge branch 'main' into ajb/projects
TonyBagnall Oct 5, 2024
872ef40
draft
TonyBagnall Oct 5, 2024
1be97dd
add some papers
TonyBagnall Oct 13, 2024
2949e7d
Merge branch 'main' into ajb/projects
TonyBagnall Oct 31, 2024
c4e26e6
Merge branch 'main' into ajb/projects
TonyBagnall Nov 25, 2024
dc1e5de
Merge branch 'main' into ajb/projects
TonyBagnall Nov 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions docs/completed_projects.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

[//]: # (Try to put references in harvard style for consistency.)

# aeon projects: completed

This is a list of people who have completed projects with `aeon` or `aeon-neuro`
either as paid interns, undergraduate or postgraduate project or as volunteers
mentored by `aeon` core developers.

## 2024

# Interns

1. Divya ({user}`itsadivya`): Proximity forest 1 and 2.
divya was a google summer of code student

Her blog is here
2. Aadya ({user}`MatthewMiddlehurst`): Deep clustering

3. Gabriel ({user}`MatthewMiddlehurst`): EEG classification with aeon-neuro

4. Danieli ({user}`MatthewMiddlehurst`): catch22 and

5. Ivan ({user}`MatthewMiddlehurst`): shapelets

# MSc projects

1. X : Reimannian EEG
2. X: shapelet quality measures

# BSc projects

1. X
2. X
3. X

# Volunteers

1. Frank
2. Adam
112 changes: 23 additions & 89 deletions docs/mentoring.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@

[//]: # (Try to put references in harvard style for consistency.)

# Mentoring and Projects
# aeon projects: ongoing or potential

`aeon` runs a range of short to medium duration projects interacting with the community
and the code
`aeon` runs a range of short to medium duration projects that involve
developing or using aeon and interacting with the community and the code
base. These projects are designed for internships, usage as part of
undergraduate/postgraduate projects at academic institutions, and as options for
programs such as [Google Summer of Code (GSoC)](https://summerofcode.withgoogle.com/).

For those interested in undertaking a project outside these scenarios, we recommend
joining the [Slack](https://join.slack.com/t/aeon-toolkit/shared_invite/zt-22vwvut29-HDpCu~7VBUozyfL_8j3dLA)
and discussing with the project mentors. We aim to run schemes to
and discussing with the community. We aim to run schemes to
help new contributors to become more familiar with `aeon`, time series machine learning
research, and open-source software development.

Expand All @@ -20,7 +20,7 @@ majority of them will require some knowledge of machine learning and time series

## Current aeon projects

This is a list of some of the projects we are interested in running in 2024. Feel
This is a list of some of the projects we are interested in running in 2024/25. Feel
free to propose your own project ideas, but please discuss them with us first. We have
an active community of researchers and students who work on `aeon`. Please get in touch
via Slack if you are interested in any of these projects or have any questions.
Expand All @@ -33,13 +33,13 @@ to open source. We list projects by time series task

[Classification](#classification)
1. Optimizing the Shapelet Transform for classification and similarity search
2. EEG classification with aeon-neuro (Listed for GSoC 2024)
3. Improved Proximity Forest for classification (listed for GSoC 2024)
2. EEG classification with aeon-neuro
3. Implement TS-CHIEF
4. Improved HIVE-COTE implementation.
5. Compare distance based classification.

[Forecasting](#forecasting)
1. Machine Learning for Time Series Forecasting (listed in GSoC 2024)
1. Machine Learning for Time Series Forecasting
2. Deep Learning for Time Series Forecasting
3. Implement ETS forecasters in aeon

Expand All @@ -48,7 +48,7 @@ to open source. We list projects by time series task
2. Deep learning based clustering algorithms

[Anomaly Detection](#anomaly-detection)
1. Anomaly detection with the Matrix Profile and MERLIN
1. Anomaly detection with the Matrix Profile, MERLIN and MADRID
Copy link
Contributor

@itsdivya1309 itsdivya1309 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MERLIN anomaly detector is already present in the API and there are no open issues for this.


[Segmentation](#segmentation)
1. Time series segmentation
Expand All @@ -58,7 +58,7 @@ to open source. We list projects by time series task
2. Implement channel selection algorithms

[Visualisation](#visualisation)
1. Explainable AI with the shapelet transform (Southampton intern project).
1. Explainable AI with the shapelet transform

[Regression](#regression)
1. Adapt forecasting regressors to time series extrinsic regression.
Expand Down Expand Up @@ -147,7 +147,7 @@ transform: A new approach for time series shapelets. In International Conference
Pattern Recognition and Artificial Intelligence (pp. 653-664). Cham: Springer
International Publishing.

#### 2. EEG classification with aeon-neuro (Listed for GSoC 2024)
#### 2. EEG classification with aeon-neuro

Mentors: Tony Bagnall ({user}`TonyBagnall`) and Aiden Rushbrooke

Expand Down Expand Up @@ -193,73 +193,7 @@ Time Series Classification of Electroencephalography Data, IWANN 2023.
2. MNE Toolkit, https://mne.tools/stable/index.html
3. The Brain Imaging Data Structure (BIDS) standard, https://bids.neuroimaging.io/

#### 3. Improved Proximity Forest for classification (listed for GSoC 2024)

Mentors: Matthew Middlehurst ({user}`MatthewMiddlehurst`) and Tony Bagnall
({user}`TonyBagnall`)

##### Related Issues
[#159](https://github.com/aeon-toolkit/aeon/issues/159)
[#428](https://github.com/aeon-toolkit/aeon/issues/428)


##### Description

Distance-based classifiers such as k-Nearest Neighbours are popular approaches to time
series classification. They primarily use elastic distance measures such as Dynamic Time
Warping (DTW) to compare two series. The Proximity Forest algorithm [1] is a
distance-based classifier for time series. The classifier creates a forest of decision
trees, where the tree splits are based on the distance between time series using
various distance measures. A recent review of time series classification algorithms [2]
found that Proximity Forest was the most accurate distance-based algorithm of those
compared.

`aeon` previously had an implementation of the Proximity Forest algorithm, but it was
not as accurate as the original implementation (the one used in the study) and was
unstable on benchmark datasets. The goal of this project is to significantly overhaul
the previous implementation or completely re-implement Proximity Forest in `aeon` to
match the accuracy of the original algorithm. This will involve comparing against the
authors' Java implementation of the algorithm as well as alternate Python versions.
The mentors will provide results for both for alternative methods. While knowing
Java is not a requirement for this project, it could be beneficial.

Recently, the group which published the algorithm has proposed a new version of the
Proximity Forest algorithm, Proximity Forest 2.0 [3]. This algorithm is more accurate
than the original Proximity Forest algorithm, and does not currently have an
implementation in `aeon` or elsewhere in Python. If time allows, the project could also
involve implementing and evaluating the Proximity Forest 2.0 algorithm.

##### Project stages

1. Learn about `aeon` best practices, coding standards and testing policies.
2. Study the Proximity Forest algorithm and previous `aeon` implementation.
3. Improve/re-implement the Proximity Forest implementation in `aeon`, with
the aim being to have an implementation that is as accurate as the original algorithm,
while remaining feasible to run.
4. Evaluate the improved implementation against the original `aeon` Proximity Forest
and the authors' Java implementation.
5. If time, implement the Proximity Forest 2.0 algorithm and repeat the above
evaluation.

##### Expected Outcomes

We expect the mentee engage with the aeon community and produce a high quality
implementation of the Proximity Forest algorithm(s) that gets accepted into the toolkit.

##### References

1. Lucas, B., Shifaz, A., Pelletier, C., O’Neill, L., Zaidi, N., Goethals,
B., Petitjean, F. and Webb, G.I., 2019. Proximity forest: an effective and scalable
distance-based classifier for time series. Data Mining and Knowledge Discovery, 33(3),
pp.607-635.
2. Middlehurst, M., Schäfer, P. and Bagnall, A., 2023. Bake off redux: a review and
experimental evaluation of recent time series classification algorithms. arXiv preprint
arXiv:2304.13029.
3. Herrmann, M., Tan, C.W., Salehi, M. and Webb, G.I., 2023. Proximity Forest 2.0: A
new effective and scalable similarity-based classifier for time series. arXiv
preprint arXiv:2304.05800.

#### 4. Improved HIVE-COTE implementation
#### 3. Improved HIVE-COTE implementation

Mentors: Matthew Middlehurst ({user}`MatthewMiddlehurst`) and Tony Bagnall
({user}`TonyBagnall`)
Expand Down Expand Up @@ -302,7 +236,7 @@ alternative structures. This can easily develop into a research project.
experimental evaluation of recent time series classification algorithms. arXiv preprint
arXiv:2304.13029.

#### 5. Compare distance based classification and regression
#### 4. Compare distance based classification and regression

Mentors: Chris Holder ({user}`cholder`) and Tony Bagnall
({user}`TonyBagnall`)
Expand All @@ -327,9 +261,9 @@ datasets.

### Forecasting

#### 1. Machine Learning for Time Series Forecasting (listed in GSoC 2024)
#### 1. Machine Learning for Time Series Forecasting

Mentors: Tony Bagnall ({user}`TonyBagnall`) and Matthew Middlehurst (@MatthewMiddlehurst).
Mentors: Tony Bagnall ({user}`TonyBagnall`) and Leo Tsaprounis ({user}`ltsaprounis`) .

##### Related Issues
[#265](https://github.com/aeon-toolkit/aeon/issues/265)
Expand All @@ -353,7 +287,7 @@ SETAR-Tree [3].

##### Expected Outcomes

1. Contributions to the aeon forecasting module.
1. Contributions to the new experimental aeon forecasting module.
2. Implementation of a machine learning forecasting algorithms.
3. Help write up results for a technical report/academic paper (depending on outcomes).

Expand Down Expand Up @@ -497,7 +431,7 @@ Published: 07 September 2020 Volume 34, pages 1936–1962, (2020)
### Anomaly detection


#### 1. Anomaly detection with the Matrix Profile and MERLIN
#### 1. Anomaly detection with the Matrix Profile, MERLIN and MADRID

Mentors: Matthew Middlehurst ({user}`MatthewMiddlehurst`)

Expand Down Expand Up @@ -553,7 +487,7 @@ mining. Journal of Open Source Software, 4(39), p.1504.

#### 1. Time series segmentation

Mentors: Tony Bagnall ({user}`TonyBagnall`) and TBC
Mentors: Tony Bagnall ({user}`TonyBagnall`)

##### Description

Expand Down Expand Up @@ -723,7 +657,7 @@ Series Classification. AALTD, ECML-PKDD, Springer, 2021

### Visualisation

#### 1. Explainable AI with the shapelet transform (Southampton intern project).
#### 1. Explainable AI with the shapelet transform.

Mentors: TonyBagnall ({user}`TonyBagnall`) and David Guijo-Rubio
({user}`dguijo`)
Expand All @@ -741,13 +675,13 @@ source toolkits, familiarisation with the shapelet code and the development of a
visualisation tool to help relate shapelets back to the training data. An outline
for the project is

Weeks 1-2: Familiarisation with open source, aeon and the visualisation module. Make
1. Familiarisation with open source, aeon and the visualisation module. Make
contribution for a good first issue.
Weeks 3-4: Understand the shapelet transfer algorithm, engage in ongoing discussions
2. Understand the shapelet transfer algorithm, engage in ongoing discussions
for possible improvements, run experiments to create predictive models for a test data set
Weeks 5-6: Design and prototype visualisation tools for shapelets, involving a range
3. Design and prototype visualisation tools for shapelets, involving a range
of summary measures and visualisation techniques, including plotting shapelets on training data, calculating frequency, measuring similarity between
Weeks 7-8: Debug, document and make PRs to merge contributions into the aeon toolkit.
4. Debug, document and make PRs to merge contributions into the aeon toolkit.

[1] Bagnall, A., Lines, J., Bostrom, A., Large, J. and Keogh, E. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery, Volume 31, pages 606–660, (2017)
[2] Ye, L., Keogh, E. Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Disc 22, 149–182 (2011). https://doi.org/10.1007/s10618-010-0179-5
Expand Down
29 changes: 20 additions & 9 deletions docs/papers_using_aeon.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,17 @@ the paper and a link to the code in your personal GitHub or other repository.

## Classification

- Middlehurst, M. and Schäfer, P. and Bagnall, A. (2024). Bake off redux: a review
- Dempster, A., Tan, W. T., Miller, L., Foumani, N., Schmidt, D. and Webb, G (2024).
Highly Scalable Time Series Classification for Very Large Datasets, ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data. [Paper](https://ecml-aaltd.github.io/aaltd2024/articles/Dempster_AALTD24.pdf)
- Dempster, A., Schmidt, D. and Webb, G. (2024). QUANT: a minimalist interval method
for time series classification, Data Mining and Knowledge Discovery, Volume 38,
pages 2377–2402. [Paper](https://link.springer.com/article/10.1007/s10618-024-01036-9)
- Serramazza, D., Nguyen, T. and Ifrim, G. (2024) Improving the Evaluation and
Actionability of Explanation Methods for Multivariate Time Series Classification.
Proc. ECML/PKDD [ArXiV](https://arxiv.org/abs/2406.12507)
- Middlehurst, M., Schäfer, P. and Bagnall, A. (2024). Bake off redux: a review
and experimental evaluation of recent time series classification algorithms.
Data Mining and Knowledge Discovery, online first, open access.
Data Mining and Knowledge Discovery, Volume 38, pages 1958–2031.
[Paper](https://link.springer.com/article/10.1007/s10618-024-01022-1) [Webpage/Code](https://tsml-eval.readthedocs.io/en/stable/publications/2023/tsc_bakeoff/tsc_bakeoff_2023.html)
- Spinnato, F. and Guidotti, R. and Monreale, A. and Nanni, M. (2024). Fast, Interpretable,
and Deterministic Time Series Classification With a Bag-of-Receptive-Fields.
Expand All @@ -39,6 +47,12 @@ the paper and a link to the code in your personal GitHub or other repository.
Learning on Temporal Data (pp. 39-55).
[Paper](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_4)

## Ordinal classification

- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P.A., Bagnall, A., and Hervás-Martínez, C. Convolutional and Deep Learning based techniques for Time Series Ordinal Classification.[ArXiV](https://arxiv.org/abs/2306.10084).
- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P. A., and Hervás-Martínez, C. (2024). O-Hydra: A Hybrid Convolutional and Dictionary-Based Approach to Time Series Ordinal Classification. In Conference of the Spanish Association for Artificial Intelligence (pp. 50-60). [Paper](https://link.springer.com/chapter/10.1007/978-3-031-62799-6_6).
- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P.A., and Hervás-Martínez, C. (2023). A Dictionary-Based Approach to Time Series Ordinal Classification. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2023. Lecture Notes in Computer Science, vol 14135. [Paper](https://link.springer.com/chapter/10.1007/978-3-031-43078-7_44).

## Regression

- Guijo-Rubio, D., Middlehurst, M., Arcencio, G., Silva, D. and Bagnall, A. (2024).
Expand All @@ -51,17 +65,14 @@ the paper and a link to the code in your personal GitHub or other repository.
(pp. 113-126).
[Paper](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_8) [Webpage/Code](https://tsml-eval.readthedocs.io/en/stable/publications/2023/rist_pipeline/rist_pipeline.html)

## Ordinal classification

- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P.A., Bagnall, A., and Hervás-Martínez, C. Convolutional and Deep Learning based techniques for Time Series Ordinal Classification. [ArXiV](https://arxiv.org/abs/2306.10084).
- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P. A., and Hervás-Martínez, C. (2024). O-Hydra: A Hybrid Convolutional and Dictionary-Based Approach to Time Series Ordinal Classification. In Conference of the Spanish Association for Artificial Intelligence (pp. 50-60). [Paper](https://link.springer.com/chapter/10.1007/978-3-031-62799-6_6).
- Ayllón-Gavilán, R., Guijo-Rubio, D., Gutiérrez, P.A., and Hervás-Martínez, C. (2023). A Dictionary-Based Approach to Time Series Ordinal Classification. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2023. Lecture Notes in Computer Science, vol 14135. [Paper](https://link.springer.com/chapter/10.1007/978-3-031-43078-7_44).

## Prototyping

- Ismail-Fawaz, A. and Ismail Fawaz, H. and Petitjean, F. and Devanne, M. and Weber,
J. and Berretti, S. and Webb, GI. and Forestier, G. (2023 December "ShapeDBA: Generating Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging." ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data. [Paper](https://doi.org/10.1007/978-3-031-49896-1_9) [code](https://github.com/MSD-IRIMAS/ShapeDBA)
J. and Berretti, S. and Webb, GI. and Forestier, G. (2023) ShapeDBA: Generating
Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging. ECML/PKDD
Workshop on Advanced Analytics and Learning on Temporal Data. [Paper](https://doi.org/10.1007/978-3-031-49896-1_9) [code](https://github.com/MSD-IRIMAS/ShapeDBA)
- Holder, C., Guijo-Rubio, D., & Bagnall, A. J. (2023). Barycentre Averaging for the Move-Split-Merge Time Series Distance Measure. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management-Volume 1:, 51-62, pp. 51-62. [Paper](https://www.scitepress.org/Link.aspx?doi=10.5220/0012164900003598)
[Paper](https://www.scitepress.org/Link.aspx?doi=10.5220/0012164900003598)

## Generation Evaluation

Expand Down