Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Tidy up the rocket transformers #1699

Open
4 of 10 tasks
TonyBagnall opened this issue Jun 18, 2024 · 4 comments
Open
4 of 10 tasks

[ENH] Tidy up the rocket transformers #1699

TonyBagnall opened this issue Jun 18, 2024 · 4 comments
Labels
enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package

Comments

@TonyBagnall
Copy link
Contributor

TonyBagnall commented Jun 18, 2024

Describe the feature or idea you want to propose

Its time to tidy up the convolutional transformers, will collate all issues here and make tasks for smaller PRs. Replaces #208

To Do

Done

@TonyBagnall
Copy link
Contributor Author

TonyBagnall commented Jun 29, 2024

Time wise results as expected, also comparable to other implementations. This to transform different length series in a univariate collection size of 100, length on x axis, y axis is seconds

image

@TonyBagnall
Copy link
Contributor Author

after #1781, next issue is to allow variable length. #1746 PR does it by taking numba inside the loop, but it slows it down unacceptably. An alternative, which is already in Rocket, is not to give types in njit arguments. Surprisingly, Rocket does not do this

@njit(fastmath=True, cache=True)
def _apply_kernel_univariate(X, weights, length, bias, dilation, padding):
    n_timepoints = len(X)

    output_length = (n_timepoints + (2 * padding)) - ((length - 1) * dilation)

whereas minirocket does

@njit(
    "float32[:,:](float32[:,:],Tuple((int32[:],int32[:],int32[:],int32[:],float32["
    ":])), int32[:,:])",
    fastmath=True,
    parallel=True,
    cache=True,
)
def _static_transform_uni(X, parameters, indices):

without them, its much easier to use multivariate, but crucially need to assess any performance hit. So some timing experiments

  1. Run minirocket both with and without the annotation and time it, passing 3D numpy
  2. Run rocket both with and without the annotation and time it (possible speed up for main),

based on these results, either remove type annotation and adapt to lists of numpy or add them and use split functions

@TonyBagnall
Copy link
Contributor Author

TonyBagnall commented Jul 14, 2024

Code to test 1 and 2, including direct code without

def timing_experiment_n_cases_main():
    import time
    # Build numba functions
    X = np.random.random(size=(10, 1, 100)).astype(np.float32)
    r = MiniRocket(random_state=0)
    p1 = r.fit_transform(X)
    r2 = MiniRocket(random_state=0)
    p2= r2.fit_transform(X)
    r3 =BadPlaceMiniRocket(random_state=0)
    r4 = BadPlaceMiniRocketMultivariate(random_state=0)
    p3= r3.fit_transform(X)
    p4=r4.fit_transform(X)

    for i in range(500,31000,500):
        X1 = make_example_3d_numpy(n_cases=i, n_channels=1, n_timepoints=500,
                                   return_y=False).astype(np.float32)
        X2 = make_example_3d_numpy(n_cases=i, n_channels=5, n_timepoints=100,
                                   return_y=False).astype(np.float32)

        from aeon.transformations.collection.convolution_based._minirocket import \
            _static_fit, _static_transform_uni, _static_transform_multi
        start = time.time()
        r.fit_transform(X1)
        t1 = time.time() - start
        start = time.time()
        r.fit_transform(X2)
        t2 = time.time() - start
        start = time.time()
        p1=_static_fit(X1)
        X_= X1.squeeze(1)
        _static_transform_uni(X_,p1,MiniRocket._indices)
        t3 = time.time() - start
        start = time.time()
        p1=_static_fit(X2)
        _static_transform_multi(X2,p1,MiniRocket._indices)
        t4 = time.time() - start
        start = time.time()
        r2._fit_transform(X1)
        t5 = time.time() - start
        start = time.time()
        r2._fit_transform(X2)
        t6 = time.time() - start
        start = time.time()
        r3.fit_transform(X1)
        t7 = time.time() - start
        start = time.time()
        r4.fit_transform(X2)
        t8 = time.time() - start

        print(i,",",t1,",",t2,",",t3,",",t4,",",t5,",",t6,",",t7,",",t8)

@TonyBagnall
Copy link
Contributor Author

TonyBagnall commented Jul 14, 2024

so if anything adding type checks makes it a bit slower.

<style> </style>
  With Without Diff
500 0.49 0.48 -0.02
1000 0.88 0.86 -0.02
1500 1.30 1.28 -0.03
2000 1.74 1.64 -0.10
2500 2.02 2.00 -0.02
3000 2.46 2.40 -0.06
3500 2.81 2.77 -0.04
4000 3.19 3.23 0.04
4500 3.66 3.56 -0.10
5000 4.08 4.06 -0.02
5500 4.57 4.43 -0.14
6000 4.75 4.73 -0.02
6500 5.17 5.29 0.13
7000 5.80 5.64 -0.16
7500 6.14 5.86 -0.27
8000 6.51 6.26 -0.25
8500 6.85 6.74 -0.11
9000 7.33 7.03 -0.30
9500 7.55 7.45 -0.10
10000 7.92 7.78 -0.15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package
Projects
None yet
Development

No branches or pull requests

1 participant