Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onset length as per the onset frames paper #6

Open
falaktheoptimist opened this issue Oct 20, 2019 · 1 comment
Open

Onset length as per the onset frames paper #6

falaktheoptimist opened this issue Oct 20, 2019 · 1 comment

Comments

@falaktheoptimist
Copy link
Contributor

Firstly, thank you so much for your super useful implementation of onset and frames model in pytorch. It has been valuable to understanding the paper and also in our project. I was wondering about the lengths of the onset in the labels which is 1 frame as per the implementation

label[left:onset_right, f] = 3

However, the onset frames method mentions that

We performed a coarse hyperparameter search over onset length (we tried 16, 32 and 48ms) and found that 32ms worked best. In hindsight this is not surprising as it is also the length of our frames and so almost all onsets will end up spanning exactly two frames.

In this case, would making this to 2 help? (Either from here or doubling the ONSET_LENGTH constant). I was curious also as it took the model about 6k steps using the Maestro dataset to come up with onsets (not surprising since they would be sparse across samples) - it just predicted frames before and no onsets. Wanted to know your take on the values.

Thanks.

@jongwook
Copy link
Owner

jongwook commented Nov 4, 2019

I didn't totally understand that part of their paper, since their hop length is also 32ms (hop length 512 / frame rate 16000). Maybe they used 16ms hop first and then didn't update that part of the paper later when they decided to use 32ms hop.

As far as the 3 means in the quoted code, it's just a code to specify the onset in a combined byte array to save some runtime memory. The actual onset/offset/frame data is decoded during training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants