-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explanation of frame_sizes
and ns_frame_samples
#17
Comments
It's been a long time but if you have found answers for this I'll gladly take them, i'm also confused |
I remember getting an intuition of how that worked, but I can't remember exactly what that was. I think that the actual frame size was the product between the two vectors, or something like that... Sorry I can't be of much help :( |
Hello, It's been an even longer time but I think I'm starting to understand it (I had to read the helper description at least 50 times...). The helper description says :
So I think you have to give the number of same you want in a given fram as a function of all the other frame from the lower tier. Ex: From the paper "HIGH-QUALITY SPEECH CODING WITH SAMPLE RNN" I need the following frame sizes: FS (1) = FS (2) = 2, FS (3) = 16 and FS (4) = 160. Intuitively I would put as argument: But for what I understand I would need to put [as argument] : -2 because it's the lowest tier That would explain the use of However, I might just not understand as well. I don't know why they would do it that way because it is really confusing if yes (at least for me) Hope It helped. If you have more info please do correct me. Tks -bert |
Hello,
can you please explain the purpose of
frame_sizes
andns_frame_samples
in theSampleRNN
constructor?I get the meaning of
frame_sizes
from the paper. However, there's something strange (at least to me): in the paper, especially in the main figure, it seems the the frame size at tier 3 is 16 and the frame size at tier 2. In the code, you use the same values (frame_sizes = [16, 4]
), however it seems that the order is reversed, because inPredictor
'sforward()
you scan the RNNs inreversed
order, so apparently you use 4 for tier 3 and 16 for tier 2. Is there something I'm not getting right here...?Besides, what's the purpose of
n_frame_samples
for each RNN?Thanks!
The text was updated successfully, but these errors were encountered: