Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is overlap length in dataloader needed? #25

Open
williamFalcon opened this issue Sep 21, 2018 · 1 comment
Open

Why is overlap length in dataloader needed? #25

williamFalcon opened this issue Sep 21, 2018 · 1 comment

Comments

@williamFalcon
Copy link

Again, great project guys, good implementation!

I had some inline questions from the dataloader file. Not understanding why it's doing the segmentation it's doing.

# dataset.py    

# A) what is this??? 
# B) why? 
# C) Is this related to the number of samples per frame for tier 3?
self.overlap_len = 64     

# length of music clip   
n_samples = 128064    

# desired seq size we want for each tng example
# D) why did you pick 1024? 
self.seq_len = 1024   

# iterate the full song 1024 units at a time    
for seq_begin in range(self.overlap_len, n_samples, self.seq_len):
    # 0 in first loop
    from_index = seq_begin - self.overlap_len  

    # 1088 in first loop.
    # E) Why not 1024? 
    # F) what is the overlap?
    to_index = seq_begin + self.seq_len   

    # (128 x 1088)  
    sequences = batch[:, from_index : to_index]   

    # G) why is this dropping off the last sample??
    input_sequences = sequences[:, : -1]   

    # H) why is the label such an odd subset? 
    target_sequences = sequences[:, self.overlap_len:].contiguous()   
    
    # I) Is X not trying to predict the next sequence making that missing chunk Y?
    # ie: full_seq = [1,2,3,4,5,6].   X = [1,2,3,4].   Y = [5, 6]?
    # currently this is not how the data are laid out.   
    yield (input_sequences, reset, target_sequences)    

Thanks! @koz4k

@rohan1561
Copy link

C) yes, it is equal to that.
E) they're probably trying to generate the entire song so they use a dummy input for the first time step in all rnn tiers. This also answers H. Every part of the actual song is in the target sequence. The very first frame for every tier is generated from dummy inputs.
G) Because the last sample is generated from the sample level mlp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants