1d Attention Example / recommendation #11

ragulpr · 2025-01-03T21:51:10Z

Would be nice with an example to be used with attention. I have experimented but I'm not up to date enough to know what the most popular approach for compression / pruning / regularization is. Preferrably I'd rewrite this section to be useful with attention rather than RNN's as they are out:

taildropout/README.md

Lines 88 to 101 in 5ed14a0

    
           #### Sequences 
        
           "Recurrent dropout" == Keep mask constant over time. Popular approach. 
        
           ``` 
        
           x = torch.randn(n_timesteps,n_sequences,n_features) 
        
           gru = nn.GRU(n_features,n_features) 
        
           taildropout = TailDropout(batch_dim = 1, dropout_dim = 2) 
        
           x, _ = gru(x) 
        
           x = taildropout(x) 
        
           ``` 
        
           If you want to have mask vary for each timestep and sequence 
        
           ``` 
        
           taildropout = TailDropout(batch_dim = [0,1], dropout_dim = 2)

Ref #10 also see #4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1d Attention Example / recommendation #11

1d Attention Example / recommendation #11

ragulpr commented Jan 3, 2025 •

edited

Loading

1d Attention Example / recommendation #11

1d Attention Example / recommendation #11

Comments

ragulpr commented Jan 3, 2025 • edited Loading

ragulpr commented Jan 3, 2025 •

edited

Loading