We made an input-pipeline, which saves and loads our datasets locally after creation.
- tensorflow version >= 2.6 required (CPU works as well as GPU)
- ensure that you are in the /iannwtf_hw7 directory
- to run:
python -m pipeline
- BPTT can be computationally expensive as the number of timesteps increases and can lead to gradient problems (vanishing,exploding).
- TBPTT cuts down computation and memory requirements (but the truncation-length has to be chosen carefully in order to work well).
- To use TBPTT we would need to implement backpropagation on a different level or way, because we would have to optimize our model for each bundle of timesteps, not at the end for all timesteps together.
- We could theoretically use TBPTT to reduce computation and memory requirement while training our model.
- The integrated forget-gate in the LSTM cells already helps with vanishing anf exploding gradients
- In our problem: our input consists of
n
numbers (n = sequnece length
) and our target is either 1 or 0, depending on the sum of all input numbers. - This is a function with the dimensions
f: R^n -> {0,1}
, and therefore a classification problem. - Nice to note: in the LSTM layers we perform regression tasks, for axample with the function
f: R^n -> R
(for the last hidden_output).
- data_samples: 96000: 64000/16000/16000 (train,validation,test)
- data_seq_length: 25
- batch_size: 32
- learning_rate = 0.001
- optimizer: Adam
- loss = BinaryCrossentropy
- epochs = 3