-
Notifications
You must be signed in to change notification settings - Fork 323
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Positional encodings and initial arguments for transformer * Stub for TransformerEncoder * WIP self attention * ffn * Unmasked self-attention prototype * cleaned up code. Still not tested * Put things together so we can run and debug, some cleanup * Separate layer construction from application in encoder * Added masking for self-attention * More fixes, now runs on CPUs with default args * removed unused code * fix inference for transformer * docstrings * Added Multi-head dot attention for the actual attention mechanism. Enable with --attention-type mhdot * fixed existing tests * Import fix * Precompute positional encodings in variable initialization * temporary fix. Will change later * Pass max_seq_len to Embedding if needed for positional encodings * fix import * more control over positional encodings * Fix masking for MultiheadAttention * Fix nasty bug with layer normalization quietly accepting 3d input. * WIP: decoder * Added transformer test * WIP full transformer with decoder. Inference and RNN is currently broken, work-in-progress * fix auto-regressive bias * Revised Configs and Decoder interface * moved attention into (rnn) decoder * Defined proper Decoder interface for inference. Rewrote RecurrentDecoder to adhere to the new interface. * Fixed bias variable/length problem by writing a custom operator * custom operator for positional encodings * added integration tests * improve consistency * Fixed a last bug in inference regarding lengths. All tests pass now * Bump version * Update tests * Make mypy happy * Support transformer with convolutional embedding encoder * Fix to actually use layer normalization * Allow projecting segment embeddings to arbitrary size * Typo fix * Correct path in documentation pypi upload doc. (#92) * Uniform weight initialization. (#93) * Added transformer dropout * Learning rate warmup * fix * Changed eps for Layer Normalization * Docstrings and cleanup * Better coverage for ConvolutionalEmbeddingEncoder * warmup WIP * Fix travis builds * Removed source_length from inference code. Is now computed in the encoder graph * Added transformer module to doc generation * small fixes * fixed doc generation * Fix tests * Refactored read_metrics_file method to separate multiple of its responsibilities. The new read_metrics_file method can now easily be used for other things, e.g. offline analysis etc. * Removed old method * Fixed argument description * revised arguments according to David & Tobis comments * Fix system tests * Removed duplicate query scaling in DotAttention * adressed Tobis comments * pass correct argument to rnn attention num heads * Moved check for batch2timemajor encoder being last encoder to encoder sequence * Fixed RNN decoder after decoder rewrite. * fix #2 * Do not truncate metrics file in callback_monitor constructor. Restructured saving and loading of metris file to make it consistent. * make pylint happy * adressed Tobis comments * Test averaging in integration/system tests * Adressed Tobis (last?) comments * revised abstract class * adressed tobis comments
- Loading branch information
Showing
31 changed files
with
2,250 additions
and
772 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.