install with:
conda env create --file environment.yml
conda activate pytorch_transformer
and if data does not exist run the dataset (en-de, de-en) download:
# cd is important, otherwise data goes into wrong folder
cd helpers
python helpers/iwslt_setup.py
Please Check before running training: The data should be in the same directory as the python script in which you are executing torchtext.datasets.IWSLT.splits
, e.g.:
transformer-annotated
|- transformer.py
|
|- .data
|- iwslt
|- en-de.tgz
|- de-en.tgz
scp -r [email protected]:/work/ws/nemo/fr_as1464-transformer_work-0/transformer-main/experiments_save/<NAME_OF_EXPERIMENT> /home/mrrobot/PycharmProjects/transformer-main/experiments_save
# Within the project main folder:
tensorboard --logdir experiments_save/runs
- Paper link: https://arxiv.org/pdf/1706.03762.pdf
- Suggested theory: https://jalammar.github.io/illustrated-transformer/
Contains the implementation of the original transformer paper "Attention is all you need".
Paper link: https://arxiv.org/pdf/1706.03762.pdf
Certain modifications:
- LayerNorm (before instead of after)
- Dropout (Added additionally to attention weights and point-wise feed-forward net sublayer