This repository contains a basic implementation of the Transformer model, specifically designed for a unique sentence transformation task.
transformer.py
: Contains the implementation of the Transformer model.train.py
: Contains the code for training and testing the Transformer model.
- Ensure you have PyTorch installed.
- Run
train.py
to train the model. - The training script,
train.py
, leverages the Transformer model defined intransformer.py
.
Sample data consists of predefined sentences that map characters (like "a" and "b") to corresponding Chinese characters (like "一" and "二").
The model is based on the standard Transformer architecture. It has an encoder and a decoder. The encoder reads the input sentence and produces a continuous representation. The decoder then uses this representation to produce the output sentence.
-
@article{vaswani2017attention, title={Attention is all you need}, author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia}, journal={Advances in neural information processing systems}, volume={30}, year={2017} }