Transformer Model Implementation

This repository contains a basic implementation of the Transformer model, specifically designed for a unique sentence transformation task.

File Structure

transformer.py: Contains the implementation of the Transformer model.
train.py: Contains the code for training and testing the Transformer model.

How to Use

Ensure you have PyTorch installed.
Run train.py to train the model.
The training script, train.py, leverages the Transformer model defined in transformer.py.

Data

Sample data consists of predefined sentences that map characters (like "a" and "b") to corresponding Chinese characters (like "一" and "二").

Model

The model is based on the standard Transformer architecture. It has an encoder and a decoder. The encoder reads the input sentence and produces a continuous representation. The decoder then uses this representation to produce the output sentence.

Acknowledgments

this detailed article

@article{vaswani2017attention,
  title={Attention is all you need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  journal={Advances in neural information processing systems},
  volume={30},
  year={2017}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transformer Model Implementation

File Structure

How to Use

Data

Model

Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transformer Model Implementation

File Structure

How to Use

Data

Model

Acknowledgments