This repository implements different architectures for training word embeddings. The architectures include Continuous Bag-of-Words (CBOW), skip-gram, and Global Vectors for Word Representation (GloVe). Wikipedia articles is used as training data, while the Google Analogy dataset and the WordSim353 dataset is used for validating the word embeddings.
- Continuous Bag-of-Words (CBOW) architecture implementation
- Skip-gram architecture implementation
- Global Vectors for Word Representation (GloVe) architecture implementation
- Install required python version 3.11
- Install required packages
pip install -r source/requirements.txt
(We recommend using virtual environment, follow guide under Virtual Environment Setup below and skip this step) - Run program
python source/main.py
- Get the package
pip install virtualenv
- Create a new empty instance of python environment
py -3.11 -m venv ./.venv
- Activate the environment
source .venv/Scripts/activate
- Install the packages required by this project
pip install -r source/requirements.txt
- Get the package
pip install virtualenv
- Create a new empty instance of python environment
python -m venv ./.venv
- Activate the environment
source .venv/bin/activate
- Install the packages required by this project
pip install -r source/requirements.txt