Word-Vectors

Overview

This repository implements different architectures for training word embeddings. The architectures include Continuous Bag-of-Words (CBOW), skip-gram, and Global Vectors for Word Representation (GloVe). Wikipedia articles is used as training data, while the Google Analogy dataset and the WordSim353 dataset is used for validating the word embeddings.

Continuous Bag-of-Words (CBOW) architecture implementation
Skip-gram architecture implementation
Global Vectors for Word Representation (GloVe) architecture implementation

Setup

Install required python version 3.11
Install required packages pip install -r source/requirements.txt (We recommend using virtual environment, follow guide under Virtual Environment Setup below and skip this step)
Run program python source/main.py

Virtual Environment Setup

Windows

Get the package pip install virtualenv
Create a new empty instance of python environment py -3.11 -m venv ./.venv
Activate the environment source .venv/Scripts/activate
Install the packages required by this project pip install -r source/requirements.txt

Linux

Get the package pip install virtualenv
Create a new empty instance of python environment python -m venv ./.venv
Activate the environment source .venv/bin/activate
Install the packages required by this project pip install -r source/requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
data		data
examples		examples
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word-Vectors

Overview

Setup

Virtual Environment Setup

Windows

Linux

About

Languages

License

sindre0830/Word-Vectors

Folders and files

Latest commit

History

Repository files navigation

Word-Vectors

Overview

Setup

Virtual Environment Setup

Windows

Linux

About

Topics

Resources

License

Stars

Watchers

Forks

Languages