Skip to content

lct-rug-2022/lft-assignment-3

Repository files navigation

Learning From Data Assignment 3: Multi-Class Classification for Text Reviews

Installation

Note: python 3.10 is required

python -m venv .venv
source .venv/bin/activate
python -m pip install -U -r requirements.txt 

Dataset split

Split default dataset (datasets folder reviews.txt file) with 0.7/0.15/0.15 as train/val/test sets as csv files.

python dataset_split.py

For more information and additional parameters please refer to the script help

python dataset_split.py --help

The split dataset used for our experiments is uploaded to the git and can be found in datasets folder.

Experiments

Notebooks with experiments are available in experiments folder. They are optimized to be used with collab and requre dataset files to be uploaded.

Training model from scratch

Training best model with default dataset (datasets folder train.csv, val.csv and test.csv files)

python train.py

For more information and additional parameters please refer to the script help

python train.py --help

The best model we trained can be found at HuggingFace Model Hub.

Predict with trained model

Download and run best model on provided dataset file (datasets/test.csv by default).

python predict.py

For more information and additional parameters please refer to the script help

python predict.py --help

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published