Skip to content

Latest commit

 

History

History
65 lines (40 loc) · 1.48 KB

README.md

File metadata and controls

65 lines (40 loc) · 1.48 KB

Learning From Data Assignment 3: Multi-Class Classification for Text Reviews

Installation

Note: python 3.10 is required

python -m venv .venv
source .venv/bin/activate
python -m pip install -U -r requirements.txt 

Dataset split

Split default dataset (datasets folder reviews.txt file) with 0.7/0.15/0.15 as train/val/test sets as csv files.

python dataset_split.py

For more information and additional parameters please refer to the script help

python dataset_split.py --help

The split dataset used for our experiments is uploaded to the git and can be found in datasets folder.

Experiments

Notebooks with experiments are available in experiments folder. They are optimized to be used with collab and requre dataset files to be uploaded.

Training model from scratch

Training best model with default dataset (datasets folder train.csv, val.csv and test.csv files)

python train.py

For more information and additional parameters please refer to the script help

python train.py --help

The best model we trained can be found at HuggingFace Model Hub.

Predict with trained model

Download and run best model on provided dataset file (datasets/test.csv by default).

python predict.py

For more information and additional parameters please refer to the script help

python predict.py --help