Tested with Ubuntu 22
and Python 3.10
and Nvidia A10
- Ensure you have the
submodule initialized in case you did not clone with--recursive
flag:git submodule update --init --recursive
- Setup a working Python environment and install the
from the MaChAmp directory. - Download the datasets with
All experiments here run with MaChAmp. Configuration files were updated to work with v0.4.2 of the software.
All experiments here run with MaChAmp. Configuration files were updated to work with v0.4.2 of the software.
To get MaChAmp and install its requirements, you can run:
git submodule update --init --recursive # (if you didn't clone CreoleVal with --recursive flag)
pip install -r machamp/requirements.txt
It's recommended to do this in a virtual environment. (PS. If there are issues with the jsonnet installation, try installing it via conda-forge, i.e. conda install -c conda-forge jsonnet
The configs
folder contains all the MaChAmp configuration files for the model hyperparameters and datasets.
Filenames beginning with params_
contain model hyperparameters where the pretrained transformer models can be changed by changing the following line to any Huggingface transformer model:
"transformer_model": "bert-base-multilingual-cased",
- For all experiments, there are config files for mBERT, mT5, and XLM-R.
- Filenames beginning with
contain the data filepaths and configurations. Ensure that filepaths mentioned in these files are same as the filepaths to the data files on your system (should be the case if you use the instructions & scripts in this repo).
Output of a train file will create a logs
dir where all your models, metrics, scores and test files will be stored.
- Data: AfriSenti-SemEval Shared Task 12
- Download script:
- Download script:
- Config:
- Train:
./train.sh senti_afri {mbert,mt5,xlmr}
- Predict:
./predict.sh logs/senti_afri_<model>_<date> data/afrisenti/pcm_test.tsv
- Data: The data is originally from Data Science Nigeria but we include our train/dev/test splits in
in this repo for reproducibility. - Config:
- Train:
./train.sh senti_oyewusi {mbert,mt5,xlmr}
- Predict:
./predict.sh logs/senti_oyewusi_xlmr_baseline/<date>/ data/oyewusi/test.tsv