- handle over 100 dataset
- generate statistic report about processed dataset
- support many pre-processing ways
- Provide a panel for entering your parameters at runtime
- easy to adapt your own dataset and pre-processing utility
https://voidful.github.io/NLPrep-Datasets/
Learn more from the docs.
pip install nlprep
nlprep --dataset clas_udicstm --outdir sentiment
You can also try nlprep in Google Colab:
$ nlprep
arguments:
--dataset which dataset to use
--outdir processed result output directory
optional arguments:
-h, --help show this help message and exit
--util data preprocessing utility, multiple utility are supported
--cachedir dir for caching raw dataset
--infile local dataset path
--report generate a html statistics report
Thanks for your interest.There are many ways to contribute to this project. Get started here.
Icons modify from Darius Dan from www.flaticon.com
Icons modify from Freepik from www.flaticon.com