Read this Medium article for full discussion.
The semantic models resources are added to Lanfrica
π π π The Amharic RoBERTa model is uploaded in Huggingface Amharic RoBERTa Model π π π
π π The Amharic FLAIR embedding model is integrated into the FLAIR library as am-forward
π π The model will be accessible on the next FLAIR release. Details
π π The Amharic Segmenter, Toknizer, and Translitrator is released and can be installed as pip install amseg
π π
π π The Flair based Amharic NER classifier model is now released am-flair-ner π π
π π The Flair based Amharic Sentiment classifier model is now released am-flair-sent π π
π π The Flair based Amharic POS tagger is now released am-flair-pos π π
- Here, we have described the different NLP tasks for which we built models using the benchmark datasets Tasks
- NER
- Sentiment
- POS tagging
- Question classification
- The different datsets and resources are available under: Datasets
- Named Entity recognition dataset
- POS dataset
- Sentiment Dataset
- Question Classification Dataset
- For Amahric word segmentation, tokenization, and translitration check this project: Segmentation
To cite the different Amharic NLP models and resources, use the following paper
@Article{fi13110275,
AUTHOR = {Yimam, Seid Muhie and Ayele, Abinew Ali and Venkatesh, Gopalakrishnan and Gashaw, Ibrahim and Biemann, Chris},
TITLE = {Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets},
JOURNAL = {Future Internet},
VOLUME = {13},
YEAR = {2021},
NUMBER = {11},
ARTICLE-NUMBER = {275},
URL = {https://www.mdpi.com/1999-5903/13/11/275},
ISSN = {1999-5903},
DOI = {10.3390/fi13110275}
}
To cite the impacts of homophone normalization, use the the following paper
@inproceedings{belay2021impacts,
title={Impacts of Homophone Normalization on Semantic Models for Amharic},
author={Belay, Tadesse Destaw and Ayele, Abinew Ali and Gelaye, Getie and Yimam, Seid Muhie and Biemann, Chris},
booktitle={2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)},
pages={101-106},
year={2021},
ISSN = {978-1-6654-3666-3},
DOI = (10.1109/ICT4DA53266.2021.9672229},
publisher={IEEE}
}
To cite the Question Answering Classification for Amharic, use the the following paper
@inproceedings{belay2022question,
title={Question Answering Classification for Amharic Social Media Community Based Questions},
author={Belay, Tadesse Destaw and Yimam, Seid Muhie and Gelaye, Getie and Ayele, Abinew Ali and Biemann, Chris},
booktitle={2022 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL)},
pages={Page will appear},
year={2022},
publisher={arXiv}
}