This project aims to track Natural Language Processing (NLP) resources for the Arabic language and gives an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.
The idea is to track papers, datasets, pre-trained models and SOTA of each NLP task. This inventory is built with business applications in mind, the objective being to have a starting point for NLP based solutions, rather than fully focused on research.
The main goal is to provide the reader with an overview of benchmark datasets and the state-of-the-art for their task of interest, which helps building a quick proof of concept (POC) for their target application.
- Automatic speech recognition
- Speaker Diarization
- Machine translation
- Question answering
- Language Modeling
- Diacritization
More tasks coming soon stay tuned ! π€© You are welcome to contribute to this project ! π
- Arabic Natural Language Processing Workshop: WANLP 2020, WANLP 2019.
Read our Contributing Guidelines and Code of Conduct.
This project is inspired by NLP-progress.