Hindi-and-Tamil-Question-Answering-System

This contains the code for the result obtained using XLM-RoBERTa on the chaii dataset.

Dataset:

The chaii-dataset is used for fine-tuning, along with mlqa (MultiLingual Question Answering) and XQuAD (Cross-lingual Question Answering Dataset) datasets on the XLM-RoBERTa model that has been pre-trained on SQuAD2. The Chaii dataset consists of the following:

id: unique id for each example
context: a paragraph based on which the questions have to be answered
question: the question that has to be answered
answer_start: the index from which the answer starts (only in the train set)
answer_text: the answer in string format (only in the train set)

Models:

The model used are mBERT, XLM-RoBERTa.

Result:

m-BERT (pre-trained on SQuADv1.1, finetuned with chaii) gives 0.55 jaccord score mDeBERTa gives 0.579 jaccord score mDeBERTa (finetuned with mlqa, xquad, chaii) gives 0.59 jaccord score XLM-RoBERTa (pre-trained on squadv2, finetuned with chaii) gives 0.586 jaccord score XLM-RoBERTa (pre-trained on squadv2, finetuned with mlqa, xquad, chaii) gives 0.616 jaccord score

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
metric		metric
postprocess		postprocess
preprocess		preprocess
README.md		README.md
mDeBERTa.ipynb		mDeBERTa.ipynb
mDeBERTa_mlqa_xquad.ipynb		mDeBERTa_mlqa_xquad.ipynb
mbert.ipynb		mbert.ipynb
xlmroberta-mlqa-xquad-chaii.ipynb		xlmroberta-mlqa-xquad-chaii.ipynb
xlmroberta.ipynb		xlmroberta.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hindi-and-Tamil-Question-Answering-System

Dataset:

Models:

Result:

About

Releases

Packages

Languages

gnanaprakash-ravi/Hindi-and-Tamil-Question-Answering-System

Folders and files

Latest commit

History

Repository files navigation

Hindi-and-Tamil-Question-Answering-System

Dataset:

Models:

Result:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages