CS224n Homeworks and final project
-
Updated
Oct 13, 2020 - Jupyter Notebook
CS224n Homeworks and final project
submission from team import-winning-model
Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"
This work was done as part of Kaggle InClass competition hosted by Sharechat. We had to develop AI solutions for predicting abusive comments posted on the Moj app in 10+ languages given natural language data and user context data.
By using the hypothesis of historical linguistics, we found a way to improve the performance of multilingual transformers with limited amount of data
Zero-shot and Translation Experiments on XQuAD, MLQA and TyDiQA
In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.
Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models
Cross-lingual misinformation detection
Code and data for the EMNLP 2020 paper: "Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank"
Codes for the short essay project for Mphil TAL by Yingjia Wan in 2023.
This repository contains a number of experiments with Multi Lingual Transformer models (Multi-Lingual BERT, DistilBERT, XLM-RoBERTa, mT5 and ByT5) focussed on the Dutch language.
Match celebrity users with their respective tweets by making use of Semantic Textual Similarity on over 900+ celebrity users' 2.5 million+ scraped tweets utilizing SBERT, streamlit, tweepy and FastAPI
Align Parallel Sentence of 104 Languages with the help of mBERT and LaBSE
Add a description, image, and links to the multilingual-bert topic page so that developers can more easily learn about it.
To associate your repository with the multilingual-bert topic, visit your repo's landing page and select "manage topics."