ViLT

TarrySingh · Jul 10, 2024 · 608ca25 · 608ca25
1 parent 0c0aa4c
commit 608ca25
Show file tree

Hide file tree

Showing 8 changed files with 35,840 additions and 0 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/deep-learning/.DS_Store b/deep-learning/.DS_Store
diff --git a/deep-learning/Transformer-Tutorials/ViLT/Fine_tuning_ViLT_for_VQA.ipynb b/deep-learning/Transformer-Tutorials/ViLT/Fine_tuning_ViLT_for_VQA.ipynb
diff --git a/...learning/Transformer-Tutorials/ViLT/Inference_with_ViLT_(visual_question_answering).ipynb b/...learning/Transformer-Tutorials/ViLT/Inference_with_ViLT_(visual_question_answering).ipynb
diff --git a/deep-learning/Transformer-Tutorials/ViLT/Masked_language_modeling_with_ViLT.ipynb b/deep-learning/Transformer-Tutorials/ViLT/Masked_language_modeling_with_ViLT.ipynb
diff --git a/deep-learning/Transformer-Tutorials/ViLT/README.md b/deep-learning/Transformer-Tutorials/ViLT/README.md
@@ -0,0 +1,10 @@
+# ViLT notebooks
+In this directory, you can find several notebooks that illustrate how to use NAVER AI Lab's [ViLT](https://arxiv.org/abs/2102.03334) both for fine-tuning on custom data as well as inference. It currently includes the following notebooks:
+
+- fine-tuning ViLT for visual question answering (VQA) (based on the [VQAv2 dataset](https://visualqa.org/))
+- performing inference with ViLT to illustrate visual question answering (VQA)
+- masked language modeling (MLM) with a pre-trained ViLT model
+- performing inference with ViLT for image-text retrieval
+- performing inference with ViLT to illustrate natural language for visual reasoning (based on the [NLVRv2 dataset](https://lil.nlp.cornell.edu/nlvr/)).
+
+All models can be found on the [hub](https://huggingface.co/models?search=vilt).