Skip to content

Examples and tutorials

Nandan Thakur edited this page Jun 29, 2022 · 1 revision

🍻 Examples and Tutorials

To easily understand and get your hands dirty with BEIR, we invite you to try our tutorials out πŸš€ πŸš€

🍻 Google Colab

Name Link
How to evaluate pre-trained models on BEIR datasets Open In Colab

🍻 Lexical Retrieval (Evaluation)

Name Link
BM25 Retrieval with Elasticsearch evaluate_bm25.py
Anserini-BM25 (Pyserini) Retrieval with Docker evaluate_anserini_bm25.py
Multilingual BM25 Retrieval with Elasticsearch πŸ†• evaluate_multilingual_bm25.py

🍻 Dense Retrieval (Evaluation)

Name Link
Exact-search retrieval using (dense) Sentence-BERT evaluate_sbert.py
Exact-search retrieval using (dense) ANCE evaluate_ance.py
Exact-search retrieval using (dense) DPR evaluate_dpr.py
Exact-search retrieval using (dense) USE-QA evaluate_useqa.py
ANN and Exact-search using Faiss πŸ†• evaluate_faiss_dense.py
Retrieval using Binary Passage Retriver (BPR) πŸ†• evaluate_bpr.py
Dimension Reduction using PCA πŸ†• evaluate_dim_reduction.py

🍻 Sparse Retrieval (Evaluation)

Name Link
Hybrid sparse retrieval using SPARTA evaluate_sparta.py
Sparse retrieval using docT5query and Pyserini evaluate_anserini_docT5query.py
Sparse retrieval using docT5query (MultiGPU) and Pyserini πŸ†• evaluate_anserini_docT5query_parallel.py
Sparse retrieval using DeepCT and Pyserini πŸ†• evaluate_deepct.py

🍻 Reranking (Evaluation)

Name Link
Reranking top-100 BM25 results with SBERT CE evaluate_bm25_ce_reranking.py
Reranking top-100 BM25 results with Dense Retriever evaluate_bm25_sbert_reranking.py

🍻 Dense Retrieval (Training)

Name Link
Train SBERT with Inbatch negatives train_sbert.py
Train SBERT with BM25 hard negatives train_sbert_BM25_hardnegs.py
Train MSMARCO SBERT with BM25 Negatives train_msmarco_v2.py
Train (SOTA) MSMARCO SBERT with Mined Hard Negatives πŸ†• train_msmarco_v3.py
Train (SOTA) MSMARCO BPR with Mined Hard Negatives πŸ†• train_msmarco_v3_bpr.py
Train (SOTA) MSMARCO SBERT with Mined Hard Negatives (Margin-MSE) πŸ†• train_msmarco_v3_margin_MSE.py

🍻 Question Generation

Name Link
Synthetic Query Generation using T5-model query_gen.py
(GenQ) Synthetic QG using T5-model + fine-tuning SBERT query_gen_and_train.py
Synthetic Query Generation using Multiple GPU and T5 πŸ†• query_gen_multi_gpu.py

🍻 Benchmarking (Evaluation)

Name Link
Benchmark BM25 (Inference speed) benchmark_bm25.py
Benchmark Cross-Encoder Reranking (Inference speed) benchmark_bm25_ce_reranking.py
Benchmark Dense Retriever (Inference speed) benchmark_sbert.py