Skip to content

v0.10.0

Compare
Choose a tag to compare
@KennethEnevoldsen KennethEnevoldsen released this 26 Mar 12:40
· 1605 commits to main since this release

v0.10.0 (2024-03-26)

Ci

  • ci: renamed test job and workflow (#282)

ci: Added tests (6675bb8)

Documentation

  • docs: typos in readme (#268) (aa9234c)

  • docs: add dataset schemas (#255)

  • docs: update AbsTaskClassification.py document schema for classification task

  • update AbsTaskBitextMining.py

  • update BornholmskBitextMining.py

  • update AbsTaskClustering.py and BlurbsClusteringP2P.py

  • update 8 files

  • update 9 files

  • update AbsTaskReranking.py

  • update BlurbsClusteringP2P.py

  • update CMTEBPairClassification.py

  • update GerDaLIRRetrieval.py

  • update 7 files

  • update AbsTaskBitextMining.py

  • update AbsTaskClassification.py (c3ce1ac)

  • docs: Add development installation instructions (#246)

  • docs: Add development installation instructions

  • removed unused requirements file

I don't believe this is nec. with the setup.py specifying the same dependencies

  • docs: Updated make file with new dependencies

  • ci: Update ci to use make commands

This ensure that the user runs exactly what the CI expects

  • ci: Avoid specifying tests folder as it causes issuew ith tests

  • ci: removed unec. args for test ci

  • Added dev install (0048878)

Feature

  • feat: update revision id of wikicitiesclustering task (fb90c02)

Fix

  • fix: dead link in readme (ecbb776)

  • fix: Added sizes to the metadata (#276)

  • restructing the readme

  • added mmteb

  • removed unec. method

  • Added docstring to metadata

  • Updated outdated examples

  • formatting documents

  • fix: Updated form to be parsed correctly

  • fix: Added sizes to the metadata

this allow for automatic metadata generations

  • Updated based on feedback

  • Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <[email protected]>

  • updated based on feedback

  • Added suggestion from review

  • added correction based on review

  • reformatted empty fields to None


Co-authored-by: Niklas Muennighoff <[email protected]> (cd4a012)

  • fix: remove debugging print statement (d292d93)

  • fix: pass parallel_retrieval kwarg to use DenseRetrievalParallelExactSearch (19b8f66)

  • fix: msmarco-v2 uses dev.tsv, not dev1.tsv (6908d21)

  • fix: add missing task-langs attribute (#152) (bc22909)

Refactor

  • refactor: add metadata basemodel (#260)

  • refactor: rename description to metadata dict

  • refactor: add TaskMetadata and first example

  • update 9 files

  • update TaskMetadata.py

  • update TaskMetadata.py

  • update TaskMetadata.py

  • update LICENSE, TaskMetadata.py and requirements.dev.txt

  • update 151 files

  • update 150 files

  • update 43 files and delete 1 file

  • update 106 files

  • update 45 files

  • update 6 files

  • update 14 files

  • Added model results to repo and updated CLI to create consistent folder structure. (#254)

  • Added model results to repo and updated CLI to create consistent folder structure.

  • ci: updated ci to use make install

  • Added missing pytest dependencies

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>


Co-authored-by: Niklas Muennighoff <[email protected]>

  • Restructing the readme (#262)

  • restructing the readme

  • removed double specification of versions and moved all setup to pyproject.toml

  • correctly use flat-layout for the package

  • build(deps): update TaskMetadata.py and pyproject.toml

  • update 221 files

  • build(deps): update pyproject.toml

  • build(deps): update pyproject.toml

  • build(deps): update pyproject.toml


Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (dd5d617)

Unknown

  • Ci-fix (#289)

  • added release pipeline

  • v1.3.0

  • ci: moved release to the correct folder (7f56c1a)

  • v1.3.0

  • added release pipeline

  • v1.3.0 (5e4d10e)

  • tests: speed up tests (#283)

update Makefile and test_all_abstasks.py (2155bf6)

  • update TaskMetadata.py (#281) (acfd7d4)

  • Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (c9d1a03)

  • Enable ruff ci (#279)

  • restructing the readme

  • added mmteb

  • removed unec. method

  • Added docstring to metadata

  • Updated outdated examples

  • formatting documents

  • fix: Updated form to be parsed correctly

  • fix: Added sizes to the metadata

this allow for automatic metadata generations

  • Updated based on feedback

  • Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <[email protected]>

  • updated based on feedback

  • Added suggestion from review

  • added correction based on review

  • reformatted empty fields to None

  • CI: Enable linter


Co-authored-by: Niklas Muennighoff <[email protected]> (a16eb07)

  • Added MMTEB (#275)

  • restructing the readme

  • added mmteb

  • removed unec. method

  • Added docstring to metadata

  • Updated outdated examples

  • formatting documents

  • fix: Updated form to be parsed correctly

  • Updated based on feedback

  • Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <[email protected]>

  • updated based on feedback

  • Added suggestion from review

  • added correction based on review


Co-authored-by: Niklas Muennighoff <[email protected]> (c0dc49a)

  • dev: add ruff as suggested extension (#274) (b08913f)

  • dev: add isort (#271)

  • dev: add isort

  • dev: add isort (845099d)

  • dev: run tests on pull request towards any branch (13f759a)

  • Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (b42abe4)

  • replaced linter with ruff (#265)

  • restructing the readme

  • removed double specification of versions and moved all setup to pyproject.toml

  • correctly use flat-layout for the package

  • replaced linter with ruff

  • rerun tests

  • ci: Added in newer workflow

some of them are disables as they require other issues to be solved

  • Update Makefile

Co-authored-by: Niklas Muennighoff <[email protected]>


Co-authored-by: Niklas Muennighoff <[email protected]> (023e881)

  • Restructing the readme (#262)

  • restructing the readme

  • removed double specification of versions and moved all setup to pyproject.toml

  • correctly use flat-layout for the package (769157b)

  • restructing the readme (364be7f)

  • Added model results to repo and updated CLI to create consistent folder structure. (#254)

  • Added model results to repo and updated CLI to create consistent folder structure.

  • ci: updated ci to use make install

  • Added missing pytest dependencies

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>


Co-authored-by: Niklas Muennighoff <[email protected]> (8a758bc)

  • dev: add workspace defaults in VSCode (#253)

  • dev: add black as default formatter in vscode

  • Update .vscode/settings.json


Co-authored-by: Kenneth Enevoldsen <[email protected]> (30e5b9e)

  • Add Danish Discourse dataset (#247)

  • misc.

  • update ddisco.py

  • chore: delete ddisco.py, ddisco.test.tsv and ddisco.train.tsv

  • Update mteb/tasks/Classification/DdiscoCohesionClassification.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • Update mteb/tasks/Classification/DdiscoCohesionClassification.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • Update mteb/tasks/Classification/DdiscoCohesionClassification.py

Co-authored-by: Imene Kerboua <[email protected]>

  • Update mteb/tasks/Classification/DdiscoCohesionClassification.py

Co-authored-by: Imene Kerboua <[email protected]>

  • Update mteb/tasks/Classification/DdiscoCohesionClassification.py

Co-authored-by: Imene Kerboua <[email protected]>


Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]> (d46d0f5)

  • Update structure of mteb/tasks to mteb/tasks/{type}/{language}  (#245)

  • Fix structure of mteb/tasks
    Fixes #243

  • fix: Added missing init files (b1c78c1)

  • tests: do not run tests on collection (#249)

test: update test_all_abstasks.py (236614a)

  • Update README.md with correct DRESModel location (399edf4)

  • Fix typo (9610378)

  • Set dev version (716f59c)

  • Release: 1.2.0 (9e9dca8)

  • Rmv superfluous file (d772fed)

  • Remove duplicate & outdated code (12bcb83)

  • Adapt scripts (36b9234)

  • Add example (273ff4a)

  • Simplify retrieval (#233)

  • Simplify retrieval

  • Simplify

  • Make call method

  • Add splits

  • Rmv outdated test

  • Fix name & \n

  • Add qrels

  • Add revisions

Co-authored-by: Imene Kerboua <[email protected]>

  • Add hf hub org

  • Add test

  • Add missing revision

  • Rename test

Co-authored-by: Imene Kerboua <[email protected]>

  • log dres compat

Co-authored-by: Imene Kerboua <[email protected]> (c9fccbc)

  • Fixed missing revision error on Norwegian Bitext Mining (#221)

  • Removed revision specification from Norwegian Bitext Mining task

  • Update to latest revision


Co-authored-by: Niklas Muennighoff <[email protected]> (b249c67)

  • Remove HAGRID from french benchmark (#235)

  • add Masakhane dataset config

  • add trigram lang code for dataset who use it

  • create french script eval

  • fix French word

  • add some documentation

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • 4 pair classification (#10)

  • add Opusparcus dataset

  • multilingual usage

  • use eval_split of config files

  • change eval_split according to data


Co-authored-by: Gabriel Sequeira <[email protected]>

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • Clustering with HAL S2S dataset (#11)

HAL S2S dataset creation and evaluation on clustering task.

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • DiaBLa and Flores Bitext Mining evaluation (#12)

  • Add DiaBLa dataset for bitext mining

  • Add DiaBLa dataset for bitext mining

  • deduplicate bitext task

  • add Flores

  • format files

  • add flores to evaluation script

  • remove prints

  • add revision


Co-authored-by: Gabriel Sequeira <[email protected]>

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • adding dataset processing for mteb

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • fix change on langmapping

  • reset alphabetical order

  • add revision handling

  • Clustering: Add AlloProf dataset (#17)

AlloProf dataset for clustering task

  • handling of revision

  • change split + add revision handling

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • adding dataset processing for mteb

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • add script to process and upload alloprof on HF

  • adding dataset processing for mteb

  • refactor few thing

  • reset alphabetical order

  • add revision handling

  • handling of revision

  • change split + add revision handling

  • use eval variable

  • alphabetic order

  • Add MLSUM dataset for clustering task (#21)

  • Use Masakhane dataset for clustering task (#23)

  • 16 add datasets to readmemd (#18)

  • run task table

  • run task table

  • Add MLSUM dataset for clustering task (#21)

  • Use Masakhane dataset for clustering task (#23)

  • run task table

  • refresh readme

  • refresh readme

  • run task table

  • refresh readme


Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>

  • load only test split (#25)

Co-authored-by: Gabriel Sequeira <[email protected]>

  • Update mteb/tasks/BitextMining/DiaBLaBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/HALClusteringS2S.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • renaming masakhane (#28)

Co-authored-by: Gabriel Sequeira <[email protected]>

  • Syntec dataset addition (#26)

  • add scrpit to process & load to HF

  • add script to enable download of data from HF

  • add syntec dataset files to gitignore

  • add syntecretrieval

  • add syntec retrival

  • build dataloading script

  • remove datasets

  • correct typo


Co-authored-by: Sequeira Gabriel <[email protected]>

  • 30 add syntec reranking (#31)

  • change name to secify retrieval

  • add reranking tasks

  • create script to upload dataset fo reranking task

  • create reranking task

  • add reranking tasks

  • add model name in description

  • SummEval translated to french (#32)

  • 7 sts (#33)

  • taike into account multilingual tasks

  • add stsbenchmark multilingual dataset

  • add STS tasks

  • taike into account multilingual tasks

  • add stsbenchmark multilingual dataset

  • add STS tasks

  • add coma

  • Adding sick fr dataset to sts tasks (#34)

  • Adding sick fr dataset to sts tasks

  • modifying dataset in load function to have the right column names

  • Fix alloprof dataset (#36)

  • change revision to use

  • remove duplicate data

  • change main metric because dataset is hard (#37)

  • Fix alloprof dataset (#40)

  • change revision to use

  • remove duplicate data

  • change revision

  • handle queries train test split

  • change dataset creation method

  • change revision

  • handle queries train test split

  • change dataset creation method

  • Fix DiaBLa by inheriting CrossLingual class (#42)

  • Fix DiaBLa by inheriting CrossLingual class

  • remove remaining print

  • Fix DiaBLa integration

  • Update mteb/tasks/BitextMining/FloresBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Classification/MasakhaNEWSClassification.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

  • Update mteb/tasks/BitextMining/FloresBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/abstasks/AbsTaskPairClassification.py

Co-authored-by: Imene Kerboua <[email protected]>

  • Update README.md

  • Update scripts/data/syntec/create_data_reranking.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/data/alloprof/create_data_reranking.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/run_mteb_french.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/run_mteb_french.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Retrieval/HagridRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MLSUMClusteringP2P.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MLSUMClusteringS2S.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py

  • Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py

  • Update mteb/tasks/STS/SickFrSTS.py

  • Inherit OpusparcusPC init from MultilingualTask

  • remove unnecessary init

  • Remove train split from evaluation on MasakhaNEWSClassification (#52)

remove train split from evaluation

  • put script on HF dataset repos (#56)

  • put script on HF dataset repos

  • remove scripts

  • 49 fix dictionnary in syntecretrieval (#54)

  • add trust remote code arg

  • leave corpus as dict

  • remove trust remote code

  • add Tatoeba & BUCC BitextMining tasks (#57)

add bucc and tatoeba bitextmining tasks

  • 46 add other languages to masakhaneweclusterings2s and p2p (#58)

  • add other language to clustering tasks

  • fix main score and S2S task

  • update run fr becnhmark script

  • Update run_mteb_french.py

  • Update AbsTaskClustering.py

  • remove train and validation splits

  • remove Hagrid (#60)


Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Sequeira Gabriel <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: wissam-sib <[email protected]>
Co-authored-by: Wissam Siblini <[email protected]> (d01d053)

  • Restore TRECCOVID import (9f8e897)

  • Extend MTEB with French datasets (#218)

  • add Masakhane dataset config

  • add trigram lang code for dataset who use it

  • create french script eval

  • fix French word

  • add some documentation

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • 4 pair classification (#10)

  • add Opusparcus dataset

  • multilingual usage

  • use eval_split of config files

  • change eval_split according to data


Co-authored-by: Gabriel Sequeira <[email protected]>

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • Clustering with HAL S2S dataset (#11)

HAL S2S dataset creation and evaluation on clustering task.

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • DiaBLa and Flores Bitext Mining evaluation (#12)

  • Add DiaBLa dataset for bitext mining

  • Add DiaBLa dataset for bitext mining

  • deduplicate bitext task

  • add Flores

  • format files

  • add flores to evaluation script

  • remove prints

  • add revision


Co-authored-by: Gabriel Sequeira <[email protected]>

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • adding dataset processing for mteb

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • fix change on langmapping

  • reset alphabetical order

  • add revision handling

  • Clustering: Add AlloProf dataset (#17)

AlloProf dataset for clustering task

  • handling of revision

  • change split + add revision handling

  • add script to process and upload alloprof on HF

  • build script for HF

  • adding dataset processing for mteb

  • refactor few thing

  • remove whitespaces

  • adding dataset processing for mteb

  • adding BSARD dataset

  • add BSARD to benchmark

  • adding Hagrid dataset

  • add script to process and upload alloprof on HF

  • adding dataset processing for mteb

  • refactor few thing

  • reset alphabetical order

  • add revision handling

  • handling of revision

  • change split + add revision handling

  • use eval variable

  • alphabetic order

  • Add MLSUM dataset for clustering task (#21)

  • Use Masakhane dataset for clustering task (#23)

  • 16 add datasets to readmemd (#18)

  • run task table

  • run task table

  • Add MLSUM dataset for clustering task (#21)

  • Use Masakhane dataset for clustering task (#23)

  • run task table

  • refresh readme

  • refresh readme

  • run task table

  • refresh readme


Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>

  • load only test split (#25)

Co-authored-by: Gabriel Sequeira <[email protected]>

  • Update mteb/tasks/BitextMining/DiaBLaBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/HALClusteringS2S.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • renaming masakhane (#28)

Co-authored-by: Gabriel Sequeira <[email protected]>

  • Syntec dataset addition (#26)

  • add scrpit to process & load to HF

  • add script to enable download of data from HF

  • add syntec dataset files to gitignore

  • add syntecretrieval

  • add syntec retrival

  • build dataloading script

  • remove datasets

  • correct typo


Co-authored-by: Sequeira Gabriel <[email protected]>

  • 30 add syntec reranking (#31)

  • change name to secify retrieval

  • add reranking tasks

  • create script to upload dataset fo reranking task

  • create reranking task

  • add reranking tasks

  • add model name in description

  • SummEval translated to french (#32)

  • 7 sts (#33)

  • taike into account multilingual tasks

  • add stsbenchmark multilingual dataset

  • add STS tasks

  • taike into account multilingual tasks

  • add stsbenchmark multilingual dataset

  • add STS tasks

  • add coma

  • Adding sick fr dataset to sts tasks (#34)

  • Adding sick fr dataset to sts tasks

  • modifying dataset in load function to have the right column names

  • Fix alloprof dataset (#36)

  • change revision to use

  • remove duplicate data

  • change main metric because dataset is hard (#37)

  • Fix alloprof dataset (#40)

  • change revision to use

  • remove duplicate data

  • change revision

  • handle queries train test split

  • change dataset creation method

  • change revision

  • handle queries train test split

  • change dataset creation method

  • Fix DiaBLa by inheriting CrossLingual class (#42)

  • Fix DiaBLa by inheriting CrossLingual class

  • remove remaining print

  • Fix DiaBLa integration

  • Update mteb/tasks/BitextMining/FloresBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Classification/MasakhaNEWSClassification.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update README.md

  • Update mteb/tasks/BitextMining/FloresBitextMining.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/abstasks/AbsTaskPairClassification.py

Co-authored-by: Imene Kerboua <[email protected]>

  • Update README.md

  • Update scripts/data/syntec/create_data_reranking.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/data/alloprof/create_data_reranking.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/run_mteb_french.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/run_mteb_french.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/evaluation/MTEB.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Retrieval/HagridRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MLSUMClusteringP2P.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MLSUMClusteringS2S.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py

  • Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py

  • Update mteb/tasks/STS/SickFrSTS.py

  • Inherit OpusparcusPC init from MultilingualTask

  • remove unnecessary init

  • Remove train split from evaluation on MasakhaNEWSClassification (#52)

remove train split from evaluation

  • put script on HF dataset repos (#56)

  • put script on HF dataset repos

  • remove scripts

  • 49 fix dictionnary in syntecretrieval (#54)

  • add trust remote code arg

  • leave corpus as dict

  • remove trust remote code

  • add Tatoeba & BUCC BitextMining tasks (#57)

add bucc and tatoeba bitextmining tasks

  • 46 add other languages to masakhaneweclusterings2s and p2p (#58)

  • add other language to clustering tasks

  • fix main score and S2S task

  • update run fr becnhmark script

  • Update run_mteb_french.py

  • Update AbsTaskClustering.py

  • remove train and validation splits


Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: mciancone <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: wissam-sib <[email protected]>
Co-authored-by: Wissam Siblini <[email protected]> (3d8b8ec)

  • dev (c16eddc)

  • Dev (08c7317)

  • Add tasks for Spanish Embedding Evaluation (#227)

  • feat: add xmarket es dataset

  • refactor: use multilingual dataset

  • fix: update revision id

  • refactor: add constant for language

  • feat: add two clustering datasets

Signed-off-by: jupyterjazz <[email protected]>

  • feat: import classes

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: flores dataset

Signed-off-by: jupyterjazz <[email protected]>

  • feat: add miracl reranking task for spanish

  • feat: use hf repo with all reranking langs

  • feat: update revision hash

  • refactor: use description for language

  • feat: add stses task

  • fix: get scores from label column

  • refactor: add revision to data loading

  • Added spanish passage retrieval

  • feat: mintaka and xpqa retrieval tasks

Signed-off-by: jupyterjazz <[email protected]>

  • feat: import classes

Signed-off-by: jupyterjazz <[email protected]>

  • fix: typo in data loading

  • fix: id

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: try out multilingual task

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: multilingual task import

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: cmon man

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: go back to monolingual tasks

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: remove unused import

Signed-off-by: jupyterjazz <[email protected]>

  • refactor: loading logic

Signed-off-by: jupyterjazz <[email protected]>

  • feat: add miracl as retrieval task

  • fix: nested corpus

  • refactor: get lang from description

  • Update mteb/tasks/Retrieval/MIRACLRetrieval.py

Co-authored-by: Michael Günther <[email protected]>

  • feat: allow multlingual reranking tasks

  • feat: make miraclreranking multilingual

  • refactor: rename miraclretrieval

Co-authored-by: Niklas Muennighoff <[email protected]>

  • style: add missing eof empty line

  • feat: make xmarket retrieval task multilingual

  • refactor: rename xmarket

  • refactor: turn spanish tasks multilingual (#11)

  • refactor: make xpqa retrieval multilingual

  • fix: formatting of xpqa dataset

  • refactor: make mintaka into multilingual task

  • refactor: make miracl retrieval multilingual

  • feat: add revision ids for hf datasets

  • refactor: remove patool

  • Update mteb/tasks/Reranking/init.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update mteb/tasks/STS/init.py

Co-authored-by: Niklas Muennighoff <[email protected]>


Signed-off-by: jupyterjazz <[email protected]>
Co-authored-by: guenthermi <[email protected]>
Co-authored-by: jupyterjazz <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Michael Günther <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (52d5c9f)

  • Release: 1.1.2 (def3c91)

  • Add task list (#228)

  • Add task list

  • Update mteb/init.py

  • Update README.md (10bf6f8)

  • Update BeIRPLTask.py (#225)

  • Update BeIRPLTask.py

  • Update BeIRPLTask.py (a8922c1)

  • Allow multiple languages (2cc222e)

  • Add Korean Text Search Tasks to MTEB (#210)

  • add Ko-miracl, Ko-StrategyQA, Ko-mrtydi tasks

  • Update mteb/abstasks/AbsTaskRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update AbsTaskRetrieval.py

  • Update mteb/abstasks/AbsTaskRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]>

  • Update scripts/run_mteb_korean.py

Co-authored-by: Niklas Muennighoff <[email protected]>


Co-authored-by: Niklas Muennighoff <[email protected]> (dadf2da)

  • Add MultiLongDocRetrieval task to MTEB. (#224)

  • Update AbsTaskRetrieval.py.

  • Add Retrieval Task: MultiLongDocRetrieval

  • Update AbsTaskRetrieval.py and MLDR task

  • Update reference of MLDR (2f65179)

  • Fix name (2989f76)

  • only save top-k (#209)

  • Update AbsTaskRetrieval.py

  • Add json import; rename kwarg

  • Pass OF

  • Update mteb/abstasks/AbsTaskRetrieval.py

  • Update AbsTaskRetrieval.py

  • Update AbsTaskRetrieval.py

  • Update mteb/abstasks/AbsTaskRetrieval.py


Co-authored-by: Niklas Muennighoff <[email protected]> (f58888d)

  • Add tasks for German Embedding Evaluation (#214)

  • chore: solve merge conflict

  • fix: gerdalir dataset

  • fix: lang from en to de

  • chore: solve merge conflict

  • chore: add ir datasets to requirements

  • refactor: limit queries to 10k

  • refactor: update description of task with limit

  • revert style changes

  • feat: add german stsbenchmarksts task

  • feat: update revision id

  • refactor: update revision id after changes in scores

  • add XMarket dataset

  • add xmarket to init file

  • feat: add revision id

  • add paws x dataset

  • Add ir_datasets as dependency

  • add GermanDPR dataset

  • fix loading

  • Update mteb/tasks/Retrieval/GermanDPRRetrieval.py

Co-authored-by: Saba Sturua <[email protected]>

  • feat: add miracl reranking task for german

  • refactor: cleanup task

  • prevent duplicate pos docs

  • fix: use test split in MIRACL (#13)

Fixes mismatch between description and HuggingFace dataset

  • refactor: remove WikiCLIR

  • fix: double import; xmarket name

  • add German tasks to run_mteb_german script

  • fupdate revisions and style

  • update MIRACL to work with latest version

  • revert adding ir_datasets

  • support multilingual pair classification

  • remove print statement

  • Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <[email protected]>

  • fix monolingual pair classification

  • remove lang for monolingual tasks


Co-authored-by: Isabelle Mohr <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Saba Sturua <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (9aba9ee)

  • Simplify (1cd07db)

  • Refer to other works (8f28bcb)

  • Update mteb/tasks/Retrieval/GermanQuADRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]> (09a9cb0)

  • clean up (51c40fd)

  • WIP: implement requested changes (58baad2)

  • remove code for writing JSONL dataset (d23eac3)

  • add docstring, remove local qrels (af7ee50)

  • fix query id in qrel dataset, ready to merge (33c9dd4)

  • WIP: use HF dataset instead of local JSONL (db3fea1)

  • rename BeIRDETask (e56cf86)

  • Update scripts/run_mteb_german.py

Co-authored-by: Niklas Muennighoff <[email protected]> (4b18a7e)

  • Update mteb/tasks/Retrieval/GermanRetrieval.py

Co-authored-by: Niklas Muennighoff <[email protected]> (3fef61a)


Co-authored-by: Isabelle Mohr <[email protected]> (88beb46)

  • Do not enforce rich import (aa11fe7)

  • fix RerankingEvaluator's compute_metrics_individual (fd7bfac)

  • Fix SummEval import (859d38e)

  • Increment version (4d75ddf)

  • Release: 1.1.1 (d3aaf4f)

  • Merge branch 'main' into fixconversion (d292258)

  • Fix eval_lang (7836148)

  • Simplify code snippets (d434f52)

  • Simplify wording (3adb0b5)

  • Clarify multi-gpu usage (5a2da23)

  • Fix splits (93f6f85)

  • Improve Cust Model explanation (52c1fd8)

  • Add bs to Clustering test (4df0d2e)

  • Rely on auto-conversion to tensor in score function (d8512f7)

  • Rely on standard encode kwargs only (4c1660e)

  • Improve Cust Model explanation (23d758f)

  • Add bs to Clustering test (6e0c0d2)

  • Rely on auto-conversion to tensor in score function (7ec4c57)

  • Rely on standard encode kwargs only (2fad0f9)

  • Update README.md (d9aa70f)

  • Update README.md (2211f83)

  • Simplify assertion (f7fcbc1)

  • Default to false (d64f6c7)

  • Add multi gpu eval to readme (#140)

update readme (1b1c9d3)

  • Support Multi-node Evaluation (#132)

  • styling

  • USE_HF_DATASETS

  • Support DRPES

  • we use beir.datasets.data_loader_hf in case of non dist

  • distributed fixes

  • update run command

  • cleanup

  • .

  • sugg

  • ruff (0dd82a9)

  • Add Chinese tasks (C-MTEB) (#134)

  • add C_MTEB

  • add C_MTEB

  • rename MMarcoReranking

  • rename MMarcoReranking

  • Update mteb/tasks/Retrieval/CMTEBRetrieval.py

  • Update README.md

  • Allow custom encode functions


Co-authored-by: shitao <[email protected]>
Co-authored-by: Nouamane Tazi <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (071974a)

  • Add Polish tasks (PL-MTEB) (#137)

  • Add Polish tasks (PL-MTEB)

  • Add Polish datasets to README

  • Add newline


Co-authored-by: rposwiata <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (2779344)

  • Add BEIR-PL datasets to MTEB (#121)

  • Add BIER-PL benchmark

  • Update README with BEIR-PL datasets

  • Update names

  • Add tasks to init to be visible during evaluation


Co-authored-by: Konrad Wojtasik <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (5972c02)

  • Replaced prints with logging (#133)

  • Make sure that main score is added to bitext mining tasks

  • Added scandinavian languages: da, no, sv

  • merge upstream main

  • fix: Replaced prints with logging statements

  • chore: removed accidental commits (d7ca378)

  • add logging (6412a6a)

  • Merge pull request #131 from embeddings-benchmark/nouamane/quick-fixes

Code cleanup (4fb97d0)

Bump version ID and update PyPI after adding additional tasks. (4a4b54b)

  • Fix typo (33a3140)

  • Sort imports (ab2eef8)

  • Sort imports (3432374)

  • Raise error first (0b1bfd2)

  • Added support for Scandinavian Languages (#124)

  • Make sure that main score is added to bitext mining tasks

  • Added scandinavian languages: da, no, sv

  • Updated readme with scandinavian tasks

  • Changes n samples for the nordic lang CLF

  • Added scandinavian models to init

  • Added error logs to gitignore

  • fix import error

  • fix dataset columns

  • rename dataset columns

  • remove swefaq

  • fix: Added functionality to raise error

  • fix: Updated names

  • fix: Removed no as a language

  • Added missing data transformation

  • Fix spelling error (acb0f59)

  • Install beir (c50b8ab)

  • Update README.md (29ffedf)

  • ruff (6a58b5d)

  • Update README.md (5825536)

  • fix revision hash for TenKGnadClusteringP2P dataset

Co-authored-by: Niklas Muennighoff <[email protected]> (eb622f8)

  • change dataset order for BlurbsClustering in README

Co-authored-by: Niklas Muennighoff <[email protected]> (f6e49ba)

  • change dataset order for TenKGnadClustering in README

Co-authored-by: Niklas Muennighoff <[email protected]> (2a2c47f)

  • fix descriptions for German clustering datasets (30a966c)

  • add German clustering tasks to README (62457e3)

  • update reference & category for TenKGnad datasets (2174a47)

  • add German clustering tasks (ab469be)

  • Allow abs path (b56528c)

  • Add @Property annotation to description method of AbsTask (98b0443)

  • fix typo (37a986b)

  • fix extend lang pairs (865dffc)

  • Fix clustering eval, black, isort (bc43665)

  • Add 'auto' to sklearn clustering, add test, fix warning (15ce352)

  • Update MSMARCORetrieval.py (d913f56)

  • Revert to old split (1f3ff6e)

  • Add wheel instruction (62fad9b)

  • Dev version (d988e48)

  • Release: 1.0.2 (e189bae)

  • Add comment

Co-authored-by: Nouamane Tazi <[email protected]> (3e72ee8)

  • Fix naming (33f2db9)

  • Cleaner logging & tqdm usage (542d871)

  • Add kwargs (e0b801d)

  • Produce embeddings in one go (e88bcf2)

  • Fix naming (6c62f18)

  • Make inputs always List[str] & call in one (bdeeedf)

  • Fix SummEval description (0c2b1be)

  • fix SemmEval description

Unless I'm missing something, I think the SemmEval description is incorrect---the dataset consists of summaries of news articles, not biomedical abstracts. (1ccc068)