Skip to content

Releases: embeddings-benchmark/mteb

1.25.7

29 Dec 15:47
Compare
Choose a tag to compare

1.25.7 (2024-12-29)

Fix

  • fix: Correction of discrepancies for gte-Qweb model (#1637) (2de61b1)

Unknown

1.25.6

24 Dec 23:16
Compare
Choose a tag to compare

1.25.6 (2024-12-24)

Fix

  • fix: Update results_to_dataframe to use BenchmarkResults class (#1628) (02ae4fa)

Unknown

  • Feat: Add jasper (#1591)

  • init jasper

  • init jasper

  • add to overview

  • add to overview

  • remove some params

  • fix max length

  • return sdpa

  • add dtype

  • add dtype

  • fix convert_to_tensor

  • change to encode

  • return whitespace processing

  • explicitly add instructions

  • move seq length

  • try float

  • fix max_seq_length

  • add prompt validation to format instruction

  • don't use instructions only to s2p (ef5a068)

1.25.5

22 Dec 22:24
Compare
Choose a tag to compare

1.25.5 (2024-12-22)

Fix

  • fix: properly add mteb_model_meta to model object (#1623) (72a457e)

  • fix: GermanDPR Dataset Causes Cross-Encoder Failure Due to Unexpected dict (#1621)

Fixes #1609 (748033e)

Unknown

  • add MSMARCO eval split in MTEB English (classic) benchmark (#1620)

  • add MSMARCO eval split in MTEB English (classic) benchmark

Fixes #1608

  • Add co-author

Co-authored-by: aashka-trivedi <[email protected]>


Co-authored-by: aashka-trivedi <[email protected]> (e1b74f2)

1.25.4

22 Dec 13:42
Compare
Choose a tag to compare

1.25.4 (2024-12-22)

Fix

  • fix: override existing results (#1617)

  • fix override existing results

  • lint

  • fix tests

  • add tests with overwrite

  • lint

  • update tests

  • lint

  • update

  • lint (272adb1)

1.25.3

20 Dec 19:38
Compare
Choose a tag to compare

1.25.3 (2024-12-20)

Fix

  • fix: set use_instructions to True in models using prompts (#1616)

feat: set use_instructions to True in models using prompts (0c44482)

1.25.2

20 Dec 15:49
Compare
Choose a tag to compare

1.25.2 (2024-12-20)

Fix

Unknown

  • Add IBM Granite Embedding Models (#1613)

  • add IBM granite embedding models

  • lint formatting

  • add adapted_from and superseded_by to ModelMeta (ad05983)

  • Feat: Evaluate missing languages (#1584)

  • init

  • fix tests

  • update mock retrieval

  • update tests

  • use subsets instead of langs

  • Apply suggestions from code review
    Co-authored-by: Isaac Chung <[email protected]>

  • fix tests

  • add to readme

  • rename subset in readme


Co-authored-by: Isaac Chung <[email protected]> (48cb97d)

  • Update tasks table (9de7f20)

  • Add NanoBEIR Datasets (#1588)

  • add NanoClimateFeverRetrieval task, still requires some debugging

  • move task to correct place in init file

  • add all Nano datasets and results

  • format code

  • Update mteb/tasks/Retrieval/eng/tempCodeRunnerFile.py
    Co-authored-by: Roman Solomatin <[email protected]>

  • pin revision to commit and add datasets to benchmark.py

  • create new benchmark for NanoBEIR

  • add revision when loading datasets

  • lint


Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: isaac-chung <[email protected]> (6731b94)

  • Feat: Use similarity scores if available (#1602)

  • Use similarity scores if available

  • lint (b81b584)

1.25.1

16 Dec 20:04
Compare
Choose a tag to compare

1.25.1 (2024-12-16)

Fix

  • fix: Leaderboard refinements (#1603)

  • Added explanation of aggregate measures

  • Added download button to result tables

  • Task info gets sorted by task name

  • Added custom, shareable links for each benchmark

  • Moved explanation of aggregate metrics to the summary tab (6ecc86f)

Unknown

  • Leaderboard: Refined plots (#1601)

  • Added embedding size guide to performance-size plot, removed shading on radar chart

  • Changed plot names to something more descriptive

  • Made plots failsafe (0c9e046)

  • Add new models nvidia, gte, linq (#1436)

  • Add new models nvidia, gte, linq

  • add warning for gte-Qwen and nvidia models re: instruction used in docs as well


Co-authored-by: isaac-chung <[email protected]> (95d5ae5)

  • Feat: add support for scoring function (#1594)

  • add support for scoring function

  • lint

  • move similarity to wrapper

  • remove score function

  • lint

  • remove from InstructionRetrievalEvaluator

  • Update mteb/evaluation/evaluators/RetrievalEvaluator.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • remove score function from README.md

Co-authored-by: Kenneth Enevoldsen <[email protected]> (8e6ee46)

  • doc: colbert add score_function & doc section (#1592)

  • doc: colbert add score_function & doc section

  • doc: Update README.md

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • doc: Update README.md

Co-authored-by: Isaac Chung <[email protected]>


Co-authored-by: sam021313 <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Isaac Chung <[email protected]> (992b20b)

1.25.0

14 Dec 09:55
Compare
Choose a tag to compare

1.25.0 (2024-12-14)

Feature

  • feat: Add ColBert (#1563)

  • feat: add max_sim operator for IR tasks to support multi-vector models

  • docs: add doc for Model2VecWrapper.init(...)

  • feat: add ColBERTWrapper to models & add ColBERTv2

  • fix: resolve issues

  • fix: resolve issues

  • Update README.md

Co-authored-by: Roman Solomatin <[email protected]>

  • Update README.md

Co-authored-by: Isaac Chung <[email protected]>

  • Update README.md

Co-authored-by: Isaac Chung <[email protected]>

  • Update mteb/evaluation/evaluators/RetrievalEvaluator.py

Co-authored-by: Isaac Chung <[email protected]>

  • Update README.md

Co-authored-by: Isaac Chung <[email protected]>

  • README.md: rm subset

  • doc: update example for Late Interaction

  • get colbert running without errors

  • fix: pass is_query to pylate

  • fix: max_sim add pad_sequence

  • feat: integrate Jinja templates for ColBERTv2 and add model prompt handling

  • feat: add revision & prompt_name

  • doc: pad_sequence

  • rm TODO jina colbert v2

  • doc: warning: higher resource usage for MaxSim


Co-authored-by: sam021313 <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Isaac Chung <[email protected]> (fdfdaef)

1.24.2

13 Dec 10:55
Compare
Choose a tag to compare

1.24.2 (2024-12-13)

Fix

  • fix: Eval langs not correctly passed to monolingual tasks (#1587)

  • fix SouthAfricanLangClassification.py

  • add check for langs

  • lint (373db74)

1.24.1

11 Dec 23:26
Compare
Choose a tag to compare

1.24.1 (2024-12-11)

Fix

  • fix: Add namaa MrTydi reranking dataset (#1573)

  • Add dataset class and file requirements

  • pass tests

  • make lint changes

  • adjust meta data and remove load_data


Co-authored-by: Omar Elshehy <[email protected]> (7b9b3c9)

Unknown