Releases: embeddings-benchmark/mteb
1.6.2
1.6.2 (2024-04-12)
Documentation
Added 6x2 points for guenthermi for datasets and 1 point to Muennighoff for review
I have not accounted for bonus points as I am not sure was what available at the time.
- docs: added point for #197
Added 2 points for rasdani and 2 bonus points for the first german retrieval (I believe). Added one point for each of the reviewers
- docs: added points for #116
This includes 6 points for 3 datasets to slvnwhrl +2 for first german clustering task also added points for reviews
- Added points for #134 cmteb
This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included.
All the points are attributed to @staoxiao, though we can split them if needed.
We also added points for review.
- docs: Added points for #137 polish
This includes points for 12 datasets (24) across 4 tasks (8). These points are given to rafalposwiata and then one point for review
- docs: Added points for #27 (spanish)
These include 9 datasets (18 points) across 4 news tasks (8) for spanish.
Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed.
- docs: Added points for #224
Added points 2 points for the dataset. I could imagine that I might have missed some bonus points as well. Also added one point for review.
- docs: Added points for #210 (korean)
This include 3 datasets (6 points) across 1 new task (+2 bonus) for korean. Also added 1 points for reviewers.
- Add contributor
Co-authored-by: Niklas Muennighoff <[email protected]> (9dbf500
)
Fix
-
fix: Added Hindi discourse dataset (#346)
-
Added news classification dataset.
-
Fixes on suggestions
-
Added new medical qa dataset
-
Update model run files and model path
-
Added points for dataset.
-
Fixes
-
Added hindi discourse dataset
-
Added points
-
Added avg char length
-
Fixes
Co-authored-by: Kenneth Enevoldsen <[email protected]> (a55ae5f
)
Unknown
fix: removing bitextmining tasks from fr script (86ad02d
)
1.6.1
1.6.0
1.6.0 (2024-04-10)
Documentation
Feature
-
feat: Added new language code standard (#326)
-
fix: Added initial language code suggestion
-
docs: updated task metadata description
-
fix: changed folder structure to iso 639-3 codes
-
fix: Updated all language tags
-
clean: removed accidental results commit
-
fix: Add trusting of remote code to remove warning
-
fix: Added formatting
-
fix: trust remote code the flores dataset
-
docs: Added point for language rewrite
-
fix: reran linter after merge
-
fix: Added corrections from review
-
fix: Updated languages for newly added datasets
-
docs: added points for new annotations (
f0daece
)
1.5.6
1.5.6 (2024-04-10)
Documentation
- docs: add points and affiliation for MartinBernstorff (#335)
docs: update points.md (2903cb4
)
Fix
-
fix: Added medical qa dataset (#333)
-
Added news classification dataset.
-
Fixes on suggestions
-
Added new medical qa dataset
-
Update model run files and model path
-
Added points for dataset.
-
Fixes
Co-authored-by: Kenneth Enevoldsen <[email protected]> (80acc3e
)
Unknown
- Update pull_request_template.md (
84cffa2
)
1.5.5
1.5.4
1.5.4 (2024-04-08)
Fix
-
fix: Multiple dataset fixes (#328)
-
fix: remove time of run (as it does not relate to the model itself). Time of run should be on the dataset results
-
fix: fixes the PawsX datasets
-
docs: Updated points
-
fix: flores clustering
-
fix: mulitple dataset fixes
-
docs: updated points
-
fix: added missing dataset_transform to multitask task
-
syle: ran formatter
-
fix: correctly fix pawsX (
84408f7
)
1.5.3
1.5.3 (2024-04-08)
Documentation
-
docs: Added point for SEB (#318)
-
docs: added points for seb
-
docs: added points for seb (
ca64fc7
) -
docs: Small fixes in readme.md (#317)
Fix typos in readme.md (ede12c8
)
Fix
-
fix: Added English news classification dataset (#323)
-
Fix typos in readme.md
-
Added news classification dataset.
-
Added news classification dataset.
-
Fixes on suggestions
-
Update docs/mmteb/points.md
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]> (4d21807
)
Unknown
-
Fix name (
d69bf94
) -
Add law datasets (#311)
-
add command
-
add datasets
-
reformat dataset
-
Rephrase description
-
Update mteb/tasks/Retrieval/law/GerDaLIRRetrieval.py
-
Update mteb/tasks/Retrieval/law/GerDaLIRRetrieval.py
-
Update mteb/init.py
-
Update scripts/run_mteb_law.py
-
Update scripts/run_mteb_law.py
-
Update mteb/init.py
-
Update mteb/tasks/Retrieval/init.py
-
Update mteb/tasks/Retrieval/law/GerDaLIRRetrieval.py
-
Update mteb/tasks/Retrieval/law/GerDaLIRRetrieval.py
-
Update mteb/tasks/Retrieval/law/LegalQuADRetrieval.py
-
Update mteb/tasks/Retrieval/law/LegalQuADRetrieval.py
-
Update scripts/run_mteb_law.py
-
Update mteb/tasks/Retrieval/law/LegalSummarizationRetrieval.py
-
Update mteb/tasks/Retrieval/law/LegalSummarizationRetrieval.py
-
Update mteb/tasks/Retrieval/law/LeCaRDv2Retrieval.py
-
Update mteb/tasks/Retrieval/law/LeCaRDv2Retrieval.py
-
Rename GerDaLIRRetrieval.py to GerDaLIRSmallRetrieval.py
-
Update mteb/tasks/Retrieval/init.py
-
Update GerDaLIRSmallRetrieval.py
Add metadata
- Update GerDaLIRSmallRetrieval.py
Update metadata
- Update AILACasedocsRetrieval.py
Update AILACasedocsRetrieval metadata
- Update AILAStatutesRetrieval.py
Update AILAStatutesRetrieval metadata
- Update LeCaRDv2Retrieval.py
Update LeCaRDv2Retrieval metadata
- Update LegalBenchConsumerContractsQARetrieval.py
Update LegalBenchConsumerContractsQARetrieval metadata
- Update LegalBenchCorporateLobbyingRetrieval.py
Update LegalBenchCorporateLobbyingRetrieval metadata
- Update LegalQuADRetrieval.py
Update LegalQuADRetrieval metadata
- Update LegalSummarizationRetrieval.py
Update LegalSummarizationRetrieval metadata
- Update AILACasedocsRetrieval.py
Update AILACasedocsRetrieval
- Update AILACasedocsRetrieval.py
Update AILACasedocsRetrieval metadata
- Update AILAStatutesRetrieval.py
Update AILAStatutesRetrieval metadata
- Update GerDaLIRSmallRetrieval.py
Update GerDaLIRSmallRetrieval metadata
- Update LeCaRDv2Retrieval.py
Update LeCaRDv2Retrieval metadata
-
Update LegalBenchConsumerContractsQARetrieval.py
-
Update LegalBenchCorporateLobbyingRetrieval.py
-
Update LegalQuADRetrieval.py
-
Update LegalSummarizationRetrieval.py
-
Update AILACasedocsRetrieval.py
-
Update AILAStatutesRetrieval.py
-
Update GerDaLIRSmallRetrieval.py
-
Update LeCaRDv2Retrieval.py
-
move dataset language folder
-
update order
Co-authored-by: Niklas Muennighoff <[email protected]> (6e3f419
)
1.5.2
1.5.1
1.5.1 (2024-04-03)
Fix
-
fix: Added tests for checking datasets (#307)
-
fix: Fixed hf_hub_name for WikiCitiesClustering
-
Added points for this PR and a 3 other minor dataset fixes
-
feat: Added tests which validated that datasets are available
-
fix: Updated hf references and revisions to multiple datasets
-
Added points for submission
-
fix: Added suggestions from the review
-
Apply suggestions from code review
Co-authored-by: Imene Kerboua <[email protected]>
-
fix: sped up async test for whether datasets exist
-
fix: Updated revisions
-
fix: reuploaded scandeval datasets
-
fix: Applied formatter
Co-authored-by: Imene Kerboua <[email protected]> (8d804f4
)
1.5.0
1.5.0 (2024-04-02)
Feature
-
feat: Allow extending the load_dataset parameters in custom tasks inheriting AbsTask (#299)
-
Allow extending the load_dataset parameters
-
format
-
Fix test
-
remove duplicated logic from AbsTask, now handled in the metadata
-
add tests
-
remove comments, moved to PR
-
format
-
extend metadata dict from super class
-
Remove additional load_data
-
test: adding very high level test
-
Remove hf_hub_name and add test
-
Fix revision in output file
Co-authored-by: gbmarc1 <[email protected]> (953780d
)