Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TREC 2023 Tip-of-the-Tongue #235

Open
6 of 8 tasks
mam10eks opened this issue May 19, 2023 · 3 comments
Open
6 of 8 tasks

TREC 2023 Tip-of-the-Tongue #235

mam10eks opened this issue May 19, 2023 · 3 comments

Comments

@mam10eks
Copy link
Contributor

mam10eks commented May 19, 2023

Dataset Information:

The training and dev data of the TREC 2023 Tip-of-the-Tongue track are now available: https://trec-tot.github.io/guidelines

Description from the website:

Tip of the tongue: The phenomenon of failing to retrieve something from memory, combined with partial recall and the feeling that retrieval is imminent
In terms of input and output, the movie identification task is relatively straightforward—given an input TOT request, output a ranked list of movies. Each movie must be identified by its Wikipedia page id and the correct movie should be ranked as high as possible. For each query, runs should return a ranked list of 1000 Wikipedia page ids. Runs will be evaluated using IR metrics that are appropriate for IR tasks with one relevant document, such as discounted cumulative gain, reciprocal rank, and success@k.

Dataset ID(s) & supported entities:

  • tip-of-the-tongue/train
  • tip-of-the-tongue/dev
  • tip-of-the-tongue/test (not yet released)

Checklist

Mark each task once completed. All should be checked prior to merging a new dataset.

  • Dataset definition (in ir_datasets/datasets/[topid].py)
  • Tests (in tests/integration/[topid].py)
  • Metadata generated (using ir_datasets generate_metadata command, should appear in ir_datasets/etc/metadata.json)
  • Documentation (in ir_datasets/etc/[topid].yaml)
  • Downloadable content (in ir_datasets/etc/downloads.json)
    • Download verification action (in .github/workflows/verify_downloads.yml). Only one needed per topid.
    • Any small public files from NIST (or other potentially troublesome files) mirrored in https://github.com/seanmacavaney/irds-mirror/. Mirrored status properly reflected in downloads.json.

Additional comments/concerns/ideas/etc.

@mam10eks
Copy link
Contributor Author

I would like to implement this ticket.

@mam10eks
Copy link
Contributor Author

cc @samarthbhargav

mam10eks added a commit to mam10eks/ir_datasets that referenced this issue Jun 6, 2023
@mam10eks
Copy link
Contributor Author

mam10eks commented Jun 6, 2023

Dear all, I now had the time to implement this in this branch: https://github.com/mam10eks/ir_datasets/tree/trec-tip-of-the-tongue

Basically, everything is resolved, but I forgot how to do these two steps:

Otherwise, everything seems to be ready.

@seanmacavaney I forgot, was there some documentation on how to do those two steps?

mam10eks added a commit to mam10eks/ir_datasets that referenced this issue Jun 7, 2023
seanmacavaney added a commit that referenced this issue Jun 15, 2023
* Prepare addition of the TREC Tip-of-the-Tongue dataset #235

* Prepare addition of the TREC Tip-of-the-Tongue dataset #235

* a few tweaks

* mf

* title type

* documentation

* fix yaml error in other file

* typing

* rename trec-tip-of-the-tongue to trec-tot and added year

* rename trec-tip-of-the-tongue to trec-tot and added year

* rename trec-tip-of-the-tongue to trec-tot and added year

---------

Co-authored-by: Maik Fröbe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant