Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] BigBench #988

Closed
orendar opened this issue Nov 14, 2023 · 9 comments · Fixed by #1686
Closed

[Refactor] BigBench #988

orendar opened this issue Nov 14, 2023 · 9 comments · Fixed by #1686
Assignees
Labels
bug Something isn't working.

Comments

@orendar
Copy link

orendar commented Nov 14, 2023

Hey, I saw that the current implementation of BigBench with its difficult dependencies is a placeholder until Huggingface Dataset is ready. I am a fan of BigBench and would love to utilize it within big-refactor branch.

Is there any additional work that I or the community can do to finish the HF-based integration? Or should it be ready already based on the state of the dataset?

Thanks!

@haileyschoelkopf
Copy link
Collaborator

Hi! I will have the mirrored dataset completed uploading shortly, I apologize for the delay--had to work around rate limits for uploading to HF!

@orendar
Copy link
Author

orendar commented Nov 17, 2023

Thank you so much for all your work!! I really appreciate it, closing :)

@orendar orendar closed this as completed Nov 17, 2023
@haileyschoelkopf
Copy link
Collaborator

Everything should be uploaded as of #1002 !

@orendar orendar reopened this Nov 18, 2023
@orendar
Copy link
Author

orendar commented Nov 18, 2023

@haileyschoelkopf Hey sorry to bother you again, but I'm having trouble running bigbench_*_multiple_choice (including individual tasks, with/without few-shot etc). Do you know what might be the issue here?
I get a similar stacktrace for every subtask and configuration I've tried:

Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/eval/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ec2-user/anaconda3/envs/eval/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/__main__.py", line 248, in <module>
    cli_evaluate()
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/__main__.py", line 199, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/utils.py", line 356, in _wrapper
    return fn(*args, **kwargs)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/evaluator.py", line 111, in simple_evaluate
    task_dict = lm_eval.tasks.get_task_dict(tasks)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 250, in get_task_dict
    task_name: get_task(task_name=task_element, config=config),
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 192, in get_task
    return TASK_REGISTRY[task_name](config=config)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/api/task.py", line 682, in __init__
    test_target = self.doc_to_target(test_doc)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/api/task.py", line 899, in doc_to_target
    target_string = utils.apply_template(doc_to_target, doc)
  File "/home/ec2-user/SageMaker/generative/lm-evaluation-harness/lm_eval/utils.py", line 489, in apply_template
    return rtemplate.render(**doc)
  File "/home/ec2-user/anaconda3/envs/eval/lib/python3.8/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/home/ec2-user/anaconda3/envs/eval/lib/python3.8/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
ValueError: 'thought' is not in list

@haileyschoelkopf
Copy link
Collaborator

Hey! Is there a specific subtask for which this occurs, so I can test it?

@orendar
Copy link
Author

orendar commented Nov 19, 2023

Yes, I just verified and this specific stack trace comes from "bigbench_ascii_word_recognition_multiple_choice". Thank you!

@lintangsutawika
Copy link
Contributor

@orendar @haileyschoelkopf is this fixed?

@bryanSwk
Copy link

bryanSwk commented Feb 6, 2024

Sorry for bumping, but i noticed that bigbench_*_multiple_choice breaks for certain subsets with an empty "multiple_choice_targets" column. i.e. (https://huggingface.co/datasets/hails/bigbench/viewer/tense_zero_shot).

Got a similar error as above:

    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
ValueError: 'She has applied for the job.' is not in list

@nanyyyyyy
Copy link

Got the same error here when running bigbench_multiple_choice

@lintangsutawika lintangsutawika linked a pull request Apr 9, 2024 that will close this issue
@lintangsutawika lintangsutawika self-assigned this Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working.
Projects
Development

Successfully merging a pull request may close this issue.

5 participants