Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xquad, mlqa and mlsum tasks are not correctly implemented #75

Open
KlaudiaTH opened this issue Mar 16, 2023 · 3 comments
Open

xquad, mlqa and mlsum tasks are not correctly implemented #75

KlaudiaTH opened this issue Mar 16, 2023 · 3 comments

Comments

@KlaudiaTH
Copy link
Collaborator

KlaudiaTH commented Mar 16, 2023

@katrinklug

When running the tasks mlqa_en,mlsum_en, xquad_en and xquad_en I get the following error message:

Traceback (most recent call last):
  File "./tasks/eval_harness/evaluate.py", line 446, in <module>
    main()
  File "./tasks/eval_harness/evaluate.py", line 429, in main
    results = evaluator.evaluate(adaptor, {task_name: task}, False, 0, None, bootstrap_iters=args.bootstrap_iters)
  File "/lm-evaluation-harness/lm_eval/utils.py", line 162, in _wrapper
    return fn(*args, **kwargs)
  File "/lm-evaluation-harness/lm_eval/evaluator.py", line 253, in evaluate
    resps = getattr(lm, reqtype)([req.args for req in reqs])
  File "/lm-evaluation-harness/lm_eval/base.py", line 343, in greedy_until
    re_ord = utils.Reorderer(requests, _collate)
  File "/lm-evaluation-harness/lm_eval/utils.py", line 125, in __init__
    arr = group(arr, lambda x: fn(x[1]))
  File "/lm-evaluation-harness/lm_eval/utils.py", line 59, in group
    res[fn(ob)].append(ob)
  File "/lm-evaluation-harness/lm_eval/utils.py", line 125, in <lambda>
    arr = group(arr, lambda x: fn(x[1]))
  File "/lm-evaluation-harness/lm_eval/base.py", line 340, in _collate
    toks = self.tok_encode(x[0])
  File "/lm-evaluation-harness/lm_eval/models/gpt2.py", line 122, in tok_encode
    return self.tokenizer.encode(string, add_special_tokens=False)
AttributeError: '_GPT2BPETokenizer' object has no attribute 'encode'

Performed evaluation on Taurus using scripts from evaluation repository: apptainer/juwels_german-evalds.sbatch

@janEbert
Copy link

Any perplexity tasks fail, same goes for the German ones. The issue is that the EvalHarnessAdaptor in Megatron-DeepSpeed/tasks/eval_harness/evaluate.py does not implement the greedy_until method, which perplexity tasks require.

@katrinklug
Copy link

Happened also for germanquad and squad2

@KlaudiaTH
Copy link
Collaborator Author

New images ...
Taurus: /projects/p025/p_gptx/apptainer_images/obmd-lmeval-21.12_100423-py3.sif
Juwels: /p/scratch/opengptx-elm/shared/apptainer_images/obmd-lmeval-21.12_100423-py3.sif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants