Made litellm judge backend more robust. #485

JoelNiklaus · 2025-01-04T10:26:06Z

The metric computation fails when a result is None. Therefore retry and otherwise return an error ModelResponse.

clefourrier

LGTM

Side note: We should add an error flag in the ModelResponse if we want to use it to also report these kind of problems - otherwise, if we implement caching, we won't re-run these even though they are incorrect.
Can you also add a failed attribute to the ModelResponse, False by default, and True here?

You'll be able to use it in #480

src/lighteval/metrics/llm_as_judge.py

clefourrier

Thanks!

HuggingFaceDocBuilderDev · 2025-01-07T07:39:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This reverts commit fdb12f4.

Made litellm judge backend more robust.

e770039

clefourrier approved these changes Jan 6, 2025

View reviewed changes

src/lighteval/metrics/llm_as_judge.py Outdated Show resolved Hide resolved

clefourrier and others added 3 commits January 6, 2025 09:20

Merge branch 'main' into fix-litellm-judge

a00af73

Added failed flag to ModelResponse.

01803d3

Merge branch 'main' into fix-litellm-judge

a700d2b

clefourrier approved these changes Jan 7, 2025

View reviewed changes

clefourrier merged commit fdb12f4 into huggingface:main Jan 7, 2025
3 checks passed

JoelNiklaus added a commit to JoelNiklaus/lighteval that referenced this pull request Jan 7, 2025

Revert "Made litellm judge backend more robust. (huggingface#485)"

7b7001b

This reverts commit fdb12f4.

JoelNiklaus mentioned this pull request Jan 7, 2025

Hotfix for litellm judge #490

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Made litellm judge backend more robust. #485

Made litellm judge backend more robust. #485

JoelNiklaus commented Jan 4, 2025

clefourrier left a comment

clefourrier left a comment

HuggingFaceDocBuilderDev commented Jan 7, 2025

Made litellm judge backend more robust. #485

Made litellm judge backend more robust. #485

Conversation

JoelNiklaus commented Jan 4, 2025

clefourrier left a comment

Choose a reason for hiding this comment

clefourrier left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jan 7, 2025