Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use config dict in TaskConfigLogger for easier serialization #454

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

albertvillanova
Copy link
Member

Use config dict (instead of dataclass) in TaskConfigLogger for easier serialization (e.g. json.dump).

Note that dataclasses are not JSON serializable out of the box.

@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@NathanHB NathanHB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I thought we wrote a json serializer for dataclasses. Also, was this causing issues ?

@@ -35,7 +35,7 @@
from lighteval.metrics.stderr import get_stderr_function
from lighteval.models.abstract_model import ModelInfo
from lighteval.models.model_output import ModelResponse
from lighteval.tasks.lighteval_task import LightevalTask, LightevalTaskConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we using the LightevalTaskConfig anywhere else ?

@clefourrier
Copy link
Member

I'm quite against going back from dataclass to dict

@albertvillanova
Copy link
Member Author

Let me try to explain my point:

  • the most common use case of the dataclasses.asdict utility function is serialization: convert a dataclass instance to a JSON-like dictionary for serialization, e.g., when sending data over a network or storing it in a database,...
  • the main purpose of our loggers is indeed to prepare all the evaluation data for serialization (done by EvaluationTracker)

In my opinion, converting dataclass instances to dicts is a common and recommended approach for logging and (later serializing) them.

@albertvillanova
Copy link
Member Author

albertvillanova commented Dec 20, 2024

Let me give another example: in my opinion, a user would expect to be able to serialize the results obtained from the Pipeline:

results = pipeline.get_results()
results_json = json.dumps(results)
  • However the code above raises an error:
    TypeError: Object of type LightevalTaskConfig is not JSON serializable
  • The user needs a custom serializer.

See this code in the leaderboard, which is no longer valid:
https://huggingface.co/spaces/demo-leaderboard-backend/backend/blob/5c763da56426909e1b3d01e88d9b7382b9287a8a/src/backend/run_eval_suite_lighteval.py#L96-L98

The PR addresses this issue.

@albertvillanova
Copy link
Member Author

albertvillanova commented Dec 24, 2024

I'm quite against going back from dataclass to dict

I've been considering the use of dataclasses inside the logged information and would like to understand more about the rationale behind this choice. It would be helpful to know the considerations or advantages that led to this decision. Could you please share some insights or context about the thought process involved? Thanks.

CC: @clefourrier

@clefourrier
Copy link
Member

Ofc, and thanks for your very detailed answers!

We went the dataclass way first to constrain keys used (without having to add a lot of post init logic as you would for a dict), get implicit typing about said keys, and to help us when coding with IDEs as we now get autocompletion.

We had added a dataclass serializer to json afterwards and changed all the dict classes we were using to dataclasses and it indeed made coding way faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants