You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I need to evaluate google translate and other methods that do not fit in any of the evaluation modes supported currently.
Solution/Feature
I am not sure what the best way to do this is, but I see the following general solution:
We add one ModelConfig, where the user can specify a function that is called for a batch of examples. As long as the user provides the output correctly formatted, they can evaluate any system they like. It does not need to fit into the model_configs supported so far.
Happy to propose a PR for this.
The text was updated successfully, but these errors were encountered:
Issue encountered
I need to evaluate google translate and other methods that do not fit in any of the evaluation modes supported currently.
Solution/Feature
I am not sure what the best way to do this is, but I see the following general solution:
We add one ModelConfig, where the user can specify a function that is called for a batch of examples. As long as the user provides the output correctly formatted, they can evaluate any system they like. It does not need to fit into the model_configs supported so far.
Happy to propose a PR for this.
The text was updated successfully, but these errors were encountered: