Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation engine criteria causes errors when intermediate evaluation_steps generation returns poorly formatted JSON/dict #1578

Closed
griptapeOsipa opened this issue Jan 14, 2025 · 1 comment · Fixed by #1519
Assignees
Milestone

Comments

@griptapeOsipa
Copy link

Description:

When "criteria" is used in the evaluation engine, during the automagic generation of evaluation_steps, the intermediate JSON/dict is sometimes improperly formatted, leading to runtime errors.

Steps to Reproduce:
Running this as-is will work (sending something to the evaluation steps directly). Commenting "evaluation_steps" out, and "criteria" in, shows the problem:

from griptape.structures import Pipeline
from griptape.engines import EvalEngine
from griptape.tasks import PromptTask
from griptape.rules import Rule

from dotenv import load_dotenv
load_dotenv() # Load the environment variables

rules    =  [
                "Answer with a json object, with no additional markup.",
                "Talk like a pirate.",
            ]

pipeline = Pipeline(
    tasks = [
        PromptTask(
            "Respond to this user: '{{ args[0] }}'"
            "{% if args[1] %}Use this feedback when answering.{{ args[1] }}{% endif %}"
        ),
    ],
    rules=[ Rule( rule ) for rule in rules ],
)

engine = EvalEngine(
    #criteria=[
    evaluation_steps=[
        f"Determine whether the following rules have been met: {rules}",
    ]
)

pipeline.run( "Hi there" )
score, reason = engine.evaluate(
    input         = pipeline.tasks[0].input.value,
    actual_output = pipeline.output.value,
)

Expected Behavior: The intermediate JSON/dict for evaluation_steps should always include a steps key, an be properly formatted for use.

Environment:

Griptape version: 1.1.1
Python version: 3.10-3.12
OS: OSX

Debug text pulled to look at the json directly

('{"type": "object", "properties": {"steps": {"type": "array", "items": '
 '{"type": "string"}}}, "required": ["steps"], "additionalProperties": false, '
 '"$id": "Output Format", "$schema": '
 '"http://json-schema.org/draft-07/schema#"}')

How the error reads:

Traceback (most recent call last):
  File ".../GitHub/griptape-TS-1_1/evalEngine.py", line 33, in <module>
    score, reason = engine.evaluate(
  File ".../GitHub/griptape-TS-1_1/.venv/lib/python3.10/site-packages/griptape/engines/eval/eval_engine.py", line 86, in evaluate
    self.evaluation_steps = self._generate_steps(evaluation_params)
  File ".../GitHub/griptape-TS-1_1/.venv/lib/python3.10/site-packages/griptape/engines/eval/eval_engine.py", line 115, in _generate_steps
    return parsed_result["steps"]
KeyError: 'steps'
@collindutter
Copy link
Member

Fixed via #1519

@collindutter collindutter added this to the 1.2 milestone Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants