Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

Open
kkarski opened this issue Mar 28, 2024 · 3 comments
Labels
bug Something isn't working P2

Comments

@kkarski
Copy link

kkarski commented Mar 28, 2024

Bug Description

In order to help the LLM output valid JSON objects, it's often helpful to provide a few JSON examples in addition to the Pydantic schema.

The issue is, if any JSON is present in the prompt_template_str, the first {'property' ... } of the JSON example will be interpreted as a {variable} for the str.format() function when the message is being prepared before invoking the LLM. Since it's not actually a variable to be replaced, none can be provided and the str.format() functions fails with a KeyError.

Version

llama-index==0.10.15

Steps to Reproduce

Place any JSON content in the prompt_template_str before invoking LLMTextCompletionProgram.from_defaults()

Relevant Logs/Tracbacks

No response

@kkarski kkarski added bug Something isn't working triage Issue needs to be triaged/prioritized labels Mar 28, 2024
Copy link

dosubot bot commented Mar 28, 2024

Hello @kkarski! 👋 I'm Dosu, a bot here to assist you with bug-fixing, answering questions, and guiding you through contributions while we wait for a human maintainer. I'm diving into your issue regarding the LLMTextCompletionProgram and the JSON data in prompt_template_str. Expect a full answer shortly!

@logan-markewich
Copy link
Collaborator

This is very hard to fix with python string formatting 😓

I suggest escaping your JSON with double brackets

Copy link

dosubot bot commented Mar 28, 2024

To address the KeyError issue you're encountering with LLMTextCompletionProgram when your prompt includes JSON data, a practical solution involves using the convert_to_handlebars function available in the LlamaIndex codebase. This function is specifically designed to convert Python format strings, which utilize single braces {} for variable substitution and double braces {{}} for escaping, into handlebars-style templates. In handlebars templates, double braces {{}} are used for variable substitution, making them suitable for JSON data.

Here's a concise guide on how to use the convert_to_handlebars function:

  1. Start with your JSON data as a Python string, ensuring dynamic values are represented using Python's string formatting syntax (i.e., {variable_name}).
  2. Apply the convert_to_handlebars function to transform your Python format string into a handlebars-style template. This conversion process automatically handles the escaping of JSON data, making it compatible with the LLMTextCompletionProgram.
  3. You can now safely use the converted string as a prompt in LLMTextCompletionProgram without triggering the KeyError due to JSON data.

For example:

json_data_with_python_format = '{"name": "{user_name}", "age": {user_age}}'
handlebars_template = convert_to_handlebars(json_data_with_python_format)

This approach, leveraging the convert_to_handlebars function, effectively circumvents the issue of escaping JSON data within Python string formatting, facilitating the integration of JSON data into prompts for the LLMTextCompletionProgram.

The convert_to_handlebars function works by first replacing double braces with a temporary placeholder, then single braces with double braces, and finally, the temporary placeholder with single braces. This ensures that your JSON data is correctly formatted for use in the prompt without encountering the KeyError.

You can find the convert_to_handlebars function in the llama-index-core/llama_index/core/prompts/guidance_utils.py file.

As for updates or newer versions of LlamaIndex that might address this issue directly, I couldn't find specific information in the repository. If you need further assistance or have additional questions, feel free to ask.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@logan-markewich logan-markewich added P2 and removed triage Issue needs to be triaged/prioritized labels Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2
Projects
Status: No status
Development

No branches or pull requests

2 participants