[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

kkarski · 2024-03-28T22:35:07Z

Bug Description

In order to help the LLM output valid JSON objects, it's often helpful to provide a few JSON examples in addition to the Pydantic schema.

The issue is, if any JSON is present in the prompt_template_str, the first {'property' ... } of the JSON example will be interpreted as a {variable} for the str.format() function when the message is being prepared before invoking the LLM. Since it's not actually a variable to be replaced, none can be provided and the str.format() functions fails with a KeyError.

Version

llama-index==0.10.15

Steps to Reproduce

Place any JSON content in the prompt_template_str before invoking LLMTextCompletionProgram.from_defaults()

Relevant Logs/Tracbacks

No response

dosubot · 2024-03-28T22:35:21Z

Hello @kkarski! 👋 I'm Dosu, a bot here to assist you with bug-fixing, answering questions, and guiding you through contributions while we wait for a human maintainer. I'm diving into your issue regarding the LLMTextCompletionProgram and the JSON data in prompt_template_str. Expect a full answer shortly!

logan-markewich · 2024-03-28T22:36:24Z

This is very hard to fix with python string formatting 😓

I suggest escaping your JSON with double brackets

dosubot · 2024-03-28T22:44:10Z

To address the KeyError issue you're encountering with LLMTextCompletionProgram when your prompt includes JSON data, a practical solution involves using the convert_to_handlebars function available in the LlamaIndex codebase. This function is specifically designed to convert Python format strings, which utilize single braces {} for variable substitution and double braces {{}} for escaping, into handlebars-style templates. In handlebars templates, double braces {{}} are used for variable substitution, making them suitable for JSON data.

Here's a concise guide on how to use the convert_to_handlebars function:

Start with your JSON data as a Python string, ensuring dynamic values are represented using Python's string formatting syntax (i.e., {variable_name}).
Apply the convert_to_handlebars function to transform your Python format string into a handlebars-style template. This conversion process automatically handles the escaping of JSON data, making it compatible with the LLMTextCompletionProgram.
You can now safely use the converted string as a prompt in LLMTextCompletionProgram without triggering the KeyError due to JSON data.

For example:

json_data_with_python_format = '{"name": "{user_name}", "age": {user_age}}'
handlebars_template = convert_to_handlebars(json_data_with_python_format)

This approach, leveraging the convert_to_handlebars function, effectively circumvents the issue of escaping JSON data within Python string formatting, facilitating the integration of JSON data into prompts for the LLMTextCompletionProgram.

The convert_to_handlebars function works by first replacing double braces with a temporary placeholder, then single braces with double braces, and finally, the temporary placeholder with single braces. This ensures that your JSON data is correctly formatted for use in the prompt without encountering the KeyError.

You can find the convert_to_handlebars function in the llama-index-core/llama_index/core/prompts/guidance_utils.py file.

As for updates or newer versions of LlamaIndex that might address this issue directly, I couldn't find specific information in the repository. If you need further assistance or have additional questions, feel free to ask.

Sources

llama-index-core/llama_index/core/prompts/guidance_utils.py

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

kkarski added bug Something isn't working triage Issue needs to be triaged/prioritized labels Mar 28, 2024

logan-markewich added P2 and removed triage Issue needs to be triaged/prioritized labels Mar 29, 2024

logan-markewich added this to Framework Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

kkarski commented Mar 28, 2024 •

edited

Loading

dosubot bot commented Mar 28, 2024

logan-markewich commented Mar 28, 2024

dosubot bot commented Mar 28, 2024

About Dosu

[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

[Bug]: LLMTextCompletionProgram fails with KeyError when prompt contains JSON example data #12390

Comments

kkarski commented Mar 28, 2024 • edited Loading

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

dosubot bot commented Mar 28, 2024

logan-markewich commented Mar 28, 2024

dosubot bot commented Mar 28, 2024

Sources

About Dosu

kkarski commented Mar 28, 2024 •

edited

Loading