.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt #6964

Evyatar108 · 2024-06-26T12:03:15Z

Describe the bug
the semantic kernel package in c# is inflating the usage tokens by around 20% for my prompts because it escapes Unicode characters and URL encodes the string of the content field of the messages array.

I tested it on a prompt that has 49k input tokens based on the metadata information from the Azure OpenAI api, and when using semantic kernel it inflates the input tokens to 63k

It's very easy to see the prompt differences when it has JSON in it

Platform

OS: Windows
IDE: Visual Studio
Language: C#
Source: NuGet package version 1.15

blinchi · 2024-06-27T07:40:46Z

And it's not just the consumption of tokens, it's that it costs more for the LLM to process it. When receiving encrypted data, the responses are sent to you encoded, when it should not be that way. In my case using Gemini.

Evyatar108 added the bug Something isn't working label Jun 26, 2024

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Jun 26, 2024

github-actions bot changed the title ~~Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt~~ .Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt Jun 26, 2024

markwallace-microsoft self-assigned this Jul 1, 2024

markwallace-microsoft removed the triage label Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt #6964

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt #6964

Evyatar108 commented Jun 26, 2024 •

edited

Loading

blinchi commented Jun 27, 2024

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt #6964

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt #6964

Comments

Evyatar108 commented Jun 26, 2024 • edited Loading

blinchi commented Jun 27, 2024

Evyatar108 commented Jun 26, 2024 •

edited

Loading