Experimentation with "internal" llm #505

lebaudantoine · 2024-12-16T00:00:33Z

Please check my first comit. It's draft

simplify LLM's job. Do not request Json output with a single key. Instead, make sure LLM don't output any extra information. By simplifying LLM's job, we're making sure its output can be parsed. I did a quick test with the Translate prompt. Adding instructions to output only translated text seems enough after a bunch of tests. I did a small prompt engineering, using ChatGPT and Claude to generate a proper system prompt … it works quite okay BUT there is room for improvement for sure. I'ven't searched yet OS prompts we could find in a prompt library. Perfect translation job seems to be a difficult job for a 8B model. Please note I haven't updated yet the other prompts, let's discuss it before. I ran my experiment with our internal LLM which is optimized for throughput, and not latency (there is a trade-off). I'll try fine tune few of its parameters to see if I can reduce its latency. For 880 tokens (based on chatgpt tokens counter online). It takes roughly 17s, vs ~40s for Albert CNRS 70B. For 180 tokens it takes roughly 3s. Without a proper UX (eg. a nicer loading animation, streaming tokens) it feels a decade. However, asking Chatgpt the same job take the same amount, from submitting the request to the last token being generated.

AntoLC

We are as well finalizing a PR here, where we actually stop to use totally the markdown with the translation actions: #479

AntoLC · 2024-12-16T08:53:48Z

src/backend/core/services/ai_services.py

I think it is a good idea to do directly in markdown.

This prompts as well should be improved, they use the same methods:
https://github.com/numerique-gouv/impress/blob/8b60bc57e6c980dfc5eed8932a3cf64244b2079b/src/backend/core/services/ai_services.py#L13-L36

Yes ofc, I mention it in my commit "Please note I haven't updated yet the other prompts, let's discuss it before."

AntoLC reviewed Dec 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimentation with "internal" llm #505

Experimentation with "internal" llm #505

lebaudantoine commented Dec 16, 2024

AntoLC left a comment •

edited

Loading

AntoLC Dec 16, 2024

lebaudantoine Dec 16, 2024

Experimentation with "internal" llm #505

Are you sure you want to change the base?

Experimentation with "internal" llm #505

Conversation

lebaudantoine commented Dec 16, 2024

AntoLC left a comment • edited Loading

Choose a reason for hiding this comment

AntoLC Dec 16, 2024

Choose a reason for hiding this comment

lebaudantoine Dec 16, 2024

Choose a reason for hiding this comment

AntoLC left a comment •

edited

Loading