Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimentation with "internal" llm #505

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

lebaudantoine
Copy link
Collaborator

Please check my first comit. It's draft

simplify LLM's job. Do not request Json output with a single key. Instead,
make sure LLM don't output any extra information.

By simplifying LLM's job, we're making sure its output can be parsed.

I did a quick test with the Translate prompt. Adding instructions to output
only translated text seems enough after a bunch of tests.

I did a small prompt engineering, using ChatGPT and Claude to generate
a proper system prompt … it works quite okay BUT there is room for
improvement for sure.

I'ven't searched yet OS prompts we could find in a prompt library.
Perfect translation job seems to be a difficult job for a 8B model.

Please note I haven't updated yet the other prompts, let's discuss it before.
I ran my experiment with our internal LLM which is optimized for throughput,
and not latency (there is a trade-off). I'll try fine tune few of its parameters to
see if I can reduce its latency.

For 880 tokens (based on chatgpt tokens counter online). It takes roughly
17s, vs ~40s for Albert CNRS 70B.

For 180 tokens it takes roughly 3s. Without a proper UX (eg. a nicer loading
animation, streaming tokens) it feels a decade. However, asking Chatgpt the
same job take the same amount, from submitting the request to the last
token being generated.
Copy link
Collaborator

@AntoLC AntoLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are as well finalizing a PR here, where we actually stop to use totally the markdown with the translation actions: #479

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea to do directly in markdown.

This prompts as well should be improved, they use the same methods:
https://github.com/numerique-gouv/impress/blob/8b60bc57e6c980dfc5eed8932a3cf64244b2079b/src/backend/core/services/ai_services.py#L13-L36

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ofc, I mention it in my commit "Please note I haven't updated yet the other prompts, let's discuss it before."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants