Skip to content

Commit

Permalink
wip use our internal llm + switch from json to markdown
Browse files Browse the repository at this point in the history
simplify LLM's job. Do not request Json output with a single key. Instead,
make sure LLM don't output any extra information.

By simplifying LLM's job, we're making sure its output can be parsed.

I did a quick test with the Translate prompt. Adding instructions to output
only translated text seems enough after a bunch of tests.

I did a small prompt engineering, using ChatGPT and Claude to generate
a proper system prompt … it works quite okay BUT there is room for
improvement for sure.

I'ven't searched yet OS prompts we could find in a prompt library.
Perfect translation job seems to be a difficult job for a 8B model.

Please note I haven't updated yet the other prompts, let's discuss it before.
I ran my experiment with our internal LLM which is optimized for throughput,
and not latency (there is a trade-off). I'll try fine tune few of its parameters to
see if I can reduce its latency.

For 880 tokens (based on chatgpt tokens counter online). It takes roughly
17s, vs ~40s for Albert CNRS 70B.

For 180 tokens it takes roughly 3s. Without a proper UX (eg. a nicer loading
animation, streaming tokens) it feels a decade. However, asking Chatgpt the
same job take the same amount, from submitting the request to the last
token being generated.
  • Loading branch information
lebaudantoine committed Dec 15, 2024
1 parent 65fdf11 commit 8b60bc5
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 23 deletions.
47 changes: 24 additions & 23 deletions src/backend/core/services/ai_services.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,29 @@
),
}


AI_TRANSLATE = (
"Translate the markdown text to {language:s}, preserving markdown formatting. "
'Return JSON: {{"answer": "your translated markdown text in {language:s}"}}. '
"Do not provide any other information."
"""
You are a professional translator for `{language:s}`.
### Guidelines:
1. **Preserve exactly as-is:**
- All formatting, markdown, symbols, tags
- Names, numbers, URLs, citations
- Code blocks and technical terms
2. **Translation Rules:**
- Use natural expressions in the target language
- Match the tone of the source text (default: professional)
- Maintain original meaning precisely
- Adapt idioms to suit the target culture
- Ensure grammatical correctness stylistic coherence
3. **Do Not:**
- Add, remove, or explain any content
Output only the translated text, keeping all original formatting intact.
"""
)


Expand All @@ -59,32 +78,14 @@ def call_ai_api(self, system_content, text):
"""Helper method to call the OpenAI API and process the response."""
response = self.client.chat.completions.create(
model=settings.AI_MODEL,
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_content},
{"role": "user", "content": json.dumps({"markdown_input": text})},
{"role": "user", "content": text},
],
)

content = response.choices[0].message.content

try:
sanitized_content = re.sub(r'\s*"answer"\s*:\s*', '"answer": ', content)
sanitized_content = re.sub(r"\s*\}", "}", sanitized_content)
sanitized_content = re.sub(r"(?<!\\)\n", "\\\\n", sanitized_content)
sanitized_content = re.sub(r"(?<!\\)\t", "\\\\t", sanitized_content)

json_response = json.loads(sanitized_content)
except (json.JSONDecodeError, IndexError):
try:
json_response = json.loads(content)
except json.JSONDecodeError as err:
raise RuntimeError("AI response is not valid JSON", content) from err

if "answer" not in json_response:
raise RuntimeError("AI response does not contain an answer")

return json_response
return {"answer": content}

def transform(self, text, action):
"""Transform text based on specified action."""
Expand Down
3 changes: 3 additions & 0 deletions src/helm/env.d/dev/values.impress.yaml.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ backend:
AWS_S3_SECRET_ACCESS_KEY: password
AWS_STORAGE_BUCKET_NAME: impress-media-storage
STORAGES_STATICFILES_BACKEND: django.contrib.staticfiles.storage.StaticFilesStorage
AI_API_KEY: **ask antoine**
AI_BASE_URL: https://albertine.beta.numerique.gouv.fr/v1/
AI_MODEL: meta-llama/Llama-3.1-8B-Instruct

migrate:
command:
Expand Down

0 comments on commit 8b60bc5

Please sign in to comment.