LLM transliteration #6

theRealProHacker · 2025-01-17T00:19:32Z

Prompt Engineering on o4 Qwen2.5-72B (see #6 comment)
~~Fine-tuning LLM (e.g. Gemma or Lama)~~ (see LLM transliteration #6 (comment))
~~Gathering Data~~ (see LLM transliteration #6 (comment))

~~Use https://github.com/mlc-ai/web-llm for web-based LLM-inference~~ (see #6 (comment))

Use https://github.com/linuxscout/yaziji to generate random phrases, transliterate and correct them to get a ton of data.

theRealProHacker · 2025-01-22T14:30:51Z

Web-based LLM inference doesn't work on many browsers and takes too long to load.

What I tried:

import * as webllm from "https://esm.run/@mlc-ai/web-llm";

const initProgressCallback = (initProgress) => {
  console.log(initProgress);
 }
const selectedModel = "Llama-3.1-8B-Instruct-q4f32_1-MLC";

const engine = await webllm.CreateMLCEngine(
  selectedModel,
   { initProgressCallback: initProgressCallback }, 
);

const messages = [
  {role: 'system', content: 'Transliterating according to IJMES ...'},
  {role: 'user', content: text}
]
            
const reply = await engine.chat.completions.create({messages});

Maybe fewer parameters would allow for an acceptable loading time, but it makes sense to wait with web-based inference until it (i.e. web GPU) is fully adopted in all major browsers.

theRealProHacker · 2025-01-22T14:58:31Z

#7 instead uses the HuggingFace inference API from the backend.

The following aspects are very important:

Instruct the model to answer succinctly/short. In the tested prompt, this is even repeated twice.
For transliteration, there is no experimentation or creativity needed. There is really only one correct answer. Therefore, the temperature should be set to 0
The Qwen 2.5 model performs very good for Arabic, which becomes apparent from the Open Arabic LLM Leaderboard. 72B parameters almost guarantee a solid performance in most cases.

from huggingface_hub import InferenceClient

client = InferenceClient(api_key="...")

messages = [
    {
        "role": "system",
        "content": "You are a transliterator that transliterates according to the IJMES standard. Your task is to transliterate as quickly and succinctly as possible. Don't explain anything, keep your answers as short as possible. "
    },
    {
        "role": "user",
        "content": text
    }
]

completion = client.chat.completions.create(
    model="Qwen/Qwen2.5-72B-Instruct", 
    messages=messages, 
    max_tokens=500,
    temperature=0
)

return completion.choices[0].message.content

theRealProHacker · 2025-01-22T15:02:37Z

For now, fine-tuning and gathering more data seems unnecessarily complicated. Or, to put it differently, the costs outweigh the benefits in light of the quite good performance of the current approach, at least for IJMES.

theRealProHacker self-assigned this Jan 17, 2025

theRealProHacker added the enhancement New feature or request label Jan 17, 2025

theRealProHacker linked a pull request Jan 22, 2025 that will close this issue

LLM Transliteration #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM transliteration #6

LLM transliteration #6

theRealProHacker commented Jan 17, 2025 •

edited

Loading

theRealProHacker commented Jan 22, 2025

Uh oh!

theRealProHacker commented Jan 22, 2025

Uh oh!

theRealProHacker commented Jan 22, 2025

Uh oh!

LLM transliteration #6

LLM transliteration #6

Comments

theRealProHacker commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

theRealProHacker commented Jan 22, 2025

Uh oh!

theRealProHacker commented Jan 22, 2025

Uh oh!

theRealProHacker commented Jan 22, 2025

Uh oh!

theRealProHacker commented Jan 17, 2025 •

edited

Loading