-
Notifications
You must be signed in to change notification settings - Fork 0
LLM transliteration #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Web-based LLM inference doesn't work on many browsers and takes too long to load. What I tried: import * as webllm from "https://esm.run/@mlc-ai/web-llm";
const initProgressCallback = (initProgress) => {
console.log(initProgress);
}
const selectedModel = "Llama-3.1-8B-Instruct-q4f32_1-MLC";
const engine = await webllm.CreateMLCEngine(
selectedModel,
{ initProgressCallback: initProgressCallback },
);
const messages = [
{role: 'system', content: 'Transliterating according to IJMES ...'},
{role: 'user', content: text}
]
const reply = await engine.chat.completions.create({messages}); Maybe fewer parameters would allow for an acceptable loading time, but it makes sense to wait with web-based inference until it (i.e. web GPU) is fully adopted in all major browsers. |
#7 instead uses the HuggingFace inference API from the backend. The following aspects are very important:
from huggingface_hub import InferenceClient
client = InferenceClient(api_key="...")
messages = [
{
"role": "system",
"content": "You are a transliterator that transliterates according to the IJMES standard. Your task is to transliterate as quickly and succinctly as possible. Don't explain anything, keep your answers as short as possible. "
},
{
"role": "user",
"content": text
}
]
completion = client.chat.completions.create(
model="Qwen/Qwen2.5-72B-Instruct",
messages=messages,
max_tokens=500,
temperature=0
)
return completion.choices[0].message.content |
For now, fine-tuning and gathering more data seems unnecessarily complicated. Or, to put it differently, the costs outweigh the benefits in light of the quite good performance of the current approach, at least for IJMES. |
Uh oh!
There was an error while loading. Please reload this page.
o4Qwen2.5-72B (see #6 comment)Fine-tuning LLM (e.g. Gemma or Lama)(see LLM transliteration #6 (comment))Gathering Data(see LLM transliteration #6 (comment))Use https://github.com/mlc-ai/web-llm for web-based LLM-inference(see #6 (comment))Use https://github.com/linuxscout/yaziji to generate random phrases, transliterate and correct them to get a ton of data.
The text was updated successfully, but these errors were encountered: