You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
GGUF is the new file format specification that we've been designing that's designed to solve the problem of not being able to identify a model. The specification is here: ggerganov/ggml#302
llm should be able to do the following:
continue supporting existing models (i.e. this change should be non-destructive)
load GGUF models and automatically dispatch to the correct model.
load_dynamic already has an interface that should support this, but loading currently only begins after the model arch is known
use the new information stored within the metadata to improve the UX, including automatically using the HF tokenizer if available
save GGUF models, especially in quantization
llm could do the following:
convert old models to GGUF models with prompting for missing data
implement the migration tool mentioned in the spec, which does autonomous conversion for users based on hashes
The text was updated successfully, but these errors were encountered:
Hi - sorry about the lack of updates, I've been extremely busy for the last ~two months and haven't had much free time to work on llm. I'm hoping this will ease up soon and we can start catching up proper.
GGUF is the new file format specification that we've been designing that's designed to solve the problem of not being able to identify a model. The specification is here: ggerganov/ggml#302
llm
should be able to do the following:load_dynamic
already has an interface that should support this, but loading currently only begins after the model arch is knownllm
could do the following:The text was updated successfully, but these errors were encountered: