-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future of candle-transformers
/ long-term plans
#1186
Comments
Another I know that @philpax and I have slightly different ways of looking at |
I would love to know MPS timeline as well. |
Hello,
One of the ideas for building candle is to drive the core libraries development by the requirements of state of the art models and that's where One key aspect of candle is to provide flexibility compared to llama.cpp for example. E.g. if someone wants to try quantizing any model, candle should make this easy, either by writing rust code or even by first prototyping things in Python using the candle-pyo3 layer.
That's a bit unclear. For the close future it will be in this repo, and we will continue pushing more models into it, hopefully adding examples of missing model types such as tts (tortoise? musicgen?), multimodal (fuyu?), etc.
We haven't done much of that yet. It's fairly tricky as lots of models have various kinks that make it hard to have very generic traits. Here we try more to propose a collection of models rather than a framework for such models, this may evolve but I feel that the amount of boilerplate to switch from a model to another is not that large currently (this can be seen in the examples) and would be more in favor of making helper functions for the shared bits rather than providing a fully unified interface.
Not that much in
Yes, we do want to support metal and there has been some progresses on it. I'm not sure about the timeline neither but it's certainly something that we do want to have.
My feeling on this is that it's better to have multiple crates trying different designs, aim at different use cases and overall experiment different tradeoffs. On the backend side, metal and webgpu are two obvious things that would help lots of candle users and are on the todo list. For high level abstractions, this really depend on users demand and also on what the community builds. Having higher level crates built on candle would be the best, and hopefully they can be developed and maintained independently. That said if none of these appear and we get lots of users demand for such abstractions we should consider building them directly. |
Thanks for the detailed response! It's clarified a lot of things, and helped me better understand the relationship between Candle and candle-transformers.
Yes, I'd agree with this. It would be nice to cover the additional territory that you do, but it's not a priority for us; our focus is on ensuring that LLM inference is easy and fast first and foremost. I'd be happy to see Candle as a backend some day :)
That makes sense; I'll be keeping an eye on things as well, sounds like an exciting frontier 🙂
That's understandable, and I can see where the challenges come from, especially as you're implementing a diverse suite of models. I've found a unified interface to be useful from a library design perspective, as it allows people to learn one API and use it for every model, as opposed to having to look out for differences in individual models. However, I don't think that's too much of an issue for the current implementations - they all share a pretty similar interface from what I can see. The only thing to watch out for in this regard would be ensuring they don't drift too far apart (principle of least surprise and whatnot).
Yup, gotcha! That makes sense, the experimental phase should be quite interesting 🙂
👍
Awesome, thank you for clarifying all of this. You're right that having multiple different takes on the problem might help us figure out what the best set of tradeoffs are, and that community demand/builds will be crucial to the growth of these ecosystems. Based on what you've said, I'm now confident with continuing development on All of my questions have been answered; I'm happy to close the issue, but other people might appreciate the information you've provided, so I'll leave it up to you. Thanks once again - your insight has been absolutely invaluable ❤️ |
Great very happy that this was useful and that the effort on |
Hi there!
Apologies for the vague issue title, but I was struggling to think of one that conveyed my sentiments.
I'm the primary maintainer of rustformers/llm, which implements several common LLM architectures atop GGML as an easy-to-use library. In some sense, it can be considered a robust, consistent and extensible Rust library-ification of
llama.cpp
with support for other architectures.Recently, I've been considering winding down development on it in favour of encouraging people to use
candle-transformers
instead, because Candle can evolve faster than we do, supports more models, and isn't held back by the free time I/our contributors have.My initial plan was to get
llm
back up to speed with the latest inllama.cpp
and elsewhere, and then add support for other backends, so that Candle could be a secondary backend and (likely) become the primary backend in future. However, chasing a moving target is quite difficult, andcandle-transformers
already covers much of the same ground.With that in mind, people in that discussion have raised a few issues around switching to
candle-transformers
and thinkllm
is still relevant. I think it'd be simpler for the ecosystem if there was a single place for LLMs, but the concerns raised have made it harder to provide a straightforward recommendation to switch tocandle-transformers
.So, here are my questions:
candle-transformers
? Vague question, I know, but will it live in this repo forever? Will it become an ecosystem unto itself like Pythontransformers
?llm
is necessary? My gut feeling is that Candle (andcandle-transformers
) could grow to cover all of its territory, but there may still be some value in custom high-level abstractions and support for other backends. That would be obviated by Candle developing its own high-level abstractions and increasing its backend support, though.Sorry about the wall of questions, but your input is hugely useful in figuring out our own direction. Knowing what you have planned will clarify some of the unknowns for us and let us figure out what to do next.
Thanks in advance! ❤️
The text was updated successfully, but these errors were encountered: