Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add support for Pixtral and other Vision models (llama 3.2 11b/90b etc) #5

Closed
YorkieDev opened this issue Oct 8, 2024 · 17 comments
Labels
fixed-in-next-release The next release of LM Studio fixes this issue

Comments

@YorkieDev
Copy link

Pixtral works great in mlx-vlm (Blaizzy/mlx-vlm#67) would be great to see support land in LM Studio.

@youcefs21
Copy link

youcefs21 commented Oct 11, 2024

mlx-vlm version in lm studio is 0.0.13, pixtral is supported in 0.0.15, don't we just need to upgrade the mlx-vlm version?

@YorkieDev
Copy link
Author

That would probably work. Seems fairly straightforward to do @yagil @neilmehta24

@Blaizzy
Copy link

Blaizzy commented Oct 11, 2024

Wait for v0.1.0 (Blaizzy/mlx-vlm#41)

It will be released later today or tomorrow.

@Blaizzy
Copy link

Blaizzy commented Oct 11, 2024

It has some fixes and many new features.

@julien-blanchon
Copy link

Awesome v0.1.0 is now merged

@mattjcly
Copy link
Member

Pixtral is now supported thanks to @Blaizzy !

@julien-blanchon
Copy link

I hope this will get bundled to LM Studio soon

@yagil
Copy link
Member

yagil commented Oct 17, 2024

I hope this will get bundled to LM Studio soon

Not there yet, but keep an eye on https://lmstudio.ai/beta-releases

@neilmehta24
Copy link
Member

As of #22, mlx-engine has Pixtral and Llama 3.2 vision support. We expect to roll this out to LM Studio soon.

@orcinus
Copy link

orcinus commented Dec 23, 2024

Any updates? Vision support in general in LM Studio is abysmal (UI and UX seem more like an MVP than actual usable thing, and recent vision enabled models like Qwen VL are extremely buggy / bordering on unusable). Combined with being months late with support for major vision-enabled LLMs makes using LM Studio a tough sell (and yes, i know it's free - i'd gladly pay for it if it meant faster rollout of features and architecture support).

@YorkieDev
Copy link
Author

Marking this issue as solved as 0.3.5 Build 9 has support for Pixtral and Llama 3.2 Vision in the MLX Engine.

https://lmstudio.ai/beta-releases

@orcinus
Copy link

orcinus commented Dec 23, 2024

Still doesn't work.
Getting Unknown ArrayValue filter: trim when trying to load MLX Llama 3.2 Vision Instruct.

@orcinus
Copy link

orcinus commented Dec 23, 2024

Nevermind, i'm dumb. Apparently, vision llama 3.2 does not have a System role (but regular 3.2 does). Weird.

@orcinus
Copy link

orcinus commented Dec 23, 2024

Unfortunately, it still doesn't work. It leaks RAM badly.
Deleting messages from context doesn't reduce RAM usage either, it just keeps ballooning the moment you start first inference.

@yagil
Copy link
Member

yagil commented Dec 23, 2024

@orcinus worth it to open an issue for performance. We'd track it separately. Also cc @Blaizzy

@orcinus
Copy link

orcinus commented Dec 23, 2024

I'm too slow - someone already did in here: #63
I've added my own case in that one.

@Blaizzy
Copy link

Blaizzy commented Dec 24, 2024

Thanks!

I'm working on it from tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed-in-next-release The next release of LM Studio fixes this issue
Projects
None yet
Development

No branches or pull requests

8 participants