[New Model]: Codestral Mamba #6479

K-Mistele · 2024-07-16T15:36:27Z

The model to consider.

Mamba Codestral: https://huggingface.co/mistralai/mamba-codestral-7B-v0.1

Highlights:

SOTA 7B code model
theoretically unlimited context length; tested up to 256k
inference is linear-complexity with respect to sequence length, compared to transformers which is quadratic-complexity

The closest model vllm already supports.

Jamba seems to be the closest model, since it is Mamba-based: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/jamba.py

What's your difficulty of supporting the model you want?

Mamba is a non-transformer architecture, but there is already a mamba-based model supported, so it's unclear how difficult it would be to support.

simon-mo · 2024-07-16T16:38:15Z

cc @tlrmchlsmth who is working on it

digantamisra98 · 2024-10-10T09:47:24Z

Any updates on this?

tlrmchlsmth · 2024-10-10T21:08:43Z

@digantamisra98 I have a branch where I have this working -- planning to land #6484 soon (possibly today) and will follow up with Mamba2 support afterwards (which will include support for Codestral Mamba)

github-actions · 2025-01-09T02:14:17Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

K-Mistele added the new model Requests to new models label Jul 16, 2024

tlrmchlsmth mentioned this issue Jul 16, 2024

[Model] Support Mamba #6484

Merged

tlrmchlsmth linked a pull request Oct 12, 2024 that will close this issue

[Model] Support Mamba2 (Codestral Mamba) #9292

Open

5 tasks

github-actions bot added the stale label Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Model]: Codestral Mamba #6479

[New Model]: Codestral Mamba #6479

K-Mistele commented Jul 16, 2024

simon-mo commented Jul 16, 2024

digantamisra98 commented Oct 10, 2024

tlrmchlsmth commented Oct 10, 2024

github-actions bot commented Jan 9, 2025

[New Model]: Codestral Mamba #6479

[New Model]: Codestral Mamba #6479

Comments

K-Mistele commented Jul 16, 2024

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

simon-mo commented Jul 16, 2024

digantamisra98 commented Oct 10, 2024

tlrmchlsmth commented Oct 10, 2024

github-actions bot commented Jan 9, 2025