Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: exclude mamba blocks for jamba when load8bit #1578

Merged
merged 1 commit into from
May 7, 2024

Conversation

NanoCode012
Copy link
Collaborator

Description

Closes #1498

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

@NanoCode012
Copy link
Collaborator Author

@creatorrr I made the PR. Please let me know if you could test and confirm it works.

@NanoCode012 NanoCode012 changed the title feat: exclude mamba blocks for jamba feat: exclude mamba blocks for jamba when load8bit Apr 30, 2024
@creatorrr
Copy link

I'd have to do a training run with and without the patch to compare but the recommendation from the Jamba team is exactly this so 💯 from me. And thanks a bunch for getting around to this!

@NanoCode012 NanoCode012 merged commit 8b9c15b into axolotl-ai-cloud:main May 7, 2024
7 checks passed
@NanoCode012 NanoCode012 deleted the feat/no-quant-mamba branch May 7, 2024 13:52
djsaunde pushed a commit that referenced this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[jamba] Quantizing should exclude mamba layers
3 participants