Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for QwQ-32B Model Integration #12635

Open
CaptainRui1000 opened this issue Mar 17, 2025 · 2 comments
Open

Feature Request: Support for QwQ-32B Model Integration #12635

CaptainRui1000 opened this issue Mar 17, 2025 · 2 comments
Assignees

Comments

@CaptainRui1000
Copy link

Background

QwQ-32B is a recently developed model that offers impressive performance while maintaining efficiency, making it a promising candidate for various AI tasks. As the model architecture evolves, integrating support for QwQ-32B would be valuable for those looking to use NeMo for training and deployment.

Request

I would like to request the addition of support for the QwQ-32B model within the NeMo framework. This would allow users to leverage NeMo's capabilities for training, fine-tuning, and deployment with QwQ-32B.

Motivation

Integrating QwQ-32B into NeMo could help users to:

  • Utilize cutting-edge models without needing to switch frameworks.
  • Benefit from NeMo's optimizations and features (e.g., multi-GPU support, mixed precision).
  • Enable faster model training and inference by leveraging QwQ-32B’s efficiency.

Resources

Expected Outcome

It would be great to have QwQ-32B supported in the next release or as part of an experimental feature, allowing users to integrate it seamlessly into their workflows.

@euronymous-aithal
Copy link

@akoumpa can we cover in Nemo AutoModel ?

@akoumpa akoumpa assigned akoumpa and unassigned okuchaiev Mar 20, 2025
@akoumpa
Copy link
Member

akoumpa commented Mar 20, 2025

Hi @CaptainRui1000 ,

Thank you for your request. We recently introduced a HuggingFace-native workflow in NeMo that supports HuggingFace models and provides multigpu scaling via FSDP2. I was able to confirm that QwQ-32B is runnable on a single 8xH100 node.

I would recommend trying the PEFT or SFT notebooks with the QwQ-32B model.

Please feel free to let me know if you have any other questions. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants