Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSDP support #142

Open
ssmmnn11 opened this issue Feb 18, 2025 · 0 comments
Open

FSDP support #142

ssmmnn11 opened this issue Feb 18, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@ssmmnn11
Copy link
Member

Is your feature request related to a problem? Please describe.

Add support for Fully Sharded Data Parallel to support large (Parameters) models

Describe the solution you'd like

We need to adapt the Pytorch Lightning FSDP strategy to implement our model and reader groups.

https://github.com/Lightning-AI/pytorch-lightning/blob/master/src/lightning/fabric/strategies/fsdp.py

Potentially we could also make use of

https://github.com/Lightning-AI/pytorch-lightning/blob/master/src/lightning/fabric/strategies/model_parallel.py

Describe alternatives you've considered

No response

Additional context

No response

Organisation

ECMWF

@ssmmnn11 ssmmnn11 added the enhancement New feature or request label Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

3 participants