Which Parameter is used for Stage 3 ZeRO? #3605
Unanswered
fahadh4ilyas
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been reading DeepSpeed Integration from Transformers and there is something that interesting for me. Quoted from here
But, I never found this fact in here. Is this really true? If it's true, which parameters are used for stage 3 ZeRO?
Beta Was this translation helpful? Give feedback.
All reactions