Plan for MoE #5472
Replies: 4 comments
-
@aklife97 can you please share what is coming for MoE? |
Beta Was this translation helpful? Give feedback.
-
Adding MoE for GPT is pretty straightforward. I'll create a PR for MoE for GPT soon after some testing. I'll update this thread once its added |
Beta Was this translation helpful? Give feedback.
-
@aklife97 I am wondering is there any plan to include expert parallel to the MoE? Any guidance to support expert parallelism would be super helpful. Thanks |
Beta Was this translation helpful? Give feedback.
-
Hey @adamlin120 expert-parallel for MoE is high priority, we have it working with Megatron-core and plan to support it inside NeMo in a few months |
Beta Was this translation helpful? Give feedback.
-
Is there a plan for adding Mixture of experts for GPT-style models? I've found this PR https://github.com/NVIDIA/NeMo/pull/5409/files but that seems to be for T5-like models. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions