Tensor Parallelism #1

timmytwoteeth · 2024-03-14T14:41:04Z

Hello,

Thank you for the great work.

I was wondering if scatter moe supported tensor parallelism?

Thank you!

shawntan · 2024-03-15T07:33:29Z

We're thinking about it! I'll keep this issue open as a reminder.

timmytwoteeth · 2024-03-16T01:30:44Z

Appreciate the update.

yikangshen · 2024-03-25T17:04:08Z

If you need a model parallel for training, we suggest using Pipeline parallelism for now. It works very well for MoE models because they usually have a narrow hidden state compared to the number of parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor Parallelism #1

Tensor Parallelism #1

timmytwoteeth commented Mar 14, 2024

shawntan commented Mar 15, 2024

timmytwoteeth commented Mar 16, 2024

yikangshen commented Mar 25, 2024

Tensor Parallelism #1

Tensor Parallelism #1

Comments

timmytwoteeth commented Mar 14, 2024

shawntan commented Mar 15, 2024

timmytwoteeth commented Mar 16, 2024

yikangshen commented Mar 25, 2024