refactor dbrx experts to use FusedMoe layer #186

divakar-amd · 2024-09-16T19:15:59Z

Re-use FusedMoe layer for dbrx model.

Create DbrxMoe class
DbrxExperts class now uses FusedMoe as the base class
- with custom weight loader
This will allow us to use existing moe based optimization from mixtral like VLLM_MOE_PADDING
Will also expose the "process_weights_after_loading" for this model

charlifu · 2024-09-16T21:49:32Z

LGTM, thank you for making the change.

refactor dbrx experts to use FusedMoe layer

85e6084

divakar-amd requested review from rasmith, charlifu, ilia-cher, alexeykondrat, gshtras, Alexei-V-Ivanov-AMD, mawong-amd, maleksan85 and hegemanjw4amd September 16, 2024 19:15

divakar-amd self-assigned this Sep 16, 2024

yapf re-format

d1e3448

ROCm deleted a comment from github-actions bot Sep 16, 2024

divakar-amd marked this pull request as ready for review September 16, 2024 19:22

divakar-amd requested a review from shajrawi September 16, 2024 19:24

prepare for the fp8 input

a54b155

charlifu approved these changes Sep 16, 2024

View reviewed changes

gshtras and others added 2 commits September 16, 2024 18:31

Merge branch 'main' into dbrx_fusedmoe_refactor

93d0f31

Merge branch 'main' into dbrx_fusedmoe_refactor

c8b9222

divakar-amd merged commit 6bd99d2 into main Sep 17, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor dbrx experts to use FusedMoe layer #186

refactor dbrx experts to use FusedMoe layer #186

divakar-amd commented Sep 16, 2024 •

edited

Loading

charlifu commented Sep 16, 2024

refactor dbrx experts to use FusedMoe layer #186

refactor dbrx experts to use FusedMoe layer #186

Conversation

divakar-amd commented Sep 16, 2024 • edited Loading

charlifu commented Sep 16, 2024

divakar-amd commented Sep 16, 2024 •

edited

Loading