Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support granite and granitemoe models #1099

Merged
merged 3 commits into from
Jan 8, 2025

Conversation

eaidova
Copy link
Collaborator

@eaidova eaidova commented Jan 6, 2025

What does this PR do?

Fixes #1097

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@eaidova eaidova added the openvino-test Trigger OpenVINO slow tests label Jan 6, 2025
github-merge-queue bot pushed a commit to openvinotoolkit/openvino.genai that referenced this pull request Jan 7, 2025
related to huggingface/optimum-intel#1099
added opportunity to test these models via llm_bench

Co-authored-by: Ilya Lavrenov <[email protected]>
ilya-lavrenov added a commit to openvinotoolkit/openvino.genai that referenced this pull request Jan 7, 2025
related to huggingface/optimum-intel#1099
added opportunity to test these models via llm_bench

Co-authored-by: Ilya Lavrenov <[email protected]>
Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for investigating the MoE tracing problem !

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tests/openvino/utils_tests.py Show resolved Hide resolved
Comment on lines +3633 to +3638
# copied from https://github.com/huggingface/transformers/blob/v4.47.1/src/transformers/models/granitemoe/modeling_granitemoe.py#L281
def _granite_moe_parallel_experts_forward(self, inputs, expert_size):
output_list = []
# difference with original
# 1) expert_size is tensor instead of list of ints after gating patching, that does not allow use original inputs.split(expert_size)
# 2) use index_start:next_index for obtaining expert inputs splits one by one instead of precomputed splits once before cycle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super helpful, thanks!

@SearchSavior
Copy link

SearchSavior commented Jan 7, 2025

Thank you eaidova and team for your work on this! Very instructive about how the IR format works.

I ran a checkout to open this branch in a fresh conda environment and inspected the changes locally. Yet I am still getting errors that the optimum exporters extension is missing when running via the CLI tool or through export=True in from.pretrained.

I still have a lot to learn about advanced package management with python but once the unrecognized export config error resolved I figured it might be useful to share here.

@IlyasMoutawwakil
Copy link
Member

IlyasMoutawwakil commented Jan 8, 2025

@SearchSavior in a clean env, after you checkout to this branch or even on main you should do pip install .[openvino]
Make sure you don't install with the editable mode flag -e so that optimum and optimum-intel are found by the interpreter in you're site-packages. Tell me if this works for you.

@IlyasMoutawwakil IlyasMoutawwakil merged commit 7d7de7c into huggingface:main Jan 8, 2025
28 checks passed
@SearchSavior
Copy link

@IlyasMoutawwakil Sorry for the delay!

I did end up getting that working. However, in my dev env for inference performance with Granite has been much worse than expected on GPU and CPU across the model family. It's possible my conversion was borked in some way. Multiple conversions are uploaded here;

https://huggingface.co/Echo9Zulu

Will try converting again soon now that support is merged and will open an issue to present something more concrete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openvino-test Trigger OpenVINO slow tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IBM Granite MoE OpenVINO IR export config
6 participants