Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ONNX and ORT support for Falcon #1391

Merged
merged 7 commits into from
Oct 18, 2023

Conversation

fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Sep 16, 2023

This one was more painful than it should have been because:

Remaining issue: I think repeat_interleave ONNX export inserts Loop in the ONNX, which we may want to avoid. EDIT: fixed in pytorch 2.1

@fxmarty
Copy link
Contributor Author

fxmarty commented Sep 16, 2023

Fixes #1172

Comment on lines +322 to +323
# we need to set output_attentions=True in the model input to avoid calling
# torch.nn.functional.scaled_dot_product_attention that is not supported by the ONNX export
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would move this comment inside the method.

optimum/exporters/onnx/model_patcher.py Outdated Show resolved Hide resolved
optimum/exporters/onnx/model_patcher.py Outdated Show resolved Hide resolved
generation_config=generation_config,
**kwargs,
)
# self.num_kv_heads = config.num_kv_heads if (config.new_decoder_architecture or not config.multi_query) else 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it for now

@@ -211,7 +211,7 @@ class NormalizedConfigManager:
"blenderbot": BartLikeNormalizedTextConfig,
"blenderbot_small": BartLikeNormalizedTextConfig,
"bloom": NormalizedTextConfig.with_args(num_layers="n_layer"),
"falcon": NormalizedTextConfig.with_args(num_layers="num_hidden_layers", num_attention_heads="num_kv_heads"),
"falcon": NormalizedTextConfig,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: does NormalizedConfig have a NUM_KV_HEADS attribute to normalize it or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No

@fxmarty fxmarty merged commit 1ae95a7 into huggingface:main Oct 18, 2023
64 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants