Fix llama ONNX export #1432

fxmarty · 2023-10-05T15:10:42Z

LLama uses an optional configuration key num_key_value_heads for the number of heads, and uses num_attention_heads to compute the head dimension. This was unfortunately not implemented in #975 (apart from llama2 70b, the llama and llama2 series do not make use of this num_key_value_heads key), as the key probably did not exist at the time.

Fixes #1399

echarlaix

thanks for the fix

echarlaix · 2023-10-06T10:01:40Z

optimum/exporters/onnx/model_configs.py

@@ -216,7 +216,48 @@ class OPTOnnxConfig(TextDecoderOnnxConfig):
    NORMALIZED_CONFIG_CLASS = NormalizedTextConfig


+class LlamaDummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator):


Would make sense to move it in optimum/utils/input_generators.py

optimum/optimum/utils/input_generators.py

Line 830 in 099cd73

class GPTBigCodeDummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator):

echarlaix · 2023-10-06T11:59:36Z

optimum/exporters/onnx/model_configs.py

+            random_sequence_length_range=random_sequence_length_range,
+            **kwargs,
+        )
+        self.num_key_value_heads = normalized_config.num_key_value_heads


should we also add a fix in prepare_inputs_for_merged

(as done in #1425)

yes thank you for taking care of it!

* Add ONNX export Mistral models support * add test * format * fix format * fix key _config * tmp install transformers from source for tests * change model id * fix after #1432 merged * fix * format * fix

fxmarty added 2 commits October 5, 2023 16:55

fix

542b374

fix export

3102649

fxmarty requested review from michaelbenayoun and echarlaix October 5, 2023 15:10

fxmarty mentioned this pull request Oct 5, 2023

Export in ONNX/FP16 of PY007/TinyLlama-1.1B-Chat-v0.2 fails #1399

Closed

fxmarty requested review from regisss and mht-sharma October 5, 2023 15:11

fxmarty merged commit ba113e5 into huggingface:main Oct 6, 2023

echarlaix reviewed Oct 6, 2023

View reviewed changes

echarlaix added a commit that referenced this pull request Oct 6, 2023

fix after #1432 merged

c3b584a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix llama ONNX export #1432

Fix llama ONNX export #1432

fxmarty commented Oct 5, 2023 •

edited

Loading

echarlaix left a comment

echarlaix Oct 6, 2023

echarlaix Oct 6, 2023

fxmarty Oct 6, 2023

		@@ -216,7 +216,48 @@ class OPTOnnxConfig(TextDecoderOnnxConfig):
		NORMALIZED_CONFIG_CLASS = NormalizedTextConfig


		class LlamaDummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator):

Fix llama ONNX export #1432

Fix llama ONNX export #1432

Conversation

fxmarty commented Oct 5, 2023 • edited Loading

echarlaix left a comment

Choose a reason for hiding this comment

echarlaix Oct 6, 2023

Choose a reason for hiding this comment

echarlaix Oct 6, 2023

Choose a reason for hiding this comment

fxmarty Oct 6, 2023

Choose a reason for hiding this comment

fxmarty commented Oct 5, 2023 •

edited

Loading