Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage] How to manually set calibration_function? #886

Open
donpromax opened this issue Nov 1, 2024 · 2 comments
Open

[Usage] How to manually set calibration_function? #886

donpromax opened this issue Nov 1, 2024 · 2 comments
Assignees

Comments

@donpromax
Copy link

donpromax commented Nov 1, 2024

I noticed that modifiers like SmoothQuantModifier have a parameter calibration_function for the forward pass during calibration. However, it's not clear how to set this calibration_function. For example:

recipe = [
    SmoothQuantModifier(smoothing_strength=0.8, calibration_function=model.generate),
    GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"]),
]

Additionally, I found that initializing SmoothQuantModifier without using YAML for the mappings parameter results in an error #105

mappings = [
    [["re:.*wqkv"], "re:.*attention_norm"],
    [["re:.*w1", "re:.*w3"], "re:.*ffn_norm"]
]

recipe = [
    SmoothQuantModifier(smoothing_strength=0.8, mappings=mappings),
    GPTQModifier(targets="Linear", scheme="W8A8", ignore=["re:.*output", "re:vision_model.*", "re:mlp1.*"]),
]

This results in the following error:

  File "/llm-compressor/src/llmcompressor/recipe/recipe.py", line 601, in _load_json_or_yaml_string
    raise ValueError(f"Could not parse recipe from string {content}") from err
ValueError: Could not parse recipe from string DEFAULT_stage:
  DEFAULT_modifiers:
    SmoothQuantModifier:
      smoothing_strength: 0.8
      mappings:
      - !!python/tuple
        - - re:.*wqkv
        - re:.*attention_norm
      - !!python/tuple
        - - re:.*w1
          - re:.*w3
        - re:.*ffn_norm
    GPTQModifier:
      targets: Linear
      ignore:
      - lm_head
      scheme: W8A8
@HelloCard
Copy link

The mappings error is an old bug that was fixed in #37, but strangely the fix has not been merged into the mainline until now.
As stated there, the workaround is to use string mappings, which seems to have no additional pitfalls.

@robertgshaw2-neuralmagic
Copy link
Collaborator

Thanks for the report. We need to do a better job on updating the documentation and examples for SQ mappings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants