Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPTQ quantization on Jetson Orin Nano #36139

Open
2 of 4 tasks
lukazso opened this issue Feb 11, 2025 · 4 comments
Open
2 of 4 tasks

GPTQ quantization on Jetson Orin Nano #36139

lukazso opened this issue Feb 11, 2025 · 4 comments
Labels

Comments

@lukazso
Copy link

lukazso commented Feb 11, 2025

I am trying to use GPTQ quantization support on my Jetson Orin Nano Super. The docs say that I can use either auto-gptq or gptqmodel for this. However, when only installing gptqmodel - since auto-gptq is supposed to be the deprecated version - I get the error:

ImportError: Loading a GPTQ quantized model requires the auto-gptq library (`pip install auto-gptq`)

Am I doing something wrong or are the docs not up to date?

Thanks!
Lukas


System Info:
Machine: aarch64
System: Linux
Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15.148-tegra
Python: 3.10.12
CUDA: 12.6.68

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Minimum example to reproduce:

from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig

model_id = "facebook/opt-125m"
tokenizer = AutoTokenizer.from_pretrained(model_id)
gptq_config = GPTQConfig(bits=4, dataset="c4", tokenizer=tokenizer)
quantized_model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", quantization_config=gptq_config)

Here the complete traceback:

ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for compatibililty.
Traceback (most recent call last):
  File "/home/lukas/Development/db-agent/test.py", line 39, in <module>
    quantized_model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", quantization_config=gptq_config)
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3620, in from_pretrained
    hf_quantizer.validate_environment(
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 59, in validate_environment
    raise ImportError(
ImportError: Loading a GPTQ quantized model requires the auto-gptq library (`pip install auto-gptq`)

Expected behavior

No error

@lukazso lukazso added the bug label Feb 11, 2025
@Rocketknight1
Copy link
Member

cc @SunMarc @MekkCyber

@MekkCyber
Copy link
Contributor

Hi @lukazso, I think you just need to update your transformers version, because this change is very recent

@lukazso
Copy link
Author

lukazso commented Feb 13, 2025

It seems like I already have the latest version (I meanwhile switched to python3.12):

transformers       4.48.3
gptqmodel          1.9.0
optimum            1.24.0

@MekkCyber
Copy link
Contributor

MekkCyber commented Feb 13, 2025

Yes you are right @lukazso, sorry it's not part of the release yet, It will be in the next release 4.49 in the coming days, for now you can just install the main branch, it's stable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants