GPTQ quantization on Jetson Orin Nano #36139

lukazso · 2025-02-11T16:48:45Z

I am trying to use GPTQ quantization support on my Jetson Orin Nano Super. The docs say that I can use either auto-gptq or gptqmodel for this. However, when only installing gptqmodel - since auto-gptq is supposed to be the deprecated version - I get the error:

ImportError: Loading a GPTQ quantized model requires the auto-gptq library (`pip install auto-gptq`)

Am I doing something wrong or are the docs not up to date?

Thanks!
Lukas

System Info:
Machine: aarch64
System: Linux
Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15.148-tegra
Python: 3.10.12
CUDA: 12.6.68

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Minimum example to reproduce:

from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig

model_id = "facebook/opt-125m"
tokenizer = AutoTokenizer.from_pretrained(model_id)
gptq_config = GPTQConfig(bits=4, dataset="c4", tokenizer=tokenizer)
quantized_model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", quantization_config=gptq_config)

Here the complete traceback:

ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for compatibililty.
Traceback (most recent call last):
  File "/home/lukas/Development/db-agent/test.py", line 39, in <module>
    quantized_model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", quantization_config=gptq_config)
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3620, in from_pretrained
    hf_quantizer.validate_environment(
  File "/home/lukas/Development/db-agent/venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 59, in validate_environment
    raise ImportError(
ImportError: Loading a GPTQ quantized model requires the auto-gptq library (`pip install auto-gptq`)

Expected behavior

No error

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2025-02-12T13:14:42Z

cc @SunMarc @MekkCyber

MekkCyber · 2025-02-12T13:34:43Z

Hi @lukazso, I think you just need to update your transformers version, because this change is very recent

lukazso · 2025-02-13T14:27:14Z

It seems like I already have the latest version (I meanwhile switched to python3.12):

transformers       4.48.3
gptqmodel          1.9.0
optimum            1.24.0

MekkCyber · 2025-02-13T14:36:44Z

Yes you are right @lukazso, sorry it's not part of the release yet, It will be in the next release 4.49 in the coming days, for now you can just install the main branch, it's stable

lukazso added the bug label Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ quantization on Jetson Orin Nano #36139

GPTQ quantization on Jetson Orin Nano #36139

lukazso commented Feb 11, 2025 •

edited

Loading

Rocketknight1 commented Feb 12, 2025

MekkCyber commented Feb 12, 2025

lukazso commented Feb 13, 2025

MekkCyber commented Feb 13, 2025 •

edited

Loading

GPTQ quantization on Jetson Orin Nano #36139

GPTQ quantization on Jetson Orin Nano #36139

Comments

lukazso commented Feb 11, 2025 • edited Loading

Information

Tasks

Reproduction

Expected behavior

Rocketknight1 commented Feb 12, 2025

MekkCyber commented Feb 12, 2025

lukazso commented Feb 13, 2025

MekkCyber commented Feb 13, 2025 • edited Loading

lukazso commented Feb 11, 2025 •

edited

Loading

MekkCyber commented Feb 13, 2025 •

edited

Loading