You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
===== Compressing layer 23/40 =====
2024-08-15T15:22:59.526464+0000 | compress_module | INFO - Compressing model.layers.22.model.layers.22.self_attn.o_proj...
2024-08-15T15:23:00.110515+0000 | compress | INFO - time 0.51
2024-08-15T15:23:00.110713+0000 | compress | INFO - error 0.00
Expected behavior
exit the compress() function early - but GPTQ will still run. we do do need all the layers in the pipeline for the data to flow properly.
Environment
Include all relevant environment information:
OS [e.g. Ubuntu 20.04]:
Python version [e.g. 3.7]:
LLM Compressor version or commit hash [e.g. 0.1.0, f7245c8]:
ML framework version(s) [e.g. torch 2.3.1]:
Other Python package versions [e.g. vLLM, compressed-tensors, numpy, ONNX]:
Other relevant environment information [e.g. hardware, CUDA version]:
* add function to pack bits
* fix arg
* make 4bits the default
* update
* add support for int8 decompress; update function to take in name to scheme mapping
* update to test 8 bits; update kwargs
* fix print; update name
* update tests
* update arg
* update all other classes
Describe the bug
Cosmetic issue.
Running the code std-out's
Expected behavior
exit the compress() function early - but GPTQ will still run. we do do need all the layers in the pipeline for the data to flow properly.
Environment
Include all relevant environment information:
f7245c8
]:To Reproduce
using examples/big_models_with_accelerate/multi_gpu_int8.py.
Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
Additional context
Add any other context about the problem here. Also include any relevant files.
The text was updated successfully, but these errors were encountered: