Composability with sparse and quantization compressors #948

rahul-tuli · 2024-12-02T22:16:44Z

This PR enables accomplishes the following:

Increases the sparsity threshold to 50%
Allow sparse + quantized compression-decompression on llm-compressor side
Adds a test for sparse+quantized compression-decompression

github-actions · 2024-12-02T22:16:55Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

dsikka · 2024-12-02T22:43:23Z

src/llmcompressor/transformers/compression/quantization_format.py

@@ -37,7 +36,8 @@ def infer_quantization_format(
    if save_compressed:
        weight_args, input_args = _get_unique_quant_args(model)
        is_24_structure = (
-            sparsity_config and sparsity_config.sparsity_structure == "2:4"
+            SparsityStructure(sparsity_structure).value
+            == SparsityStructure.TWO_FOUR.value


It seems like we've only enabled this to save using the marlin-24 compressor if the model follows 2:4 sparsity?

nit: can we not compare the enums directly without .value?

@kylesayrs accepted

@dsikka let's sync offline

Increase Sparsity Threshold Signed-off-by: Rahul Tuli <[email protected]>

Signed-off-by: Rahul Tuli <[email protected]>

horheynm · 2024-12-03T16:42:47Z

verified decompression works for sparse and quantized model

dsikka reviewed Dec 2, 2024

View reviewed changes

rahul-tuli force-pushed the composability-v2 branch from 6b47ecd to ea8b8b5 Compare December 3, 2024 06:07

rahul-tuli added 3 commits December 3, 2024 06:16

Enable Sparse24 quantization for Weight + Activation quantization

ea7d5b5

Increase Sparsity Threshold Signed-off-by: Rahul Tuli <[email protected]>

Add composability test

0bd66c8

Signed-off-by: Rahul Tuli <[email protected]>

Review comments from @kylesayrs compare enum directly

afc0b5f

Signed-off-by: Rahul Tuli <[email protected]>

rahul-tuli force-pushed the composability-v2 branch from 9249158 to afc0b5f Compare December 3, 2024 06:16

rahul-tuli changed the title ~~[ DRAFT ] Composability with sparse and quantization compressors~~ Composability with sparse and quantization compressors Dec 3, 2024

Merge branch 'main' into composability-v2

6e658d0

horheynm approved these changes Dec 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composability with sparse and quantization compressors #948

Composability with sparse and quantization compressors #948

rahul-tuli commented Dec 2, 2024 •

edited

Loading

github-actions bot commented Dec 2, 2024

dsikka Dec 2, 2024

kylesayrs Dec 3, 2024

rahul-tuli Dec 3, 2024

rahul-tuli Dec 3, 2024

horheynm commented Dec 3, 2024

Composability with sparse and quantization compressors #948

Are you sure you want to change the base?

Composability with sparse and quantization compressors #948

Conversation

rahul-tuli commented Dec 2, 2024 • edited Loading

github-actions bot commented Dec 2, 2024

dsikka Dec 2, 2024

Choose a reason for hiding this comment

kylesayrs Dec 3, 2024

Choose a reason for hiding this comment

rahul-tuli Dec 3, 2024

Choose a reason for hiding this comment

rahul-tuli Dec 3, 2024

Choose a reason for hiding this comment

horheynm commented Dec 3, 2024

rahul-tuli commented Dec 2, 2024 •

edited

Loading