Bump minimum TorchAO version to 0.7.0 #10293

a-r-r-o-w · 2024-12-18T19:54:03Z

Context: https://huggingface.slack.com/archives/C065E480NN9/p1734425021147699

HuggingFaceDocBuilderDev · 2024-12-18T20:00:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

a-r-r-o-w · 2024-12-18T19:55:41Z

tests/quantization/torchao/test_torchao.py

@@ -276,7 +278,6 @@ def test_int4wo_quant_bfloat16_conversion(self):
        self.assertTrue(isinstance(weight, AffineQuantizedTensor))
        self.assertEqual(weight.quant_min, 0)
        self.assertEqual(weight.quant_max, 15)
-        self.assertTrue(isinstance(weight.layout_type, TensorCoreTiledLayoutType))


layout_type has become an internal private attribute called _layout now. It does not have to be tested as such so can remove. The layout is also now called TensorCoreTiledLayout instead

a-r-r-o-w · 2024-12-18T19:56:22Z

tests/quantization/torchao/test_torchao.py

+        size_quantized_with_not_convert = get_model_size_in_bytes(quantized_model_with_not_convert)
+        size_quantized = get_model_size_in_bytes(quantized_model)
+
+        self.assertTrue(size_quantized < size_quantized_with_not_convert)


Not related to bumping the version, but it makes for a more meaningful test

a-r-r-o-w · 2024-12-18T19:57:32Z

tests/quantization/torchao/test_torchao.py

-
-        for param in module.parameters():
-            if param.__class__.__name__ == "AffineQuantizedTensor":
-                data, scale, zero_point = param.layout_tensor.get_plain()


Same reason as above for removing this. layout_tensor is internal private attribute meaning we shouldn't access it because they could change it without warning in future

a-r-r-o-w · 2024-12-18T19:57:49Z

tests/quantization/torchao/test_torchao.py

-        self.assertTrue(total_int8wo < total_bf16 < total_int4wo_gs32)
-        # int4 with default group size quantized very few linear layers compared to a smaller group size of 32
-        self.assertTrue(quantized_int4wo < quantized_int4wo_gs32 and unquantized_int4wo > unquantized_int4wo_gs32)
+        total_int4wo = get_model_size_in_bytes(transformer_int4wo)


We use the torchao provided utility instead now

a-r-r-o-w · 2024-12-18T20:00:22Z

tests/quantization/torchao/test_torchao.py

@@ -593,7 +589,7 @@ def get_dummy_inputs(self, device: torch.device, seed: int = 0):

    def _test_quant_type(self, quantization_config, expected_slice):
        components = self.get_dummy_components(quantization_config)
-        pipe = FluxPipeline(**components).to(dtype=torch.bfloat16)


I think this was incorrect thing to do here and it slipped past us in previous PR. We should not be calling .to(dtype) on the pipeline directly if there has been a model that has been quantized.

The GGUF PR introduced a check in modeling_utils.py here that catches this behaviour.

a-r-r-o-w · 2024-12-21T12:13:46Z

Gentle ping @DN6

* bump min torchao version to 0.7.0 * update

a-r-r-o-w added 3 commits December 18, 2024 08:15

bump min torchao version to 0.7.0

0f812cd

Merge branch 'main' into bump-torchao-version

bb2fb98

update

497c56d

a-r-r-o-w requested a review from DN6 December 18, 2024 19:54

Merge branch 'main' into bump-torchao-version

b0e5e5b

a-r-r-o-w commented Dec 18, 2024

View reviewed changes

a-r-r-o-w added 2 commits December 20, 2024 06:56

Merge branch 'main' into bump-torchao-version

6e89718

Merge branch 'main' into bump-torchao-version

5dc6225

a-r-r-o-w added the quantization label Dec 22, 2024

DN6 approved these changes Dec 23, 2024

View reviewed changes

DN6 merged commit ffc0eaa into main Dec 23, 2024
15 checks passed

a-r-r-o-w deleted the bump-torchao-version branch December 23, 2024 05:35

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

Bump minimum TorchAO version to 0.7.0 (#10293)

01acf5f

* bump min torchao version to 0.7.0 * update

a-r-r-o-w mentioned this pull request Dec 25, 2024

Fix TorchAO related bugs; revert device_map changes #10371

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bump minimum TorchAO version to 0.7.0 #10293

Bump minimum TorchAO version to 0.7.0 #10293

Uh oh!

a-r-r-o-w commented Dec 18, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Uh oh!

a-r-r-o-w commented Dec 21, 2024

Uh oh!

Uh oh!

Uh oh!

Bump minimum TorchAO version to 0.7.0 #10293

Bump minimum TorchAO version to 0.7.0 #10293

Uh oh!

Conversation

a-r-r-o-w commented Dec 18, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 18, 2024

Uh oh!

a-r-r-o-w Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w commented Dec 21, 2024

Uh oh!

Uh oh!

Uh oh!