groupsize consistency #417

HDCharles · 2024-06-21T22:17:19Z

Summary:

half of the apis used groupsize and half used group_size, swapping them all to groupsize

Test Plan:

python eval.py -q int8wo --limit 1
wikitext: {'word_perplexity,none': 12.204889603121593, 'byte_perplexity,none': 1.5965674184201175, 'bits_per_byte,none': 0.6749734750293632, 'alias': 'wikitext'}

python generate.py --quantization int4wo-64
Average tokens/sec: 13.93
Average Bandwidth: 52.04 GB/s
Peak Memory Usage: 15.92 GB
Model Size: 3.74 GB

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: half of the apis used groupsize and half used group_size, swapping them all to groupsize Test Plan: python eval.py -q int8wo --limit 1 wikitext: {'word_perplexity,none': 12.204889603121593, 'byte_perplexity,none': 1.5965674184201175, 'bits_per_byte,none': 0.6749734750293632, 'alias': 'wikitext'} python generate.py --quantization int4wo-64 Average tokens/sec: 13.93 Average Bandwidth: 52.04 GB/s Peak Memory Usage: 15.92 GB Model Size: 3.74 GB Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-06-21T22:17:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/417

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 867ec9a with merge base ef1e745 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-06-21T22:18:10Z

can you change it to group_size? it will be more consistent with other args like inner_k_tiles

HDCharles · 2024-06-21T22:21:06Z

I'm kind of expecting this to fail CI somewhere, will fix after the issue is identified

msaroufim · 2024-06-22T01:54:02Z

Yeah +1 on group_size

gau-nernst · 2024-06-27T00:55:31Z

How about block_size used by

choose_qparams_affine
quantize_affine
dequantize_affine

Though it is a bit different as it expects a tuple instead of an int (like group_size).

jerryzh168 · 2024-06-27T01:11:50Z

How about block_size used by

choose_qparams_affine quantize_affine dequantize_affine

Though it is a bit different as it expects a tuple instead of an int (like group_size).

block_size is the most general form of these args, so it's used in our most fundamental quant_primitive ops like the ones you listed.

group_size is a special case of block_size and that is what's supported in some of the kernels I think, that's why APIs like int4_weight_only has that argument instead of the general block_size argument

gau-nernst · 2024-06-27T02:08:35Z

@jerryzh168 understand your point. But it is a bit confusing, since block_size and group_size actually mean the same thing, but are slightly different. Probably outside the scope of this PR.

jerryzh168 · 2024-06-27T21:17:51Z

@jerryzh168 understand your point. But it is a bit confusing, since block_size and group_size actually mean the same thing, but are slightly different. Probably outside the scope of this PR.

what do you mean by block_size and group_size mean the same thing? I thought group_size is used just for groupwise quantization for a single dimension (mostly just the second dimension of a 2d tensor), but block_size can be used by any kind of quantization, including per tensor, channel, group or a even groupwise quant for multiple dimensions

gau-nernst · 2024-06-27T22:13:49Z

@jerryzh168 yes, I understand that part. I think my confusion comes from the non-standardized use of the word "group" and "block". Do people always mean "group" as consecutive elements along a dimension (typically last dim) and "block" as more general (can have 2D structure like a (64, 64) tile for example)? (per channel quant can also be viewed as group-wise quant with group_size=channel length. per tensor quant can also be view as group-wise quant by flattening the tensor and with group_size=tensor size).

For example, 8-bit Adam paper from bnb (https://arxiv.org/pdf/2110.02861) uses the word "block_size" here, but I think it is actually "group_size" according to usage in torchao? Happy to be corrected if I have a wrong understanding.

jerryzh168 · 2024-06-27T23:29:50Z

@jerryzh168 yes, I understand that part. I think my confusion comes from the non-standardized use of the word "group" and "block". Do people always mean "group" as consecutive elements along a dimension (typically last dim) and "block" as more general (can have 2D structure like a (64, 64) tile for example)? (per channel quant can also be viewed as group-wise quant with group_size=channel length. per tensor quant can also be view as group-wise quant by flattening the tensor and with group_size=tensor size).

For example, 8-bit Adam paper from bnb (arxiv.org/pdf/2110.02861) uses the word "block_size" here, but I think it is actually "group_size" according to usage in torchao? Happy to be corrected if I have a wrong understanding.

ah I see, I think in torchao, we just use group_size to indicate the group_size for last dimension of a 2D tensor, this is where we are using it before: https://github.com/pytorch/ao/pull/321/files#diff-7c9b4c8c6d4ef9c47873263304a335d5cf56c3ac9f98ba10b994cd80dc9c2709L652-L654, so it will just be a single number indicating how many elements we want in the same group.

block_size is a more general term I think.

HDCharles requested review from cpuhrsch, msaroufim and jerryzh168 June 21, 2024 22:17

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 21, 2024

HDCharles closed this Jun 21, 2024

HDCharles reopened this Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

groupsize consistency #417

groupsize consistency #417

HDCharles commented Jun 21, 2024

pytorch-bot bot commented Jun 21, 2024 •

edited

Loading

jerryzh168 commented Jun 21, 2024

HDCharles commented Jun 21, 2024

msaroufim commented Jun 22, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024 •

edited

Loading

groupsize consistency #417

Are you sure you want to change the base?

groupsize consistency #417

Conversation

HDCharles commented Jun 21, 2024

pytorch-bot bot commented Jun 21, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/417

✅ No Failures

jerryzh168 commented Jun 21, 2024

HDCharles commented Jun 21, 2024

msaroufim commented Jun 22, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024

gau-nernst commented Jun 27, 2024

jerryzh168 commented Jun 27, 2024 • edited Loading

pytorch-bot bot commented Jun 21, 2024 •

edited

Loading

jerryzh168 commented Jun 27, 2024 •

edited

Loading