Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupsize consistency #417

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

groupsize consistency #417

wants to merge 1 commit into from

Conversation

HDCharles
Copy link
Contributor

Summary:

half of the apis used groupsize and half used group_size, swapping them all to groupsize

Test Plan:

python eval.py -q int8wo --limit 1
wikitext: {'word_perplexity,none': 12.204889603121593, 'byte_perplexity,none': 1.5965674184201175, 'bits_per_byte,none': 0.6749734750293632, 'alias': 'wikitext'}

python generate.py --quantization int4wo-64
Average tokens/sec: 13.93
Average Bandwidth: 52.04 GB/s
Peak Memory Usage: 15.92 GB
Model Size: 3.74 GB

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:

half of the apis used groupsize and half used group_size, swapping them
all to groupsize

Test Plan:

python eval.py -q int8wo --limit 1
wikitext: {'word_perplexity,none': 12.204889603121593, 'byte_perplexity,none': 1.5965674184201175, 'bits_per_byte,none': 0.6749734750293632, 'alias': 'wikitext'}

python generate.py --quantization int4wo-64
Average tokens/sec: 13.93
Average Bandwidth: 52.04 GB/s
Peak Memory Usage: 15.92 GB
Model Size: 3.74 GB

Reviewers:

Subscribers:

Tasks:

Tags:
Copy link

pytorch-bot bot commented Jun 21, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/417

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 867ec9a with merge base ef1e745 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 21, 2024
@jerryzh168
Copy link
Contributor

can you change it to group_size? it will be more consistent with other args like inner_k_tiles

@HDCharles
Copy link
Contributor Author

I'm kind of expecting this to fail CI somewhere, will fix after the issue is identified

@HDCharles HDCharles closed this Jun 21, 2024
@HDCharles HDCharles reopened this Jun 21, 2024
@msaroufim
Copy link
Member

Yeah +1 on group_size

@gau-nernst
Copy link
Collaborator

How about block_size used by

choose_qparams_affine
quantize_affine
dequantize_affine

Though it is a bit different as it expects a tuple instead of an int (like group_size).

@jerryzh168
Copy link
Contributor

How about block_size used by

choose_qparams_affine quantize_affine dequantize_affine

Though it is a bit different as it expects a tuple instead of an int (like group_size).

block_size is the most general form of these args, so it's used in our most fundamental quant_primitive ops like the ones you listed.

group_size is a special case of block_size and that is what's supported in some of the kernels I think, that's why APIs like int4_weight_only has that argument instead of the general block_size argument

@gau-nernst
Copy link
Collaborator

@jerryzh168 understand your point. But it is a bit confusing, since block_size and group_size actually mean the same thing, but are slightly different. Probably outside the scope of this PR.

@jerryzh168
Copy link
Contributor

@jerryzh168 understand your point. But it is a bit confusing, since block_size and group_size actually mean the same thing, but are slightly different. Probably outside the scope of this PR.

what do you mean by block_size and group_size mean the same thing? I thought group_size is used just for groupwise quantization for a single dimension (mostly just the second dimension of a 2d tensor), but block_size can be used by any kind of quantization, including per tensor, channel, group or a even groupwise quant for multiple dimensions

@gau-nernst
Copy link
Collaborator

@jerryzh168 yes, I understand that part. I think my confusion comes from the non-standardized use of the word "group" and "block". Do people always mean "group" as consecutive elements along a dimension (typically last dim) and "block" as more general (can have 2D structure like a (64, 64) tile for example)? (per channel quant can also be viewed as group-wise quant with group_size=channel length. per tensor quant can also be view as group-wise quant by flattening the tensor and with group_size=tensor size).

For example, 8-bit Adam paper from bnb (https://arxiv.org/pdf/2110.02861) uses the word "block_size" here, but I think it is actually "group_size" according to usage in torchao? Happy to be corrected if I have a wrong understanding.

@jerryzh168
Copy link
Contributor

jerryzh168 commented Jun 27, 2024

@jerryzh168 yes, I understand that part. I think my confusion comes from the non-standardized use of the word "group" and "block". Do people always mean "group" as consecutive elements along a dimension (typically last dim) and "block" as more general (can have 2D structure like a (64, 64) tile for example)? (per channel quant can also be viewed as group-wise quant with group_size=channel length. per tensor quant can also be view as group-wise quant by flattening the tensor and with group_size=tensor size).

For example, 8-bit Adam paper from bnb (arxiv.org/pdf/2110.02861) uses the word "block_size" here, but I think it is actually "group_size" according to usage in torchao? Happy to be corrected if I have a wrong understanding.

ah I see, I think in torchao, we just use group_size to indicate the group_size for last dimension of a 2D tensor, this is where we are using it before: https://github.com/pytorch/ao/pull/321/files#diff-7c9b4c8c6d4ef9c47873263304a335d5cf56c3ac9f98ba10b994cd80dc9c2709L652-L654, so it will just be a single number indicating how many elements we want in the same group.

block_size is a more general term I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants