int4 gptq shape fix #142

HDCharles · 2024-03-20T02:53:30Z

Stack from ghstack (oldest at bottom):

Summary: redoing
5bf70c1
in a way that doesn't get reverted. note, needed to fix a device issue
as well.

Test Plan:

export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf
python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5
python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5

Reviewers:

Subscribers:

Tasks:

Tags:

5bf70c1 in a way that doesn't get reverted Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

5bf70c1 in a way that doesn't get reverted Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 4307b5c7904d59fdbd40a0455687ae90e84585d4 Pull Request resolved: #142

model.py

5bf70c1 in a way that doesn't get reverted Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

5bf70c1 in a way that doesn't get reverted Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1b4a8b43482ff27c8a300b571b2e3e81a13b29e4 Pull Request resolved: #142

Summary: redoing 5bf70c1 in a way that doesn't get reverted. note, needed to fix a device issue as well. Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: redoing 5bf70c1 in a way that doesn't get reverted. note, needed to fix a device issue as well. Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d3928b07e8be7e1e9e98b43584e125b6d60770d6 Pull Request resolved: #142

Summary: redoing pytorch-labs/gpt-fast@5bf70c1 in a way that doesn't get reverted. note, needed to fix a device issue as well. Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d3928b07e8be7e1e9e98b43584e125b6d60770d6 Pull Request resolved: pytorch-labs/gpt-fast#142

Summary: redoing

8704152

5bf70c1 in a way that doesn't get reverted Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

HDCharles added a commit that referenced this pull request Mar 20, 2024

Summary: redoing

9875178

5bf70c1 in a way that doesn't get reverted Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 4307b5c7904d59fdbd40a0455687ae90e84585d4 Pull Request resolved: #142

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2024

cpuhrsch approved these changes Mar 20, 2024

View reviewed changes

cpuhrsch requested changes Mar 20, 2024

View reviewed changes

model.py Outdated Show resolved Hide resolved

HDCharles changed the title ~~Summary: redoing~~ int4 gptq shape fix Mar 26, 2024

HDCharles mentioned this pull request Mar 26, 2024

fixing GPTQ #147

Open

HDCharles requested a review from cpuhrsch March 27, 2024 18:40

HDCharles merged commit 2e76737 into gh/HDCharles/6/base Mar 27, 2024
1 check passed

HDCharles deleted the gh/HDCharles/6/head branch March 27, 2024 18:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

int4 gptq shape fix #142

int4 gptq shape fix #142

HDCharles commented Mar 20, 2024 •

edited

Loading

int4 gptq shape fix #142

int4 gptq shape fix #142

Conversation

HDCharles commented Mar 20, 2024 • edited Loading

HDCharles commented Mar 20, 2024 •

edited

Loading