Skip to content

Commit

Permalink
Summary: redoing
Browse files Browse the repository at this point in the history
5bf70c1
in a way that doesn't get reverted

Test Plan:

export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf
python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5
python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1b4a8b43482ff27c8a300b571b2e3e81a13b29e4
Pull Request resolved: #142
  • Loading branch information
HDCharles committed Mar 26, 2024
1 parent c955dac commit 0ad385c
Show file tree
Hide file tree
Showing 5 changed files with 513 additions and 23 deletions.
4 changes: 2 additions & 2 deletions GPTQ.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,9 @@ def __init__(
}

# trace model for one input
one_input = [multi.values[0] for multi in inputs]
one_input = tuple([multi.values[0].cpu() for multi in inputs])
exported_model = torch._dynamo.export(
model, aten_graph=True, pre_dispatch=True, tracing_mode="fake"
model.cpu(), aten_graph=True, pre_dispatch=True, tracing_mode="fake"
)(*one_input)
super().__init__(exported_model.graph_module)
self.new_state_dict = model.state_dict()
Expand Down
Loading

0 comments on commit 0ad385c

Please sign in to comment.