Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Block weight quantize tool for LLM [skip ci] #13758

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hseok-oh
Copy link
Contributor

@hseok-oh hseok-oh commented Aug 26, 2024

  • Block quantization for LLM: FullyConnected, Gather
  • Decide quantize type by circle-quantizer parameter: --block_quantize_weights (Q4_0, Q8_0)
  • Skip quantization by circle-quantizer parameter: --skipsize_block_quantize (default: 0)

Caution: It's for PoC of circle format and test model generation. Not for compiler implementation.
#13742 #13743

@hseok-oh hseok-oh added the PR/NO TEST Tell CI to not run test label Aug 26, 2024
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch from 621f3f6 to 29b28a5 Compare August 26, 2024 11:19
@hseok-oh hseok-oh changed the title PoC: Blckwise weight quantize tool for LLM [skip ci] PoC: Chunk weight quantize tool for LLM [skip ci] Aug 26, 2024
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch 3 times, most recently from b23a54b to 47eede8 Compare August 27, 2024 05:33
@hseok-oh hseok-oh changed the title PoC: Chunk weight quantize tool for LLM [skip ci] PoC: Block weight quantize tool for LLM [skip ci] Aug 27, 2024
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch from 93887e0 to fb99730 Compare August 27, 2024 08:16
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch from f26e7cc to 84cef0e Compare September 6, 2024 02:03
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch 2 times, most recently from efac650 to 750278f Compare October 11, 2024 08:11
- Blockwise quantization for LLM: FullyConnected, Gather
- Decide quantize type by circle-quantizer parameter: `--quantize_weights_chunk` (Q4_0, Q8_0)
- Skip quantization by circle-quantizer parameter: `--skip_chunkquant_size` (default: 0)

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh <[email protected]>
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch from 750278f to 2971542 Compare October 11, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/NO TEST Tell CI to not run test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant