Skip to content

Commit

Permalink
Update ascend_en_get_started.md for kvcache quant
Browse files Browse the repository at this point in the history
  • Loading branch information
jinminxi104 authored Dec 4, 2024
1 parent 3a6518e commit 12b2305
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions docs/en/get_started/ascend/get_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,3 +136,9 @@ lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR --device npu
```

Please check [supported_models](../../supported_models/supported_models.md) before use this feature.

### int8 KV-cache Quantization

Ascend backend has supported offline int8 KV-cache Quantization on eager mode.

Please refer this [doc](https://github.com/DeepLink-org/dlinfer/blob/main/docs/quant/ascend_kv_quant.md) for details.

0 comments on commit 12b2305

Please sign in to comment.