v0.4.2: bug fix release
Highlights
1 Fix autoawq exporting issue
2 remove bias exporting if possible in autogptq format
What's Changed
- bump version into v0.4.1 by @XuehaoSun in #350
- Update docker user and remove baseline UT by @XuehaoSun in #347
- delete llm example and refine readme by @wenhuach21 in #354
- Simulated W4Afp8 Quantization by @wenhuach21 in #331
- add QWQ-32B, VLM, Qwen2.5, Llama3.1 int4 models by @wenhuach21 in #356
- fix awq exporting by @wenhuach21 in #358
- Tensor reshape bugfix by @WeiweiZhang1 in #364
- fix awq backend and fp_layers issue by @wenhuach21 in #363
- fix awq exporting bugs by @wenhuach21 in #365
- fix bug of only_text_test check due to inference issue on cpu by @n1ck-guo in #362
- add gpu test by @wenhuach21 in #367
- using multicard when device set to "auto" by @n1ck-guo in #368
- quant_block_names enhancement by @WeiweiZhang1 in #369
- [HPU] Add lazy mode back by @yiliu30 in #371
- remove bias exporting if possible in autogptq format by @wenhuach21 in #375
- save processor automatically by @n1ck-guo in #372
- Add gpu ut by @wenhuach21 in #370
- fix gpu ut by @n1ck-guo in #376
- fix typos by @wenhuach21 in #377
Full Changelog: v0.4.1...v0.4.2