Release v0.4.2: bug fix release · intel/auto-round

Highlights

1 Fix autoawq exporting issue
2 remove bias exporting if possible in autogptq format

bump version into v0.4.1 by @XuehaoSun in #350
Update docker user and remove baseline UT by @XuehaoSun in #347
delete llm example and refine readme by @wenhuach21 in #354
Simulated W4Afp8 Quantization by @wenhuach21 in #331
add QWQ-32B, VLM, Qwen2.5, Llama3.1 int4 models by @wenhuach21 in #356
fix awq exporting by @wenhuach21 in #358
Tensor reshape bugfix by @WeiweiZhang1 in #364
fix awq backend and fp_layers issue by @wenhuach21 in #363
fix awq exporting bugs by @wenhuach21 in #365
fix bug of only_text_test check due to inference issue on cpu by @n1ck-guo in #362
add gpu test by @wenhuach21 in #367
using multicard when device set to "auto" by @n1ck-guo in #368
quant_block_names enhancement by @WeiweiZhang1 in #369
[HPU] Add lazy mode back by @yiliu30 in #371
remove bias exporting if possible in autogptq format by @wenhuach21 in #375
save processor automatically by @n1ck-guo in #372
Add gpu ut by @wenhuach21 in #370
fix gpu ut by @n1ck-guo in #376
fix typos by @wenhuach21 in #377

Full Changelog: v0.4.1...v0.4.2