What's Changed
- BLOOM Default Smoothquant Mappings by @kylesayrs in #906
- [SparseAutoModelForCausalLM Deprecation] Feature change by @horheynm in #881
- Correct "dyanmic" typo by @kylesayrs in #888
- Explicit defaults for QuantizationModifier targets by @kylesayrs in #889
- [SparseAutoModelForCausalLM Deprecation] Update examples by @horheynm in #880
- Support pack_quantized format for nonuniform mixed-precision by @mgoin in #913
- Actually make the
run_compressed
test useful by @dsikka in #920 - Fix for e2e tests by @horheynm in #927
- [Bugfix] Correct metrics calculations by @kylesayrs in #878
- Update kv_cache example by @dsikka in #921
- [1/2] Expand e2e testing to prepare for lm-eval by @dsikka in #922
- Update pytest command to capture results to file by @dbarbuzzi in #932
- [Bugfix] DisableKVCache Context by @kylesayrs in #834
- Add helpful info to the marlin-24 example by @dsikka in #946
- Remove requires_torch by @kylesayrs in #949
- Remove unused sparseml.export utilities by @kylesayrs in #950
- Implement HooksMixin by @kylesayrs in #917
- Add LM Eval Testing by @dsikka in #945
- update version by @dsikka in #969
Full Changelog: 0.3.0...0.3.1