All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added utility to get device name from hl-smi (#232)
- Integrated Intel Neural Compressor for FP8 inference (#235)
- Updated to Intel Gaudi software Release 1.16.2 (#207)
- Updated to Intel Gaudi software Release 1.17.0 (#221)
- Modified torch device specification for FSDP on HPU (#222)
- Updated strategy to use default fork (#234)
- Updated hpu parallel strategy as base class (#237)
- Updated to Intel Gaudi software Release 1.18.0 (#245)
- Fixed device name retrieval without hlsmi (#240)
- Deprecated support for Habana Quantization Toolkit. (#235)
- Added support for additional dtypes (#194)
- Added more tests of FSDP with HPU (#197)
- Added FSDP strategy with fabric on HPU (#198)
- Updated to common
hpu_backend
interface for compile support. (#183) - Updated to Intel Gaudi software Release 1.16.0 (#191)
- Updated HQT APIs to be in accordance with Intel Gaudi software Release 1.16.0 (#192)
- Updated HPUPrecisionPlugin for fp8 based on Intel Gaudi software Release 1.16.0. (#195)
- Fixed deepspeed documentation & tests based on synapse AI release 1.15.1 and latest PTL fabric. (#184)
- Workaround to resolve label name issue in HPUProfiler with torch.compile. (#185)
- Fixed incompatibility issue for PyTorch>=2.3.0 (#193)
- Added support for Intel Gaudi Profiler. Deprecate
HABANA_PROFILE
environment variable from HPUProfiler. (#158) - Added support for FP8 inference. (#162)
- Added support for LightningCLI. (#173)
- Added experimental support for FSDP on HPU. (#174)
- Added support for FP8 inference with DeepSpeed. (#176)
- Updated the lightning version check for using FSDP. (#182)
- Changed HPUParallelStrategy to HPUDDPStrategy (#160)
- Changed HPU docker image based on Synapse AI release 1.15.0 (#166)
- Updated to Intel Gaudi software Release 1.15.1 (#171)
- Fixed "No profiler activity found" error with HPUProfiler. (#172)
- Decoupled return strings of firmware, synapse version helper (#137)
- Changed HPU docker image based on Synapse AI release 1.14.0 (#140)
- Fixed fabric imports for HPU strategies (#126)
- Enabling tests and examples of fabric with HPU (#139)
- Fixes an API break due to non-strict loading in Trainer (#150)
aot_hpu_training_backend
will be deprecated. Usehpu_backend
instead for torch compile with hpu (#148)
- Added support for Deepspeed inference on HPU with tests and documentation (#110)
- Added tests, examples, and documentation for dynamic shapes with recipe caching (#107)
- Added preview of torch compile with tests and documentation (#119)
- Changed HPU docker image based on Synapse AI release 1.13.0 (#114)
- Fixed fabric imports for HPU strategies (#126)
- Added tests, examples and documentation for HPUPrecisionPlugin with autocast (#94)
- Added test to validate checkpoint resuming with HPUDeepSpeedStrategy (#95)
- Added support for lightning 2.1 (#100, #105)
- Changed HPU docker image based on synapse AI release 1.12.0 (#90)
- Use standard API's and Remove env variable to get HPU distributed backend (#91)
- Changed HPU docker image based on synapse AI release 1.12.1, updated hooks (#106)
- Documentation with examples for using DeepSpeed with HPU (#64)
- Add autocast using HPUPrecision plugin (#66, #75)
- Demonstrate HPU Graphs support (#67)
- Enhance test coverage of DeepSpeed strategy on HPU (#68)
- Added version check helper to use right release (#75, #76)
- Implement reduce with parallel plugin (#77)
- Changed HPU docker image based on synapse AI release 1.11.0 & upgraded deepspeed plugin to version 0.9.4 (#61)
- Fixed optimizer priority based on deepspeed specification (#36)
- Fixed missing extras in package (#70)
- Warn on HMP deprecation from
HPUPrecision
plugin (#65)
- Enabled skipped tests based on registered strategy, accelerator (#46)
- Fixed Attribute Error (#43)
- Fixed wrong imports (#44)
- Fixed graph breaks in test/val phases in lazy mode (#45)
- Added HPU support for fabric (#11)
- Added Pytorch HPU profiler support (#15)
- Added basic HPU infra support for deep speed (#21)
- Added Pytorch HPU datamodule support (#16)
- Changed code hierarchy in compliance with base lightning code for pytorch (#12)
- Changed default HPU docker image based on HPU release 1.10.0 (#30)