Bump accelerate from 0.22.0 to 0.34.2 #40

dependabot · 2024-09-09T23:52:16Z

Bumps accelerate from 0.22.0 to 0.34.2.

Release notes

v0.34.1 Patchfix

Bug fixes

Fixes an issue where processed DataLoaders could no longer be pickled in #3074 thanks to @byi8220

Fixes an issue when using FSDP where default_transformers_cls_names_to_wrap would separate _no_split_modules by characters instead of keeping it as a list of layer names in #3075

Full Changelog: huggingface/accelerate@v0.34.0...v0.34.1

v0.34.0: StatefulDataLoader Support, FP8 Improvements, and PyTorch Updates!

Dependency Changes

Updated Safetensors Requirement: The library now requires safetensors version 0.4.3.

Added support for Numpy 2.0: The library now fully supports numpy 2.0.0

Core

New Script Behavior Changes

Process Group Management: PyTorch now requires users to destroy process groups after training. The accelerate library will handle this automatically with accelerator.end_training(), or you can do it manually using PartialState().destroy_process_group().

MLU Device Support: Added support for saving and loading RNG states on MLU devices by @huismiling

NPU Support: Corrected backend and distributed settings when using transfer_to_npu, ensuring better performance and compatibility.

DataLoader Enhancements

Stateful DataDataLoader: We are excited to announce that early support has been added for the StatefulDataLoader from torchdata, allowing better handling of data loading states. Enable by passing use_stateful_dataloader=True to the DataLoaderConfiguration, and when calling load_state() the DataLoader will automatically be resumed from its last step, no more having to iterate through passed batches.

Decoupled Data Loader Preparation: The prepare_data_loader() function is now independent of the Accelerator, giving you more flexibility towards which API levels you would like to use.

XLA Compatibility: Added support for skipping initial batches when using XLA.

Improved State Management: Bug fixes and enhancements for saving/loading DataLoader states, ensuring smoother training sessions.

Epoch Setting: Introduced the set_epoch function for MpDeviceLoaderWrapper.

FP8 Training Improvements

Enhanced FP8 Training: Fully Sharded Data Parallelism (FSDP) and DeepSpeed support now work seamlessly with TransformerEngine FP8 training, including better defaults for the quantized FP8 weights.

Integration baseline: We've added a new suite of examples and benchmarks to ensure that our TransformerEngine integration works exactly as intended. These scripts run one half using 🤗 Accelerate's integration, the other with raw TransformersEngine, providing users with a nice example of what we do under the hood with accelerate, and a good sanity check to make sure nothing breaks down over time. Find them here

Import Fixes: Resolved issues with import checks for the Transformers Engine that has downstream issues.

FP8 Docker Images: We've added new docker images for TransformerEngine and accelerate as well. Use docker pull huggingface/accelerate@gpu-fp8-transformerengine to quickly get an environment going.

torchpippy no more, long live torch.distributed.pipelining

With the latest PyTorch release, torchpippy is now fully integrated into torch core, and as a result we are exclusively supporting the PyTorch implementation from now on

There are breaking examples and changes that comes from this shift. Namely:

Tracing of inputs is done with a shape each GPU will see, rather than the size of the total batch. So for 2 GPUs, one should pass in an input of [1, n, n] rather than [2, n, n] as before.

We no longer support Encoder/Decoder models. PyTorch tracing for pipelining no longer supports encoder/decoder models, so the t5 example has been removed.

Computer vision model support currently does not work: There are some tracing issues regarding resnet's we are actively looking into.

If either of these changes are too breaking, we recommend pinning your accelerate version. If the encoder/decoder model support is actively blocking your inference using pippy, please open an issue and let us know. We can look towards adding in the old support for torchpippy potentially if needed.

Fully Sharded Data Parallelism (FSDP)

Environment Flexibility: Environment variables are now fully optional for FSDP, simplifying configuration. You can now fully create a FullyShardedDataParallelPlugin yourself manually with no need for environment patching:
from accelerate import FullyShardedDataParallelPlugin
fsdp_plugin = FullyShardedDataParallelPlugin(...)
FSDP RAM efficient loading: Added a utility to enable RAM-efficient model loading (by setting the proper environmental variable). This is generally needed if not using accelerate launch and need to ensure the env variables are setup properly for model loading:
from accelerate.utils import enable_fsdp_ram_efficient_loading, disable_fsdp_ram_efficient_loading
</tr></table> 

... (truncated)

Commits

c61f41c Release: v0.34.2
beb4378 Release: v0.34.1
e13bef2 Allow DataLoaderAdapter subclasses to be pickled by implementing __reduce__...
73a1531 Fix FSDP auto_wrap using characters instead of full str for layers (#3075)
159c0dd Release: v0.34.0
8931e5e Remove skip_first_batches support for StatefulDataloader and fix all the te...
a848592 Speed up tests by shaving off subprocess when not needed (#3042)
758d624 add set_epoch for MpDeviceLoaderWrapper (#3053)
b07ad2a Fix typo in comment (#3045)
1d09a20 use duck-typing to ensure underlying optimizer supports schedulefree hooks (#...
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [accelerate](https://github.com/huggingface/accelerate) from 0.22.0 to 0.34.2. - [Release notes](https://github.com/huggingface/accelerate/releases) - [Commits](huggingface/accelerate@v0.22.0...v0.34.2) --- updated-dependencies: - dependency-name: accelerate dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

dependabot bot added the dependencies Pull requests that update a dependency file label Sep 9, 2024

dependabot bot mentioned this pull request Sep 9, 2024

Bump accelerate from 0.22.0 to 0.34.0 #38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump accelerate from 0.22.0 to 0.34.2 #40

Bump accelerate from 0.22.0 to 0.34.2 #40

dependabot bot commented on behalf of github Sep 9, 2024

Bump accelerate from 0.22.0 to 0.34.2 #40

Are you sure you want to change the base?

Bump accelerate from 0.22.0 to 0.34.2 #40

Conversation

dependabot bot commented on behalf of github Sep 9, 2024

v0.34.1 Patchfix

Bug fixes

v0.34.0: StatefulDataLoader Support, FP8 Improvements, and PyTorch Updates!

Dependency Changes

Core

New Script Behavior Changes

DataLoader Enhancements

FP8 Training Improvements

torchpippy no more, long live torch.distributed.pipelining

Fully Sharded Data Parallelism (FSDP)

`torchpippy` no more, long live `torch.distributed.pipelining`