[feature request] Native checkpointing to/from `s3://`

### 🚀 The feature, motivation and pitch

Sometimes it's beneficial to directly stream checkpoint data to a cloud storage (rather than dump it localy and have some background process handle sync/upload/cleanup) or load weights from `s3://` or `gs://` path checkpoint weights. I wonder if this also can be related to the recent native HFStorageReader support https://github.com/pytorch/pytorch/pull/154518

I also found that torchsnapshot library (seems abandonned now - last commit 6 months ago) supports this: https://docs.pytorch.org/torchsnapshot/main/getting_started.html

---

Also there exist a fairly popular packge fsspec which itself wraps some cloud storage libraries and provides caching functionalities, there was some discussion in torchsnapshot on supporting in natively:
- https://github.com/pytorch/torchsnapshot/issues/102
- https://github.com/pytorch/torchsnapshot/pull/114

And some older (before HF) discussions:
- https://github.com/pytorch/pytorch/issues/91965
- https://github.com/pytorch/pytorch/issues/68320

I wonder if some HF utils on checkpointing / HF Hub blob caching structure could be upstreamed to PyTorch. E.g. for loading pretrained weights, this should be good. And maybe some `hf://` management/caching could be made to plug into fsspec interface? Then `hf://` could be used in all fsspec-using places.

### Alternatives

_No response_

### Additional context

_No response_

cc @mruberry @mikaylagawarecki @LucasLLC @pradeepfn @MeetVadakkanchery @mhorowitz @ekr0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature request] Native checkpointing to/from `s3://` #155992

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature request] Native checkpointing to/from s3:// #155992

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[feature request] Native checkpointing to/from `s3://` #155992