You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[None][feat] Skip prefetching consolidated safetensors when appropriate (NVIDIA#7225)
* Why?
Some models (e.g. anything produced by Mistral) can have both sharded
safetensors and a consolidated safetensor in the same checkpoint
directory. In such cases, prefetching both to memory is a waste of time,
and memory.
* What?
This commit skips over consolidated safetensors when they are not the
only safetensor file present in the checkpoint directory.
Signed-off-by: William Zhang <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
0 commit comments