Skip to content

Actions: microsoft/DeepSpeed

nv-torch-latest-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,085 workflow runs
5,085 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix checkpointable_layers Logic
nv-torch-latest-v100 #12721: Pull request #6881 synchronize by loadams
December 20, 2024 00:55 6h 30m 58s Quentin-Anthony:qanthony/fix-act-recomp
December 20, 2024 00:55 6h 30m 58s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12720: Pull request #6773 synchronize by loadams
December 20, 2024 00:55 6h 12m 2s deepcharm:stage3-use-new-grad-acc-api
December 20, 2024 00:55 6h 12m 2s
nv-torch-latest-v100
nv-torch-latest-v100 #12719: Scheduled
December 20, 2024 00:20 1h 49m 31s master
December 20, 2024 00:20 1h 49m 31s
Fix error caused by all_reduce call in domino
nv-torch-latest-v100 #12718: Pull request #6880 synchronize by loadams
December 19, 2024 23:23 1h 29m 51s hongwei/fix_domino_allreduce
December 19, 2024 23:23 1h 29m 51s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12717: Pull request #6881 synchronize by loadams
December 19, 2024 20:32 4h 21m 9s Quentin-Anthony:qanthony/fix-act-recomp
December 19, 2024 20:32 4h 21m 9s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12716: Pull request #6773 synchronize by loadams
December 19, 2024 20:32 1h 30m 14s deepcharm:stage3-use-new-grad-acc-api
December 19, 2024 20:32 1h 30m 14s
Fix: forbid repeated deepspeed.initialize on training objects
nv-torch-latest-v100 #12715: Pull request #6874 synchronize by traincheck-team
December 19, 2024 18:37 Action required traincheck-team:fix-6848-forbid-repeated-init
December 19, 2024 18:37 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12714: Manually run by loadams
December 19, 2024 18:15 1h 20m 12s loadams/test-transformers-inference
December 19, 2024 18:15 1h 20m 12s
Add the missing view operations from sequence parallel(async).
nv-torch-latest-v100 #12713: Pull request #6750 synchronize by loadams
December 19, 2024 17:39 Action required inkcherry:ds_overlap_fix
December 19, 2024 17:39 Action required
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12712: Pull request #6803 synchronize by loadams
December 19, 2024 17:36 1h 35m 15s nelyahu:zero2_param_idx
December 19, 2024 17:36 1h 35m 15s
Change compile for pipeline module torch.compile
nv-torch-latest-v100 #12711: Pull request #6478 synchronize by loadams
December 19, 2024 17:36 1h 29m 9s NirSonnenschein:torch_compile_micro_offset_fix
December 19, 2024 17:36 1h 29m 9s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12710: Pull request #6830 synchronize by loadams
December 19, 2024 17:32 6h 0m 41s loadams/transformers-inference
December 19, 2024 17:32 6h 0m 41s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12709: Pull request #6830 synchronize by loadams
December 19, 2024 17:27 5m 11s loadams/transformers-inference
December 19, 2024 17:27 5m 11s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12708: Pull request #6830 synchronize by loadams
December 19, 2024 17:25 2m 42s loadams/transformers-inference
December 19, 2024 17:25 2m 42s
hpu_accelerator: use torch.use_deterministic_algorithms
nv-torch-latest-v100 #12705: Pull request #6897 opened by nelyahu
December 19, 2024 07:23 1h 32m 15s nelyahu:patch-2
December 19, 2024 07:23 1h 32m 15s
nv-torch-latest-v100
nv-torch-latest-v100 #12704: Scheduled
December 19, 2024 00:22 3h 4m 22s master
December 19, 2024 00:22 3h 4m 22s
Allow to compile collective for PT > 2.3
nv-torch-latest-v100 #12703: Pull request #6674 reopened by loadams
December 18, 2024 21:53 4h 0m 8s nelyahu:compile_collectives
December 18, 2024 21:53 4h 0m 8s
Allow to compile collective for PT > 2.3
nv-torch-latest-v100 #12702: Pull request #6674 synchronize by loadams
December 18, 2024 21:07 39m 25s nelyahu:compile_collectives
December 18, 2024 21:07 39m 25s
Copy #6674: Allow to compile collective for PT > 2.3
nv-torch-latest-v100 #12701: Pull request #6894 opened by loadams
December 18, 2024 21:01 8h 55m 21s loadams/test-compile-collectives
December 18, 2024 21:01 8h 55m 21s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12700: Pull request #6881 synchronize by Quentin-Anthony
December 18, 2024 20:25 3h 19m 58s Quentin-Anthony:qanthony/fix-act-recomp
December 18, 2024 20:25 3h 19m 58s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12699: Pull request #6881 synchronize by Quentin-Anthony
December 18, 2024 20:24 Action required Quentin-Anthony:qanthony/fix-act-recomp
December 18, 2024 20:24 Action required
Support latest transformers with DSChat
nv-torch-latest-v100 #12698: Pull request #6711 synchronize by loadams
December 18, 2024 20:24 1h 31m 8s loadams/fix-ds-chat-transformers
December 18, 2024 20:24 1h 31m 8s
Training ops kernels: Speeding up the Llama-based MoE architectures
nv-torch-latest-v100 #12697: Pull request #6734 synchronize by loadams
December 18, 2024 19:27 Action required RezaYazdaniAminabadi:tops-kernels
December 18, 2024 19:27 Action required