Skip to content

Actions: microsoft/DeepSpeed

nv-torch-latest-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,127 workflow runs
5,127 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix: RuntimeError for UCP large DP
nv-torch-latest-v100 #12760: Pull request #6918 opened by saforem2
December 29, 2024 18:23 In progress saforem2/ucp-bug
December 29, 2024 18:23 In progress
nv-torch-latest-v100
nv-torch-latest-v100 #12759: Scheduled
December 29, 2024 00:23 1h 32m 49s master
December 29, 2024 00:23 1h 32m 49s
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12758: Pull request #6847 synchronize by tjruwase
December 28, 2024 19:44 1h 21m 5s olruwase/pr_6772
December 28, 2024 19:44 1h 21m 5s
nv-torch-latest-v100
nv-torch-latest-v100 #12757: Scheduled
December 28, 2024 00:20 1h 34m 9s master
December 28, 2024 00:20 1h 34m 9s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12756: Pull request #6909 synchronize by hj-wei
December 27, 2024 03:06 Action required hj-wei:dev_hjwei
December 27, 2024 03:06 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12753: Scheduled
December 27, 2024 00:20 1h 32m 55s master
December 27, 2024 00:20 1h 32m 55s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12752: Pull request #6773 synchronize by loadams
December 26, 2024 20:09 1h 47m 22s deepcharm:stage3-use-new-grad-acc-api
December 26, 2024 20:09 1h 47m 22s
Change compile for pipeline module torch.compile
nv-torch-latest-v100 #12751: Pull request #6478 synchronize by loadams
December 26, 2024 20:08 1h 29m 39s NirSonnenschein:torch_compile_micro_offset_fix
December 26, 2024 20:08 1h 29m 39s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12750: Pull request #6773 synchronize by loadams
December 26, 2024 17:40 1h 38m 33s deepcharm:stage3-use-new-grad-acc-api
December 26, 2024 17:40 1h 38m 33s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12749: Pull request #6909 synchronize by loadams
December 26, 2024 17:15 Action required hj-wei:dev_hjwei
December 26, 2024 17:15 Action required
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12748: Pull request #6847 synchronize by loadams
December 26, 2024 17:13 1h 31m 7s olruwase/pr_6772
December 26, 2024 17:13 1h 31m 7s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12747: Pull request #6881 synchronize by loadams
December 26, 2024 17:12 1h 32m 5s Quentin-Anthony:qanthony/fix-act-recomp
December 26, 2024 17:12 1h 32m 5s
Update Gaudi2 jobs to latest 1.19 build
nv-torch-latest-v100 #12746: Pull request #6905 synchronize by loadams
December 26, 2024 17:12 6h 0m 24s raza-sikander:master
December 26, 2024 17:12 6h 0m 24s
Add fp8_gemm fallback for non-triton systems
nv-torch-latest-v100 #12744: Pull request #6916 opened by oelayan7
December 26, 2024 08:52 Action required oelayan7:fp8_gemm_no_triton
December 26, 2024 08:52 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12743: Scheduled
December 26, 2024 00:20 1h 24m 46s master
December 26, 2024 00:20 1h 24m 46s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12739: Pull request #6909 synchronize by hj-wei
December 25, 2024 02:18 Action required hj-wei:dev_hjwei
December 25, 2024 02:18 Action required
Add the missing view operations from sequence parallel(async).
nv-torch-latest-v100 #12738: Pull request #6750 synchronize by inkcherry
December 25, 2024 01:50 Action required inkcherry:ds_overlap_fix
December 25, 2024 01:50 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12737: Scheduled
December 25, 2024 00:20 1h 34m 19s master
December 25, 2024 00:20 1h 34m 19s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12736: Pull request #6909 opened by hj-wei
December 24, 2024 07:38 Action required hj-wei:dev_hjwei
December 24, 2024 07:38 Action required