Skip to content

Actions: microsoft/DeepSpeed

nv-torch-latest-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,085 workflow runs
5,085 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Add the missing view operations from sequence parallel(async).
nv-torch-latest-v100 #12696: Pull request #6750 synchronize by loadams
December 18, 2024 18:59 Action required inkcherry:ds_overlap_fix
December 18, 2024 18:59 Action required
Fix error caused by all_reduce call in domino
nv-torch-latest-v100 #12695: Pull request #6880 synchronize by hwchen2017
December 18, 2024 18:02 7h 28m 5s hongwei/fix_domino_allreduce
December 18, 2024 18:02 7h 28m 5s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12694: Pull request #6773 synchronize by loadams
December 18, 2024 17:55 6h 0m 24s deepcharm:stage3-use-new-grad-acc-api
December 18, 2024 17:55 6h 0m 24s
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12693: Pull request #6803 synchronize by loadams
December 18, 2024 17:55 2h 36m 56s nelyahu:zero2_param_idx
December 18, 2024 17:55 2h 36m 56s
nv-torch-latest-v100
nv-torch-latest-v100 #12692: Manually run by loadams
December 18, 2024 17:55 29m 44s loadams/inference-ops-test-repro
December 18, 2024 17:55 29m 44s
Update version.txt after 0.16.2 release
nv-torch-latest-v100 #12691: Pull request #6893 opened by loadams
December 18, 2024 17:52 6h 0m 22s AutoPR/0.16.2
December 18, 2024 17:52 6h 0m 22s
Inference ops unit test failures/fixes
nv-torch-latest-v100 #12688: Pull request #6879 synchronize by loadams
December 18, 2024 16:53 3m 40s loadams/inference-ops-test-repro
December 18, 2024 16:53 3m 40s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12687: Pull request #6773 synchronize by loadams
December 18, 2024 16:51 1h 4m 48s deepcharm:stage3-use-new-grad-acc-api
December 18, 2024 16:51 1h 4m 48s
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12686: Pull request #6803 synchronize by loadams
December 18, 2024 16:51 1h 4m 59s nelyahu:zero2_param_idx
December 18, 2024 16:51 1h 4m 59s
Update code owners
nv-torch-latest-v100 #12685: Pull request #6890 synchronize by loadams
December 18, 2024 16:30 1h 32m 36s olruwase/code_owners
December 18, 2024 16:30 1h 32m 36s
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12683: Pull request #6847 synchronize by tjruwase
December 18, 2024 13:59 1h 19m 19s olruwase/pr_6772
December 18, 2024 13:59 1h 19m 19s
Update code owners
nv-torch-latest-v100 #12682: Pull request #6890 opened by tjruwase
December 18, 2024 12:04 1h 37m 34s olruwase/code_owners
December 18, 2024 12:04 1h 37m 34s
Fix error caused by all_reduce call in domino
nv-torch-latest-v100 #12681: Pull request #6880 synchronize by tjruwase
December 18, 2024 11:51 1h 35m 21s hongwei/fix_domino_allreduce
December 18, 2024 11:51 1h 35m 21s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12680: Pull request #6773 synchronize by deepcharm
December 18, 2024 09:44 1h 32m 4s deepcharm:stage3-use-new-grad-acc-api
December 18, 2024 09:44 1h 32m 4s
Add arctic model support by adding w2 to all_reduce
nv-torch-latest-v100 #12678: Pull request #6856 synchronize by loadams
December 18, 2024 01:31 4h 1m 23s pi314ever:arctic-enabling-upstream
December 18, 2024 01:31 4h 1m 23s
nv-torch-latest-v100
nv-torch-latest-v100 #12676: Scheduled
December 18, 2024 00:21 7h 29m 13s master
December 18, 2024 00:21 7h 29m 13s
Fix no-torch workflow and update real_accelerator
nv-torch-latest-v100 #12675: Pull request #6885 opened by loadams
December 17, 2024 22:25 6h 7m 2s loadams/fix-real-accelerator-no-torch
December 17, 2024 22:25 6h 7m 2s
Adds ignore_index to sequence parallel cross entropy
nv-torch-latest-v100 #12674: Pull request #6882 synchronize by tjruwase
December 17, 2024 22:00 6h 0m 43s ronald-d-rogers:add-ignore-index-sp-loss
December 17, 2024 22:00 6h 0m 43s
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12673: Pull request #6803 synchronize by loadams
December 17, 2024 20:22 6h 11m 49s nelyahu:zero2_param_idx
December 17, 2024 20:22 6h 11m 49s
Add arctic model support by adding w2 to all_reduce
nv-torch-latest-v100 #12672: Pull request #6856 synchronize by loadams
December 17, 2024 19:58 5h 10m 31s pi314ever:arctic-enabling-upstream
December 17, 2024 19:58 5h 10m 31s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12671: Pull request #6830 synchronize by loadams
December 17, 2024 19:55 6h 5m 22s loadams/transformers-inference
December 17, 2024 19:55 6h 5m 22s
Inference ops unit test failures/fixes
nv-torch-latest-v100 #12670: Pull request #6879 synchronize by loadams
December 17, 2024 19:54 29m 1s loadams/inference-ops-test-repro
December 17, 2024 19:54 29m 1s
Update transformers ops unit tests to use requried_torch_version
nv-torch-latest-v100 #12669: Pull request #6884 synchronize by loadams
December 17, 2024 18:22 1h 31m 13s loadams/fix-transformers-inference
December 17, 2024 18:22 1h 31m 13s