【mthreads】【block】resnet50 training #246

mingyuanw-mt · 2023-09-12T02:03:16Z

No description provided.

shh2000 · 2023-09-12T02:43:11Z

training/mthreads/README.md

+
+摩尔线程MTT  S系列全功能GPU支持多样算力，借助覆盖深度学习、图形渲染、视频处理和科学计算的完整MUSA软件栈，可为AI训练、AI推理、大模型、AIGC、云游戏、云渲染、视频云、数字孪生等场景提供通用智能算力支持，旨在为数据中心、智算中心和元计算中心的建设构建坚实算力基础，助力元宇宙中多元应用创新和落地。
+
+MUSA软件栈通过musify CUDA代码迁移工具、计算/通信加速库、mcc编译器、musa运行时和驱动实现对CUDA生态的兼容，帮助用户快速完成代码及应用的迁移。通过torch_musa插件，可以实现MTT S系列GPU对原生PyTroch的对接，用户可以无感的把AI模型运行在摩尔线程全功能GPU上。


typo Pytroch

training/mthreads/README.md

training/mthreads/resnet50-pytorch/README.md

shh2000 · 2023-09-12T02:45:26Z

training/mthreads/resnet50-pytorch/config/config_S3000x1x8.py

@@ -0,0 +1,5 @@
+lr = 0.1
+train_batch_size = 32
+eval_batch_size = 32


请教下这里为什么缩小了8倍？是否八卡上面采用32的local_batchsize性能更优？

这里是从nvidia配置文件copy过来的，目前还没有跑完整的训练

shh2000 · 2023-09-12T02:45:47Z

training/mthreads/resnet50-pytorch/config/config_S3000x1x2.py

请尽可能优先支持2*8.即1*1/1*8/2*8

yuzhou03 · 2023-12-07T07:53:22Z

training/mthreads/README.md

+## 环境配置参考
+  - 硬件
+    - 机器型号： MCCX D800
+    - 加速卡型号: MTT S3000 32GB


提交的配置，加速卡是S4000

yuzhou03 · 2023-12-07T07:58:28Z

training/run_benchmarks/config/test_conf.py

 # We will run benchmarks in training/<vendor>
-VENDOR = "nvidia"
+VENDOR = "mthreads"


无需修改

这里是为了方便测试，后续会revert

yuzhou03 · 2023-12-08T03:48:34Z

training/mthreads/mthreads_monitor.py

请在此文件中添加get_sys_info方法，获取机器基本信息，并记录到sys_info.log中
具体可以参考：https://github.com/FlagOpen/FlagPerf/blob/main/training/iluvatar/iluvatar_monitor.py

* [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config (#346) * [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config * [kunlunxin] modify tacotron2 test_config * [kunlunxin] update tacotron2 readme * [kunlunxin] modify tacotron2 torch.load() * [iluvatar] swin_transformer-pytorch 1x1 2x8 (#340) * update iluvatar/swin_transformer-pytorch * update * update * update * fix batch size mistake in readme * correct val_loss to final acc1 * add finnal_acc1 and mem in readme * correct readme mem --------- Co-authored-by: 魏杰 <[email protected]> Co-authored-by: 杨智超 <[email protected]> Co-authored-by: clveryang <[email protected]> * fix get_system_info for iluvatar_monitor (#351) Co-authored-by: zhouyu <[email protected]> * update iluvatar mobilenetv2 config (#356) Co-authored-by: sen.li <[email protected]> * Update README.md (#357) * Update README.md * Update README.md * [iluvatar] bertlarge inference case (#353) * iluvatar bertlarge MLM inference case * update ixrt readme --------- Co-authored-by: 杨智超 <[email protected]> * [mthreads] bert_hf 1x8 (#350) * support bert_hf fp32/amp/bf16 training for mthreads * update readme * prevent overrun * 1x1/2x8 not support * 【mthreads】【block】resnet50 training (#246) * support resnet50 training on mthreads * fix typo * support rn50 amp training on mthreads * add test config (should revert this commit) * update config & readme * add get_system_info fn * update * 1x1/2x8 not support --------- Co-authored-by: Zhou Yu <[email protected]> * fix llama, add TFLOPS log (#358) * fixllama * add t/tflops * [mthreads] deepspeed llama2 * update readme for sdpa --------- Co-authored-by: jamesruio <[email protected]> Co-authored-by: swish swish <[email protected]> Co-authored-by: 魏杰 <[email protected]> Co-authored-by: 杨智超 <[email protected]> Co-authored-by: clveryang <[email protected]> Co-authored-by: Zhou Yu <[email protected]> Co-authored-by: zhouyu <[email protected]> Co-authored-by: forestlee95 <[email protected]> Co-authored-by: sen.li <[email protected]> Co-authored-by: uuup <[email protected]> Co-authored-by: clveryang <[email protected]> Co-authored-by: mingyuanw-mt <[email protected]> Co-authored-by: shh2000 <[email protected]>

yuzhou03 assigned upvenly and shh2000 Sep 12, 2023

shh2000 reviewed Sep 12, 2023

View reviewed changes

training/mthreads/README.md Outdated Show resolved Hide resolved

shh2000 reviewed Sep 12, 2023

View reviewed changes

training/mthreads/resnet50-pytorch/README.md Outdated Show resolved Hide resolved

shh2000 reviewed Sep 12, 2023

View reviewed changes

yuzhou03 self-assigned this Sep 14, 2023

mingyuanw-mt force-pushed the add_musa_resnet branch from b29f5a1 to 61a64b2 Compare September 15, 2023 09:38

upvenly changed the title ~~mthreads resnet50 training~~ 【block】mthreads resnet50 training Oct 8, 2023

shh2000 changed the title ~~【block】mthreads resnet50 training~~ 【mthreads】【block】mthreads resnet50 training Nov 3, 2023

shh2000 changed the title ~~【mthreads】【block】mthreads resnet50 training~~ 【mthreads】【block】resnet50 training Nov 3, 2023

mingyuanw-mt added 3 commits December 4, 2023 12:14

support resnet50 training on mthreads

b46682c

fix typo

5e99bda

support rn50 amp training on mthreads

3ad5765

mingyuanw-mt force-pushed the add_musa_resnet branch 2 times, most recently from 4f23f28 to 9b4f983 Compare December 6, 2023 05:55

add test config (should revert this commit)

952b243

mingyuanw-mt force-pushed the add_musa_resnet branch from 9b4f983 to 952b243 Compare December 6, 2023 06:21

yuzhou03 reviewed Dec 7, 2023

View reviewed changes

update config & readme

37b1535

yuzhou03 reviewed Dec 8, 2023

View reviewed changes

mingyuanw-mt added 2 commits December 8, 2023 14:31

add get_system_info fn

8e1cc5e

update

87ced90

yuzhou03 approved these changes Dec 11, 2023

View reviewed changes

Merge branch 'main' into add_musa_resnet

b103cd3

shh2000 approved these changes Dec 11, 2023

View reviewed changes

1x1/2x8 not support

2c35806

upvenly merged commit 0b03ff5 into FlagOpen:main Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【mthreads】【block】resnet50 training #246

【mthreads】【block】resnet50 training #246

mingyuanw-mt commented Sep 12, 2023

shh2000 Sep 12, 2023

mingyuanw-mt Sep 15, 2023

shh2000 Sep 12, 2023

mingyuanw-mt Sep 12, 2023

shh2000 Sep 12, 2023 •

edited

Loading

yuzhou03 Dec 7, 2023

yuzhou03 Dec 7, 2023

mingyuanw-mt Dec 7, 2023

yuzhou03 Dec 8, 2023

mingyuanw-mt Dec 8, 2023


		摩尔线程MTT S系列全功能GPU支持多样算力，借助覆盖深度学习、图形渲染、视频处理和科学计算的完整MUSA软件栈，可为AI训练、AI推理、大模型、AIGC、云游戏、云渲染、视频云、数字孪生等场景提供通用智能算力支持，旨在为数据中心、智算中心和元计算中心的建设构建坚实算力基础，助力元宇宙中多元应用创新和落地。

		MUSA软件栈通过musify CUDA代码迁移工具、计算/通信加速库、mcc编译器、musa运行时和驱动实现对CUDA生态的兼容，帮助用户快速完成代码及应用的迁移。通过torch_musa插件，可以实现MTT S系列GPU对原生PyTroch的对接，用户可以无感的把AI模型运行在摩尔线程全功能GPU上。

【mthreads】【block】resnet50 training #246

【mthreads】【block】resnet50 training #246

Conversation

mingyuanw-mt commented Sep 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shh2000 Sep 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shh2000 Sep 12, 2023 •

edited

Loading