Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【mthreads】【block】resnet50 training #246

Merged
merged 9 commits into from
Dec 12, 2023

Conversation

mingyuanw-mt
Copy link
Contributor

No description provided.


摩尔线程MTT S系列全功能GPU支持多样算力,借助覆盖深度学习、图形渲染、视频处理和科学计算的完整MUSA软件栈,可为AI训练、AI推理、大模型、AIGC、云游戏、云渲染、视频云、数字孪生等场景提供通用智能算力支持,旨在为数据中心、智算中心和元计算中心的建设构建坚实算力基础,助力元宇宙中多元应用创新和落地。

MUSA软件栈通过musify CUDA代码迁移工具、计算/通信加速库、mcc编译器、musa运行时和驱动实现对CUDA生态的兼容,帮助用户快速完成代码及应用的迁移。通过torch_musa插件,可以实现MTT S系列GPU对原生PyTroch的对接,用户可以无感的把AI模型运行在摩尔线程全功能GPU上。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo Pytroch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -0,0 +1,5 @@
lr = 0.1
train_batch_size = 32
eval_batch_size = 32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请教下这里为什么缩小了8倍?是否八卡上面采用32的local_batchsize性能更优?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是从nvidia配置文件copy过来的,目前还没有跑完整的训练

Copy link
Collaborator

@shh2000 shh2000 Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请尽可能优先支持2*8.即1*1/1*8/2*8

@yuzhou03 yuzhou03 self-assigned this Sep 14, 2023
@upvenly upvenly changed the title mthreads resnet50 training 【block】mthreads resnet50 training Oct 8, 2023
@shh2000 shh2000 changed the title 【block】mthreads resnet50 training 【mthreads】【block】mthreads resnet50 training Nov 3, 2023
@shh2000 shh2000 changed the title 【mthreads】【block】mthreads resnet50 training 【mthreads】【block】resnet50 training Nov 3, 2023
@mingyuanw-mt mingyuanw-mt force-pushed the add_musa_resnet branch 2 times, most recently from 4f23f28 to 9b4f983 Compare December 6, 2023 05:55
## 环境配置参考
- 硬件
- 机器型号: MCCX D800
- 加速卡型号: MTT S3000 32GB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

提交的配置,加速卡 是S4000

# We will run benchmarks in training/<vendor>
VENDOR = "nvidia"
VENDOR = "mthreads"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

无需修改

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是为了方便测试,后续会revert

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请在此文件中添加get_sys_info方法,获取 机器基本信息,并记录到sys_info.log中
具体可以参考:https://github.com/FlagOpen/FlagPerf/blob/main/training/iluvatar/iluvatar_monitor.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加

@upvenly upvenly merged commit 0b03ff5 into FlagOpen:main Dec 12, 2023
shh2000 added a commit that referenced this pull request Dec 21, 2023
* [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config (#346)

* [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config

* [kunlunxin] modify tacotron2 test_config

* [kunlunxin] update tacotron2 readme

* [kunlunxin] modify tacotron2 torch.load()

* [iluvatar] swin_transformer-pytorch 1x1 2x8 (#340)

* update iluvatar/swin_transformer-pytorch

* update

* update

* update

* fix batch size mistake in readme

* correct val_loss to final acc1

* add finnal_acc1 and mem in readme

* correct readme mem

---------

Co-authored-by: 魏杰 <[email protected]>
Co-authored-by: 杨智超 <[email protected]>
Co-authored-by: clveryang <[email protected]>

* fix get_system_info for iluvatar_monitor (#351)

Co-authored-by: zhouyu <[email protected]>

* update iluvatar mobilenetv2 config (#356)

Co-authored-by: sen.li <[email protected]>

* Update README.md (#357)

* Update README.md

* Update README.md

* [iluvatar] bertlarge inference case (#353)

* iluvatar bertlarge MLM inference case

* update ixrt readme

---------

Co-authored-by: 杨智超 <[email protected]>

* [mthreads] bert_hf 1x8 (#350)

* support bert_hf fp32/amp/bf16 training for mthreads

* update readme

* prevent overrun

* 1x1/2x8 not support

* 【mthreads】【block】resnet50 training (#246)

* support resnet50 training on mthreads

* fix typo

* support rn50 amp training on mthreads

* add test config (should revert this commit)

* update config & readme

* add get_system_info fn

* update

* 1x1/2x8 not support

---------

Co-authored-by: Zhou Yu <[email protected]>

* fix llama, add TFLOPS log (#358)

* fixllama

* add t/tflops

* [mthreads] deepspeed llama2

* update readme for sdpa

---------

Co-authored-by: jamesruio <[email protected]>
Co-authored-by: swish swish <[email protected]>
Co-authored-by: 魏杰 <[email protected]>
Co-authored-by: 杨智超 <[email protected]>
Co-authored-by: clveryang <[email protected]>
Co-authored-by: Zhou Yu <[email protected]>
Co-authored-by: zhouyu <[email protected]>
Co-authored-by: forestlee95 <[email protected]>
Co-authored-by: sen.li <[email protected]>
Co-authored-by: uuup <[email protected]>
Co-authored-by: clveryang <[email protected]>
Co-authored-by: mingyuanw-mt <[email protected]>
Co-authored-by: shh2000 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants