-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[metax] add glm result #466
Conversation
test_conf.py中添加这2个模型的示例配置 |
training_event = None | ||
|
||
max_samples_termination = 1388270 * 4 | ||
target_accuracy = 0.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
无需
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这3个配置文件中,有许多参数的值是重复的,建议把公共的参数,放到config_common.py中去引用它。
|
training/metax/glm-pytorch/README.md
Outdated
| ------------------- | --------- | --------------- | -------- | ------- | ------- | ------ | ----- | --------- | ----- | | ||
| C500单机8卡(1x8) | fp32 | / | | | | | 0.802 | 54.5/64.0 | | | ||
| C500单机单卡(1x1) | fp32 | / | | | | | / | 50.4/64.0 | | | ||
| C500两机16卡(2x8) | fp32 | / | | | | | / | 29.8.0/64.0 | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请确认此处29.8.0是29.8还是其他数值
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是29.8, 已修改
* add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md
* commit * . * fix * fix * add net * fix * Update README.md * Update README.md * Update README.md * add package version * Update README.md fix * add aquila_7b_finetune * fix * Update cluster_conf.py * Update README.md * Update flagscale_main.sh * fix * Update README.md * fix * Update README.md * Update test_conf.py * update t5small & txl readme (#443) * update t5small pytorch version & doc * add training result for 1x1, 2x8 * update readme for txl --------- Co-authored-by: zhouyu <[email protected]> * [metax] bert_hf (#456) * add bert_hf result * Update README.md 1 * [metax] add efficientnet (#455) * add efficientnet * add code --------- Co-authored-by: xiaofeng guo <[email protected]> * add nv results for distilbert (#458) Co-authored-by: zhouyu <[email protected]> * Update test_conf.py (#461) [metax] add waveglow case * Merge Aquila70B and others into main branch (#460) * Aquila multinode (#349) * init * 123123 * rm privacy * fixnet * monitor * 123123 * fix * add req * 123123 * 123132 * 23123 * 123123 * 123123 * 123123 * 123123 * 23123 * Aquila 34/70B (#364) * init * 123123 * rm privacy * fixnet * monitor * 123123 * fix * add req * 123123 * 123132 * 23123 * 123123 * 123123 * 123123 * 123123 * 23123 * try-3470 * sync FlagScale & vendor_shell (#374) * sync FlagScale & vendor_shell * fix * 123 * add vis * add vis * add vis * add vis * add vis * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform (#396) * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform * [Kunlun] add monitory.py for xpu * [Kunlun] add Dockerfile * [Kunlun] add 2 file: monitor data processing file * [Kunlun] add singlenode_correctness.sh file * [Kunlun] rm lr config in singlenode_adapt.sh * [Kunlun] add 7B mpu config --------- Co-authored-by: root <[email protected]> * [Cambricon] support FlagPerf (#398) * [Cambricon] support FlagPerf * [Cambricon] fixed vendor name in singlenode_adapt.sh; deleted useless directory in aquila2_7B_container-in_container; fixed standalone_monitor.py&cambricon_monitor * [Cambricon] revised standalone_monitor.py (#424) * [mthreads] support Aquila2 7B/34B/70B (#385) * [mthreads] support Aquila2 7B/34B/70B * [mthreads] add config and singlenode adapt for 34B/70B * add singlenode_correctness.sh * add display_line and scatter_gpu script * add recompute attention&layernorm --------- Co-authored-by: yehua.zhang <[email protected]> * [mthreads] modify recompute argument (#426) * [mthreads] modify recompute argument * add 70B of 128&256 gpus' recompute argument --------- Co-authored-by: yehua.zhang <[email protected]> * [DCU] support Aquila2 7B/34B/70B (#427) * added dcu-aquila2 * added dcu-aquila2 * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update singlenode_adapt.sh * Update config.py * Update singlenode_run.sh * Update config.py * Update singlenode_run.sh * updata run_benchmarks * added readme for in_cluster --------- Co-authored-by: ying zhao <yingzhao27> Co-authored-by: shh2000 <[email protected]> * [Iluvatar] support Aquila2 7B/34B/70B. (#435) * update iluvatar aquila2 7b/34b/70b. update iluvatar aquila2 7b/34b/70b. * update iluvatar aquila2 7B/34B/70B configuration parameters. update iluvatar aquila2 7B/34B/70B configuration parameters. * update iluvatar Aquila2 7B accuracy testing. update iluvatar Aquila2 7B accuracy testing. * Move the location of the iluvatar visual script. Move the location of the iluvatar visual script. * [Ascend] Support Aquila2 7B (#433) * [Ascend] Support Aquila2 34B&70B (#436) * DCU mini-update (#447) * [Mthreads] modify 7B test args (#445) * [Cambricon] add display_line.py, scatter_gpu.py and revise standalone_monitor.py (#449) * [Ascend] Update training scripts for Aquila2 (#450) * [Ascend] update monitor (#451) * Update singlenode_adapt.sh (#452) * [Ascend] modify monitor (#453) * [Ascend] Update scripts for Aquila2 (#454) --------- Co-authored-by: helen88 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: HawkL327 <[email protected]> Co-authored-by: shang-mt <[email protected]> Co-authored-by: yehua.zhang <[email protected]> Co-authored-by: Ying Zhao <[email protected]> Co-authored-by: forestlee95 <[email protected]> Co-authored-by: LoomisChen <[email protected]> Co-authored-by: Haitao Wang <[email protected]> * [metax] Waveglow pr (#457) * add t5_small and transformer_xl * Update README.md change t5_small readme * Update README.md change transformer_xl readme * Update README.md fix t5_small readme * Update README.md fix transformer_xl readme * Update README.md fix t5_small readme requirement.txt path * first commit * Update README.md change metax val loss to -5.7461 --------- Co-authored-by: jiaxing xie <[email protected]> * update readme (#463) * format? * ur * update readme (#464) Co-authored-by: zhouyu <[email protected]> * 【Metax】Add mobilenetv2 (#465) * add mobilenetv2 * fix * fix * Update test_conf.py (#469) [metax] add bert_hf /glm sample Co-authored-by: shh2000 <[email protected]> * [metax] add glm result (#466) * add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md * 【metax】add model mask_rcnn and detr (#459) * add model mask_rcnn and detr * maskrcnn & detr model logs * [metax] stablediffusion inference pr (#468) * update * update inference * update readme * update dockerfile --------- Co-authored-by: Shengchu Zhao <[email protected]> * [KUNLUN] add llama70B case (#470) * [KUNLUN] add llama70B case * [KUNLUN] add llama70B case * Merge branch 'main' of https://github.com/ZLkanyo009/FlagPerf into main * Update README.md --------- Co-authored-by: zhangling21 <[email protected]> * [metax] swintransformer-inference pr (#473) * add metax swin-transformer * mod readme * mod readme * mod swin * Update README.md * Update config_common.py * Update requirements.txt * fix torch_six in swin_transformer * Update utils.py * add metax swintrans-infer --------- Co-authored-by: jingyifa <[email protected]> * [DCU]Add glm case of dcu in FlagPerf. (#472) * Add glm case of dcu in Flagperf. * update 1*1 log * Update README infos in glm_pytorch of DCU. --------- Co-authored-by: shh2000 <[email protected]> * add resnet infer metax (#474) Co-authored-by: yaguang.wuyaguang <[email protected]> * [metax] add bert_large inference result (#476) * add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md * update metax bertlarge inference result * update metax bert_large inference result * Update README.md * 【BAAI】add MoFlow pretraining std case (#397) * add MoFlow std case * update readme * add case example for test_conf * change to comment * rdkit add version * add jit & cuda_graph to mutable_params, overwritten by vendors are allowed * rename config_name to dataset_name * set time statistic variables to 0 * update seed and target_nuv * update 1x8 result for official bs * update notice for readme * Update test_conf.py --------- Co-authored-by: zhouyu <[email protected]> Co-authored-by: shh2000 <[email protected]> * ur (#478) * [metax]添加sam/vit推理结果 (#477) * 添加sam/vit推理结果 * 添加硬件信息 --------- Co-authored-by: fdeng <[email protected]> * 【Metax】Add yolov5 infer (#479) * add yolov5 * add readme * fix * Update Dockerfile * . add llava case * v2 . * . . * . * commit commit * readme1 commit * llava13b llava13b * llava1.5_7b llava1.5_7b * del del * del del * fix fix * del del * add add * add add --------- Co-authored-by: shh2000 <[email protected]> Co-authored-by: Zhou Yu <[email protected]> Co-authored-by: zhouyu <[email protected]> Co-authored-by: Kathrine <[email protected]> Co-authored-by: xfguo <[email protected]> Co-authored-by: xiaofeng guo <[email protected]> Co-authored-by: sherryxie1 <[email protected]> Co-authored-by: helen88 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: HawkL327 <[email protected]> Co-authored-by: shang-mt <[email protected]> Co-authored-by: yehua.zhang <[email protected]> Co-authored-by: Ying Zhao <[email protected]> Co-authored-by: forestlee95 <[email protected]> Co-authored-by: LoomisChen <[email protected]> Co-authored-by: Haitao Wang <[email protected]> Co-authored-by: jiaxing xie <[email protected]> Co-authored-by: 会意 <[email protected]> Co-authored-by: happyxuwork <[email protected]> Co-authored-by: fred1912 <[email protected]> Co-authored-by: Shengchu Zhao <[email protected]> Co-authored-by: Ling Zhang <[email protected]> Co-authored-by: zhangling21 <[email protected]> Co-authored-by: FaJingyi <[email protected]> Co-authored-by: jingyifa <[email protected]> Co-authored-by: Rayyyyy <[email protected]> Co-authored-by: jsnoc <[email protected]> Co-authored-by: yaguang.wuyaguang <[email protected]> Co-authored-by: dfgan <[email protected]> Co-authored-by: fdeng <[email protected]>
No description provided.