[metax] add glm result #466

Kathrine94 · 2024-02-29T06:33:59Z

No description provided.

1

yuzhou03 · 2024-02-29T07:03:50Z

test_conf.py中添加这2个模型的示例配置

yuzhou03 · 2024-02-29T07:18:41Z

training/metax/glm-pytorch/config/config_C500x2x8.py

+training_event = None
+
+max_samples_termination = 1388270 * 4
+target_accuracy = 0.8


这3个配置文件中，有许多参数的值是重复的，建议把公共的参数，放到config_common.py中去引用它。

Kathrine94 · 2024-03-04T02:47:32Z

test_conf.py中添加这2个模型的示例配置

已在https://github.com/FlagOpen/FlagPerf/pull/469中更新

nrikoh · 2024-03-04T03:03:23Z

training/metax/glm-pytorch/README.md

+| ------------------- | --------- | --------------- | -------- | ------- | ------- | ------ | ----- | --------- | ----- |
+| C500单机8卡（1x8）  | fp32      | / |     |     |  |  | 0.802 | 54.5/64.0 |  |
+| C500单机单卡（1x1） | fp32      | / |     |    | |   | /     | 50.4/64.0 |  |
+| C500两机16卡（2x8） | fp32      | /  |     |  |    |   | /     | 29.8.0/64.0 | |


请确认此处29.8.0是29.8还是其他数值

是29.8, 已修改

* add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md

* commit * . * fix * fix * add net * fix * Update README.md * Update README.md * Update README.md * add package version * Update README.md fix * add aquila_7b_finetune * fix * Update cluster_conf.py * Update README.md * Update flagscale_main.sh * fix * Update README.md * fix * Update README.md * Update test_conf.py * update t5small & txl readme (#443) * update t5small pytorch version & doc * add training result for 1x1, 2x8 * update readme for txl --------- Co-authored-by: zhouyu <[email protected]> * [metax] bert_hf (#456) * add bert_hf result * Update README.md 1 * [metax] add efficientnet (#455) * add efficientnet * add code --------- Co-authored-by: xiaofeng guo <[email protected]> * add nv results for distilbert (#458) Co-authored-by: zhouyu <[email protected]> * Update test_conf.py (#461) [metax] add waveglow case * Merge Aquila70B and others into main branch (#460) * Aquila multinode (#349) * init * 123123 * rm privacy * fixnet * monitor * 123123 * fix * add req * 123123 * 123132 * 23123 * 123123 * 123123 * 123123 * 123123 * 23123 * Aquila 34/70B (#364) * init * 123123 * rm privacy * fixnet * monitor * 123123 * fix * add req * 123123 * 123132 * 23123 * 123123 * 123123 * 123123 * 123123 * 23123 * try-3470 * sync FlagScale & vendor_shell (#374) * sync FlagScale & vendor_shell * fix * 123 * add vis * add vis * add vis * add vis * add vis * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform (#396) * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform * [Kunlun] add aquila 7b/34b/70b pretrained for ai platform * [Kunlun] add monitory.py for xpu * [Kunlun] add Dockerfile * [Kunlun] add 2 file: monitor data processing file * [Kunlun] add singlenode_correctness.sh file * [Kunlun] rm lr config in singlenode_adapt.sh * [Kunlun] add 7B mpu config --------- Co-authored-by: root <[email protected]> * [Cambricon] support FlagPerf (#398) * [Cambricon] support FlagPerf * [Cambricon] fixed vendor name in singlenode_adapt.sh; deleted useless directory in aquila2_7B_container-in_container; fixed standalone_monitor.py&cambricon_monitor * [Cambricon] revised standalone_monitor.py (#424) * [mthreads] support Aquila2 7B/34B/70B (#385) * [mthreads] support Aquila2 7B/34B/70B * [mthreads] add config and singlenode adapt for 34B/70B * add singlenode_correctness.sh * add display_line and scatter_gpu script * add recompute attention&layernorm --------- Co-authored-by: yehua.zhang <[email protected]> * [mthreads] modify recompute argument (#426) * [mthreads] modify recompute argument * add 70B of 128&256 gpus' recompute argument --------- Co-authored-by: yehua.zhang <[email protected]> * [DCU] support Aquila2 7B/34B/70B (#427) * added dcu-aquila2 * added dcu-aquila2 * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update singlenode_adapt.sh * Update config.py * Update singlenode_run.sh * Update config.py * Update singlenode_run.sh * updata run_benchmarks * added readme for in_cluster --------- Co-authored-by: ying zhao <yingzhao27> Co-authored-by: shh2000 <[email protected]> * [Iluvatar] support Aquila2 7B/34B/70B. (#435) * update iluvatar aquila2 7b/34b/70b. update iluvatar aquila2 7b/34b/70b. * update iluvatar aquila2 7B/34B/70B configuration parameters. update iluvatar aquila2 7B/34B/70B configuration parameters. * update iluvatar Aquila2 7B accuracy testing. update iluvatar Aquila2 7B accuracy testing. * Move the location of the iluvatar visual script. Move the location of the iluvatar visual script. * [Ascend] Support Aquila2 7B (#433) * [Ascend] Support Aquila2 34B&70B (#436) * DCU mini-update (#447) * [Mthreads] modify 7B test args (#445) * [Cambricon] add display_line.py, scatter_gpu.py and revise standalone_monitor.py (#449) * [Ascend] Update training scripts for Aquila2 (#450) * [Ascend] update monitor (#451) * Update singlenode_adapt.sh (#452) * [Ascend] modify monitor (#453) * [Ascend] Update scripts for Aquila2 (#454) --------- Co-authored-by: helen88 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: HawkL327 <[email protected]> Co-authored-by: shang-mt <[email protected]> Co-authored-by: yehua.zhang <[email protected]> Co-authored-by: Ying Zhao <[email protected]> Co-authored-by: forestlee95 <[email protected]> Co-authored-by: LoomisChen <[email protected]> Co-authored-by: Haitao Wang <[email protected]> * [metax] Waveglow pr (#457) * add t5_small and transformer_xl * Update README.md change t5_small readme * Update README.md change transformer_xl readme * Update README.md fix t5_small readme * Update README.md fix transformer_xl readme * Update README.md fix t5_small readme requirement.txt path * first commit * Update README.md change metax val loss to -5.7461 --------- Co-authored-by: jiaxing xie <[email protected]> * update readme (#463) * format? * ur * update readme (#464) Co-authored-by: zhouyu <[email protected]> * 【Metax】Add mobilenetv2 (#465) * add mobilenetv2 * fix * fix * Update test_conf.py (#469) [metax] add bert_hf /glm sample Co-authored-by: shh2000 <[email protected]> * [metax] add glm result (#466) * add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md * 【metax】add model mask_rcnn and detr (#459) * add model mask_rcnn and detr * maskrcnn & detr model logs * [metax] stablediffusion inference pr (#468) * update * update inference * update readme * update dockerfile --------- Co-authored-by: Shengchu Zhao <[email protected]> * [KUNLUN] add llama70B case (#470) * [KUNLUN] add llama70B case * [KUNLUN] add llama70B case * Merge branch 'main' of https://github.com/ZLkanyo009/FlagPerf into main * Update README.md --------- Co-authored-by: zhangling21 <[email protected]> * [metax] swintransformer-inference pr (#473) * add metax swin-transformer * mod readme * mod readme * mod swin * Update README.md * Update config_common.py * Update requirements.txt * fix torch_six in swin_transformer * Update utils.py * add metax swintrans-infer --------- Co-authored-by: jingyifa <[email protected]> * [DCU]Add glm case of dcu in FlagPerf. (#472) * Add glm case of dcu in Flagperf. * update 1*1 log * Update README infos in glm_pytorch of DCU. --------- Co-authored-by: shh2000 <[email protected]> * add resnet infer metax (#474) Co-authored-by: yaguang.wuyaguang <[email protected]> * [metax] add bert_large inference result (#476) * add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md * update metax bertlarge inference result * update metax bert_large inference result * Update README.md * 【BAAI】add MoFlow pretraining std case (#397) * add MoFlow std case * update readme * add case example for test_conf * change to comment * rdkit add version * add jit & cuda_graph to mutable_params, overwritten by vendors are allowed * rename config_name to dataset_name * set time statistic variables to 0 * update seed and target_nuv * update 1x8 result for official bs * update notice for readme * Update test_conf.py --------- Co-authored-by: zhouyu <[email protected]> Co-authored-by: shh2000 <[email protected]> * ur (#478) * [metax]添加sam/vit推理结果 (#477) * 添加sam/vit推理结果 * 添加硬件信息 --------- Co-authored-by: fdeng <[email protected]> * 【Metax】Add yolov5 infer (#479) * add yolov5 * add readme * fix * Update Dockerfile * . add llava case * v2 . * . . * . * commit commit * readme1 commit * llava13b llava13b * llava1.5_7b llava1.5_7b * del del * del del * fix fix * del del * add add * add add --------- Co-authored-by: shh2000 <[email protected]> Co-authored-by: Zhou Yu <[email protected]> Co-authored-by: zhouyu <[email protected]> Co-authored-by: Kathrine <[email protected]> Co-authored-by: xfguo <[email protected]> Co-authored-by: xiaofeng guo <[email protected]> Co-authored-by: sherryxie1 <[email protected]> Co-authored-by: helen88 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: HawkL327 <[email protected]> Co-authored-by: shang-mt <[email protected]> Co-authored-by: yehua.zhang <[email protected]> Co-authored-by: Ying Zhao <[email protected]> Co-authored-by: forestlee95 <[email protected]> Co-authored-by: LoomisChen <[email protected]> Co-authored-by: Haitao Wang <[email protected]> Co-authored-by: jiaxing xie <[email protected]> Co-authored-by: 会意 <[email protected]> Co-authored-by: happyxuwork <[email protected]> Co-authored-by: fred1912 <[email protected]> Co-authored-by: Shengchu Zhao <[email protected]> Co-authored-by: Ling Zhang <[email protected]> Co-authored-by: zhangling21 <[email protected]> Co-authored-by: FaJingyi <[email protected]> Co-authored-by: jingyifa <[email protected]> Co-authored-by: Rayyyyy <[email protected]> Co-authored-by: jsnoc <[email protected]> Co-authored-by: yaguang.wuyaguang <[email protected]> Co-authored-by: dfgan <[email protected]> Co-authored-by: fdeng <[email protected]>

Kathrine94 and others added 4 commits February 23, 2024 11:18

add bert_hf result

f2ecc02

Update README.md

ebd0058

1

Merge branch 'FlagOpen:main' into main

9759ab4

add glm result

6777f05

yuzhou03 reviewed Feb 29, 2024

View reviewed changes

nrikoh reviewed Mar 4, 2024

View reviewed changes

[metax] Update glm README.md

c3c580e

nrikoh approved these changes Mar 4, 2024

View reviewed changes

shh2000 approved these changes Mar 4, 2024

View reviewed changes

shh2000 merged commit 0c86726 into FlagOpen:main Mar 4, 2024
1 check passed

nrikoh pushed a commit to nrikoh/FlagPerf that referenced this pull request Mar 14, 2024

[metax] add glm result (FlagOpen#466)

b9f6294

* add bert_hf result * Update README.md 1 * add glm result * [metax] Update glm README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[metax] add glm result #466

[metax] add glm result #466

Kathrine94 commented Feb 29, 2024

yuzhou03 commented Feb 29, 2024

yuzhou03 Feb 29, 2024

yuzhou03 Feb 29, 2024

Kathrine94 commented Mar 4, 2024

nrikoh Mar 4, 2024

Kathrine94 Mar 4, 2024

[metax] add glm result #466

[metax] add glm result #466

Conversation

Kathrine94 commented Feb 29, 2024

yuzhou03 commented Feb 29, 2024

yuzhou03 Feb 29, 2024

Choose a reason for hiding this comment

yuzhou03 Feb 29, 2024

Choose a reason for hiding this comment

Kathrine94 commented Mar 4, 2024

nrikoh Mar 4, 2024

Choose a reason for hiding this comment

Kathrine94 Mar 4, 2024

Choose a reason for hiding this comment