add kunlun llama70b case #462

ZLkanyo009 · 2024-02-28T08:44:06Z

No description provided.

shh2000 · 2024-02-28T09:12:20Z

training/benchmarks/llama2_70B/megatron/megatron_main.sh

@@ -132,7 +132,9 @@ LOGGING_ARGS="
 "

 source $VENDOR_SHELL
-cmd="torchrun $DISTRIBUTED_ARGS /workspace/FlagScale/pretrain_llama.py \
+CODE_PATH="/workspace/FlagScale/pretrain_llama.py"


这里请把source vendor_shell放到CODEpath后面吧

shh2000 · 2024-02-28T09:14:37Z

training/kunlunxin/llama2_70B-megatron/README.md

+| R300十机32卡（10x8） | amp       | TP8PP4DP1 | /                            | /         | /     | /     | /   |
+| R300十机32卡（10x8） | amp       | TP4PP8DP1 | /                            | /         | /     | 21/32 | /   |
+| R300十机32卡（10x8） | amp       | TP4PP8DP1 | GAS=1024(GBS=1024=4M tokens) | /         | doing | 21/32 | /   |
+因缺少R300机器，在单卡R300与单卡GPU上初步验证精度


*doing：因缺少blabla，目前已通过减小模型层数的方式，在单卡R300与单卡GPU上验证精度。完整70B模型的精度验证进行中。可以按照这个写好后，上面表格里的doing改成doing*

shh2000 · 2024-02-28T09:15:02Z

training/kunlunxin/llama2_70B-megatron/config/config_R300x10x8.py

+batchsize = 1
+accumulate_steps = 44
+train_tokens = 100000000
+theoryflops = 495000000000000.0


昆仑芯理论算力是495TFLOPS嘛

add kunlun llama70b case

21c176a

shh2000 reviewed Feb 28, 2024

View reviewed changes

Merge branch 'FlagOpen:main' into main

b73bcba

ZLkanyo009 closed this by deleting the head repository Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add kunlun llama70b case #462

add kunlun llama70b case #462

ZLkanyo009 commented Feb 28, 2024

shh2000 Feb 28, 2024

shh2000 Feb 28, 2024 •

edited

Loading

shh2000 Feb 28, 2024

add kunlun llama70b case #462

add kunlun llama70b case #462

Conversation

ZLkanyo009 commented Feb 28, 2024

shh2000 Feb 28, 2024

Choose a reason for hiding this comment

shh2000 Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

shh2000 Feb 28, 2024

Choose a reason for hiding this comment

shh2000 Feb 28, 2024 •

edited

Loading