-
Notifications
You must be signed in to change notification settings - Fork 142
Issues: Tencent/TencentPretrain
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
请问这里的中文模型支持的最大输入序列长度是512tokens吗?超过512tokens就会被截断嘛?可不可以在微调的时候扩大模型的位置编码数量?
#130
opened Jul 21, 2024 by
chengzi-big
单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType)
#112
opened Nov 29, 2023 by
smallYellowCat
DeepSpeedZeRoOffload initialization failed (can't allocate memory)
#70
opened May 25, 2023 by
treya-lin
pretrain.py of llama-7b model, Exception: : Current loss scale already at minimum
#57
opened Apr 26, 2023 by
liukaiyueyuo
单机8卡A100-80G deepspeed ZERO3 或者 非ZERO3 pretrain LLaMA-7B时,不能充分利用显卡
#56
opened Apr 25, 2023 by
ShadowTeamCN
用自己的中文数据的话,preprocess中需要把数据格式调整成什么形式即可?这部分相关说明有吗?目标是想做llama的增量预训练
#40
opened Apr 10, 2023 by
baketbek
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.