-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix bugs of data preprocessing with multiple json keys
#1337
opened Dec 25, 2024 by
junjzhang
Loading…
Fix: prevent double accumulation of load balancing loss and z-loss wi…
#1331
opened Dec 20, 2024 by
thuwzt
Loading…
fix args.mock_data bug caused by func get_blend_and_blend_per_split
#1306
opened Nov 29, 2024 by
1195343015
Loading…
Fix: Resolve multimodal model errors and update README usage instructions
#1286
opened Nov 13, 2024 by
singleheart
Loading…
Fix a bug in optimizer's mix_lr/max_lr when args.override_opt_param_scheduler==True
#1284
opened Nov 12, 2024 by
lyuwen
Loading…
Enable huggingface tokenizer
stale
No activity in 60 days on issue or PR
#1268
opened Oct 30, 2024 by
msiddaiah
Loading…
fix: remove unnecessary trailing comma in statement
stale
No activity in 60 days on issue or PR
#1265
opened Oct 29, 2024 by
singleheart
Loading…
Enabling LR scaling for a specific layer (ex. down-projection...) during pretraining
#1262
opened Oct 28, 2024 by
dhia680
Loading…
[ENHANCEMENT] Add support for Apex RMSNorm for use in qk-norm
#1261
opened Oct 28, 2024 by
wdevazelhes
Loading…
Add support to process gzip files
stale
No activity in 60 days on issue or PR
#1260
opened Oct 28, 2024 by
puneeshkhanna
Loading…
[Wrong spelling] Update training.py
stale
No activity in 60 days on issue or PR
#1229
opened Oct 21, 2024 by
zyqhnu
Loading…
Typo fix in readme
stale
No activity in 60 days on issue or PR
#1223
opened Oct 17, 2024 by
alexchen4ai
Loading…
support qwen2 and siglip weight conversion script to enable training …
stale
No activity in 60 days on issue or PR
#1221
opened Oct 16, 2024 by
tao-githup
Loading…
readme spelling correction
stale
No activity in 60 days on issue or PR
#1216
opened Oct 13, 2024 by
jonassteinberg1
Loading…
[Functions] Support Packed_seq_params in Megatron-LM
stale
No activity in 60 days on issue or PR
#1215
opened Oct 12, 2024 by
Baibaifan
Loading…
Embedding
stale
No activity in 60 days on issue or PR
#1209
opened Oct 10, 2024 by
rachitgarg91
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.