Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

fix bugs of data preprocessing with multiple json keys
#1337 opened Dec 25, 2024 by junjzhang Loading…
Create python-package.yml
#1332 opened Dec 21, 2024 by invisiblepancake Loading…
Add Mamba TRTLLM support
#1320 opened Dec 12, 2024 by meatybobby Loading…
update network interface env
#1319 opened Dec 12, 2024 by lizamd Loading…
[Update] Print training log in rank0
#1296 opened Nov 21, 2024 by shijungg Loading…
support qwen2 hf<->mcore ckpt converter
#1290 opened Nov 19, 2024 by wenyujin333 Loading…
Set torch.multiprocessing start method as 'spawn'
#1285 opened Nov 12, 2024 by hxdtest Loading…
Huvu/update t5 attentionmasktype
#1273 opened Nov 4, 2024 by huvunvidia Loading…
Update t5_model.py
#1271 opened Nov 2, 2024 by huvunvidia Loading…
Enable huggingface tokenizer stale No activity in 60 days on issue or PR
#1268 opened Oct 30, 2024 by msiddaiah Loading…
fix: remove unnecessary trailing comma in statement stale No activity in 60 days on issue or PR
#1265 opened Oct 29, 2024 by singleheart Loading…
Add support to process gzip files stale No activity in 60 days on issue or PR
#1260 opened Oct 28, 2024 by puneeshkhanna Loading…
[Wrong spelling] Update training.py stale No activity in 60 days on issue or PR
#1229 opened Oct 21, 2024 by zyqhnu Loading…
Typo fix in readme stale No activity in 60 days on issue or PR
#1223 opened Oct 17, 2024 by alexchen4ai Loading…
support qwen2 and siglip weight conversion script to enable training … stale No activity in 60 days on issue or PR
#1221 opened Oct 16, 2024 by tao-githup Loading…
readme spelling correction stale No activity in 60 days on issue or PR
#1216 opened Oct 13, 2024 by jonassteinberg1 Loading…
[Functions] Support Packed_seq_params in Megatron-LM stale No activity in 60 days on issue or PR
#1215 opened Oct 12, 2024 by Baibaifan Loading…
Embedding stale No activity in 60 days on issue or PR
#1209 opened Oct 10, 2024 by rachitgarg91 Loading…
Dev/optimizer offloading
#1205 opened Oct 10, 2024 by lostkevin Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.