-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Issues: NVIDIA/Megatron-LM
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] The logic for calculating the last stage when average loss across microbatches.
#1379
opened Feb 6, 2025 by
LitLeo
[ENHANCEMENT] add options how to choose topk devices for
device_limited_topk
#1378
opened Feb 6, 2025 by
bzantium
[QUESTION] Support for Heterogeneous Parallelism in Multimodal Training
#1375
opened Feb 4, 2025 by
swiftomkar
[QUESTION] Backend nccl does not support reduce_scatter_tensor_coalesced, how could I solve it
#1369
opened Jan 30, 2025 by
TeddLi
[BUG] BERT and GPT345 Model Checkpoints Returning
410 Gone
HTTP Response
#1367
opened Jan 28, 2025 by
GangGreenTemperTatum
[QUESTION] The dataset cannot be found in multi-node multi-GPU training.
#1355
opened Jan 13, 2025 by
stay88
[BUG] When trying to convert llama2-7b model from HF format to megatron format
#1348
opened Jan 6, 2025 by
Sun2018421
[QUESTION]How to convert the weight file format of the MAMBA model from pt to safetensors format?
#1339
opened Dec 26, 2024 by
fxnie
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.