Releases: InternLM/xtuner
Releases · InternLM/xtuner
v0.2.0rc0
What's Changed
- Support FSDP2
- Support Contiguous Batching for RLHF
- readme中增加了MiniCPM的支持 by @LDLINGLINGLING in #869
- [Bug] fix dsv2 attn dispatch (softmax_scale) by @HIT-cwh in #873
- [Bug] fix openai_map_fn bugs by @HIT-cwh in #885
- support transformers >= 4.43 by @HIT-cwh in #878
- Add internlm2 5 cfgs by @HIT-cwh in #872
- [Bugs] fix qlora convert bugs by @HIT-cwh in #930
- 对Minicpm3进行了支持 by @LDLINGLINGLING in #954
- Add functionality to download models from sources other than HuggingFace by @starmountain1997 in #946
- Add Ascend NPU as a backend by @Tonyztj in #983
- Support 4 48 by @HIT-cwh in #985
- [Feature] Auto patch for different devices by @pppppM in #986
- [Fix]MLU Device Mesh by @pppppM in #987
- bump version to v0.2.0rc0 by @pppppM in #990
New Contributors
- @starmountain1997 made their first contribution in #946
- @Tonyztj made their first contribution in #983
Full Changelog: v0.1.23...v0.2.0rc0
XTuner Release V0.1.23
What's Changed
- Support InternVL 1.5/2.0 finetune by @hhaAndroid in #737
- [Bug] fix preference_collate_fn attn_mask by @HIT-cwh in #859
- bump version to 0.1.23 by @HIT-cwh in #862
Full Changelog: v0.1.22...v0.1.23
XTuner Release V0.1.22
What's Changed
- [Refactor] fix internlm2 dispatch by @HIT-cwh in #779
- Fix zero3 compatibility issue for DPO by @Johnson-Wang in #781
- [Fix] Fix map_fn in custom_dataset/sft by @fanqiNO1 in #785
- [Fix] fix configs by @HIT-cwh in #783
- [Docs] DPO and Reward Model documents by @RangiLyu in #751
- Support internlm2.5 by @HIT-cwh in #803
- [Bugs] fix dispatch bugs when model not in LOWEST_TRANSFORMERS_VERSION by @HIT-cwh in #802
- [Docs] fix benchmark table by @HIT-cwh in #801
- [Feature] support output without loss in openai_map_fn by @HIT-cwh in #816
- [Docs] fix typos in sp docs by @HIT-cwh in #821
- [Feature] Support the DatasetInfoHook of DPO training by @xu-song in #787
- [Enhance]: Fix sequence parallel memory bottleneck in DPO & ORPO by @RangiLyu in #830
- [Fix] Fix typo by @bychen7 in #795
- [Fix] fix initialization of ref_llm for full param dpo training with zero-3 by @xu-song in #778
- [Bugs] Fix attn mask by @HIT-cwh in #852
- fix lint by @HIT-cwh in #854
- [Bugs] Fix dispatch attn bug by @HIT-cwh in #829
- [Docs]: update readme and DPO en docs by @RangiLyu in #853
- Added minicpm config file to support sft、qlora、lora、dpo by @LDLINGLINGLING in #847
- fix lint by @HIT-cwh in #856
- bump version to 0.1.22 by @HIT-cwh in #855
New Contributors
- @Johnson-Wang made their first contribution in #781
- @xu-song made their first contribution in #787
- @bychen7 made their first contribution in #795
- @LDLINGLINGLING made their first contribution in #847
Full Changelog: v0.1.21...v0.1.22
XTuner Release V0.1.21
What's Changed
- [Feature] Support DPO, ORPO and Reward Model by @RangiLyu in #743
- [Bugs] fix dispatch bugs by @HIT-cwh in #775
- [Bugs] Fix HFCheckpointHook bugs when training deepseekv2 and mixtral withou… by @HIT-cwh in #774
- [Feature] Support the scenario where sp size is not divisible by attn head num by @HIT-cwh in #769
- bump version to 0.1.21 by @HIT-cwh in #776
Full Changelog: v0.1.20...v0.1.21
XTuner Release V0.1.20
What's Changed
- [Enhancement] Optimizing Memory Usage during ZeRO Checkpoint Convert by @pppppM in #582
- [Fix] ZeRO2 Checkpoint Convert Bug by @pppppM in #684
- [Feature] support auto saving tokenizer by @HIT-cwh in #696
- [Bug] fix internlm2 flash attn by @HIT-cwh in #693
- [Bug] The LoRA model will have
meta-tensor
during thepth_to_hf
phase. by @pppppM in #697 - [Bug] fix cfg check by @HIT-cwh in #729
- [Bugs] Fix bugs caused by sequence parallel when deepspeed is not used. by @HIT-cwh in #752
- [Fix] Avoid incorrect
torchrun
invocation with--launcher slurm
by @LZHgrla in #728 - [fix] fix save eval result failed with mutil-node pretrain by @HoBeedzc in #678
- [Improve] Support the export of various LLaVA formats with
pth_to_hf
by @LZHgrla in #708 - [Refactor] refactor dispatch_modules by @HIT-cwh in #731
- [Docs] Readthedocs ZH by @pppppM in #553
- [Feature] Support finetune Deepseek v2 by @HIT-cwh in #663
- bump version to 0.1.20 by @HIT-cwh in #766
New Contributors
Full Changelog: v0.1.19...v0.1.20
XTuner Release V0.1.19
What's Changed
- [Fix] LLaVA-v1.5 official settings by @LZHgrla in #594
- [Feature] Release LLaVA-Llama-3-8B by @LZHgrla in #595
- [Improve] Add single-gpu configs for LLaVA-Llama-3-8B by @LZHgrla in #596
- [Docs] Add wisemodel badge by @LZHgrla in #597
- [Feature] Support load_json_file with json.load by @HIT-cwh in #610
- [Feature]Support Mircosoft Phi3 4K&128K Instruct Models by @pppppM in #603
- [Fix] set
dataloader_num_workers=4
for llava training by @LZHgrla in #611 - [Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. by @HIT-cwh in #609
- [Release] LLaVA-Phi-3-mini by @LZHgrla in #615
- Update README.md by @eltociear in #608
- [Feature] Refine sp api by @HIT-cwh in #619
- [Feature] Add conversion scripts for LLaVA-Llama-3-8B by @LZHgrla in #618
- [Fix] Convert nan to 0 just for logging by @HIT-cwh in #625
- [Docs] Delete colab and add speed benchmark by @HIT-cwh in #617
- [Feature] Support dsz3+qlora by @HIT-cwh in #600
- [Feature] Add qwen1.5 110b cfgs by @HIT-cwh in #632
- check transformers version before dispatch by @HIT-cwh in #672
- [Fix]
convert_xtuner_weights_to_hf
with frozen ViT by @LZHgrla in #661 - [Fix] Fix batch-size setting of single-card LLaVA-Llama-3-8B configs by @LZHgrla in #598
- [Feature] add HFCheckpointHook to auto save hf model after the whole training phase by @HIT-cwh in #621
- Remove test info in DatasetInfoHook by @hhaAndroid in #622
- [Improve] Support
safe_serialization
saving by @LZHgrla in #648 - bump version to 0.1.19 by @HIT-cwh in #675
New Contributors
- @eltociear made their first contribution in #608
Full Changelog: v0.1.18...v0.1.19
XTuner Release V0.1.18
What's Changed
- set dev version by @LZHgrla in #537
- [Fix] Fix typo by @KooSung in #547
- [Feature] support mixtral varlen attn by @HIT-cwh in #564
- [Feature] Support qwen sp and varlen attn by @HIT-cwh in #565
- [Fix]Fix attention mask in
default_collate_fn
by @pppppM in #567 - Accept pytorch==2.2 as the bugs in triton 2.2 are fixed by @HIT-cwh in #548
- [Feature] Refine Sequence Parallel API by @HIT-cwh in #555
- [Fix] Enhance
split_list
to supportvalue
at the beginning by @LZHgrla in #568 - [Feature] Support cohere by @HIT-cwh in #569
- [Fix] Fix rotary_seq_len in varlen attn in qwen by @HIT-cwh in #574
- [Docs] Add sequence parallel related to readme by @HIT-cwh in #578
- [Bug] SUPPORT_FLASH1 = digit_version(torch.version) >= digit_version('2… by @HIT-cwh in #587
- [Feature] Support Llama 3 by @LZHgrla in #585
- [Docs] Add llama3 8B readme by @HIT-cwh in #588
- [Bugs] Check whether cuda is available when choose torch_dtype in sft.py by @HIT-cwh in #577
- [Bugs] fix bugs in tokenize_ftdp_datasets by @HIT-cwh in #581
- [Feature] Support qwen moe by @HIT-cwh in #579
- [Docs] Add tokenizer to sft in Case 2 by @HIT-cwh in #583
- bump version to 0.1.18 by @HIT-cwh in #590
Full Changelog: v0.1.17...v0.1.18
XTuner Release V0.1.17
What's Changed
- [Fix] Fix PyPI package by @LZHgrla in #540
- [Improve] Add LoRA fine-tuning configs for LLaVA-v1.5 by @LZHgrla in #536
- [Configs] Add sequence_parallel_size and SequenceParallelSampler to configs by @HIT-cwh in #538
- Check shape of attn_mask during attn forward by @HIT-cwh in #543
- bump version to v0.1.17 by @LZHgrla in #542
Full Changelog: v0.1.16...v0.1.17
XTuner Release V0.1.16
What's Changed
- set dev version by @LZHgrla in #487
- Fix type error when the visual encoder is not CLIP by @hhaAndroid in #496
- [Feature] Support Sequence parallel by @HIT-cwh in #456
- [Bug] Fix bugs in flash_attn1_pytorch by @HIT-cwh in #513
- [Fix] delete cat in varlen attn by @HIT-cwh in #508
- bump version to 0.1.16 by @HIT-cwh in #520
- [Improve] Add
generation_kwargs
forEvaluateChatHook
by @LZHgrla in #501 - [Bugs] Fix bugs when training in non-distributed env by @HIT-cwh in #522
- [Fix] Support transformers>=4.38 and require transformers>=4.36.0 by @HIT-cwh in #494
- [Fix] Fix throughput hook by @HIT-cwh in #527
- Update README.md by @JianxinDong in #528
- [Fix] dispatch internlm rote by @HIT-cwh in #530
- Limit transformers != 4.38 by @HIT-cwh in #531
New Contributors
- @hhaAndroid made their first contribution in #496
- @JianxinDong made their first contribution in #528
Full Changelog: v0.1.15...v0.1.16
XTuner Release V0.1.15
What's Changed
- set dev version by @LZHgrla in #437
- [Bugs] Fix bugs when using EpochBasedRunner by @HIT-cwh in #439
- [Feature] Support processing ftdp dataset and custom dataset offline by @HIT-cwh in #410
- Update prompt_template.md by @aJupyter in #441
- [Doc] Split finetune_custom_dataset.md to 6 parts by @HIT-cwh in #445
- [Improve] Add notes for demo_data examples by @LZHgrla in #458
- [Fix] Gemma prompt_template by @LZHgrla in #454
- [Feature] Add LLaVA-InternLM2-1.8B by @LZHgrla in #449
- show more info about datasets by @amulil in #464
- [Fix] write text with
encoding='utf-8'
by @LZHgrla in #477 - support offline process llava data by @HIT-cwh in #448
- [Fix]
msagent_react_map_fn
error by @LZHgrla in #470 - [Improve] Reorg
xtuner/configs/llava/
configs by @LZHgrla in #483 - limit pytorch version <= 2.1.2 as there may be some bugs in triton2… by @HIT-cwh in #452
- [Fix] fix batch sampler bs by @HIT-cwh in #468
- bump version to v0.1.15 by @LZHgrla in #486
New Contributors
Full Changelog: v0.1.14...v0.1.15