XTuner Release V0.1.18
What's Changed
- set dev version by @LZHgrla in #537
- [Fix] Fix typo by @KooSung in #547
- [Feature] support mixtral varlen attn by @HIT-cwh in #564
- [Feature] Support qwen sp and varlen attn by @HIT-cwh in #565
- [Fix]Fix attention mask in
default_collate_fn
by @pppppM in #567 - Accept pytorch==2.2 as the bugs in triton 2.2 are fixed by @HIT-cwh in #548
- [Feature] Refine Sequence Parallel API by @HIT-cwh in #555
- [Fix] Enhance
split_list
to supportvalue
at the beginning by @LZHgrla in #568 - [Feature] Support cohere by @HIT-cwh in #569
- [Fix] Fix rotary_seq_len in varlen attn in qwen by @HIT-cwh in #574
- [Docs] Add sequence parallel related to readme by @HIT-cwh in #578
- [Bug] SUPPORT_FLASH1 = digit_version(torch.version) >= digit_version('2… by @HIT-cwh in #587
- [Feature] Support Llama 3 by @LZHgrla in #585
- [Docs] Add llama3 8B readme by @HIT-cwh in #588
- [Bugs] Check whether cuda is available when choose torch_dtype in sft.py by @HIT-cwh in #577
- [Bugs] fix bugs in tokenize_ftdp_datasets by @HIT-cwh in #581
- [Feature] Support qwen moe by @HIT-cwh in #579
- [Docs] Add tokenizer to sft in Case 2 by @HIT-cwh in #583
- bump version to 0.1.18 by @HIT-cwh in #590
Full Changelog: v0.1.17...v0.1.18