CodeFuse-MFTCoder: Multitask Fine-Tuned Code LLMs

[中文] [English]

News

🔥🔥 [2023/10/20] CodeFuse-QWen-14B has been released, achieving a pass@1 (greedy decoding) score of 48.8% on HumanEval, which gains 16% absolute improvement over the base model Qwen-14b

🔥🔥 [2023/09/27] CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval.

🔥🔥🔥 [2023/09/26]We are pleased to announce the release of the 4-bit quantized version of CodeFuse-CodeLlama-34B. Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.

🔥🔥🔥 [2023/09/07]We released CodeFuse-CodeLlama-34B, which achieves the 74.4% Python Pass@1 (greedy decoding) and surpasses GPT4 (2023/03/15) and ChatGPT-3.5 on the HumanEval Benchmarks.

🔥🔥 [2023/08/26]We released MFTCoder which supports finetuning Code Llama, Llama, Llama2, StarCoder, ChatGLM2, CodeGeeX2, Qwen, and GPT-NeoX models with LoRA/QLoRA.

HumanEval Performance

Model	HumanEval(Pass@1)	Date
CodeFuse-CodeLlama-34B	74.4%	2023/09
CodeFuse-CodeLlama-34B-4bits	73.8%	2023/09
WizardCoder-Python-34B-V1.0	73.2%	2023/08
GPT-4(zero-shot)	67.0%	2023/03
PanGu-Coder2 15B	61.6%	2023/08
CodeFuse-StarCoder-15B	54.9%	2023/08
CodeLlama-34b-Python	53.7%	2023/08
CodeFuse-QWen-14B	48.8%	2023/10
CodeLlama-34b	48.8%	2023/08
GPT-3.5(zero-shot)	48.1%	2022/11
OctoCoder	46.2%	2023/08
StarCoder-15B	33.6%	2023/05
QWen-14B	32.3%	2023/10

Articles

TBA

Introduction

CodeFuse-MFTCoder is an open-source project of CodeFuse for multitasking Code-LLMs(large language model for code tasks), which includes models, datasets, training codebases and inference guides. In MFTCoder, we released two codebases for finetuning Large Language Models:

mft_peft_hf is based on the HuggingFace Accelerate and deepspeed framework.
mft_atorch is based on the ATorch frameworks, which is a fast distributed training framework of LLM.

The aim of this project is to foster collaboration and share advancements in large language models, particularly within the domain of code development.

Frameworks

Highlights

✅ Multi-task: Train models on multiple tasks while maintaining a balance between them. The models can even generalize to new, previously unseen tasks.

✅ Multi-model: It integrates state-of-the-art open-source models such as gpt-neox, llama, llama-2, baichuan, Qwen, chatglm2, and more. (These finetuned models will be released in the near future.)

✅ Multi-framework: It provides support for both HuggingFace Accelerate (with deepspeed) and ATorch.

✅ Efficient fine-tuning: It supports LoRA and QLoRA, enabling fine-tuning of large models with minimal resources. The training speed meets the demands of almost all fine-tuning scenarios.

The main components of this project include:

Support for both SFT (Supervised FineTuning) and MFT (Multi-task FineTuning). The current MFTCoder achieves data balance among multiple tasks, and future releases will achieve a balance between task difficulty and convergence speed during training.
Support for QLoRA instruction fine-tuning, as well as LoRA fine-tuning.
Support for most mainstream open-source large models, particularly those relevant to Code-LLMs, such as Code-LLaMA, Starcoder, Codegeex2, Qwen, GPT-Neox, and more.
Support for weight merging between the LoRA adaptor and base models, simplifying the inference process.
Release of 2 high-quality code-related instruction fine-tuning datasets: Evol-instruction-66k and CodeExercise-Python-27k.
Release of 2 models: CodeFuse-13B and CodeFuse-CodeLlama-34B.

Requirements

To begin, ensure that you have successfully installed CUDA (version >= 11.4, preferably 11.7) along with the necessary drivers. Additionally, make sure you have installed torch (version 2.0.1).

Next, we have provided an init_env.sh script to simplify the installation of required packages. Execute the following command to run the script:

sh init_env.sh

If you require flash attention, please refer to the following link for installation instructions: https://github.com/Dao-AILab/flash-attention

Training

🚀 Huggingface accelerate + deepspeed Codebase for MFT(Multi-task Finetuning)

🚀 Atorch Codebase for MFT(Multi-task Finetuning)

Models

We are excited to release the following two CodeLLMs trained by MFTCoder, now available on Hugging Face:

Model	Base Model	Num of examples trained	Batch Size	Seq Length
🔥🔥🔥 CodeFuse-CodeLlama-34B	CodeLlama-34b-Python	600k	80	4096
🔥🔥🔥 CodeFuse-CodeLlama-34B-4bits	CodeLlama-34b-Python			4096
🔥🔥🔥 CodeFuse-StarCoder-15B	Starcoder	600k	256	4096
🔥🔥🔥 CodeFuse-QWen-14B	Qwen-14b	1100k	256	4096
🔥 CodeFuse-13B	CodeFuse-13B	66k	64	4096

Datasets

We are also pleased to release two code-related instruction datasets, meticulously selected from a range of datasets to facilitate multitask training. Moving forward, we are committed to releasing additional instruction datasets covering various code-related tasks.

Dataset	Introduction
⭐ Evol-instruction-66k	Based on open-evol-instruction-80k, filter out low-quality, repeated, and similar instructions to HumanEval, thus get high-quality code instruction dataset.
⭐ CodeExercise-Python-27k	python code exercise instruction dataset generated by chatgpt

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
assets		assets
mft_atorch		mft_atorch
mft_peft_hf		mft_peft_hf
.gitignore		.gitignore
LEGAL.md		LEGAL.md
LICENSE.md		LICENSE.md
README.md		README.md
README_cn.md		README_cn.md
init_env.sh		init_env.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeFuse-MFTCoder: Multitask Fine-Tuned Code LLMs

Contents

News

HumanEval Performance

Articles

Introduction

Frameworks

Highlights

Requirements

Training

Models

Datasets

Star-History

About

Releases

Packages

Languages

License

wj882018/MFTCoder

Folders and files

Latest commit

History

Repository files navigation

CodeFuse-MFTCoder: Multitask Fine-Tuned Code LLMs

Contents

News

HumanEval Performance

Articles

Introduction

Frameworks

Highlights

Requirements

Training

Models

Datasets

Star-History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages