InternEvo-HFModels

This directory contains the model modeling and configuration files in the Hugging Face format, as well as the sample train.py file and configuration files for training these models with InternEvo.

Installation of InternEvo

Refer to the InternEvo Installation Documentation

Code Download

Download the files from InternEvo-HFModels to your local machine:

git clone https://github.com/InternLM/InternEvo-HFModels.git

Start Training

Run the specified train.sh to start training according to the model you need, for example:

bash examples/internlm/internlm_7b/train.sh

isp Parallel

For parallel training in ISP mode, the config.py file needs to be modified before starting the training to change the tensor parallel mode to ISP. The modification is as follows:

parallel = dict(
    zero1=dict(size=-1),
    tensor=dict(size=2, mode="isp"),
    pipeline=dict(size=1, interleaved_overlap=True),
    weight=dict(size=2, overlap=False, memory_pool=True),
)

Here, the size value in tensor is the size of sequence parallelism, and the size value in weight is the size of weight parallelism in ISP mode. Note: here overlap in weight parameter should be set to False.

The modeling file of the model needs to be modified to use the new_linear() function provided by InternEvo for the initialization of head, attention calculations, and mlp in the linear function. Taking the modeling file of the InternLM model as an example, the modification is as follows:

from internlm.model.modules.linear import new_linear

class InternLMMLP(nn.Module):
    def __init__(self, hidden_size: int, intermediate_size: int, hidden_act: str):
        super().__init__()
        self.gate_proj = new_linear("w1", hidden_size, intermediate_size, bias=False)
        self.down_proj = new_linear("w2", intermediate_size, hidden_size, bias=False)
        self.up_proj = new_linear("w3", hidden_size, intermediate_size, bias=False)
        self.act_fn = ACT2FN[hidden_act]

class InternLMAttention(nn.Module):
    def __init__(self, config: InternLMConfig):
        super().__init__()
        self.q_proj = new_linear("wq", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias)
        self.k_proj = new_linear("wk", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias)
        self.v_proj = new_linear("wv", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias)
        self.o_proj = new_linear("wo", self.num_heads * self.head_dim, self.hidden_size, bias=config.bias)

class InternLMForCausalLM(InternLMPreTrainedModel):
    def __init__self, config):
        super().__init__(config)
        self.model = InternLMModel(config)

        self.lm_head = new_linear("head", config.hidden_size, config.vocab_size, bias=False)

The first parameter of the new_linear() function indicates the name of the parameter and can be one of the following: "head", "output", "wqkv", "wq", "wk", "wv", "wkv", "w1", "w3", "w13", "wo", "out_proj", "w2". Modify according to the actual situation.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
examples		examples
huggingface_model		huggingface_model
.gitignore		.gitignore
README-zh-Hans.md		README-zh-Hans.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

InternEvo-HFModels

Installation of InternEvo

Code Download

Start Training

isp Parallel

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

blankde/InternEvo-HFModels

Folders and files

Latest commit

History

Repository files navigation

InternEvo-HFModels

Installation of InternEvo

Code Download

Start Training

isp Parallel

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages