-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
42be3cd
commit d23f858
Showing
5 changed files
with
282 additions
and
90 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
## isp并行 | ||
如果需要开启isp并行模式训练,需要在启动训练前,修改config.py文件,将tensor并行模式改为isp,修改如下: | ||
```bash | ||
parallel = dict( | ||
zero1=dict(size=-1), | ||
tensor=dict(size=2, mode="isp"), | ||
pipeline=dict(size=1, interleaved_overlap=True), | ||
weight=dict(size=2, overlap=False, memory_pool=True), | ||
) | ||
``` | ||
其中,tensor中的size为序列并行的大小,weight中的size为isp模式中,权重并行的大小。 | ||
注意:这里weight参数中的overlap需要设置为False。 | ||
|
||
需要修改模型modeling文件,将head、attention计算以及mlp中涉及的linear初始化函数改为使用InternEvo提供的new_linear()函数。以internlm模型的modeling文件为例,修改如下: | ||
```bash | ||
from internlm.model.modules.linear import new_linear | ||
|
||
class InternLMMLP(nn.Module): | ||
super().__init__() | ||
self.gate_proj = new_linear("w1", hidden_size, intermediate_size, bias=False) | ||
self.down_proj = new_linear("w2", intermediate_size, hidden_size, bias=False) | ||
self.up_proj = new_linear("w3", hidden_size, intermediate_size, bias=False) | ||
self.act_fn = ACT2FN[hidden_act] | ||
|
||
class InternLMAttention(nn.Module): | ||
self.q_proj = new_linear("wq", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias) | ||
self.k_proj = new_linear("wk", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias) | ||
self.v_proj = new_linear("wv", self.hidden_size, self.num_heads * self.head_dim, bias=config.bias) | ||
self.o_proj = new_linear("wo", self.num_heads * self.head_dim, self.hidden_size, bias=config.bias) | ||
|
||
class InternLMForCausalLM(InternLMPreTrainedModel): | ||
def __init__(self, config): | ||
super().__init__(config) | ||
self.model = InternLMModel(config) | ||
|
||
self.lm_head = new_linear("head", config.hidden_size, config.vocab_size, bias=False) | ||
``` | ||
new_linear()函数的第一个参数标志参数的名称,可接受的名称范围为:"head"、"output"、"wqkv"、"wq"、"wk"、"wv"、"wkv"、"w1"、"w3"、"w13"、"wo"、"out_proj"、"w2",根据实际情况修改。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.