Skip to content

Commit

Permalink
Merge pull request #4805 from FederatedAI/develop-1.11.1
Browse files Browse the repository at this point in the history
Merge 1.11.1 into master
  • Loading branch information
dylan-fan committed Apr 20, 2023
2 parents 5ac0567 + 50be383 commit 5fa5522
Show file tree
Hide file tree
Showing 38 changed files with 33,044 additions and 112 deletions.
7 changes: 7 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
## Release 1.11.1
### Major Features and Improvements
> FederatedML
* Support Homo Graph Neural Network
* PSI-DH protocol enhancement: use Oakley MODP modulus groups


## Release 1.11.0
### Major Features and Improvements
> FederatedML
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ wget https://webank-ai-1251170195.cos.ap-guangzhou.myqcloud.com/fate/${version}/
scp *.tar.gz [email protected]:/data/projects/install
scp *.tar.gz [email protected]:/data/projects/install
```
Note: The current document needs to be deployed with FATE version>=1.7.0, ${version} is replaced with e.g. 1.11.0, without the v character.
Note: The current document needs to be deployed with FATE version>=1.7.0, ${version} is replaced with e.g. 1.11.1, without the v character.

### 5.2 Operating system parameter checking

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ wget https://webank-ai-1251170195.cos.ap-guangzhou.myqcloud.com/fate/${version}/
scp *.tar.gz [email protected]:/data/projects/install
scp *.tar.gz [email protected]:/data/projects/install
```
注意: 当前文档需要部署的FATE version>=1.7.0,${version}替换为如1.11.0,不带v字符
注意: 当前文档需要部署的FATE version>=1.7.0,${version}替换为如1.11.1,不带v字符
### 5.2 操作系统参数检查

**在目标服务器(192.168.0.1 192.168.0.2 192.168.0.3)app用户下执行**
Expand Down
2 changes: 1 addition & 1 deletion deploy/standalone-deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ export version={FATE version for this deployment}
example:

```bash
export version=1.11.0
export version=1.11.1
```

### 2.2 Pulling mirrors
Expand Down
4 changes: 2 additions & 2 deletions deploy/standalone-deploy/README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,13 @@
设置部署所需环境变量(注意, 通过以下方式设置的环境变量仅在当前终端会话有效, 若打开新的终端会话, 如重新登录或者新窗口, 请重新设置)

```bash
export version={本次部署的FATE版本号, 如1.11.0}
export version={本次部署的FATE版本号, 如1.11.1}
```

样例:

```bash
export version=1.11.0
export version=1.11.1
```

### 2.2 拉取镜像
Expand Down
2 changes: 2 additions & 0 deletions doc/federatedml_component/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ provide:
| [Hetero SSHE Logistic Regression](logistic_regression.md) | HeteroSSHELR | Build hetero logistic regression model without arbiter | Table, values are Instances | Table, values are Instances | | SSHE LR Model |
| [Hetero SSHE Linear Regression](linear_regression.md) | HeteroSSHELinR | Build hetero linear regression model without arbiter | Table, values are Instances | Table, values are Instances | | SSHE LinR Model |
| [Positive Unlabeled Learning](positive_unlabeled.md) | PositiveUnlabeled | Build positive unlabeled learning model | Table, values are Instances | Table, values are Instances | | |
| [FATE-LLM](fate_llm.md) | FATE_LLM | Federated Large Language Model | Torch DataSet | | PreTrained Large Language Model | FineTuned Large Language Model |


## Secure Protocol

Expand Down
1 change: 1 addition & 0 deletions doc/federatedml_component/README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Federatedml模块包括许多常见机器学习算法联邦化实现。所有模
| [Hetero SSHE Logistic Regression](logistic_regression.md) | HeteroSSHELR | 两方构建纵向逻辑回归(无可信第三方) | Table, 值为Instance | Table, 值为Instance | | SSHE LR Model |
| [Hetero SSHE Linear Regression](linear_regression.md) | HeteroSSHELinR | 两方构建纵向线性回归(无可信第三方) | Table, 值为Instance | Table, 值为Instance | | SSHE LinR Model |
| [Positive Unlabeled Learning](positive_unlabeled.md) | PositiveUnlabeled | 构建positive unlabeled learning(PU learning)模型 | Table, 值为Instance | Table, 值为Instance | | |
| [FATE-LLM](fate_llm.md) | FATE_LLM | 联邦大语言模型 | Torch DataSet | | PreTrained Large Language Model | FineTuned Large Language Model |


## 安全协议
Expand Down
42 changes: 42 additions & 0 deletions doc/federatedml_component/fate_llm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# FATE-LLM
FATE-LLM is a framework to support federated training with large language models, it also provides multiple parameter-efficient fine-tuning strategies[1][2] for industrial applications.

## Features
In current version, it supports the following features:
* Integration of various large language models for federated learning: including BERT[, ALBERT, RoBERta, GPT-2, BART, DeBERta, DistilBERT, etc.
These models are widely used in natural language understanding and generation tasks, and can meet the needs of different application scenarios[3][4][5].
* Integration of multiple parameter-efficient tuning methods: Bottleneck Adapters (including Houlsby, Pfeiffer, Parallel schemes), Invertible Adapters, LoRA, IA3, and Compacter

## Experiment Data

### Model Parameter Sizes
The current version of FATE-LLM supports various classic large language models, with parameters amount ranging from tens of millions to 1.5 billions.
The following table are the parameters amounts of models we support for commonly used versions
![llm model parameters](../images/llm_model_parameter_amount.png)

### Trainable Parameter Sizes Of Parameter-Efficient Methods
In order to give users a more intuitive feelings for the huge improvement of federated training and transmission in FATE-LLM,
we will take gpt-2 as an example and show the parameter amount in the federated training and transmission process.
![parameter_efficient](../images/parameter_efficient_of_gpt-2.png)

### Training Time Improvement:
We present a comparison of training times between different adapter
methods and fine-tuning a complete model in a homo(horizontal) federated learning scenario for a text sentiment classification task using the IMDB dataset
- Scenario: Homo(Horizontal) Federated Learning Scenario
- Task Type: Text Sentiment Classification Task
- Participants: Two client parties involved in model building and one server for aggregation.
- Data & Basic parameters: IMDB dataset, with a size of 25,000, batch_size=64, padding_length=200.
- Environment: Each modeling party uses 2x V100 32GB GPUs, and the experiments are conducted in a local area network environment.

The table below shows the training time comparison between using various adapters and fine-tuning the complete model for each epoch (in seconds).
It can be observed that the federated form of adapter+language model can significantly save training time.

![GPT-2 Training Time Improvement](../images/gpt-2_training_time_improvement.png)


## References
[1] Cai D, Wu Y, Wang S, et al. Autofednlp: An efficient fednlp framework[J]. arXiv preprint arXiv:2205.10162, 2022.
[2] Zhang Z, Yang Y, Dai Y, et al. When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods[J]. arXiv preprint arXiv:2212.10025, 2022.
[3] Zhou C, Li Q, Li C, et al. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt[J].
[4] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J].arXiv preprint arXiv:1810.04805, 2018.
[5] Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(8): 9.
Binary file added doc/images/gpt-2_training_time_improvement.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/llm_model_parameter_amount.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/parameter_efficient_of_gpt-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 5fa5522

Please sign in to comment.