Skip to content

Commit

Permalink
Merge pull request #4812 from FederatedAI/develop-1.11.1
Browse files Browse the repository at this point in the history
Update documents
  • Loading branch information
dylan-fan committed Apr 21, 2023
2 parents 5fa5522 + faa4da8 commit 9432615
Show file tree
Hide file tree
Showing 21 changed files with 93 additions and 608 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea
- [Train & Predict Hetero SecureBoost with FATE-Pipeline](./doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb)
- [Build & Customize NN models with FATE-Pipeline](./doc/tutorial/pipeline/nn_tutorial/README.md)
- [Run Job with DSL json conf](doc/tutorial/dsl_conf/dsl_conf_tutorial.md)
- [FATE-LLM Training Guides](doc/tutorial/fate_llm/README.md)
- [More Tutorials...](doc/tutorial)

## Related Repositories (Projects)
Expand Down
1 change: 1 addition & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ FATE 支持多种部署模式,用户可以根据自身情况进行选择。[
- [使用FATE-Pipeline训练及预测纵向SBT任务](./doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb)
- [使用FATE-Pipeline构建横、纵向神经网络模型](doc/tutorial/pipeline/nn_tutorial/README.md)
- [使用DSL json conf运行任务](doc/tutorial/dsl_conf/dsl_conf_tutorial.md)
- [FATE-LLM训练教程](doc/tutorial/fate_llm/README.md)
- [更多教程](doc/tutorial)

## 关联仓库
Expand Down
16 changes: 2 additions & 14 deletions doc/federatedml_component/intersect.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,6 @@ finding common even ids.
With RSA intersection, participants can get their intersection ids
securely and efficiently.

## RAW Intersection

This mode implements the simple intersection method in which a
participant sends all its ids to another participant, and the other
participant finds their common ids. Finally, the joining role will send
the intersection ids to the sender.

## DH Intersection

This mode implements secure intersection based on symmetric encryption
Expand Down Expand Up @@ -88,7 +81,7 @@ Intersection support cache.

## Multi-Host Intersection

RSA, RAW, and DH intersection support multi-host scenario. It means a
RSA, and DH intersection support multi-host scenario. It means a
guest can perform intersection with more than one host simultaneously
and get the common ids among all participants.

Expand Down Expand Up @@ -155,14 +148,13 @@ And for Host:

## Feature

Below lists features of each ECDH, RSA, DH, and RAW intersection methods.
Below lists features of each ECDH, RSA and DH intersection methods.

| Intersect Methods | PSI | Match-ID Support | Multi-Host | Exact-Cardinality | Estimated Cardinality | Preprocessing | Cache |
|------------------- |------------------------------------------------------------------------- |------------------------------------------------------------------------ |------------------------------------------------------------------------------ |------------------------------------------------------------------------------------------------ |------------------------------------------------------------------------------------ |-------------------------------------------------------------------------------------- |------------------------------------------------------------------------------- |
| ECDH | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh.py) | ✓ | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-multi) | [✓](../../examples/dsl/v2/intersect/test_intersect_job_ecdh_exact_cardinality_conf.json) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-cache.py) |
| RSA | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa.py) | [✓](../../examples/pipeline/match_id_test/pipeline-hetero-lr.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-multi-rsa.py) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-cardinality.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-cache.py) |
| DH | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh.py) | ✓ | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-multi.py) | [✓](examples/dsl/v2/intersect/test_intersect_job_dh_exact_cardinality_conf.json) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-cache.py) |
| RAW | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ |

All four methods support:

Expand All @@ -180,10 +172,6 @@ RSA, DH, ECDH intersection methods also support:

1. PSI with cache

RAW intersection supports the following extra feature:

1. base64 encoding may be used for all hashing methods.

Cardinality Computation:

1. Set `cardinality_method` to `rsa` will produce estimated intersection cardinality;
Expand Down
5 changes: 5 additions & 0 deletions doc/tutorial/README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
- [`Pipeline` 进行 `Hetero SecureBoost` 训练和预测](pipeline/pipeline_tutorial_hetero_sbt.ipynb)
- [`Pipeline` 构建神经网络模型](pipeline/nn_tutorial/README.md)
- [`Pipeline` 进行带 `Match ID``Hetero SecureBoost` 训练和预测](pipeline/pipeline_tutorial_match_id.ipynb)
- [上传带 `Meta` 的数据及`Hetero SecureBoost`训练](pipeline/pipeline_tutorial_uploading_data_with_meta.ipynb)
- [多列匹配ID时指定特定列求交任务](pipeline/pipeline_tutorial_multiple_id_columns.ipynb)

不使用 `Pipeline` 来提交任务也是支持的,用户需要配置一些 `json` 格式的任务配置文件:

Expand All @@ -22,3 +24,6 @@
`FATE-Test` 跑多个任务:

- [FATE-Test 教程](fate_test_tutorial.md)

多方模型合并并导出为 sklearn/LightGBM 或者 PMML 格式:
- [模型合并导出](./model_merge.md)
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Federated GPT-2 Tuning with Parameter Efficient methods in FATE-1.11"
"# Federated GPT-2 Tuning with Parameter Efficient methods in FATE-LLM"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we will demonstrate how to efficiently train federated large language models using the FATE 1.11 framework. In FATE-1.11, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on GPT-2, and we will also emphasize the use of the Adapter mechanism for fine-tuning GPT-2, which enables us to effectively reduce communication volume and improve overall efficiency.\n",
"In this tutorial, we will demonstrate how to efficiently train federated large language models using the FATE-LLM framework. In FATE-LLM, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on GPT-2, and we will also emphasize the use of the Adapter mechanism for fine-tuning GPT-2, which enables us to effectively reduce communication volume and improve overall efficiency.\n",
"\n",
"By following this tutorial, you will learn how to leverage the FATE framework to rapidly fine-tune federated large language models, such as GPT-2, with ease and efficiency."
]
Expand Down Expand Up @@ -600,7 +600,7 @@
" padding_side=\"left\", return_input_ids=False, pad_token='<|endoftext|>')\n",
"# TrainerParam\n",
"trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
" data_loader_worker=8, secure_aggregate=False)\n",
" data_loader_worker=8, secure_aggregate=True)\n",
"\n",
"\n",
"nn_component = HomoNN(name='nn_0', model=model)\n",
Expand Down Expand Up @@ -660,7 +660,7 @@
"outputs": [],
"source": [
"trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
" data_loader_worker=8, secure_aggregate=False, cuda=0)"
" data_loader_worker=8, secure_aggregate=True, cuda=0)"
]
},
{
Expand Down Expand Up @@ -690,11 +690,11 @@
"outputs": [],
"source": [
"client_0_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
" data_loader_worker=8, secure_aggregate=False, cuda=[0, 1, 2, 3])\n",
" data_loader_worker=8, secure_aggregate=True, cuda=[0, 1, 2, 3])\n",
"client_1_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
" data_loader_worker=8, secure_aggregate=False, cuda=[0, 3, 4])\n",
" data_loader_worker=8, secure_aggregate=True, cuda=[0, 3, 4])\n",
"server_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
" data_loader_worker=8, secure_aggregate=False)\n",
" data_loader_worker=8, secure_aggregate=True)\n",
"\n",
"# set parameter for client 1\n",
"nn_component.get_party_instance(role='guest', party_id=guest_0).component_param(\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multi-Task Federated Learning with GPT-2 using FATE-1.11"
"# Multi-Task Federated Learning with GPT-2 using FATE-LLM"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we will explore the implementation of multi-task federated learning with LM: GPT-2 using the FATE-1.11 framework. FATE-1.11 provides the \"pellm\" module for efficient federated learning. It is specifically designed for large language models in a federated setting.\n",
"In this tutorial, we will explore the implementation of multi-task federated learning with LM: GPT-2 using the FATE-LLM framework. FATE-LLM provides the \"pellm\" module for efficient federated learning. It is specifically designed for large language models in a federated setting.\n",
"\n",
"Multi-task learning involves training a model to perform multiple tasks simultaneously. In this tutorial, we will focus on two tasks - sentiment classification and named entity recognition (NER) - and show how they can be combined with GPT-2 in a federated learning setting. We will use the IMDB sentiment analysis dataset and the CoNLL-2003 NER dataset for our tasks.\n",
"\n",
Expand Down Expand Up @@ -699,7 +699,7 @@
"dataset_param = DatasetParam(dataset_name='multitask_ds', take_limits=50, tokenizer_name_or_path=model_path)\n",
"# TrainerParam\n",
"trainer_param = TrainerParam(trainer_name='multi_task_fedavg', epochs=1, batch_size=4, \n",
" data_loader_worker=8, secure_aggregate=False)\n",
" data_loader_worker=8, secure_aggregate=True)\n",
"loss = t.nn.CustLoss(loss_module_name='multi_task_loss', class_name='MultiTaskLoss', task_weights=[0.5, 0.5])\n",
"\n",
"\n",
Expand Down
5 changes: 5 additions & 0 deletions doc/tutorial/fate_llm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Usage
Here we provide tutorials of FATE-LLM training:

- [FATE-LLM example with GPT-2](GPT2-example.ipynb)
- [FATE-LLM Multi-Task GPT-2: Classification and NER Tagging](GPT2-multi-task.ipynb)
5 changes: 2 additions & 3 deletions doc/tutorial/pipeline/nn_tutorial/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,9 @@ In order to show you how to develop your own Trainer, here we try to develop a s

Here we offer some advanced examples of using FATE-NN framework.

## Fed-PELLM(Parameter Efficient Large Language Model) Training
## FATE-LLM(Federated Large Language Models) Training

- [Federated PELLM example with GPT-2](./GPT2-example.ipynb)
- [Federated Multi-Task GPT-2: Classification and NER Tagging](./GPT2-multi-task.ipynb)
- [FATE-LLM Training Guides](../../fate_llm/README.md)

## Resnet classification(Homo-NN)

Expand Down
68 changes: 27 additions & 41 deletions examples/dsl/v2/intersect/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,103 +4,89 @@ This section introduces the dsl and conf for usage of different type of task.

#### Intersection Task.

1. RAW Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_raw_conf.json

2. RAW Intersection with SM3 Hashing:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_raw_sm3_conf.json

3. RSA Intersection:
1. RSA Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_rsa_conf.json

4. RSA Intersection with Random Base Fraction set to 0.5:
2. RSA Intersection with Random Base Fraction set to 0.5:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_rsa_fraction_conf.json

5. RSA Intersection with Calculation Split:
3. RSA Intersection with Calculation Split:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_rsa_split_conf.json

6. RSA Multi-hosts Intersection:
4. RSA Multi-hosts Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_rsa_multi_host_conf.json

This dsl is an example of guest runs intersection with two hosts using rsa intersection. It can be used as more than two hosts.

7. RAW Multi-hosts Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_raw_multi_host_conf.json

This dsl is an example of guest runs intersection with two hosts using rsa intersection. It can be used as more than two hosts.

8. DH Intersection:
5. DH Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_dh_conf.json

9. DH Multi-host Intersection:
6. DH Multi-host Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_dh_multi_conf.json

10. ECDH Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_ecdh_conf.json
7. ECDH Intersection:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_ecdh_conf.json

11. ECDH Intersection with Preprocessing:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_ecdh_w_preprocess_conf.json
8. ECDH Intersection with Preprocessing:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_ecdh_w_preprocess_conf.json

12. RSA Intersection with Cache:
- dsl: test_intersect_job_cache_dsl.json
- runtime_config : test_intersect_job_rsa_cache_conf.json
9. RSA Intersection with Cache:
- dsl: test_intersect_job_cache_dsl.json
- runtime_config : test_intersect_job_rsa_cache_conf.json

13. DH Intersection with Cache:
10. DH Intersection with Cache:
- dsl: test_intersect_job_cache_dsl.json
- runtime_config : test_intersect_job_dh_cache_conf.json

14. ECDH Intersection with Cache:
11. ECDH Intersection with Cache:
- dsl: test_intersect_job_cache_dsl.json
- runtime_config : test_intersect_job_ecdh_cache_conf.json

15. RSA Intersection with Cache Loader:
12. RSA Intersection with Cache Loader:
- dsl: test_intersect_job_cache_loader_dsl.json
- runtime_config : test_intersect_job_rsa_cache_loader_conf.json

16. Estimated Intersect Cardinality:
13. Estimated Intersect Cardinality:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_rsa_cardinality_conf.json

17. Exact Intersect Cardinality with ECDH:
14. Exact Intersect Cardinality with ECDH:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_ecdh_exact_cardinality_conf.json

18. Exact Intersect Cardinality with DH:
15. Exact Intersect Cardinality with DH:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_dh_exact_cardinality_conf.json

19. DH Intersection with Preprocessing:
16. DH Intersection with Preprocessing:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_dh_w_preprocess_conf.json

20. RSA Intersection with Preprocessing:
17. RSA Intersection with Preprocessing:
- dsl: test_intersect_job_dsl.json
- runtime_config : test_intersect_job_rsa_w_preprocess_conf.json

21. ECDH Intersection with Cache Loader:
18. ECDH Intersection with Cache Loader:
- dsl: test_intersect_job_cache_loader_dsl.json
- runtime_config : test_intersect_job_ecdh_cache_loader_conf.json

22. Exact Multi-host Intersect Cardinality with ECDH:
19. Exact Multi-host Intersect Cardinality with ECDH:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_ecdh_multi_exact_cardinality_conf.json

23. Exact Multi-host Intersect Cardinality with DH:
20. Exact Multi-host Intersect Cardinality with DH:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_dh_multi_exact_cardinality_conf.json

24. Exact Multi-host Intersect with ECDH:
21. Exact Multi-host Intersect with ECDH:
- dsl: test_intersect_job_dsl.json
- runtime_config: "test_intersect_job_ecdh_multi_conf.json

Expand Down
12 changes: 0 additions & 12 deletions examples/dsl/v2/intersect/intersect_testsuite.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,6 @@
}
],
"tasks": {
"raw_intersect": {
"conf": "./test_intersect_job_raw_conf.json",
"dsl": "./test_intersect_job_dsl.json"
},
"raw_intersect_sm3": {
"conf": "./test_intersect_job_raw_sm3_conf.json",
"dsl": "./test_intersect_job_dsl.json"
},
"rsa_intersect": {
"conf": "./test_intersect_job_rsa_conf.json",
"dsl": "./test_intersect_job_dsl.json"
Expand All @@ -54,10 +46,6 @@
"conf": "./test_intersect_job_rsa_w_preprocess_conf.json",
"dsl": "./test_intersect_job_dsl.json"
},
"raw_intersect_multi_host": {
"conf": "./test_intersect_job_raw_multi_host_conf.json",
"dsl": "./test_intersect_job_dsl.json"
},
"dh_intersect": {
"conf": "./test_intersect_job_dh_conf.json",
"dsl": "./test_intersect_job_dsl.json"
Expand Down
Loading

0 comments on commit 9432615

Please sign in to comment.