diff --git a/docs/images/automl/nni_tensorboard.jpg b/docs/images/automl/nni_tensorboard.jpg
new file mode 100644
index 000000000..bd6403ac7
Binary files /dev/null and b/docs/images/automl/nni_tensorboard.jpg differ
diff --git a/docs/source/automl/finetune_config.md b/docs/source/automl/finetune_config.md
new file mode 100644
index 000000000..b9968b6c2
--- /dev/null
+++ b/docs/source/automl/finetune_config.md
@@ -0,0 +1,122 @@
+## finetune训练（可选）
+
+由于推荐业务每天都有实时更新的数据，如果用户采用先训练一批历史数据，后面每天finetune更新模型的话，可以利用以上begin调优的最优结果，再在新数据上微调。如果用户每次更新模型都是重新开始训练的话，则不需要此步骤。
+
+### 调优经验
+
+例如：用户有40天历史数据，可以先利用以上步骤调优30天数据，然后根据搜索出的最优参数，再finetuen剩余10天。
+经验是：根据begin训练得出的最优参数，将learning_rate设置为begin结束时的learning_rate。
+例如：
+begin训练时learning_rate如下,begin训练总计为8000步，因此可以设置finetune时initial_learning_rate=1e-6或者1e-7：
+
+```
+learning_rate {
+        exponential_decay_learning_rate {
+          initial_learning_rate: 0.001
+          decay_steps: 1000
+          decay_factor: 0.1
+          min_learning_rate: 1e-07
+        }
+      }
+```
+
+支持手动修改，也支持代码修改配置，修改效果如下：
+![image.png](../../images/automl/modify_lr.jpg)
+
+#### 使用代码修改配置(可选)
+
+支持本地上pipeline文件修改
+
+```bash
+python modify_pipeline_config.py --pipeline_config_path=./samples/pipeline.config --save_path=./samples/pipeline_finetune.config --learning_rate=1e-6
+```
+
+也支持oss上pipeline文件直接修改
+
+```bash
+python modify_pipeline_config.py  --pipeline_config_path=oss://easyrec/pipeline889.config --save_path=oss://easyrec/pipeline889-f.config --learning_rate=1e-6 --oss_config=../config/.ossutilconfig
+```
+
+如果用户想要看是否有更优参数，可以看下级目录启动调优。
+
+### 启动调优(可选)
+
+```bash
+nnictl create --config config_finetune.yml --port=8617
+```
+
+#### config_finetune.ini
+
+```
+[platform_config]
+name=MaxCompute
+{% set date_list = [20220616,20220617] %}
+{% set date_begin = 20220616 %}
+{% for bizdate in date_list %}
+{% set eval_ymd = bizdate +1 %}
+{% set predate = bizdate -1 %}
+{% if bizdate == date_begin %}
+cmd1_{{bizdate}}="PAI -name=easy_rec_ext
+    -project=algo_public
+    -Dscript='oss://automl-nni/easyrec/easy_rec_ext_615_res.tar.gz'
+    -Dtrain_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{bizdate}}'
+    -Deval_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{eval_ymd}}'
+    -Dcmd=train
+    -Deval_method=separate
+    -Dfine_tune_checkpoint="oss://automl-nni/easyrec/finetune/{{predate}}_finetune_model_nni_622"
+    -Dconfig='oss://automl-nni/easyrec/config/easyrec_model_${exp_id}_${trial_id}.config'
+    -Dmodel_dir='oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}'
+    -Dselected_cols='is_valid_play,ln_play_time,is_like,is_comment,features,content_features'
+    -Dbuckets='oss://automl-nni/'
+    -Darn='xxx'
+    -DossHost='oss-cn-beijing-internal.aliyuncs.com'
+    -Dcluster={"ps":{"count":1,"cpu":1600,"memory":40000 },"worker":{"count":12,"cpu":1600,"memory":40000}} "
+
+{% else %}
+cmd1_{{bizdate}}="PAI -name=easy_rec_ext
+    -project=algo_public
+    -Dscript='oss://automl-nni/easyrec/easy_rec_ext_615_res.tar.gz'
+    -Dtrain_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{bizdate}}'
+    -Deval_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{eval_ymd}}'
+    -Dcmd=train
+    -Deval_method=separate
+    -Dfine_tune_checkpoint="oss://automl-nni/easyrec/finetune/{{predate}}_finetune_model_nni_622/${exp_id}_${trial_id}"
+    -Dconfig='oss://automl-nni/easyrec/config/easyrec_model_${exp_id}_${trial_id}.config'
+    -Dmodel_dir='oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}'
+    -Dselected_cols='is_valid_play,ln_play_time,is_like,is_comment,features,content_features'
+    -Dbuckets='oss://automl-nni/'
+    -Darn='xxx'
+    -DossHost='oss-cn-beijing-internal.aliyuncs.com'
+    -Dcluster={"ps":{"count":1,"cpu":1600,"memory":40000 },"worker":{"count":12,"cpu":1600,"memory":40000}} "
+{% endif %}
+
+{% endfor %}
+
+
+[metric_config]
+# metric type is summary/table
+metric_type=summary
+{% set date_list = [20220616,20220617] %}
+{% for bizdate in date_list %}
+metric_source_{{bizdate}}=oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}/eval_val/
+{% endfor %}
+# best/final/avg,default=best
+final_mode=final
+source_list_final_mode=avg
+metric_dict={'auc_is_like':0.25, 'auc_is_valid_play':0.5, 'auc_is_comment':0.25}
+```
+
+与begin训练的`差异点`:
+
+- 每个配置模块支持jinja模版渲染
+- 配置finetune日期{% set date_list = \[20220616,20220617\] %}
+- 配置finetune开始日期{% set date_begin = 20220616 %}，Dfine_tune_checkpoint开始日期和后续日期采取的model路径不一样
+- 假设每天finetune：
+  - {bizdate} 必须保留，将会在代码中根据当天日期进行替换
+  - {eval_ymd} 必须保留，将会在代码中根据第二天日期进行替换
+  - {predate} 必须保留，将会在代码中根据前一天日期进行替换
+- metric_source也是多条路径，每一天训练结果为summary的最终结果，整组参数finetune的结果为这些天的平均值
+
+#### 配置超参搜索空间search_space.json
+
+参考begin训练阶段中想要搜索的参数即可，注意由于是finetune训练，网络结构相关的参数不要进行搜索，经验是搜索LR
diff --git a/docs/source/automl/hpo_config.md b/docs/source/automl/hpo_config.md
new file mode 100644
index 000000000..5f99e42e6
--- /dev/null
+++ b/docs/source/automl/hpo_config.md
@@ -0,0 +1,299 @@
+HPO启动配置包含exp.yml. trial.ini, search_space.json三个模块。
+
+# exp.yml
+
+exp.yml是作为NNI的配置文件，将代码和搜索空间进行结合，并使用指定的环境来运行您的训练代码，具体参考此exp.yml文件。在这里，您还可以还提供其他信息，例如并发度、调优算法、最大Trial数量和最大持续时间等参数。https://nni.readthedocs.io/zh/stable/reference/experiment_config.html#experimentconfig
+
+## 字段内容
+
+字段可以直接参考NNI官网，区别在于为了结合PAI,这些字段需保持不变
+
+```
+trialCommand: python3 -m hpo_tools.core.utils.run --config=./trial.ini
+trainingService:
+  platform: local
+assessor:
+  name: PAIAssessor
+```
+
+同时，为了能够停止PAI任务，需要使用PAIAssessor
+
+## PAIAssessor
+
+```
+支持将该组中的实验结果和同组中的所有历史进行比较，如果不满足比较标准（例如小于中位数），则停止该组超参数的运行。比如说设置最大运行次数max_trial_num， 实际使用量会显著小于max_trial_num，但具体数量就和实际跑的任务及随机到的超参有关系了。例如max_trial_num=50时，可能最终可能不到 25 次，并且差不多已经是完整探索了50组超参。
+```
+
+| PAIAssessor   | 描述                            | 值                 |
+| ------------- | ----------------------------- | ----------------- |
+| optimize_mode | 最大化优化的方向                      | maximize/minimize |
+| start_step    | 从第几步开始进行早停判定                  | 2                 |
+| moving_avg    | 早停判断时，采用所有历史的滑动平均值作为判断标准      | True              |
+| proportion    | 本次超参搜索的最优值和历史记录的proportion值比较 | 0.5               |
+| patience      | metric指标连续下降几次，就停止            | 10                |
+
+### 示例
+
+```
+experimentWorkingDirectory: ../expdir
+searchSpaceFile: search_space.json
+trialCommand: python3 -m hpo_tools.core.utils.run --config=./trial.ini
+trialConcurrency: 1
+maxTrialNumber: 4
+tuner:
+  name: TPE
+  classArgs:
+    optimize_mode: maximize
+debug: true
+logLevel: debug
+trainingService:
+  platform: local
+assessor:
+  name: PAIAssessor
+  classArgs:
+    platform: MAXCOMPUTE
+    optimize_mode: maximize
+    start_step: 1
+    moving_avg: true
+    proportion: 0.5
+```
+
+# trial.ini
+
+## 变量替换原则
+
+### 值替换
+
+程序会将trial.ini 中以下这些key默认替换成对应的值。参数默认支持值替换、列表替换、字典替换、json替换、文件替换（params_config)、支持嵌套字典的key替换（组合参数例子dlc_mnist_nested_search_space)
+
+- cmd = cmd.replace('${exp_id}', experment_id.lower())
+
+- cmd = cmd.replace('${trial_id}', trial_id.lower())
+
+- cmd = cmd.replace('${NNI_OUTPUT_DIR}',os.environ.get('NNI_OUTPUT_DIR', './tmp'))
+
+- cmd = cmd.replace('${tuner_params_list}', tuner_params_list)
+
+- cmd = cmd.replace('${tuner_params_dict}', tuner_params_dict)
+
+- cmd = cmd.replace('${tuner_params_json}', json.dumps(tuner_params))
+
+- cmd = cmd.replace('${params}', params)->支持参数标识路径，例如lr0.001_batchsize64 注意其中可能含有浮点数，请确定是否支持用来标识数据/数据表
+
+- cmd = cmd.replace(p, str(v))  将搜索的参数替换为搜索的值，搜索参数可以使用${batch_size}、${lr}来标记，需要和search_space.json中的key匹配使用
+
+### jinja渲染
+
+每个配置模块支持jinja模版渲染，用于用户在一开始设置变量，具体可以查看案例cross-validation/maxcompute-easyrec
+
+```
+[metric_config]
+# metric type is summary/table
+metric_type=summary
+{% set date_list = [20220616,20220617] %}
+{% for bizdate in date_list %}
+metric_source_{{bizdate}}=oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}/eval_val/
+{% endfor %}
+```
+
+### 字段介绍
+
+| 配置模块            | 描述                                                                | 是否可选 |
+| --------------- | ----------------------------------------------------------------- | ---- |
+| platform_config | 用于标记任务执行的平台以及对应的执行命令                                              | 必选   |
+| metric_config   | 用于标记任务metric的获取来源、metric的key以及对应权重、metric类型、最终metric的方式           | 必选   |
+| output_config   | 如果使用服务版，可以配置output_config用来获取最优模型配置summary_path，用于配制tensorboard路径 | 可选   |
+| schedule_config | 如果任务在指定时间内调度任务，则需要配置schedule_config,修改对应的schedule_config的值        | 可选   |
+| params_config   | 如果用户的参数是保存在文件中，则需要配置params_config, 用于标记需要修改参数的源文件路径和目标路径          | 可选   |
+| oss_config      | 如果任务需要使用OSS存储，则需要配置OSS config                                     | 可选   |
+| odps_config     | 如果任务需要使用maxcompute平台执行任务，则需要配置odps config                         | 可选   |
+| ts_config       | 如果任务需要使用trainingservice平台执行任务，则需要配置ts config                      | 可选   |
+| paiflow_config  | 如果任务需要执行工作流任务，则需要配置paiflow_config,修改对应的paiflow_config的值           | 可选   |
+| dlc_config      | 如果任务需要执行dlc任务，则需要配置dlc_config,修改对应的dlc_config的值                   | 可选   |
+| monitor_config  | 支持失败告警,最优metric更新时提醒                                              | 可选   |
+
+## platform_config
+
+| platform_config | 描述                                                           | 值                                                              |
+| --------------- | ------------------------------------------------------------ | -------------------------------------------------------------- |
+| name            | 用于标记任务执行的平台                                                  | DLC/MaxCompute/DataScience/LOCAL/PAI/PAIFLOW                   |
+| cmdxx           | 用于标记执行的命令，以cmd开头                                             | dlc submit pytorch --name=test_nni\_${exp_id}\_${trial_id} xxx |
+| resume          | 1表示开启续跑模式；用于用户一次运行时，比如说第一行任务成功，第二行由于资源不足失败，可以开启续跑，从第二行命令开始运行 | 0/1                                                            |
+
+## metric_config
+
+| metric_config          | 描述                                                                                                     | 值                                                                                                                                                                                                                              |
+| ---------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| metric_type            | metric类型                                                                                               | summary/table/api/json/stdout                                                                                                                                                                                                  |
+| metric_source          | metric来源（可以为多个以metric_source开头的，具体可以看maxcompute_crossvalidation案例）                                     | 对应为具体的路径或者job                                                                                                                                                                                                                  |
+| final_mode             | 如果任务运行过程中，存在很多中间metric，那么需要确定最终metric的计算方式                                                             | final/best/avg                                                                                                                                                                                                                 |
+| source_list_final_mode | 可选，默认值为final_mode，可选值为final/best/avg,用于有多个metric_source时最终metric如何计算，具体可以看maxcompute_crossvalidation案例 | final/best/avg                                                                                                                                                                                                                 |
+| metric_dict            | 对应查询的key以及对应的权重;可以为负值                                                                                  | metric_dict={'auc_is_like':0.25, 'auc_is_valid_play':0.5, 'auc_is_comment':0.25, 'loss_play_time':-0.25} metric=val(’auc_is_valid_play’)\*0.5+val(’auc_is_like’)\*0.25+val(’auc_is_comment’)\*0.25-val(’loss_play_time’)\*0.25 |
+
+- 如果metric_type=stdout类型，则metric_dict对应的key为正则表达式，value为对应的权重
+
+```
+[metric_config]
+# metric type is summary/table
+metric_type=stdout
+metric_source=oss://test-nni/examples/search/pai/stdout/stdout_${exp_id}_${trial_id}
+# best or final,default=best
+final_mode=best
+metric_dict={'validation: accuracy=([0-9\\.]+)':1}
+```
+
+- 如果metric_type=stdout类型，则metric_source支持指定默认任务的日志为来源
+  - stdoutmetric：支持指定具体的任务；例如metric_source=cmd1,即使用cmd1输出的任务日志做正则
+  - stdoutmetric：支持指定具体的任务，并过滤文件，例如metric_source=cmd,worker;即使用cmd1任务中所有的worker日志做正则
+
+```
+[metric_config]
+# metric type is summary/table
+metric_type=stdout
+# default is cmd, cmd->platform job 1,we will get the job1 all default stdout
+# if the job is distributed, you can use [cmd,worker] to assign which log has metric or just use [cmd] to choose all stdout
+metric_source=cmd,worker
+# best or final,default=best
+final_mode=best
+metric_dict={'validation: accuracy=([0-9\\.]+)':1}
+optimize_mode=maximize
+```
+
+- 如果metric_type=summary类型，则metric_source为对应的summary路径
+
+```
+[metric_config]
+# metric type is summary/table
+metric_type=summary
+# the easyrec model_dir/eval_val/ have the events summary file
+metric_source=hdfs://123.57.44.211:9000/user/nni/datascience_easyrec/model_nni/${exp_id}_${trial_id}/eval_val/
+```
+
+- 如果metric_type=table类型，则metric_source为对应的sql语句
+
+```
+[metric_config]
+# metric type is summary/table
+metric_type=table
+metric_source=select * from ps_smart_classification_metrics where pt='${exp_id}_${trial_id}';
+```
+
+- 如果metric_type=api类型，则metric支持指定具体的任务；例如metric_source=cmd1
+
+```
+[metric_config]
+# metric type is summary/table/api
+metric_type=api
+# default is cmd1,cmd1->platform job 1, we will get the default job1 metric
+# if is list,metric_source_1=cm1,metric_source_2=cmd2
+metric_source=cmd1
+```
+
+## output_config
+
+| output_config | 描述                                   | 值   |
+| ------------- | ------------------------------------ | --- |
+| model_path    | 如果使用服务版，可以配置model_path用来获取最优模型       | 路径  |
+| summary_path  | 如果使用单机版，可以配置summary用于本地查看TensorBoard | 路径  |
+
+## schedule_config
+
+| schedule_config | 描述                   | 值                |
+| --------------- | -------------------- | ---------------- |
+| day             | 支持在指定时间范围内调度AutoML任务 | everyday/weekend |
+| start_time      | 指定调度开始时间             | 00:00-23:59      |
+| end_time        | 指定调度结束时间             | 00:00-23:59      |
+
+## params_config
+
+如果用户的参数是保存在文件中，则需要配置params_config,
+
+| params_config              | 描述                                                     | 值                 |
+| -------------------------- | ------------------------------------------------------ | ----------------- |
+| params_src_dst_filepath1xx | 用于标记需要修改参数的源文件路径和目标路径,可以为多个，以params_src_dst_filepath开头 | src_path,dst_path |
+| params_src_dst_filepath2xx | xx                                                     | xx                |
+
+## oss_config
+
+| oss_config      | 描述       | 值                                                                          |
+| --------------- | -------- | -------------------------------------------------------------------------- |
+| endpoint        | endpoint | [http://oss-cn-shanghai.aliyuncs.com](http://oss-cn-shanghai.aliyuncs.com) |
+| accessKeyID     | ak       | ak                                                                         |
+| accessKeySecret | sk       | sk                                                                         |
+| role_arn        | role_arn | acs:ram::xxx:role/aliyunserviceroleforpaiautoml                            |
+
+## odps_config
+
+| odps_config   | 描述           | 值                                                                                      |
+| ------------- | ------------ | -------------------------------------------------------------------------------------- |
+| access_id     | ak           | ak                                                                                     |
+| access_key    | sk           | ak                                                                                     |
+| project_name  | project_name | xxx                                                                                    |
+| end_point     | end_point    | 弹外: http://service.odps.aliyun.com/api  弹内：http://service-corp.odps.aliyun-inc.com/api |
+| log_view_host | logview host | 弹外：http://logview.odps.aliyun.com 弹内：http://logview.alibaba-inc.com                    |
+| role_arn      | role_arn     | acs:ram::xxx:role/aliyunserviceroleforpaiautoml                                        |
+
+## dlc_config
+
+| dlc_config | 描述          | 值                                                                 |
+| ---------- | ----------- | ----------------------------------------------------------------- |
+| access_id  | ak          | ak                                                                |
+| access_key | sk          | ak                                                                |
+| end_point  | end_point   | 弹外：pai-dlc.cn-shanghai.aliyuncs.com 弹内：pai-dlc-share.aliyuncs.com |
+| region     | cn-shanghai | cn-shanghai                                                       |
+| protocol   | protocol    | http/https                                                        |
+
+## ts_config
+
+| ts_config         | 描述        | 值                            |
+| ----------------- | --------- | ---------------------------- |
+| access_key_id     | ak        | ak                           |
+| access_key_secret | sk        | ak                           |
+| region_id         | reigin    | xxx                          |
+| endpoint          | end_point | pai.cn-hangzhou.aliyuncs.com |
+
+## paiflow_config
+
+| paiflow_config    | 描述           | 值       |
+| ----------------- | ------------ | ------- |
+| access_key_id     | ak           | ak      |
+| access_key_secret | sk           | ak      |
+| region_id         | reigin       | xxx     |
+| workspace_id      | workspace_id | 2332411 |
+
+## monitor_config
+
+- 参考[阿里钉机器人](https://open.dingtalk.com/document/robots/custom-robot-access)去添加自定义机器人，获取url
+  - 点击阿里钉头像->机器人管理-自定义机器人->群组选择工作通知
+  - 点击阿里钉头像->机器人管理-自定义机器人->群组：选择对应的群号
+
+| monitor_config | 描述                                           | 值                                                     |
+| -------------- | -------------------------------------------- | ----------------------------------------------------- |
+| url            | url为创建自定义机器人对应的Webhook地址                     | https://oapi.dingtalk.com/robot/send?access_token=xxx |
+| keyword        | 添加自定义机器人：自定义关键词                              | monitor                                               |
+| at_mobiles     | 在content里添加@人的手机号，且只有在群内的成员才可被@，非群内成员手机号会被脱敏 | \['11xx'\]                                            |
+| at_user_ids    | 被@人的用户userid。即工号                             | \[\]                                                  |
+| is_at_all      | 是否@所有人                                       | True/False                                            |
+
+## search_space.json
+
+| search_space | 描述                                                                                                        | 值   |
+| ------------ | --------------------------------------------------------------------------------------------------------- | --- |
+| key          | trial.ini中配置的搜索参数变量                                                                                       |     |
+| type         | nni中定义的搜索类型，相关配置参考[NNI searchSpace参考手册](https://nni.readthedocs.io/en/v2.2/Tutorial/SearchSpaceSpec.html) |     |
+
+- {”\_type”: “choice”, “\_value”: options}：从options中选取一个。
+- {”\_type”: “randint”, “\_value”: \[lower, upper\]}：\[low,upper)之间选择一个随机整数。
+- {”\_type”: “uniform”, “\_value”: \[low, high\]}：\[low,upper\]之间随机采样。
+  |
+  | value | value是根据业务、经验设置相关搜索值 |
+  |
+
+### 示例
+
+```
+{
+    "${batch_size}": {"_type":"choice", "_value": [16, 32, 64, 128]},
+    "${lr}":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]}
+}
+```
diff --git a/docs/source/automl/hpo_res.md b/docs/source/automl/hpo_res.md
new file mode 100644
index 000000000..ef3abe85d
--- /dev/null
+++ b/docs/source/automl/hpo_res.md
@@ -0,0 +1,131 @@
+## 调优结果
+
+在运行实验后，可以在命令行界面中找到如下的Web界面地址 ：\[Your IP\]:\[Your Port\]
+![image.png](../../images/automl/pai_nni_create.jpg)
+
+### 查看概要页面
+
+点击Overview按钮，在这里可以看到实验相关信息，如配置文件、搜索空间、运行时长、日志路径等。NNI 还支持通过 Experiment summary 按钮下载这些信息和参数。
+![image.png](../../images/automl/pai_nni_overview.jpg)
+
+### 查看Trial详情页面
+
+点击Trials detail按钮，您可以在此页面中看到整个实验过程中，每个trial的结果情况。
+其中succeeded代表此次trial成功运行，earlystop表示该组参数运行结果不太好，被提前停止了。停止策略可以查看pai_nni/core/pai_assessor.PaiAssessor，当然也可以根据业务情况去修改。
+![image.png](../../images/automl/pai_nni_detail.jpg)
+
+### 查看作业日志详情
+
+点击每个Trial No，可以看到每个参数Trial的日志、参数详情,报错和输出可以点击以下3个按钮。
+![image.png](../../images/automl/pai_nni_log.jpg)
+
+### 手动停止某组实验超参
+
+如果某些参数的结果不太好，可以进行手动停止。
+例如停止第一组参数。
+![image.png](../../images/automl/nni_stop.png)
+
+### 多目标default metric查看
+
+假设用户配置的metric_config为如下，那么UI中Default metric中显示3项；该组trial最终的metric
+
+- default=auc\*0.5+accuracy\*0.5
+- auc即为最终的auc值
+- accuracy即为最终的accuracy值
+
+```
+auc=0.5
+accuracy=0.5
+```
+
+![image.png](../../images/automl/nni_metric.png)
+
+### 最优模型和参数
+
+可以按照metric排序，获取最优精度的参数，这组实验id kfv91xl5 和trial_id zuKwM,代码中默认设置模型保存路径为以下方式，因此可以在对应的路径下找到该模型
+-Dmodel_dir='oss://lcl-bj/eval_dist_test/model\_${exp_id}\_${trial_id}'
+![image.png](../../images/automl/best-model.png)
+
+### 多组参数比较
+
+点击Trail NO，选中后，点击compare, 就可以查看对应参数的相关信
+![image.png](../../images/automl/nni-compare.png)
+
+### 多组实验查看
+
+多组实验可以点击All experiments,然后点击具体的实验ID进入对应的实验详情
+![image.png](../../images/automl/exp-list.png)
+
+### 自定义参数或者失败重试
+
+可以使用自定义参数，也可以使用该功能重启失败的trial。
+点击复制这个按钮，然后跳出Customized trial，点击提交/或者修改即可，此处是新增一组参数，应该记得调高MaxTrialNo
+注意该功能在2.10目前有问题；需要nni\<=2.9
+![image.png](../../images/automl/retry_trial.jpg)
+
+### 失败续跑
+
+仅实时修改trial.ini，或者初始设置trial.ini 时添加配置resume=1即可将重跑变成断点续跑；
+注意续跑时会从上一次该参数失败的cmd开始跑。
+
+```
+[platform_config]
+name=MaxCompute
+resume=1
+cmd=PAI -name xxx
+```
+
+### 一键重试失败的Trial
+
+当用户确认失败原因为没有资源，或者算法偶现失败等原因时，想采取重试策略，可以使用该API发起多个失败的Trial一起重试。在内部其实是将NNI最大运行次数增大，并发数保持不变；并且是新增了多个Trial，每个Trial的参数和之前失败的Trial保持一致。
+
+注意该功能在2.10目前有问题；需要nni\<=2.9
+
+- experiment_id: 重试的实验ID（必选）
+
+```
+python -m hpo_tools.core.utils.retry_multi_failed_trials --experiment_id=o968matg
+```
+
+### 一键停止运行的Trial
+
+当用户得到想要的模型和参数时，由于NNI停止实验时，只会停止本地的进程，不会将dlc/trainingservice等平台的任务停止，目前提供了接口，可以将实验正常运行的作业给停止掉，并且将最大实验次数调至1(最小的正数），避免停止的瞬间起新的作业。
+
+- experiment_id: 停止的实验ID（必选）
+
+```
+python -m hpo_tools.core.utils.kill_multi_running_trials --experiment_id=o968matg
+```
+
+### 停止实验
+
+在停止实验之前，先参考一键停止运行的Trial，再停止实验
+
+```
+nnictl stop exp_id
+```
+
+### 超参数分析
+
+可以点击超参数Hyper-parameter，选中关注的指标，就可以看出来最好的参数大概是哪些；对参数进行分析
+![image.png](../../images/automl/hyper.jpg)
+
+### tensorboard
+
+需要配置output_config,获取相应的summary文件
+可以参考：https://nni.readthedocs.io/zh/stable/experiment/web_portal/tensorboard.html
+
+注意目前NNI有个bug，需要先规避一下：
+
+```
+nni_tensorboard_filepath=$(python3 -c "import nni;import os;print(os.path.join(os.path.dirname((os.path.dirname(nni.__file__))),'nni_node/extensions/nniTensorboardManager.js'))")
+echo "nni_tensorboard_filepath:"$nni_tensorboard_filepath
+sed -i -e "s/--bind_all/--host 0.0.0.0/g" $nni_tensorboard_filepath
+```
+
+```
+[output_config]
+summary_path=oss://lcl-bj/eval_dist_test/model_${exp_id}_${trial_id}_${params}
+```
+
+![image.png](../../images/automl/nni_tensorboard.jpg)
diff --git a/docs/source/automl/pai_nni_hpo.md b/docs/source/automl/pai_nni_hpo.md
index 4ff77de4e..7f08fc470 100644
--- a/docs/source/automl/pai_nni_hpo.md
+++ b/docs/source/automl/pai_nni_hpo.md
@@ -1,73 +1,60 @@
 # PAI-NNI-HPO
 
-## GetStarted
+HPO是对模型参数、训练超参数（opt、lr）等进行自动搜索调优的一个工具，从而获取较优参数，提升模型效果。可以大大地节省算法同学调参的时间，聚焦在建模和业务上。我们对NNI、PAI产品和算法等进行集成，支持多平台0代码修改调参，并做了加速、监控、调度、续跑等功能增强。
 
-注意NNI仅支持python>=3.7,因此请配置python>=3.7的环境
+NNI参考：https://nni.readthedocs.io/en/stable/hpo/overview.html
 
-NNI is tested and supported on Ubuntu >= 18.04, Windows 10 >= 21H2, and macOS >= 11.
+# 安装
 
-### 下载安装easyrec
+系统：Ubuntu >= 18.04, Windows 10 >= 21H2, macOS >= 11.
 
-```bash
-git clone https://github.com/alibaba/EasyRec.git
-cd EasyRec
-bash scripts/init.sh
-python setup.py install
-```
+python环境：注意NNI仅支持python>=3.7,因此请配置python>=3.7的环境
+
+java环境：如果需要运行MC的PAI命令，需要java8
 
-### 下载安装hpo-tools
+## 下载安装hpo-tools
 
-#### 安装python3.7+以上环境 (可选)
+安装命令为
 
 ```
-wget http://automl-nni.oss-cn-beijing.aliyuncs.com/nni/hpo_tools/Anaconda3-5.3.1-Linux-x86_64.sh
-bash Anaconda3-5.3.1-Linux-x86_64.sh
-source ~/.bashrc
-conda create -n test python=3.7
-source activate test
+source install_hpo_tools.sh $1 $2
 ```
 
-#### 安装java8及以上环境（可选）
-
-如果用户不需要提交maxcompute作业，或者环境已经满足，可以跳过。[odpscmd 参考](https://help.aliyun.com/document_detail/27971.html#section-dje-rvv-jp2)
+- 第一个参数为下载examples的位置，默认下载在输入路径下面的examples下; 如果没写目录，默认生成在根目录下。
+- 第二个参数为aliyun/eflops/mac-dlc/mac-arm-dlc，用来控制安装dlc的版本，如果没写，则默认安装aliyun版本的dlc
 
-此处给了一个mac安装java8的教程，一般linux服务器会自带java8
+### Linux
 
 ```
-# on mac
-brew install --cask homebrew/cask-versions/adoptopenjdk8
-
-# 接下来获取jdk安装路径，输入 /usr/libexec/java_home -V.红方框內就是jdk的安装路径，复制备用(/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home)
-/usr/libexec/java_home -V
-
-# 设置环境变量 vi /etc/profile
-export JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
-export PATH=$JAVA_HOME/bin:$PATH:.
-export CLASS_PATH=$JAVA_HOME/lib
-source /etc/profile
-java -version
+wget https://automl-nni.oss-cn-beijing.aliyuncs.com/nni/hpo_tools/scripts/install_hpo_tools.sh
+source install_hpo_tools.sh ./ aliyun
+source ~/.bashrc
 ```
 
-#### 安装hpo_tools
-
-第一个参数为下载examples的位置，默认下载在输入路径下面的examples下; 如果没写目录，默认生成在根目录下。
-
-安装2选1，install_hpo_tools.sh默认会安装最新版本，最新版本内的代码和案例都是匹配的，可以正常运行，但可能文档配置未更新。因此可以采用安装当前文档匹配的版本。
-
-#### 安装最新版本（可选）
+### MAC
 
 ```
+# 如果是mac系统 将zsh->bash
+chsh -s /bin/bash
+
+# 如果是mac系统，aliyun/eflops/mac-dlc/mac-arm-dlc
 wget https://automl-nni.oss-cn-beijing.aliyuncs.com/nni/hpo_tools/scripts/install_hpo_tools.sh
-source install_hpo_tools.sh ./ aliyun
-cd ./examples/search/maxcompute_easyrec
+source install_hpo_tools.sh ./ mac-dlc
+
+source ~/.bashrc
 ```
 
-#### 安装当前版本（可选）
+### MAC ARM
 
 ```
-wget https://automl-nni.oss-cn-beijing.aliyuncs.com/nni/hpo_tools/scripts/install_hpo_tools_0.1.10.sh
-source install_hpo_tools_0.1.10.sh ./ aliyun
-cd ./examples/search/maxcompute_easyrec
+# 如果是mac系统 将zsh->bash
+chsh -s /bin/bash
+
+# 如果是mac系统，aliyun/eflops/mac-dlc/mac-arm-dlc
+wget https://automl-nni.oss-cn-beijing.aliyuncs.com/nni/hpo_tools/scripts/install_hpo_tools.sh
+source install_hpo_tools.sh ./ mac-arm-dlc
+
+source ~/.bashrc
 ```
 
 - 注意如果有旧版本，会先卸载旧版本，升级新版本hpo-tools
@@ -76,124 +63,79 @@ cd ./examples/search/maxcompute_easyrec
 - 默认会安装dlc命令行工具，用于提交dlc作业
 - 默认会安装odpscmd命令行工具，用于提交maxcompute作业
 
-### 卸载hpo-tools（可选）
+## 提供镜像包（可选）
 
-如果需要升级，则需要先卸载之前安装包,第一个参数为原始安装的位置；默认会卸载hpo_tools，删除examples/dlc/odpscmd
+提供镜像用于用户免安装使用，支持local/dlc/mc/trainingservice/paiflow
 
-```
-bash uninstall_hpo_tools.sh ./
-```
+- 弹外GPU镜像：registry.cn-shanghai.aliyuncs.com/mybigpai/nni:gpu-latest
+- 弹外CPU镜像：registry.cn-shanghai.aliyuncs.com/mybigpai/nni:cpu-latest
+
+### 启动镜像
 
-## 配置
+```
+mkdir -p ./examples
+cd examples
+echo $(pwd)
 
-### 配置 config.ini
+# 挂载&获取container id
+container_id=`docker run -td --network host  -v $(pwd):/HpoTools/test registry.cn-shanghai.aliyuncs.com/mybigpai/nni:cpu-latest`
+echo $container_id
 
-#### config.ini中参数的自动替换：
+# get examples:cp docker examples to local
+docker cp $container_id:/HpoTools/examples/search $(pwd)
+# 配置案例路径
+ls $(pwd)/search
 
-程序会将config.ini 中以下这些key默认替换成对应的值。参数默认支持值替换、列表替换、字典替换、json替换、文件替换（params_config)
+# 运行镜像
+docker exec -ti $container_id /bin/bash
+cd /HpoTools/test/search
 
+### 查看具体案例 去本地修改$(pwd)/search下面的实验配置，第2章节
+### 查看具体案例 去容器/HpoTools/test/search 启动调优，第3章节
+### 查看具体案例 去本地UI查看调优结果，第4章节
 ```
-def update_default_params(cmd, tuner_params={}, params_only=False):
-    """update params in cmd."""
-    trial_id = str(nni.get_trial_id())
-    experment_id = str(nni.get_experiment_id())
-
-    tuner_params_list = ''
-    tuner_params_dict = ''
-    for p, v in tuner_params.items():
-        cmd = cmd.replace(p, str(v))
-        tuner_params_list += p + ' ' + str(v) + ' '
-        tuner_params_dict += p + '=' + str(v) + ' '
-
-    # params_only used in replace ak,sk in test at the begining
-    if not params_only:
-        # lower for k8s meta.name
-        cmd = cmd.replace('${exp_id}', experment_id.lower())
-        cmd = cmd.replace('${trial_id}', trial_id.lower())
-        cmd = cmd.replace('${NNI_OUTPUT_DIR}',
-                          os.environ.get('NNI_OUTPUT_DIR', './tmp'))
-        cmd = cmd.replace('${tuner_params_list}', tuner_params_list)
-        cmd = cmd.replace('${tuner_params_dict}', tuner_params_dict)
-        cmd = cmd.replace('${tuner_params_json}', json.dumps(tuner_params))
-
-    return cmd
-```
-
-#### params_config(可选）
 
-- 如果用户的参数是保存在文件中，则需要配置params_config, 用于标记需要修改参数的源文件路径和目标路径;可以为多个params_src_dst_filepathxx=src_path,dst_path,注意以，分割；支持OSS/HDFS/NAS/LOCAL
-- 如果用户想要生成参数，也可以配置params_config,只需要配置目标路径即可
+# 配置
 
-#### platform_config(必选）
+HPO启动配置包含exp.yml. trial.ini, search_space.json三个模块。
 
-用于标记任务执行的平台以及对应的执行命令
+HPO的启动命令是：
 
 ```
-name=DLC/MaxCompute/DataScience/LOCAL/PAI
-cmdxx=xx （执行的命令行）
+nnictl create --config exp.yml
 ```
 
-#### metric_config（必选）
-
-用于标记任务metric的获取来源、metric类型、最终metric的方式、metric的key以及对应权重、
-其中
-
-- metric_type=summary/table/api/json/stdout(必选）
-- metric_source=xxx（必选，可以为多个以metric_source开头的，具体可以看finetune案例）
-  - metric_source=oss://lcl-bj/eval_dist_test/model\_${exp_id}\_${trial_id}/eval_val/ 为easyrec model_dir/eval_val/下
-- final_mode=final/best/avg（可选，默认值为best，可选值为final/best/avg）
-- optimize_mode=maximize/minimize （可选，默认值为maximize, 可选值为maximize/minimize)
-- source_list_final_mode=final/best/avg（可选，默认值为final_mode，可选值为final/best/avg,用于有多个metric_source时最终metric如何计算，具体可以看maxcompute_crossvalidation案例）
-- metric_dict示例：对应查询的key以及对应的权重
-  - 多目标示例：metric=val(’auc_is_valid_play’)\*0.5+val(’auc_is_like’)\*0.25+val(’auc_is_comment’)\*0.25
-    ```
-    metric_dict={'auc_is_like':0.25, 'auc_is_valid_play':0.5, 'auc_is_comment':0.25}
-    ```
-  - 多目标示例：metric=val(’auc_is_valid_play’)\*0.5+val(’auc_is_like’)\*0.25+val(’auc_is_comment’)\*0.25-val(’loss_play_time’)\*0.25
-    注意：如果config.yml中nni tuner、assessor的配置方式是按metric最大化方式去选择参数的，对于loss这种越小越好的metric，需要定义权重为负值。
-    ```
-    metric_dict={'auc_is_like':0.25, 'auc_is_valid_play':0.5, 'auc_is_comment':0.25, 'loss_play_time':-0.25}
-    ```
-  - 单目标示例：metric=val(’auc_is_valid_play’)\*1
-    ```
-    metric_dict={'auc_is_valid_play':1}
-    ```
-  - 如果metric_type=stdout类型，则metric_dict对应的key为正则表达式，value为对应的权重,可以查看dlc_mnist/config_local_stdout.ini示例
-    ```
-    [metric_config]
-    # metric type is summary/table
-    metric_type=stdout
-    metric_source=oss://test-nni/examples/search/pai/stdout/stdout_${exp_id}_${trial_id}
-    # best or final,default=best
-    final_mode=best
-    metric_dict={'validation: accuracy=([0-9\\.]+)':1}
-    optimize_mode=maximize
-    ```
-
-#### oss_config （可选）
-
-如果任务需要使用OSS存储，则需要配置OSS config，修改对应的oss_config的值
-
-#### odps_config （可选）
-
-如果任务需要使用maxcompute平台执行任务，则需要配置odps config,修改对应的odps_config的值
-
-#### schedule_config (可选)
-
-支持在指定时间范围内调度AutoML任务
-
-- 天数级：例如：everyday/weekend
-- 分钟级：例如：09:00～21:00
+- 启动入口为exp.yml
+- 通过trialCommand: python3 -m hpo_tools.core.utils.run --config=./trial.ini  连接用户的具体的启动任务。
+- 通过字段searchSpaceFile: search_space.json    连接 search_space.json;
+
+配置案例均可以在安装目录examples/search目录下，细节请参考[HPO配置介绍](./hpo_config.md)
+
+## exp.yml 示例
 
 ```
-[schedule_config]
-# everyday/weedend
-day=everyday
-start_time=15:15
-end_time=21:59
+experimentName: maxcompute_easyrec
+experimentWorkingDirectory: ../expdir
+searchSpaceFile: search_space.json
+trialCommand: python3 -m hpo_tools.core.utils.run --config=./trial.ini
+trialConcurrency: 1
+maxTrialNumber: 1
+tuner:
+  name: TPE
+  classArgs:
+    optimize_mode: maximize
+trainingService:
+  platform: local
+assessor:
+  name: PAIAssessor
+  classArgs:
+    optimize_mode: maximize
+    start_step: 1
 ```
 
-#### config.ini 示例
+## trial.ini 示例
+
+可以查看安装目录下examples/search/maxcompute_easyrec/trial.ini,执行在PAI MaxCompute平台
 
 ```
 [oss_config]
@@ -239,15 +181,9 @@ metric_dict={'auc':1}
 
 ```
 
-##### easyrec命令配置
-
-相关参数说明参考[MaxCompute Tutorial](../quick_start/mc_tutorial.md)：
+## trial_local.ini 示例
 
-注意pai命令中的value需要用引号，例如DossHost='oss-cn-beijing-internal.aliyuncs.com'
-
-#### config_local.ini 示例
-
-其中执行的命令的是在本地的，而不是在PAI MaxCompute平台
+可以查看安装目录下examples/local_easyrec/trial.ini其中执行的命令的是在本地的，而不是在PAI MaxCompute平台
 
 ```
 [params_config]
@@ -267,7 +203,7 @@ final_mode=final
 metric_dict={'auc':1}
 ```
 
-##### CPU/GPU
+### CPU/GPU
 
 [NNI Local配置参考手册](https://nni.readthedocs.io/zh/stable/reference/experiment_config.html#localconfig)
 
@@ -275,7 +211,7 @@ metric_dict={'auc':1}
 - 如果想将任务执行在CPU上，则使用config_local.yml
   ![image.png](../../images/automl/nni_local.jpg)
 
-### 配置超参搜索空间search_space.json
+## 配置超参搜索空间search_space.json
 
 - key是Dconfig中的参数名称，相关配置参考[EasyRecConfig参考手册](../reference.md)
 - type是nni中定义的搜索类型，相关配置参考[NNI searchSpace参考手册](https://nni.readthedocs.io/en/v2.2/Tutorial/SearchSpaceSpec.html)
@@ -289,7 +225,7 @@ metric_dict={'auc':1}
 
 常见搜索空间可以参考：samples/hpo/search_space.json
 
-##### key配置注意项
+### key配置注意项
 
 ${initial_learning_rate} 为search_space.json中的key，需要在easyrec pipeline config中提前进行替换，原理是采用变量替换的方式去引入新的超参
 
@@ -312,7 +248,7 @@ train_config {
   }
 ```
 
-##### type配置注意事项
+### type配置注意事项
 
 [NNI searchSpace参考手册](https://nni.readthedocs.io/en/v2.2/Tutorial/SearchSpaceSpec.html)
 
@@ -320,9 +256,11 @@ train_config {
 - {"\_type": "randint", "\_value": \[lower, upper\]}：\[low,upper)之间选择一个随机整数。
 - {"\_type": "uniform", "\_value": \[low, high\]}：\[low,upper\]之间随机采样。
 
-## 启动调优
+## 高级
+
+finetune搜索高级用法参考[HPO finetune](./finetune_config.md)
 
-### 启动命令
+# 启动调优
 
 ```bash
 nnictl create --config config.yml --port=8780
@@ -336,332 +274,18 @@ nnictl create --config config.yml --port=8780
 启动成功界面：
 ![image.png](../../images/automl/pai_nni_create.jpg)
 
-### config.yml 参数说明
+如果启动失败，请先查看第6章节FAQ
 
-config.yml是作为NNI的配置文件，将代码和搜索空间进行结合，并使用指定的环境来运行您的训练代码，具体参考此config.yml文件。在这里，您还可以还提供其他信息，例如并发度、调优算法、最大Trial数量和最大持续时间等参数。
-
-[NNI参考手册config.yml](https://nni.readthedocs.io/zh/stable/reference/experiment_config.html#experimentconfig)
-
-```
-experimentWorkingDirectory: ../expdir
-searchSpaceFile: search_space.json
-trialCommand: python3 -m hpo_tools.core.utils.run --config=./config.ini
-trialConcurrency: 1
-maxTrialNumber: 1
-tuner:
-  name: TPE
-  classArgs:
-    optimize_mode: maximize
-debug: true
-logLevel: debug
-trainingService:
-  platform: local
-assessor:
-  name: PAIAssessor
-  classArgs:
-    platform: MAXCOMPUTE
-    optimize_mode: maximize
-    start_step: 1
-```
+# HPO调优结果
 
-### 并发度和最大Trial数量、最大运行时间可以实时调整：
-
-建议：刚开始设置为1，调测代码成功后，可以先调大最大运行次数Max trial No.，再调大并发度Concurrency。
-![image.png](../../images/automl/pai_nni_modify.jpg)
-
-## 调优结果
-
-在运行实验后，可以在命令行界面中找到如下的Web界面地址 ：\[Your IP\]:\[Your Port\]
-![image.png](../../images/automl/pai_nni_create.jpg)
-
-### 查看概要页面
-
-点击Overview按钮，在这里可以看到实验相关信息，如配置文件、搜索空间、运行时长、日志路径等。NNI 还支持通过 Experiment summary 按钮下载这些信息和参数。
+点击生成的URL，例如http://127.0.0.1:8780,可以看到webUI
 ![image.png](../../images/automl/pai_nni_overview.jpg)
+查看调优结果没问题后，可以调整最大Trial运行数量MaxTrialNo和并发度Concurrency。
+如果需要更详细的调优结果，可查看[HPO调优结果](./hpo_res.md)
 
-### 查看Trial详情页面
-
-点击Trials detail按钮，您可以在此页面中看到整个实验过程中，每个trial的结果情况。
-其中succeeded代表此次trial成功运行，earlystop表示该组参数运行结果不太好，被提前停止了。停止策略可以查看pai_nni/core/pai_assessor.PaiAssessor，当然也可以根据业务情况去修改。
-![image.png](../../images/automl/pai_nni_detail.jpg)
-
-### 查看作业日志详情
-
-点击每个Trial No，可以看到每个参数Trial的日志、参数详情,报错和输出可以点击以下3个按钮。
-![image.png](../../images/automl/pai_nni_log.jpg)
-
-### 手动停止某组实验超参
-
-如果某些参数的结果不太好，可以进行手动停止。
-例如停止第一组参数。
-![image.png](../../images/automl/nni_stop.png)
-
-### 多目标default metric查看
-
-假设用户配置的metric_config为如下，那么UI中Default metric中显示3项；该组trial最终的metric
-
-- default=auc\*0.5+accuracy\*0.5
-- auc即为最终的auc值
-- accuracy即为最终的accuracy值
-
-```
-auc=0.5
-accuracy=0.5
-```
-
-![image.png](../../images/automl/nni_metric.png)
-
-### 最优模型和参数
-
-可以按照metric排序，获取最优精度的参数，这组实验id kfv91xl5 和trial_id zuKwM,代码中默认设置模型保存路径为以下方式，因此可以在对应的路径下找到该模型
--Dmodel_dir='oss://lcl-bj/eval_dist_test/model\_${exp_id}\_${trial_id}'
-![image.png](../../images/automl/best-model.png)
-
-### 多组参数比较
-
-点击Trail NO，选中后，点击compare, 就可以查看对应参数的相关信
-![image.png](../../images/automl/nni-compare.png)
-
-### 多组实验查看
-
-多组实验可以点击All experiments,然后点击具体的实验ID进入对应的实验详情
-![image.png](../../images/automl/exp-list.png)
-
-### 自定义参数或者失败重试
-
-可以使用自定义参数，也可以使用该功能重启失败的trial。
-点击复制这个按钮，然后跳出Customized trial，点击提交/或者修改即可，此处是新增一组参数，应该记得调高MaxTrialNo
-注意该功能在2.10目前有问题；需要nni\<=2.9
-![image.png](../../images/automl/retry_trial.jpg)
-
-### 一键重试失败的Trial
-
-当用户确认失败原因为没有资源，或者算法偶现失败等原因时，想采取重试策略，可以使用该API发起多个失败的Trial一起重试。在内部其实是将NNI最大运行次数增大，并发数保持不变；并且是新增了多个Trial，每个Trial的参数和之前失败的Trial保持一致。
-
-注意该功能在2.10目前有问题；需要nni\<=2.9
-
-- experiment_id: 重试的实验ID（必选）
-- trial_begin_id: 默认为0（可选，表明重试的开始为第0个trial）；
-- trial_end_id: 默认为-1 （可选，表明重试的结束为最后一个trial）
-
-例如：
-实验exp跑了20组，失败5组；最大运行次数为30
-
-启动第一次重试，参数为（exp,0,-1)；最大运行次数将被修改为35，此时仍有失败2组.
-
-启动第二次重试时，参数为（exp，20，-1）；最大运行次数将被修改为37， 此时全部成功；后续无需重启
-
-```
-python -m hpo_tools.core.utils.retry_multi_failed_trials --experiment_id=o968matg --trial_begin_id=0 --trial_end_id=-1
-python -m hpo_tools.core.utils.retry_multi_failed_trials --experiment_id=o968matg --trial_begin_id=20 --trial_end_id=-1
-```
-
-### 一键停止运行的Trial
-
-当用户得到想要的模型和参数时，由于NNI停止实验时，只会停止本地的进程，不会将dlc/trainingservice等平台的任务停止，目前提供了接口，可以将实验正常运行的作业给停止掉，并且将最大实验次数调至1(最小的正数），避免停止的瞬间起新的作业。
-
-- experiment_id: 停止的实验ID（必选）
-- trial_begin_id: 默认为0（可选，表明停止的开始为第0个trial）；
-- trial_end_id: 默认为-1 （可选，表明停止的结束为最后一个trial）
-
-```
-python -m hpo_tools.core.utils.kill_multi_running_trials --experiment_id=o968matg --trial_begin_id=0 --trial_end_id=-1
-```
-
-### 停止实验
-
-在停止实验之前，先参考一键停止运行的Trial，再停止实验
-
-```
-nnictl stop exp_id
-```
-
-### 超参数分析
-
-可以点击超参数Hyper-parameter，选中关注的指标，就可以看出来最好的参数大概是哪些；对参数进行分析
-![image.png](../../images/automl/hyper.jpg)
-
-## finetune训练（可选）
-
-由于推荐业务每天都有实时更新的数据，如果用户采用先训练一批历史数据，后面每天finetune更新模型的话，可以利用以上begin调优的最优结果，再在新数据上微调。如果用户每次更新模型都是重新开始训练的话，则不需要此步骤。
-
-### 调优经验
-
-例如：用户有40天历史数据，可以先利用以上步骤调优30天数据，然后根据搜索出的最优参数，再finetuen剩余10天。
-经验是：根据begin训练得出的最优参数，将learning_rate设置为begin结束时的learning_rate。
-例如：
-begin训练时learning_rate如下,begin训练总计为8000步，因此可以设置finetune时initial_learning_rate=1e-6或者1e-7：
-
-```
-learning_rate {
-        exponential_decay_learning_rate {
-          initial_learning_rate: 0.001
-          decay_steps: 1000
-          decay_factor: 0.1
-          min_learning_rate: 1e-07
-        }
-      }
-```
-
-支持手动修改，也支持代码修改配置，修改效果如下：
-![image.png](../../images/automl/modify_lr.jpg)
-
-#### 使用代码修改配置(可选)
-
-支持本地上pipeline文件修改
-
-```bash
-python modify_pipeline_config.py --pipeline_config_path=./samples/pipeline.config --save_path=./samples/pipeline_finetune.config --learning_rate=1e-6
-```
-
-也支持oss上pipeline文件直接修改
-
-```bash
-python modify_pipeline_config.py  --pipeline_config_path=oss://easyrec/pipeline889.config --save_path=oss://easyrec/pipeline889-f.config --learning_rate=1e-6 --oss_config=../config/.ossutilconfig
-```
-
-如果用户想要看是否有更优参数，可以看下级目录启动调优。
-
-### 启动调优(可选)
-
-```bash
-nnictl create --config config_finetune.yml --port=8617
-```
-
-#### config_finetune.ini
-
-```
-[platform_config]
-name=MaxCompute
-{% set date_list = [20220616,20220617] %}
-{% set date_begin = 20220616 %}
-{% for bizdate in date_list %}
-{% set eval_ymd = bizdate +1 %}
-{% set predate = bizdate -1 %}
-{% if bizdate == date_begin %}
-cmd1_{{bizdate}}="PAI -name=easy_rec_ext
-    -project=algo_public
-    -Dscript='oss://automl-nni/easyrec/easy_rec_ext_615_res.tar.gz'
-    -Dtrain_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{bizdate}}'
-    -Deval_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{eval_ymd}}'
-    -Dcmd=train
-    -Deval_method=separate
-    -Dfine_tune_checkpoint="oss://automl-nni/easyrec/finetune/{{predate}}_finetune_model_nni_622"
-    -Dconfig='oss://automl-nni/easyrec/config/easyrec_model_${exp_id}_${trial_id}.config'
-    -Dmodel_dir='oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}'
-    -Dselected_cols='is_valid_play,ln_play_time,is_like,is_comment,features,content_features'
-    -Dbuckets='oss://automl-nni/'
-    -Darn='xxx'
-    -DossHost='oss-cn-beijing-internal.aliyuncs.com'
-    -Dcluster={"ps":{"count":1,"cpu":1600,"memory":40000 },"worker":{"count":12,"cpu":1600,"memory":40000}} "
-
-{% else %}
-cmd1_{{bizdate}}="PAI -name=easy_rec_ext
-    -project=algo_public
-    -Dscript='oss://automl-nni/easyrec/easy_rec_ext_615_res.tar.gz'
-    -Dtrain_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{bizdate}}'
-    -Deval_tables='odps://pai_rec_dev/tables/rec_sv_rebuild_acc_rnk_rank_sample_embedding_modify/dt={{eval_ymd}}'
-    -Dcmd=train
-    -Deval_method=separate
-    -Dfine_tune_checkpoint="oss://automl-nni/easyrec/finetune/{{predate}}_finetune_model_nni_622/${exp_id}_${trial_id}"
-    -Dconfig='oss://automl-nni/easyrec/config/easyrec_model_${exp_id}_${trial_id}.config'
-    -Dmodel_dir='oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}'
-    -Dselected_cols='is_valid_play,ln_play_time,is_like,is_comment,features,content_features'
-    -Dbuckets='oss://automl-nni/'
-    -Darn='xxx'
-    -DossHost='oss-cn-beijing-internal.aliyuncs.com'
-    -Dcluster={"ps":{"count":1,"cpu":1600,"memory":40000 },"worker":{"count":12,"cpu":1600,"memory":40000}} "
-{% endif %}
-
-{% endfor %}
-
-
-[metric_config]
-# metric type is summary/table
-metric_type=summary
-{% set date_list = [20220616,20220617] %}
-{% for bizdate in date_list %}
-metric_source_{{bizdate}}=oss://automl-nni/easyrec/finetune/{{bizdate}}_finetune_model_nni_622/${exp_id}_${trial_id}/eval_val/
-{% endfor %}
-# best/final/avg,default=best
-final_mode=final
-source_list_final_mode=avg
-metric_dict={'auc_is_like':0.25, 'auc_is_valid_play':0.5, 'auc_is_comment':0.25}
-```
-
-与begin训练的`差异点`:
-
-- 每个配置模块支持jinja模版渲染
-- 配置finetune日期{% set date_list = \[20220616,20220617\] %}
-- 配置finetune开始日志{% set date_begin = 20220616 %}，Dfine_tune_checkpoint开始日期和后续日期采取的model路径不一样
-- 假设每天finetune：
-  - {bizdate} 必须保留，将会在代码中根据当天日期进行替换
-  - {eval_ymd} 必须保留，将会在代码中根据第二天日期进行替换
-  - {predate} 必须保留，将会在代码中根据前一天日期进行替换
-- metric_source也是多条路径，每一天训练结果为summary的最终结果，整组参数finetune的结果为这些天的平均值
-
-#### 配置超参搜索空间search_space.json
-
-参考begin训练阶段中想要搜索的参数即可，注意由于是finetune训练，网络结构相关的参数不要进行搜索，经验是搜索LR
-
-## EarlyStop算法
-
-### 算法介绍
-
-支持将该组中的实验结果和同组中的所有历史进行比较，如果不满足比较标准（例如小于中位数），则停止该组超参数的运行。比如说设置最大运行次数max_trial_num， 实际使用量会显著小于max_trial_num，但具体数量就和实际跑的任务及随机到的超参有关系了。例如max_trial_num=50时，可能最终可能不到 25 次，并且差不多已经是完整探索了50组超参。
-在config.yml中：
-
-- optimize_mode： 最大化优化的方向,maximize/minimize
-- start_step: 从第2步开始进行早停判定
-- moving_avg: 早停判断时，采用所有历史的滑动平均值作为判断标准
-- proportion： 本次超参搜索的最优值和历史记录的proportion值比较
-- patience：metric指标连续下降几次，就停止
-- platform: 目前支持LOCAL/PAI/DLC/DATASCIENCE/MAXCOMPUTE/TRAININGSERVICE
-
-```
-assessor:
-  name: PAIAssessor
-  classArgs:
-    platform: MAXCOMPUTE
-    optimize_mode: maximize
-    start_step: 1
-    moving_avg: true
-    proportion: 0.5
-```
-
-### 算法修改
-
-如果您想设置自定义停止策略，例如到达多少步，精度还没到达0.9，可以修改代码，来加速搜索，可以参考NNI CustomizeAssessor
-注意继承对应平台的assessor函数: hpo_tools/core/assessor/pai_assessor.PAIAssessor
-trial_end函数，该函数是用来当一个实验被停止时，去将平台上运行的任务关闭，同时会去将监听metric的线程给kill掉。
-
-```
-def trial_end(self, trial_job_id, success):
-        logging.info('trial end')
-        # user_cancelled or early_stopped
-        if not success:
-            if self.platform == 'DATASCIENCE':
-                DataScienceTask(trial_id=trial_job_id).stop_job()
-            elif self.platform in ['LOCAL', 'PAI']:
-                logging.info(
-                    "the platform is local or pai, don't need to stop remote job"
-                )
-            elif self.platform == 'DLC':
-                DLCTask(trial_id=trial_job_id).stop_job()
-            elif self.platform == 'MAXCOMPUTE':
-                MaxComputeTask(trial_id=trial_job_id).stop_job()
-            elif self.platform == 'TRAININGSERVICE':
-                TrainingServiceTask(trial_id=trial_job_id).stop_job()
-            else:
-                raise TypeError(
-                    f"the self.platform {self.platform} not "
-                    f"in DATASCIENCE,DLC,MAXCOMPUTE,LOCAL,PAI,TRAININGSERVICE "
-                )
-            # remove json file
-            remove_filepath(trial_id=trial_job_id)
-```
+参考[NNI WebPortal 相关介绍](https://nni.readthedocs.io/en/stable/experiment/web_portal/web_portal.html)
 
-## FAQ
+# FAQ
 
 - 如果是用MAC安装，遇到nni启动权限问题，可以手动解决下
 
diff --git a/easy_rec/python/compat/early_stopping.py b/easy_rec/python/compat/early_stopping.py
index fe4c12132..fc850fb62 100644
--- a/easy_rec/python/compat/early_stopping.py
+++ b/easy_rec/python/compat/early_stopping.py
@@ -21,9 +21,9 @@
 import os
 import threading
 import time
-from distutils.version import LooseVersion
 
 import tensorflow as tf
+from distutils.version import LooseVersion
 from tensorflow.python.framework import dtypes
 from tensorflow.python.framework import ops
 from tensorflow.python.ops import init_ops
diff --git a/easy_rec/python/test/train_eval_test.py b/easy_rec/python/test/train_eval_test.py
index 2ae51751f..087c90cd8 100644
--- a/easy_rec/python/test/train_eval_test.py
+++ b/easy_rec/python/test/train_eval_test.py
@@ -7,11 +7,11 @@
 import threading
 import time
 import unittest
-from distutils.version import LooseVersion
 
 import numpy as np
 import six
 import tensorflow as tf
+from distutils.version import LooseVersion
 from tensorflow.python.platform import gfile
 
 from easy_rec.python.main import predict
diff --git a/setup.cfg b/setup.cfg
index 2303ef802..9f89eca98 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -10,7 +10,7 @@ multi_line_output = 7
 force_single_line = true
 known_standard_library = setuptools
 known_first_party = easy_rec
-known_third_party = absl,common_io,docutils,eas_prediction,future,google,graphlearn,kafka,matplotlib,numpy,oss2,pai,pandas,psutil,six,sklearn,sphinx_markdown_tables,sphinx_rtd_theme,tensorflow,yaml
+known_third_party = absl,common_io,distutils,docutils,eas_prediction,future,google,graphlearn,kafka,matplotlib,numpy,oss2,pai,pandas,psutil,six,sklearn,sphinx_markdown_tables,sphinx_rtd_theme,tensorflow,yaml
 no_lines_before = LOCALFOLDER
 default_section = THIRDPARTY
 skip = easy_rec/python/protos