Skip to content

Commit

Permalink
Merge branch 'master' into feature/multi_task_add_dynamic_weight
Browse files Browse the repository at this point in the history
  • Loading branch information
chengaofei authored Aug 29, 2024
2 parents 53d9ba2 + dd64fd9 commit e1ca0ed
Show file tree
Hide file tree
Showing 42 changed files with 935 additions and 83 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Running Platform:
- [DSSM](docs/source/models/dssm.md) / [MIND](docs/source/models/mind.md) / [DropoutNet](docs/source/models/dropoutnet.md) / [CoMetricLearningI2I](docs/source/models/co_metric_learning_i2i.md) / [PDN](docs/source/models/pdn.md)
- [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [FiBiNet](docs/source/models/fibinet.md) / [MaskNet](docs/source/models/masknet.md) / [PPNet](docs/source/models/ppnet.md) / [CDN](docs/source/models/cdn.md)
- [DIN](docs/source/models/din.md) / [BST](docs/source/models/bst.md) / [CL4SRec](docs/source/models/cl4srec.md)
- [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [PLE](docs/source/models/ple.md)
- [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [AITM](docs/source/models/aitm.md) / [PLE](docs/source/models/ple.md)
- [HighwayNetwork](docs/source/models/highway.md) / [CMBF](docs/source/models/cmbf.md) / [UNITER](docs/source/models/uniter.md)
- More models in development

Expand Down
Binary file added docs/images/models/aitm.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/source/benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
- 该数据集是淘宝展示广告点击率预估数据集,包含用户、广告特征和行为日志。[天池比赛链接](https://tianchi.aliyun.com/dataset/dataDetail?dataId=56)
- 训练数据表:pai_online_project.easyrec_demo_taobao_train_data
- 测试数据表:pai_online_project.easyrec_demo_taobao_test_data
- 其中pai_online_project是一个公共读的MaxCompute project,里面写入了一些数据表做测试,不需要申请权限。
- 在PAI上面测试使用的资源包括2个parameter server,9个worker,其中一个worker做评估:
```json
{"ps":{"count":2,
Expand Down
15 changes: 8 additions & 7 deletions docs/source/component/backbone.md
Original file line number Diff line number Diff line change
Expand Up @@ -1111,13 +1111,14 @@ MovieLens-1M数据集效果:

## 2.特征交叉组件

| 类名 | 功能 | 说明 | 示例 |
| -------------- | ---------------- | ------------ | -------------------------------------------------------------------------------------------------------------------------- |
| FM | 二阶交叉 | DeepFM模型的组件 | [案例2](#deepfm) |
| DotInteraction | 二阶内积交叉 | DLRM模型的组件 | [案例4](#dlrm) |
| Cross | bit-wise交叉 | DCN v2模型的组件 | [案例3](#dcn) |
| BiLinear | 双线性 | FiBiNet模型的组件 | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
| FiBiNet | SENet & BiLinear | FiBiNet模型 | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
| 类名 | 功能 | 说明 | 示例 |
| -------------- | --------------------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------- |
| FM | 二阶交叉 | DeepFM模型的组件 | [案例2](#deepfm) |
| DotInteraction | 二阶内积交叉 | DLRM模型的组件 | [案例4](#dlrm) |
| Cross | bit-wise交叉 | DCN v2模型的组件 | [案例3](#dcn) |
| BiLinear | 双线性 | FiBiNet模型的组件 | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
| FiBiNet | SENet & BiLinear | FiBiNet模型 | [fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) |
| Attention | Dot-product attention | Transformer模型的组件 | |

## 3.特征重要度学习组件

Expand Down
27 changes: 27 additions & 0 deletions docs/source/component/component.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,33 @@
| senet | SENet | | protobuf message |
| mlp | MLP | | protobuf message |

- Attention

Dot-product attention layer, a.k.a. Luong-style attention.

The calculation follows the steps:

1. Calculate attention scores using query and key with shape (batch_size, Tq, Tv).
1. Use scores to calculate a softmax distribution with shape (batch_size, Tq, Tv).
1. Use the softmax distribution to create a linear combination of value with shape (batch_size, Tq, dim).

| 参数 | 类型 | 默认值 | 说明 |
| ----------------------- | ------ | ----- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| use_scale | bool | False | If True, will create a scalar variable to scale the attention scores. |
| score_mode | string | dot | Function to use to compute attention scores, one of {"dot", "concat"}. "dot" refers to the dot product between the query and key vectors. "concat" refers to the hyperbolic tangent of the concatenation of the query and key vectors. |
| dropout | float | 0.0 | Float between 0 and 1. Fraction of the units to drop for the attention scores. |
| seed | int | None | A Python integer to use as random seed incase of dropout. |
| return_attention_scores | bool | False | if True, returns the attention scores (after masking and softmax) as an additional output argument. |
| use_causal_mask | bool | False | Set to True for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past. |

- inputs: List of the following tensors:
- query: Query tensor of shape (batch_size, Tq, dim).
- value: Value tensor of shape (batch_size, Tv, dim).
- key: Optional key tensor of shape (batch_size, Tv, dim). If not given, will use value for both key and value, which is the most common case.
- output:
- Attention outputs of shape (batch_size, Tq, dim).
- (Optional) Attention scores after masking and softmax with shape (batch_size, Tq, Tv).

## 3.特征重要度学习组件

- SENet
Expand Down
6 changes: 3 additions & 3 deletions docs/source/feature/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

EasyRec作为阿里云PAI的推荐算法包,可以无缝对接MaxCompute的数据表,也可以读取OSS中的大文件,还支持E-MapReduce环境中的HDFS文件,也支持local环境中的csv文件。

为了识别这些输入数据中的字段信息,需要设置相应的字段名称和字段类型、设置默认值,帮助EasyRec去读取相应的数据。设置label字段,作为训练的目标。为了适应多目标模型,label字段可以设置多个
为了识别这些输入数据中的字段信息,需要设置相应的字段名称和字段类型、设置默认值,帮助EasyRec去读取相应的数据。设置label字段,作为训练的目标。为了适配多目标模型,label字段可设置多个

另外还有一些参数如prefetch_size,是tensorflow中读取数据需要设置的参数。

## 一个最简单的data config的配置

这个配置里面,只有三个字段,用户ID(uid)、物品ID(item_id)、label字段(click)。

OdpsInputV2表示读取MaxCompute的表作为输入数据。
OdpsInputV2表示读取MaxCompute的表作为输入数据。如果是本地机器上训练,注意使用CSVInput类型。

```protobuf
data_config {
Expand Down Expand Up @@ -160,7 +160,7 @@ def remap_lbl(labels):
### prefetch_size

- data prefetch,以batch为单位,默认是32
- 设置prefetch size可以提高数据加载的速度,防止数据瓶颈
- 设置prefetch size可以提高数据加载的速度,防止数据瓶颈。但是当batchsize较小的时候,该值可适当调小。

### shard && file_shard

Expand Down
11 changes: 6 additions & 5 deletions docs/source/feature/feature.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

在上一节介绍了输入数据包括MaxCompute表、csv文件、hdfs文件、OSS文件等,表或文件的一列对应一个特征。

在数据中可以有一个或者多个label字段,而特征比较丰富,支持的类型包括IdFeature,RawFeature,TagFeature,SequenceFeature, ComboFeature.
在数据中可以有一个或者多个label字段,在多目标模型中,需要多个label字段。而特征比较丰富,支持的类型包括IdFeature,RawFeature,TagFeature,SequenceFeature, ComboFeature

各种特征共用字段
----------------------------------------------------------------
Expand Down Expand Up @@ -71,12 +71,12 @@ IdFeature: 离散值特征/ID类特征

.. math::
embedding\_dim=8+x^{0.25}
- 其中,x 为不同特征取值的个数
embedding\_dim=8+n^{0.25}
- 其中,n 是特征的唯一值的个数(如gender特征的取值是男、女,则n=2)

- hash\_bucket\_size: hash bucket的大小。适用于category_id, user_id等

- 对于user\_id等规模比较大的,hash冲突影响比较小的特征,
- 对于user\_id等规模比较大的,hash冲突影响比较小的特征,用户行为日志不够丰富可通过hash压缩id数量,

.. math::
Expand All @@ -91,7 +91,8 @@ IdFeature: 离散值特征/ID类特征
- num\_buckets: buckets number,
仅仅当输入是integer类型时,可以使用num\_buckets
仅仅当输入是integer类型时,可以使用num\_buckets。
但是当使用fg特征的时候,不要用integer特征用num\_buckets的方式来变换,注意要用hash\_bucket\_size的方式。

- vocab\_list:
指定词表,适合取值比较少可以枚举的特征,如星期,月份,星座等
Expand Down
4 changes: 4 additions & 0 deletions docs/source/feature/pai_rec_callback_conf.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# PAI-REC 全埋点配置

## PAI-Rec引擎的callback服务文档

- [文档](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/pairec/docs/pairec/html/intro/callback_api.html)

## 模板

```json
Expand Down
2 changes: 1 addition & 1 deletion docs/source/feature/rtp_fg.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- RTP FG: RealTime Predict Feature Generation, 解决实时预测需要的特征工程需求. 特征工程在推荐链路里面也占用了比较长的时间.

- RTP FG能够以比较高的效率生成一些复杂的交叉特征,如match feature和lookup feature, 通过使用同一套c++代码保证离线在线的一致性.
- RTP FG能够以比较高的效率生成一些复杂的交叉特征,如match feature和lookup feature.离线训练和在线预测的时候通过使用同一套c++代码保证离线在线的一致性.

- 其生成的特征可以接入EasyRec进行训练,从RTP FG的配置(fg.json)可以生成EasyRec的配置文件(pipeline.config).

Expand Down
2 changes: 1 addition & 1 deletion docs/source/feature/rtp_native.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# RTP部署

本文档介绍将EasyRec模型部署到RTP上的流程.
本文档介绍将EasyRec模型部署到RTP(Real Time Prediction,实时打分服务)上的流程.

- RTP目前仅支持checkpoint形式的模型部署,因此需要将EasyRec模型导出为checkpoint形式

Expand Down
1 change: 1 addition & 0 deletions docs/source/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,5 @@ EasyRec implements state of the art machine learning models used in common recom

### Contact

- DingDing Group: 32260796. (EasyRec usage general discussion.)
- DingDing Group: 37930014162, click [this url](https://qr.dingtalk.com/action/joingroup?code=v1,k1,oHNqtNObbu+xUClHh77gCuKdGGH8AYoQ8AjKU23zTg4=&_dt_no_comment=1&origin=11) or scan QrCode to join![new_group.jpg](../images/qrcode/new_group.jpg)
118 changes: 118 additions & 0 deletions docs/source/models/aitm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# AITM

### 简介

在推荐场景里,用户的转化链路往往有多个中间步骤(曝光->点击->转化),AITM是一种多任务模型框架,充分利用了链路上各个节点的样本,提升模型对后端节点转化率的预估。

![AITM](../../images/models/aitm.jpg)

1. (a) Expert-Bottom pattern。如 [MMoE](mmoe.md)
1. (b) Probability-Transfer pattern。如 [ESMM](esmm.md)
1. (c) Adaptive Information Transfer Multi-task (AITM) framework.

两个特点:

1. 使用Attention机制来融合多个目标对应的特征表征;
1. 引入了行为校正的辅助损失函数。

### 配置说明

```protobuf
model_config {
model_name: "AITM"
model_class: "MultiTaskModel"
feature_groups {
group_name: "all"
feature_names: "user_id"
feature_names: "cms_segid"
...
feature_names: "tag_brand_list"
wide_deep: DEEP
}
backbone {
blocks {
name: "mlp"
inputs {
feature_group_name: "all"
}
keras_layer {
class_name: 'MLP'
mlp {
hidden_units: [512, 256]
}
}
}
}
model_params {
task_towers {
tower_name: "ctr"
label_name: "clk"
loss_type: CLASSIFICATION
metrics_set: {
auc {}
}
dnn {
hidden_units: [256, 128]
}
use_ait_module: true
weight: 1.0
}
task_towers {
tower_name: "cvr"
label_name: "buy"
losses {
loss_type: CLASSIFICATION
}
losses {
loss_type: ORDER_CALIBRATE_LOSS
}
metrics_set: {
auc {}
}
dnn {
hidden_units: [256, 128]
}
relation_tower_names: ["ctr"]
use_ait_module: true
ait_project_dim: 128
weight: 1.0
}
l2_regularization: 1e-6
}
embedding_regularization: 5e-6
}
```

- model_name: 任意自定义字符串,仅有注释作用

- model_class: 'MultiTaskModel', 不需要修改, 通过组件化方式搭建的多目标排序模型都叫这个名字

- feature_groups: 配置一组特征。

- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md)

- blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图
- name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出
- keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer)
- mlp: MLP模型的参数,详见[参考文档](../component/component.md#id1)

- model_params: AITM相关的参数

- task_towers 根据任务数配置task_towers
- tower_name
- dnn deep part的参数配置
- hidden_units: dnn每一层的channel数目,即神经元的数目
- use_ait_module: if true 使用`AITM`模型;否则,使用[DBMTL](dbmtl.md)模型
- ait_project_dim: 每个tower对应的表征向量的维度,一般设为最后一个隐藏的维度即可
- 默认为二分类任务,即num_class默认为1,weight默认为1.0,loss_type默认为CLASSIFICATION,metrics_set为auc
- loss_type: ORDER_CALIBRATE_LOSS 使用目标依赖关系校正预测结果的辅助损失函数,详见原始论文
- 注:label_fields需与task_towers一一对齐。
- embedding_regularization: 对embedding部分加regularization,防止overfit

### 示例Config

- [AITM_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/aitm_on_taobao.config)

### 参考论文

[AITM: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489.pdf)
6 changes: 4 additions & 2 deletions docs/source/models/loss.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ EasyRec支持两种损失函数配置方式:1)使用单个损失函数;2
| PAIRWISE_LOGISTIC_LOSS | pair粒度的logistic loss, 支持自定义pair分组 |
| JRC_LOSS | 二分类 + listwise ranking loss |
| F1_REWEIGHTED_LOSS | 可以调整二分类召回率和准确率相对权重的损失函数,可有效对抗正负样本不平衡问题 |
| ORDER_CALIBRATE_LOSS | 使用目标依赖关系校正预测结果的辅助损失函数,详见[AITM](aitm.md)模型 |

- 说明:SOFTMAX_CROSS_ENTROPY_WITH_NEGATIVE_MINING
- 支持参数配置,升级为 [support vector guided softmax loss](https://128.84.21.199/abs/1812.11317)
Expand Down Expand Up @@ -71,9 +72,9 @@ EasyRec支持两种损失函数配置方式:1)使用单个损失函数;2

- f1_beta_square: 大于1的值会导致模型更关注recall,小于1的值会导致模型更关注precision
- F1 分数,又称平衡F分数(balanced F Score),它被定义为精确率和召回率的调和平均数。
- ![f1 score](../images/other/f1_score.svg)
- ![f1 score](../../images/other/f1_score.svg)
- 更一般的,我们定义 F_beta 分数为:
- ![f_beta score](../images/other/f_beta_score.svg)
- ![f_beta score](../../images/other/f_beta_score.svg)
- f1_beta_square 即为 上述公式中的 beta 系数的平方。

- PAIRWISE_FOCAL_LOSS 的参数配置
Expand Down Expand Up @@ -159,3 +160,4 @@ EasyRec支持两种损失函数配置方式:1)使用单个损失函数;2

- 《 Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics 》
-[Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning](https://arxiv.org/abs/2111.10603)
- [AITM: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489.pdf)
1 change: 1 addition & 0 deletions docs/source/models/multi_target.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@
esmm
mmoe
dbmtl
aitm
ple
simple_multi_task
Loading

0 comments on commit e1ca0ed

Please sign in to comment.