diff --git a/README.md b/README.md index 1368c15b2..ad9963adb 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ Running Platform: ### Simple to config - Flexible feature config and simple model config +- [Build models by combining some components](docs/source/component/backbone.md) - Efficient and robust feature generation\[used in taobao\] - Nice web interface in development @@ -60,14 +61,16 @@ Running Platform: ### A variety of models - [DSSM](docs/source/models/dssm.md) / [MIND](docs/source/models/mind.md) / [DropoutNet](docs/source/models/dropoutnet.md) / [CoMetricLearningI2I](docs/source/models/co_metric_learning_i2i.md) -- [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [DIN](docs/source/models/din.md) / [BST](docs/source/models/bst.md) +- [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [FiBiNet](docs/source/models/fibinet.md) / [MaskNet](docs/source/models/masknet.md) +- [DIN](docs/source/models/din.md) / [BST](docs/source/models/bst.md) - [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [PLE](docs/source/models/ple.md) - [CMBF](docs/source/models/cmbf.md) / [UNITER](docs/source/models/uniter.md) - More models in development ### Easy to customize -- Easy to implement [customized models](docs/source/models/user_define.md) +- Support [component-based development](docs/source/component/backbone.md) +- Easy to implement [customized models](docs/source/models/user_define.md) and [components](docs/source/component/backbone.md#id12) - Not need to care about data pipelines ### Fast vector retrieve diff --git a/docs/images/component/backbone.jpg b/docs/images/component/backbone.jpg new file mode 100644 index 000000000..7dd5e0ecb Binary files /dev/null and b/docs/images/component/backbone.jpg differ diff --git a/docs/source/component/backbone.md b/docs/source/component/backbone.md new file mode 100644 index 000000000..0af37109b --- /dev/null +++ b/docs/source/component/backbone.md @@ -0,0 +1,1396 @@ +# 为何需要组件化 + +## 1. 依靠动态可插拔的公共组件,方便为现有模型添加新特性。 + +过去一个新开发的公共可选模块,比如`Dense Feature Embedding Layer`、 `SENet`添加到现有模型中,需要修改所有模型的代码才能用上新的特性,过程繁琐易出错。随着模型数量和公共模块数量的增加,为所有模型集成所有公共可选模块将产生组合爆炸的不可控局面。组件化实现了底层公共模块与上层模型的解耦。 + +## 2. 通过重组已有组件,实现“搭积木”式新模型开发。 + +很多模型之所以被称之为一个新的模型,是因为引入了一个或多个特殊的子模块(组件),然而这些子模块并不仅仅只能用在该模型中,通过组合各个不同的子模块可以轻易组装一个新的模型。组件化EasyRec支持通过配置化的方式搭建新的模型。 + +## 3. 添加新的特性将变得更加容易。 + +现在我们只需要为新的特征开发一个Keras Layer类,并在指定package中添加import语句,框架就能自动识别并添加到组件库中,不需要额外操作。开发一个新的模型,只需要实现特殊的新模块,其余部分可以通过组件库中的已有组件拼装。新人不再需要熟悉EasyRec的方方面面就可以为框架添加功能,开发效率大大提高。 + +# 组件化的目标 + +```{hint} 目标 +不再需要实现新的模型,只需要实现新的组件! 模型通过组装组件完成。 +``` + +各个组件专注自身功能的实现,模块中代码高度聚合,只负责一项任务,也就是常说的单一职责原则。 + +# 主干网络 + +组件化EasyRec模型使用一个可配置的主干网络作为核心部件。主干网络是由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图。DAG的输出节点由`concat_blocks`配置项定义,各输出`组件块`的输出tensor拼接之后输入给一个可选的顶部MLP层,或者直接链接到最终的预测层。 + +![](../../images/component/backbone.jpg) + +## 案例1. Wide&Deep 模型 + +配置文件:[wide_and_deep_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/wide_and_deep_backbone_on_movielens.config) + +```protobuf +model_config: { + model_name: "WideAndDeep" + model_class: "RankModel" + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'deep' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + only_output_feature_list: true + wide_output_dim: 1 + } + } + blocks { + name: 'deep_logit' + inputs { + feature_group_name: 'deep' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 256, 256, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'final_logit' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.add_n(x)' + } + inputs { + block_name: 'deep_logit' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'final_logit' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +MovieLens-1M数据集效果对比: + +| Model | Epoch | AUC | +| ------------------- | ----- | ------ | +| Wide&Deep | 1 | 0.8558 | +| Wide&Deep(Backbone) | 1 | 0.8854 | + +备注:通过组件化的方式搭建的模型效果比内置的模型效果更好是因为`MLP`组件有更好的初始化方法。 + +通过protobuf message `backbone` 来定义主干网络,主干网络有多个积木块(`block`)组成,每个`block`代表一个可复用的组件。 + +- 每个`block`有一个唯一的名字(name),并且有一个或多个输入和输出。 +- 每个输入只能是某个`feature group`的name,或者另一个`block`的name,或者是一个`block package`的名字。当一个`block`有多个输入时,会自动执行merge操作(输入为list时自动合并,输入为tensor时自动concat)。 +- 所有`block`根据输入与输出的关系组成一个有向无环图(DAG),框架自动解析出DAG的拓扑关系,按照拓扑排序执行块所关联的模块。 +- 当`block`有多个输出时,返回一个python元组(tuple),下游`block`可以配置`input_slice`通过python切片语法获取到输入元组的某个元素作为输入,或者通过自定义的`input_fn`配置一个lambda表达式函数获取元组的某个值。 +- 每个`block`关联的模块通常是一个keras layer对象,实现了一个可复用的子网络模块。框架支持加载自定义的keras layer,以及所有系统内置的keras layer。 +- 可以为`block`关联一个`input_layer`对输入的`feature group`配置的特征做一些额外的加工,比如执行`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等)。注意:**当`block`关联的模块是`input_layer`时,必须设定feature_group_name为某个`feature group`的名字**,当`block`关联的模块不是`input_layer`时,block的name不可与某个`feature group`重名。 +- 还有一些特殊的`block`关联了一个特殊的模块,包括`lambda layer`、`sequential layers`、`repeated layer`和`recurrent layer`。这些特殊layer分别实现了自定义表达式、顺序执行多个layer、重复执行某个layer、循环执行某个layer的功能。 +- DAG的输出节点名由`concat_blocks`配置项指定,配置了多个输出节点时自动执行tensor的concat操作。 +- 如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 +- 可以为主干网络配置一个可选的`MLP`模块。 + +## 案例2:DeepFM 模型 + +配置文件:[deepfm_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/deepfm_backbone_on_movielens.config) + +这个Case重点关注下两个特殊的`block`,一个使用了`lambda`表达式配置了一个自定义函数;另一个的加载了一个内置的keras layer [`tf.keras.layers.Add`](https://keras.io/api/layers/merging_layers/add/)。 + +```protobuf +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + feature_names: 'title' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide_logit' + inputs { + feature_group_name: 'wide' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'features' + inputs { + feature_group_name: 'features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'features' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + } + } + blocks { + name: 'deep' + inputs { + block_name: 'features' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'add' + inputs { + block_name: 'wide_logit' + input_fn: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + inputs { + block_name: 'fm' + } + inputs { + block_name: 'deep' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'add' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +MovieLens-1M数据集效果对比: + +| Model | Epoch | AUC | +| ---------------- | ----- | ------ | +| DeepFM | 1 | 0.8867 | +| DeepFM(Backbone) | 1 | 0.8872 | + +## 案例3:DCN 模型 + +配置文件:[dcn_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dcn_backbone_on_movielens.config) + +这个Case重点关注一个特殊的 DCN `block`,用了`recurrent layer`实现了循环调用某个模块多次的效果。通过该Case还是在DAG之上添加了MLP模块。 + +```protobuf +model_config: { + model_name: 'DCN V2' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "deep" + inputs { + feature_group_name: 'all' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + blocks { + name: "dcn" + inputs { + feature_group_name: 'all' + input_fn: 'lambda x: [x, x]' + } + recurrent { + num_steps: 3 + fixed_input_index: 0 + keras_layer { + class_name: 'Cross' + } + } + } + concat_blocks: ['deep', 'dcn'] + top_mlp { + hidden_units: [64, 32, 16] + } + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +上述配置对`Cross` Layer循环调用了3次,逻辑上等价于执行如下语句: + +```python +x1 = Cross()(x0, x0) +x2 = Cross()(x0, x1) +x3 = Cross()(x0, x2) +``` + +MovieLens-1M数据集效果对比: + +| Model | Epoch | AUC | +| ----------------- | ----- | ------ | +| DCN (内置) | 1 | 0.8576 | +| DCN_v2 (backbone) | 1 | 0.8770 | + +备注:新实现的`Cross`组件对应了参数量更多的v2版本的DCN,而内置的DCN模型对应了v1版本的DCN。 + +## 案例4:DLRM 模型 + +配置文件:[dlrm_backbone_on_criteo.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dlrm_backbone_on_criteo.config) + +```protobuf +model_config: { + model_name: 'DLRM' + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + ... + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + ... + wide_deep:DEEP + } + backbone { + blocks { + name: 'bottom_mlp' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'bottom_mlp' + } + inputs { + block_name: 'sparse' + input_slice: '[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_slice: '[0]' + } + } + concat_blocks: ['sparse_2d', 'dot'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} +``` + +Criteo数据集效果对比: + +| Model | Epoch | AUC | +| --------------- | ----- | ------- | +| DLRM | 1 | 0.79785 | +| DLRM (backbone) | 1 | 0.7993 | + +备注:`DotInteraction` 是新开发的特征两两交叉做内积运算的模块。 + +这个案例中'dot' block的第一个输入是一个tensor,第二个输入是一个list,这种情况下第一个输入会插入到list中,合并成一个更大的list,作为block的输入。 + +## 案例5:为 DLRM 模型添加一个新的数值特征Embedding组件 + +配置文件:[dlrm_on_criteo_with_periodic.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dlrm_on_criteo_with_periodic.config) + +与上一个案例相比,多了一个`PeriodicEmbedding` Layer,组件化编程的**灵活性与可扩展性**由此可见一斑。 + +重点关注一下`PeriodicEmbedding` Layer的参数配置方式,这里并没有使用自定义protobuf message的传参方式,而是采用了内置的`google.protobuf.Struct`对象作为自定义Layer的参数。实际上,该自定义Layer也支持通过自定义message传参。框架提供了一个通用的`Parameter` API 用通用的方式处理两种传参方式。 + +```protobuf +model_config: { + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + ... + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + ... + wide_deep:DEEP + } + backbone { + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'PeriodicEmbedding' + st_params { + fields { + key: "output_tensor_list" + value { bool_value: true } + } + fields { + key: "embedding_dim" + value { number_value: 16 } + } + fields { + key: "sigma" + value { number_value: 0.005 } + } + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'num_emb' + input_slice: '[1]' + } + inputs { + block_name: 'sparse' + input_slice: '[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_slice: '[0]' + } + } + blocks { + name: 'num_emb_2d' + inputs { + block_name: 'num_emb' + input_slice: '[0]' + } + } + concat_blocks: ['num_emb_2d', 'dot', 'sparse_2d'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} +``` + +Criteo数据集效果对比: + +| Model | Epoch | AUC | +| --------------- | ----- | ------- | +| DLRM | 1 | 0.79785 | +| DLRM (backbone) | 1 | 0.7993 | +| DLRM (periodic) | 1 | 0.7998 | + +## 案例6:使用内置的keras layer搭建DNN模型 + +配置文件:[mlp_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/mlp_on_movielens.config) + +该案例只为了演示可以组件化EasyRec可以使用TF内置的原子粒度keras layer作为通用组件,实际上我们已经有了一个自定义的MLP组件,使用会更加方便。 + +该案例重点关注一个特殊的`sequential block`,这个组件块内可以定义多个串联在一起的layers,前一个layer的输出作为后一个layer的输入。相比定义多个普通`block`的方式,`sequential block`会更加方便。 + +备注:调用系统内置的keras layer,自能通过`google.proto.Struct`的格式传参。 + +```protobuf +model_config: { + model_class: "RankModel" + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: 'mlp' + inputs { + feature_group_name: 'features' + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 256 } + } + fields { + key: 'activation' + value: { string_value: 'relu' } + } + } + } + } + layers { + keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 256 } + } + fields { + key: 'activation' + value: { string_value: 'relu' } + } + } + } + } + layers { + keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 1 } + } + } + } + } + } + concat_blocks: 'mlp' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +MovieLens-1M数据集效果: + +| Model | Epoch | AUC | +| ----- | ----- | ------ | +| MLP | 1 | 0.8616 | + +## 案例7: 使用组件包(Multi-Tower) + +配置文件:[multi_tower_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/multi_tower_on_movielens.config) + +该案例为了演示`block package`的使用,`block package`可以打包一组`block`,构成一个可被复用的子网络,即被打包的子网络可以以共享参数的方式在同一个模型中调用多次。与之相反,没有打包的`block`是不能被多次调用的(但是可以多次复用结果)。 + +`block package`主要为自监督学习、对比学习等场景设计。 + +```protobuf +model_config: { + model_name: "multi tower" + model_class: "RankModel" + feature_groups: { + group_name: 'user' + feature_names: 'user_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + wide_deep: DEEP + } + feature_groups: { + group_name: 'item' + feature_names: 'movie_id' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + packages { + name: 'user_tower' + blocks { + name: 'mlp' + inputs { + feature_group_name: 'user' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: 'item_tower' + blocks { + name: 'mlp' + inputs { + feature_group_name: 'item' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128] + } + } + } + } + blocks { + name: 'top_mlp' + inputs { + package_name: 'user_tower' + } + inputs { + package_name: 'item_tower' + } + layers { + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [128, 64] + } + } + } + } + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +注意该案例没有为package和backbone配置`concat_blocks`,框架会自动设置为DAG的所有叶子节点。 + +MovieLens-1M数据集效果: + +| Model | Epoch | AUC | +| ---------- | ----- | ------ | +| MultiTower | 1 | 0.8814 | + +## 案例8:多目标模型 MMoE + +多目标模型的model_class一般配置为"MultiTaskModel",并且需要在`model_params`里配置多个目标对应的Tower。`model_name`为任意自定义字符串,仅有注释作用。 + +```protobuf +model_config { + model_name: "MMoE" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + ... + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: 'all' + inputs { + feature_group_name: 'all' + } + input_layer { + only_output_feature_list: true + } + } + blocks { + name: "senet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'SENet' + senet { + reduction_ratio: 4 + } + } + } + blocks { + name: "mmoe" + inputs { + block_name: "senet" + } + keras_layer { + class_name: 'MMoE' + mmoe { + num_task: 2 + num_expert: 3 + expert_mlp { + hidden_units: [256, 128] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + task_towers { + tower_name: "cvr" + label_name: "buy" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + l2_regularization: 1e-06 + } + embedding_regularization: 5e-05 +} +``` + +注意这个案例没有为backbone配置`concat_blocks`,框架会自动设置为DAG的叶子节点。 + +## 案例9:多目标模型 DBMTL + +多目标模型的model_class一般配置为"MultiTaskModel",并且需要在`model_params`里配置多个目标对应的Tower。`model_name`为任意自定义字符串,仅有注释作用。 + +```protobuf +model_config { + model_name: "DBMTL" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + ... + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64] + } + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + task_towers { + tower_name: "cvr" + label_name: "buy" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64] + } + relation_tower_names: ["ctr"] + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + l2_regularization: 1e-6 + } + embedding_regularization: 5e-6 +} +``` + +DBMTL模型需要在`model_params`里为每个子任务的Tower配置`relation_dnn`,同时还需要通`relation_tower_names`配置任务间的依赖关系。 + +这个案例同样没有为backbone配置`concat_blocks`,框架会自动设置为DAG的叶子节点。 + +## 其他案例(FiBiNet & MaskNet) + +两个新的模型: + +- FiBiNet模型配置文件:[fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) +- MaskNet模型配置文件:[masknet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/masknet_on_movielens.config) + +MovieLens-1M数据集效果: + +| Model | Epoch | AUC | +| ------- | ----- | ------ | +| MaskNet | 1 | 0.8872 | +| FibiNet | 1 | 0.8893 | + +# 组件库介绍 + +## 1.基础组件 + +| 类名 | 功能 | 说明 | +| ----------------- | ------ | --------------------------------------- | +| MLP | 多层感知机 | 支持配置激活函数、初始化方法、Dropout、是否使用BN等 | +| Highway | 类似残差链接 | 可用来对预训练embedding做增量微调,来自Highway Network | +| Gate | 门控 | 多个输入的加权求和 | +| PeriodicEmbedding | 周期激活函数 | 数值特征Embedding | +| AutoDisEmbedding | 自动离散化 | 数值特征Embedding | + +## 2.特征交叉组件 + +| 类名 | 功能 | 说明 | +| -------------- | ---------------- | ------------ | +| FM | 二阶交叉 | DeepFM模型的组件 | +| DotInteraction | 二阶内积交叉 | DLRM模型的组件 | +| Cross | bit-wise交叉 | DCN v2模型的组件 | +| BiLinear | 双线性 | FiBiNet模型的组件 | +| FiBiNet | SENet & BiLinear | FiBiNet模型 | + +## 3.特征重要度学习组件 + +| 类名 | 功能 | 说明 | +| --------- | ----------------- | ------------ | +| SENet | 建模特征重要度 | FiBiNet模型的组件 | +| MaskBlock | 建模特征重要度 | MaskNet模型的组件 | +| MaskNet | 多个串行或并行的MaskBlock | MaskNet模型 | + +## 4. 序列特征编码组件 + +| 类名 | 功能 | 说明 | +| --- | ---------------- | -------- | +| DIN | target attention | DIN模型的组件 | +| BST | transformer | BST模型的组件 | + +## 5. 多目标学习组件 + +| 类名 | 功能 | 说明 | +| ---- | --------------------------- | --------- | +| MMoE | Multiple Mixture of Experts | MMoE模型的组件 | + +# 如何自定义组件 + +在 `easy_rec/python/layers/keras` 目录下新建一个`py`文件,也可直接添加到一个已有的文件中。我们建议目标类似的组件定义在同一个文件中,减少文件数量;比如特征交叉的组件都放在`interaction.py`里。 + +定义一个继承[`tf.keras.layers.Layer`](https://keras.io/api/layers/base_layer/)的组件类,至少实现两个方法:`__init__`、`call`。 + +```python +def __init__(self, params, name='xxx', **kwargs): + pass +def call(self, inputs, training=None, **kwargs): + pass +``` + +`__init__`方法的第一个参数`params`接受框架传递给当前组件的参数。支持两种参数配置的方式:`google.protobuf.Struct`、自定义的protobuf message对象。params对象封装了对这两种格式的参数的统一读取接口,如下: + +- 检查必传参数,缺失时报错退出: + `params.check_required(['embedding_dim', 'sigma'])` +- 用点操作符读取参数: + `sigma = params.sigma`;支持连续点操作符,如`params.a.b`: +- 注意数值型参数的类型,`Struct`只支持float类型,整型需要强制转换: + `embedding_dim = int(params.embedding_dim)` +- 数组类型也需要强制类型转换: `units = list(params.hidden_units)` +- 指定默认值读取,返回值会被强制转换为与默认值同类型:`activation = params.get_or_default('activation', 'relu')` +- 支持嵌套子结构的默认值读取:`params.field.get_or_default('key', def_val)` +- 判断某个参数是否存在:`params.has_field(key)` +- 【不建议,会限定传参方式】获取自定义的proto对象:`params.get_pb_config()` +- 读写`l2_regularizer`属性:`params.l2_regularizer`,传给Dense层或dense函数。 + +【可选】如需要自定义protobuf message参数,先在`easy_rec/python/protos/layer.proto`添加参数message的定义, +再把参数注册到定义在`easy_rec/python/protos/keras_layer.proto`的`KerasLayer.params`消息体中。 + +`call`方法用来实现主要的模块逻辑,其`inputs`参数可以是一个tenor,或者是一个tensor列表。可选的`training`参数用来标识当前是否是训练模型。 + +最后也是最重要的一点,新开发的Layer需要在`easy_rec.python.layers.keras.__init__.py`文件中导出才能被框架识别为组件库中的一员。例如要导出`blocks.py`文件中的`MLP`类,则需要添加:`from .blocks import MLP`。 + +FM layer的代码示例: + +```python +class FM(tf.keras.layers.Layer): + """Factorization Machine models pairwise (order-2) feature interactions without linear term and bias. + + References + - [Factorization Machines](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) + Input shape. + - List of 2D tensor with shape: ``(batch_size,embedding_size)``. + - Or a 3D tensor with shape: ``(batch_size,field_size,embedding_size)`` + Output shape + - 2D tensor with shape: ``(batch_size, 1)``. + """ + + def __init__(self, params, name='fm', **kwargs): + super(FM, self).__init__(name, **kwargs) + self.use_variant = params.get_or_default('use_variant', False) + + def call(self, inputs, **kwargs): + if type(inputs) == list: + emb_dims = set(map(lambda x: int(x.shape[-1]), inputs)) + if len(emb_dims) != 1: + dims = ','.join([str(d) for d in emb_dims]) + raise ValueError('all embedding dim must be equal in FM layer:' + dims) + with tf.name_scope(self.name): + fea = tf.stack(inputs, axis=1) + else: + assert inputs.shape.ndims == 3, 'input of FM layer must be a 3D tensor or a list of 2D tensors' + fea = inputs + + with tf.name_scope(self.name): + square_of_sum = tf.square(tf.reduce_sum(fea, axis=1)) + sum_of_square = tf.reduce_sum(tf.square(fea), axis=1) + cross_term = tf.subtract(square_of_sum, sum_of_square) + if self.use_variant: + cross_term = 0.5 * cross_term + else: + cross_term = 0.5 * tf.reduce_sum(cross_term, axis=-1, keepdims=True) + return cross_term +``` + +# 如何搭建模型 + +`组件块`和`组件包`是搭建主干网络的核心部件,本小节将会介绍`组件块`的类型、功能和配置参数;同时还会介绍专门为参数共享子网络设计的`组件包`。 + +通过`组件块`和`组件包`搭建模型的配置方法请参考上文描述的各个 [案例](#wide-deep)。 + +`组件块`的protobuf定义如下: + +```protobuf +message Block { + required string name = 1; + // the input names of feature groups or other blocks + repeated Input inputs = 2; + optional int32 input_concat_axis = 3 [default = -1]; + optional bool merge_inputs_into_list = 4; + optional string extra_input_fn = 5; + + // sequential layers + repeated Layer layers = 6; + // only take effect when there are no layers + oneof layer { + InputLayer input_layer = 101; + Lambda lambda = 102; + KerasLayer keras_layer = 103; + RecurrentLayer recurrent = 104; + RepeatLayer repeat = 105; + } +} +``` + +`组件块`会自动合并多个输入: + +1. 若多路输入中某一路的输入类型是`list`,则最终结果被Merge成一个大的list,保持顺序不变; +1. 若多路输入中的每一路输入都是tensor,默认是执行输入tensors按照最后一个维度做拼接(concat),以下配置项可以改变默认行为: + +- `input_concat_axis` 用来指定输入tensors拼接的维度 +- `merge_inputs_into_list` 设为true,则把输入合并到一个列表里,不做concat操作 + +```protobuf +message Input { + oneof name { + string feature_group_name = 1; + string block_name = 2; + string package_name = 3; + } + optional string input_fn = 11; + optional string input_slice = 12; +} +``` + +- 每一路输入可以配置一个可选的`input_fn`,指定一个lambda函数对输入做一些简单的变换。比如配置`input_fn: 'lambda x: [x]'`可以把输入变成列表格式。 +- `input_slice`可以用来获取输入元组/列表的某个切片。比如,当某路输入是一个列表对象是,可以用`input_slice: '[1]'`配置项获取列表的第二个元素值作为这一路的输入。 +- `extra_input_fn` 是一个可选的配置项,用来对合并后的多路输入结果做一些额外的变换,需要配置成lambda函数的格式。 + +目前总共有7种类型的`组件块`,分别是`空组件块`、`输入组件块`、`Lambda组件块`、`KerasLayer组件块`、`循环组件块`、`重复组件块`、`序列组件块`。 + +## 1. 空组件块 + +当一个`block`不配置任何layer时就称之为`空组件块`,`空组件块`只执行多路输入的Merge操作。 + +## 2. 输入组件块 + +`输入组件块`关联一个`input_layer`,获取、加工并返回原始的特征输入。 + +`输入组件块`比较特殊,它只能有且只有一路输入,并且只能用`feature_group_name`项配置输入为一个`feature_group`的`name`。 + +`输入组件块`有一个特权:它的名字可以与其输入的`feature_group`同名。其他`组件块`则无此殊荣。 + +配置示例: + +```protobuf +blocks { + name: 'all' + inputs { + feature_group_name: 'all' + } + input_layer { + only_output_feature_list: true + } +} +``` + +InputLayer可以通过配置获取不同格式的输入,并且可以执行一下如`dropout`之类的额外操作,其参数定义的protobuf如下: + +```protobuf +message InputLayer { + optional bool do_batch_norm = 1; + optional bool do_layer_norm = 2; + optional float dropout_rate = 3; + optional float feature_dropout_rate = 4; + optional bool only_output_feature_list = 5; + optional bool only_output_3d_tensor = 6; + optional bool output_2d_tensor_and_feature_list = 7; + optional bool output_seq_and_normal_feature = 8; +} +``` + +输入层的定义如上,配置下说明如下: + +- `do_batch_norm` 是否对输入特征做`batch normalization` +- `do_layer_norm` 是否对输入特征做`layer normalization` +- `dropout_rate` 输入层执行dropout的概率,默认不执行dropout +- `feature_dropout_rate` 对特征整体执行dropout的概率,默认不执行 +- `only_output_feature_list` 输出list格式的各个特征 +- `only_output_3d_tensor` 输出`feature group`对应的一个3d tensor,在`embedding_dim`相同时可配置该项 +- `output_2d_tensor_and_feature_list` 是否同时输出2d tensor与特征list +- `output_seq_and_normal_feature` 是否输出(sequence特征, 常规特征)元组 + +## 3. Lambda组件块 + +`Lambda组件块`可以配置一个lambda函数,执行一些较简单的操作。示例如下: + +```protobuf +blocks { + name: 'wide_logit' + inputs { + feature_group_name: 'wide' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } +} +``` + +## 4. KerasLayer组件块 + +`KerasLayer组件块`是最核心的组件块,负责加载、执行组件代码逻辑。 + +- `class_name`是要加载的Keras Layer的类名,支持加载自定义的类和系统内置的Layer类。 +- `st_params`是以`google.protobuf.Struct`对象格式配置的参数; +- 还可以用自定义的protobuf message的格式传递参数给加载的Layer对象。 + +配置示例: + +```protobuf +keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } +} + +keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } +} +``` + +## 5. 循环组件块 + +`循环组件块`可以实现类似RNN的循环调用结构,可以执行某个Layer多次,每次执行的输入包含了上一次执行的输出。在[DCN](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dcn_backbone_on_movielens.config)网络中有循环组件块的示例,如下: + +```protobuf +recurrent { + num_steps: 3 + fixed_input_index: 0 + keras_layer { + class_name: 'Cross' + } +} +``` + +上述配置对`Cross` Layer循环调用了3次,逻辑上等价于执行如下语句: + +```python +x1 = Cross()(x0, x0) +x2 = Cross()(x0, x1) +x3 = Cross()(x0, x2) +``` + +- `num_steps` 配置循环执行的次数 +- `fixed_input_index` 配置每次执行的多路输入组成的列表中固定不变的元素;比如上述示例中的`x0` +- `keras_layer` 配置需要执行的组件 + +## 6. 重复组件块 + +`重复组件块` 可以使用相同的输入重复执行某个组件多次,实现`multi-head`的逻辑。示例如下: + +```protobuf +repeat { + num_repeat: 2 + keras_layer { + class_name: "MaskBlock" + mask_block { + output_size: 512 + aggregation_size: 2048 + input_layer_norm: false + } + } +} +``` + +- `num_repeat` 配置重复执行的次数 +- `output_concat_axis` 配置多次执行结果tensors的拼接维度,若不配置则输出多次执行结果的列表 +- `keras_layer` 配置需要执行的组件 + +## 7. 序列组件块 + +`序列组件块`可以依次执行配置的多个Layer,前一个Layer的输出是后一个Layer的输入。`序列组件块`相对于配置多个首尾相连的普通组件块要更加简单。示例如下: + +```protobuf +blocks { + name: 'mlp' + inputs { + feature_group_name: 'features' + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 256 } + } + + fields { + key: 'activation' + value: { string_value: 'relu' } + } + } + } + } + layers { + keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 1 } + } + } + } + } +} +``` + +## 通过`组件包`实现参数共享的子网络 + +`组件包`封装了由多个`组件块`搭建的一个子网络DAG,作为整体可以被以参数共享的方式多次调用,通常用在 *自监督学习* 模型中。 + +`组件包`的protobuf消息定义如下: + +```protobuf +message BlockPackage { + // package name + required string name = 1; + // a few blocks generating a DAG + repeated Block blocks = 2; + // the names of output blocks + repeated string concat_blocks = 3; +} +``` + +`组件块`通过`package_name`参数配置一路输入来调用`组件包`。 + +一个使用`组件包`来实现 *对比学习* 的案例如下: + +```protobuf +model_config { + model_class: "RankModel" + feature_groups { + group_name: "all" + feature_names: "adgroup_id" + feature_names: "user" + ... + feature_names: "pid" + wide_deep: DEEP + } + + backbone { + packages { + name: 'feature_encoder' + blocks { + name: "fea_dropout" + inputs { + feature_group_name: "all" + } + input_layer { + dropout_rate: 0.5 + only_output_3d_tensor: true + } + } + blocks { + name: "encode" + inputs { + block_name: "fea_dropout" + } + layers { + keras_layer { + class_name: 'BSTCTR' + bst { + hidden_size: 128 + num_attention_heads: 4 + num_hidden_layers: 3 + intermediate_size: 128 + hidden_act: 'gelu' + max_position_embeddings: 50 + hidden_dropout_prob: 0.1 + attention_probs_dropout_prob: 0 + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 128 } + } + fields { + key: 'kernel_initializer' + value: { string_value: 'zeros' } + } + } + } + } + } + } + blocks { + name: "all" + inputs { + name: "all" + } + input_layer { + only_output_3d_tensor: true + } + } + blocks { + name: "loss_ctr" + merge_inputs_into_list: true + inputs { + package_name: 'feature_encoder' + } + inputs { + package_name: 'feature_encoder' + } + inputs { + package_name: 'all' + } + keras_layer { + class_name: 'LOSSCTR' + st_params{ + fields { + key: 'cl_weight' + value: { number_value: 1 } + } + fields { + key: 'au_weight' + value: { number_value: 0.01 } + } + } + } + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} +``` diff --git a/docs/source/component/component.md b/docs/source/component/component.md new file mode 100644 index 000000000..f502393eb --- /dev/null +++ b/docs/source/component/component.md @@ -0,0 +1,108 @@ +# 组件详细参数 + +## 1.基础组件 + +- MLP (多层感知机) + +| 参数 | 类型 | 默认值 | 说明 | +| ----------------------- | ---- | ---------- | --------------------------- | +| hidden_units | list | | 各隐层单元数 | +| dropout_ratio | list | | 各隐层dropout rate | +| activation | str | relu | 每层的激活函数 | +| use_bias | bool | true | 是否使用偏置项 | +| use_bn | bool | true | 是否使用batch normalization | +| use_final_bn | bool | true | 最后一层是否使用batch normalization | +| final_activation | str | relu | 最后一层的激活函数 | +| initializer | str | he_uniform | 权重初始化方法,参考keras Dense layer | +| use_bn_after_activation | bool | false | 是否在激活函数之后做batch norm | + +- HighWay + +| 参数 | 类型 | 默认值 | 说明 | +| ------------ | ------ | ---- | ------------ | +| emb_size | uint32 | | embedding维度 | +| activation | str | gelu | 激活函数 | +| dropout_rate | float | | dropout rate | + +- PeriodicEmbedding + +| 参数 | 类型 | 默认值 | 说明 | +| ------------------ | ------ | ----- | ------------------------------------------------- | +| embedding_dim | uint32 | | embedding维度 | +| sigma | float | | 初始化自定义参数时的标准差,**效果敏感、小心调参** | +| add_linear_layer | bool | true | 是否在embedding之后添加额外的层 | +| linear_activation | str | relu | 额外添加的层的激活函数 | +| output_tensor_list | bool | false | 是否同时输出embedding列表 | +| output_3d_tensor | bool | false | 是否同时输出3d tensor, `output_tensor_list=true`时该参数不生效 | + +- AutoDisEmbedding + +| 参数 | 类型 | 默认值 | 说明 | +| ------------------ | ------ | ----- | ------------------------------------------------- | +| embedding_dim | uint32 | | embedding维度 | +| num_bins | uint32 | | 虚拟分桶数量 | +| keep_prob | float | 0.8 | 残差链接的权重 | +| temperature | float | | softmax函数的温度系数 | +| output_tensor_list | bool | false | 是否同时输出embedding列表 | +| output_3d_tensor | bool | false | 是否同时输出3d tensor, `output_tensor_list=true`时该参数不生效 | + +## 2.特征交叉组件 + +- Bilinear + +| 参数 | 类型 | 默认值 | 说明 | +| ---------------- | ------ | ----------- | ---------- | +| type | string | interaction | 双线性类型 | +| use_plus | bool | true | 是否使用plus版本 | +| num_output_units | uint32 | | 输出size | + +- FiBiNet + +| 参数 | 类型 | 默认值 | 说明 | +| -------- | -------- | --- | ---------------- | +| bilinear | Bilinear | | protobuf message | +| senet | SENet | | protobuf message | +| mlp | MLP | | protobuf message | + +## 3.特征重要度学习组件 + +- SENet + +| 参数 | 类型 | 默认值 | 说明 | +| --------------------- | ------ | ---- | ------------------ | +| reduction_ratio | uint32 | 4 | 隐层单元数量缩减倍数 | +| num_squeeze_group | uint32 | 2 | 压缩分组数量 | +| use_skip_connection | bool | true | 是否使用残差连接 | +| use_output_layer_norm | bool | true | 是否在输出层使用layer norm | + +- MaskBlock + +| 参数 | 类型 | 默认值 | 说明 | +| ---------------- | ------ | ---- | ------------------------------- | +| output_size | uint32 | | 输出层单元数 | +| reduction_factor | float | | 隐层单元数缩减因子 | +| aggregation_size | uint32 | | 隐层单元数 | +| input_layer_norm | bool | true | 输入是否需要做layer norm | +| projection_dim | uint32 | | 用两个小矩阵相乘代替原来的输入-隐层权重矩阵,配置小矩阵的维数 | + +- MaskNet + +| 参数 | 类型 | 默认值 | 说明 | +| ------------ | ---- | ---- | ------------- | +| mask_blocks | list | | MaskBlock结构列表 | +| use_parallel | bool | true | 是否使用并行模式 | +| mlp | MLP | 可选 | 顶部mlp | + +## 4. 序列特征编码组件 + +请参考Protobuf Message的定义,文件路径:`easy_rec/python/protos/seq_encoder.proto` + +## 5. 多任务学习组件 + +- MMoE + +| 参数 | 类型 | 默认值 | 说明 | +| ---------- | ------ | --- | ------------ | +| num_task | uint32 | | 任务数 | +| num_expert | uint32 | 0 | expert数量 | +| expert_mlp | MLP | 可选 | expert的mlp参数 | diff --git a/docs/source/index.rst b/docs/source/index.rst index c7b6e94ed..9cef0a0a5 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -24,6 +24,13 @@ Welcome to easy_rec's documentation! feature/rtp_fg feature/rtp_native +.. toctree:: + :maxdepth: 3 + :caption: BACKBONE & COMPONENT + + component/backbone + component/component + .. toctree:: :maxdepth: 3 :caption: MODEL @@ -37,7 +44,6 @@ Welcome to easy_rec's documentation! :maxdepth: 2 :caption: TRAIN & EVAL & EXPORT - loss train incremental_train online_train diff --git a/docs/source/models/dbmtl.md b/docs/source/models/dbmtl.md index a3773e275..aa4015aa7 100644 --- a/docs/source/models/dbmtl.md +++ b/docs/source/models/dbmtl.md @@ -82,6 +82,101 @@ model_config { - 注:label_fields需与task_towers一一对齐。 - embedding_regularization: 对embedding部分加regularization,防止overfit +#### DBMTL Based On Backbone + +```protobuf +model_config { + model_name: "DBMTL" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + ... + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64, 32] + } + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + task_towers { + tower_name: "cvr" + label_name: "buy" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64, 32] + } + relation_tower_names: ["ctr"] + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + l2_regularization: 1e-6 + } + embedding_regularization: 5e-6 +} +``` + +该案例添加了一个额外的`MaskNet`层,为了展示以组件化方式搭建模型的灵活性。 + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'MultiTaskModel', 不需要修改, 通过组件化方式搭建的多目标排序模型都叫这个名字 + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - masknet: MaskNet模型的参数,详见[参考文档](../component/component.md#id4) + +- 其余与dbmtl一致 + #### DBMTL+MMOE ```protobuf @@ -374,6 +469,7 @@ model_config: { - [DBMTL_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dbmtl.config) - [DBMTL_MMOE_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dbmtl_mmoe.config) +- [DBMTL_Backbone_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/dbmtl_backbone_on_taobao.config) - [DBMTL_CMBF_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/dbmtl_cmbf_on_movielens.config) - [DBMTL_UNITER_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/dbmtl_uniter_on_movielens.config) diff --git a/docs/source/models/dcn.md b/docs/source/models/dcn.md index 1891c2f14..7b26cbe2f 100644 --- a/docs/source/models/dcn.md +++ b/docs/source/models/dcn.md @@ -4,9 +4,18 @@ Deep&Cross Network(DCN)是在DNN模型的基础上,引入了一种新型的交叉网络,该网络在学习某些特征交叉时效率更高。特别是,DCN显式地在每一层应用特征交叉,不需要人工特征工程,并且只增加了很小的额外复杂性。 -![deepfm.png](../../images/models/dcn.png) +![dcn.png](../../images/models/dcn.png) -### 配置说明 +DCN-V2相对于前一个版本的模型,主要的改进点在于: + +(1) Wide侧-Cross Network中用矩阵替代向量; + +(2) 提出2种模型结构,传统的Wide&Deep并行 + Wide&Deep串行。 + +![dcn_v2](../../images/models/dcn_v2.jpg) +![dcn_v2_cross](../../images/models/dcn_v2_cross.jpg) + +### DCN v1 配置说明 ```protobuf model_config: { @@ -74,10 +83,87 @@ model_config: { - embedding_regularization: 对embedding部分加regularization,防止overfit +### DCN v2 配置说明 + +```protobuf +model_config { + model_name: 'DCN v2' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "deep" + inputs { + feature_group_name: 'all' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + blocks { + name: "dcn" + inputs { + feature_group_name: 'all' + input_fn: 'lambda x: [x, x]' + } + recurrent { + num_steps: 3 + fixed_input_index: 0 + keras_layer { + class_name: 'Cross' + } + } + } + concat_blocks: ['deep', 'dcn'] + top_mlp { + hidden_units: [64, 32, 16] + } + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 +- feature_groups: 配置一个名为'all'的feature_group。 +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_fn: 配置一个lambda函数对输入做一些简单的变换 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - recurrent: 循环调用指定的Keras Layer,参考 [循环组件块](../component/backbone.md#id16) + - num_steps 配置循环执行的次数 + - fixed_input_index 配置每次执行的多路输入组成的列表中固定不变的元素 + - keras_layer: 同上 + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 + - top_mlp: 各输出`组件块`的输出tensor拼接之后输入给一个可选的顶部MLP层 +- model_params: + - l2_regularization: 对DNN参数的regularization, 减少overfit +- embedding_regularization: 对embedding部分加regularization, 减少overfit + ### 示例Config -[DCN_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dcn.config) +1. DCN V1: [DCN_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dcn.config) +1. DCN V2: [dcn_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dcn_backbone_on_movielens.config) ### 参考论文 -[DCN](https://arxiv.org/abs/1708.05123) +1. [DCN v1](https://arxiv.org/abs/1708.05123) +1. [DCN v2](https://arxiv.org/abs/2008.13535) diff --git a/docs/source/models/deepfm.md b/docs/source/models/deepfm.md index 3b6e75c0a..f98e37870 100644 --- a/docs/source/models/deepfm.md +++ b/docs/source/models/deepfm.md @@ -8,6 +8,8 @@ DeepFM是在WideAndDeep基础上加入了FM模块的改进模型。FM模块和DN ### 配置说明 +#### 1. 内置模型 + ```protobuf model_config:{ model_class: "DeepFM" @@ -64,9 +66,126 @@ model_config:{ - embedding_regularization: 对embedding部分加regularization,防止overfit +#### 2. 组件化模型 + +```protobuf +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + feature_names: 'title' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'features' + inputs { + feature_group_name: 'features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'features' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + } + } + blocks { + name: 'deep' + inputs { + block_name: 'features' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'add' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + inputs { + block_name: 'fm' + } + inputs { + block_name: 'deep' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'add' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 +- feature_groups: 特征组 + - 包含两个feature_group: wide 和 features group +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_slice: 用来获取输入元组/列表的某个切片 + - input_fn: 配置一个lambda函数对输入做一些简单的变换 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - wide_output_dim: wide部分输出的tensor的维度 + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 +- model_params: + - l2_regularization: 对DNN参数的regularization, 减少overfit +- embedding_regularization: 对embedding部分加regularization, 减少overfit + ### 示例Config -[DeepFM_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/deepfm.config) +1. 内置模型:[DeepFM_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/deepfm.config) +1. 组件化模型:[deepfm_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/deepfm_backbone_on_movielens.config) ### 参考论文 diff --git a/docs/source/models/dlrm.md b/docs/source/models/dlrm.md index a9d9a203f..66ad84e69 100644 --- a/docs/source/models/dlrm.md +++ b/docs/source/models/dlrm.md @@ -22,6 +22,8 @@ input: ### 配置说明 +#### 1. 内置模型 + ```protobuf model_config { model_class: 'DLRM' @@ -108,9 +110,114 @@ model_config { - embedding_regularization: 对embedding部分加regularization, 减少overfit +#### 2. 组件化模型 + +``` +model_config: { + model_name: 'DLRM' + model_class: 'RankModel' + feature_groups { + group_name: 'dense' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'new_user_class_level' + feature_names: 'price' + wide_deep: DEEP + } + feature_groups { + group_name: 'sparse' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'occupation' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + backbone { + blocks { + name: 'bottom_mlp' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'bottom_mlp' + input_fn: 'lambda x: [x]' + } + inputs { + block_name: 'sparse' + input_slice: '[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_slice: '[0]' + } + } + concat_blocks: ['bottom_mlp', 'sparse_2d', 'dot'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 +- feature_groups: 特征组 + - 包含两个feature_group: dense 和sparse group + - wide_deep: dlrm模型使用的都是Deep features, 所以都设置成DEEP +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_fn: 配置一个lambda函数对输入做一些简单的变换 + - input_slice: 用来获取输入元组/列表的某个切片 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义 + - top_mlp: 各输出`组件块`的输出tensor拼接之后输入给一个可选的顶部MLP层 +- model_params: + - l2_regularization: 对DNN参数的regularization, 减少overfit +- embedding_regularization: 对embedding部分加regularization, 减少overfit + ### 示例Config -[DLRM_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dlrm_on_taobao.config) +1. 内置模型:[DLRM_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/dlrm_on_taobao.config) +1. 组件化模型:[dlrm_backbone_on_criteo.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/dlrm_backbone_on_criteo.config) ### 参考论文 diff --git a/docs/source/models/fibinet.md b/docs/source/models/fibinet.md new file mode 100644 index 000000000..62b057c3c --- /dev/null +++ b/docs/source/models/fibinet.md @@ -0,0 +1,105 @@ +# FiBiNet + +### 简介 + +FiBiNet 模型包含两个核心模块, 分别是: + +- SENET(Squeeze-Excitation network) +- Bilinear Feature Interaction + +其中 SENET 是借鉴计算机视觉中的网络, 可以动态地学习特征的重要性, 对于越重要的特征, 将学习出更大的权重, 并且减小不那么重要的特征的权重; + +另外对于特征交叉的问题, 经典的方法主要采用 Inner Product 或者 Hadamard Product 来构造交叉特征, 作者认为这些方法比较简单, +可能无法对交叉特征进行有效的建模, 因此提出了 Bilinear Feature Interaction 方法, 结合了 Inner Product 以及 Hadamard Product 二者, +在两个要交叉的特征间插入一个权重矩阵, 以动态学习到特征间的组合关系. + +![FiBiNet](../../images/models/fibinet.jpg) + +### 配置说明 + +```protobuf +model_config { + model_name: 'FiBiNet' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "all" + inputs { + feature_group_name: "all" + } + input_layer { + do_batch_norm: true + only_output_feature_list: true + } + } + blocks { + name: "fibinet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'FiBiNet' + fibinet { + senet { + reduction_ratio: 4 + } + bilinear { + type: 'each' + num_output_units: 512 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['fibinet'] + } + model_params { + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 + +- feature_groups: 配置一个名为'all'的feature_group。 + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - fibinet: FiBiNet模型的参数,详见[参考文档](../component/component.md#id3) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 + +- model_params: + + - l2_regularization: (可选) 对DNN参数的regularization, 减少overfit + +- embedding_regularization: 对embedding部分加regularization, 减少overfit + +### 示例config + +[fibinet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/fibinet_on_movielens.config) + +### 参考论文 + +1. [FiBiNET](https://arxiv.org/pdf/1905.09433.pdf) + Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction +1. [FiBiNet++](https://arxiv.org/pdf/2209.05016.pdf) + Improving FiBiNet by Greatly Reducing Model Size for CTR Predictio diff --git a/docs/source/models/masknet.md b/docs/source/models/masknet.md new file mode 100644 index 000000000..c9de68866 --- /dev/null +++ b/docs/source/models/masknet.md @@ -0,0 +1,87 @@ +# MaskNet + +### 简介 + +MaskNet提出了一种instance-guided mask方法,该方法在DNN中的特征嵌入层和前馈层同时使用element-wise product。instance-guided mask包含全局上下文信息,动态地融入到特征嵌入层和前馈层,突出重要的特征。 + +![MaskNet](../../images/models/masknet.jpg) + +### 配置说明 + +```protobuf +model_config { + model_name: 'MaskNet' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['mask_net'] + } + model_params { + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 + +- feature_groups: 配置一个名为'all'的feature_group。 + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - masknet: MaskNet模型的参数,详见[参考文档](../component/component.md#id4) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 + +- model_params: + + - l2_regularization: (可选) 对DNN参数的regularization, 减少overfit + +- embedding_regularization: 对embedding部分加regularization, 减少overfit + +### 示例Config + +[masknet_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/masknet_on_movielens.config) + +### 参考论文 + +[MaskNet](https://arxiv.org/pdf/2102.07619) diff --git a/docs/source/models/mmoe.md b/docs/source/models/mmoe.md index 3225e34bb..78668bf44 100644 --- a/docs/source/models/mmoe.md +++ b/docs/source/models/mmoe.md @@ -74,9 +74,111 @@ MMoE模型每个塔的输出名为:"logits\_" / "probs\_" / "y\_" + tower_name 其中,logits/probs/y对应: sigmoid之前的值/概率/回归模型的预测值 MMoE模型每个塔的指标为:指标名+ "\_" + tower_name +### 组件化主干网络为底座 + +```protobuf +model_config { + model_name: "MMoE" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + ... + wide_deep: DEEP + } + backbone { + blocks { + name: 'all' + inputs { + feature_group_name: 'all' + } + input_layer { + only_output_feature_list: true + } + } + blocks { + name: "senet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'SENet' + senet { + reduction_ratio: 4 + } + } + } + blocks { + name: "mmoe" + inputs { + block_name: "senet" + } + keras_layer { + class_name: 'MMoE' + mmoe { + num_task: 2 + num_expert: 3 + expert_mlp { + hidden_units: [256, 128] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + task_towers { + tower_name: "cvr" + label_name: "buy" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + l2_regularization: 1e-06 + } + embedding_regularization: 5e-05 +} +``` + +该案例添加了一个额外的`SENET`层,为了展示以组件化方式搭建模型的灵活性。 + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'MultiTaskModel', 不需要修改, 通过组件化方式搭建的多目标排序模型都叫这个名字 + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - senet: SENet模型的参数,详见[参考文档](../component/component.md#id3) + +- 其余与MMoE内置参数相同 + ### 示例Config -[MMoE_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/mmoe.config) +- [MMoE_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/mmoe.config) +- [MMoE_Backbone_demo.config](https://github.com/alibaba/EasyRec/blob/master/samples/model_config/mmoe_backbone_on_taobao.config) ### 参考论文 diff --git a/docs/source/models/multi_cls.md b/docs/source/models/multi_cls.md index e62212ee4..5160ab721 100644 --- a/docs/source/models/multi_cls.md +++ b/docs/source/models/multi_cls.md @@ -5,6 +5,8 @@ 如下图所示, 和CTR模型相比增加了: num_class: 2 +## 1. 内置模型 + ```protobuf model_config:{ model_class: "DeepFM" @@ -41,3 +43,84 @@ model_config:{ num_class: 2 } ``` + +## 2. 组件化模型 + +```protobuf +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + ... + wide_deep: WIDE + } + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + ... + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide_logit' + inputs { + feature_group_name: 'wide' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'features' + inputs { + feature_group_name: 'features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'features' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'features' + input_fn: 'lambda x: x[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + use_final_bn: false + final_activation: 'linear' + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [128, 64] + } + } + model_params { + l2_regularization: 1e-5 + wide_output_dim: 16 + } + embedding_regularization: 1e-4 + num_class: 2 +} +``` diff --git a/docs/source/models/multi_tower.md b/docs/source/models/multi_tower.md index 3180369c8..51409e8cf 100644 --- a/docs/source/models/multi_tower.md +++ b/docs/source/models/multi_tower.md @@ -9,6 +9,8 @@ ### 模型配置 +#### 1. 内置模型 + ```protobuf model_config: { model_class: 'MultiTower' @@ -97,9 +99,151 @@ model_config: { - l2_regularization: L2正则,防止overfit - embedding_regularization: embedding的L2正则 +#### 2. 组件化模型 + +```protobuf +model_config: { + model_name: 'MultiTower' + model_class: 'RankModel' + feature_groups: { + group_name: 'user' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'occupation' + feature_names: 'new_user_class_level' + wide_deep: DEEP + } + feature_groups: { + group_name: 'item' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'price' + wide_deep: DEEP + } + feature_groups: { + group_name: 'combo' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + losses { + loss_type: F1_REWEIGHTED_LOSS + weight: 1.0 + f1_reweighted_loss { + f1_beta_square: 2.25 + } + } + losses { + loss_type: PAIR_WISE_LOSS + weight: 1.0 + } + backbone { + packages { + name: "user_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "user" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: "item_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "item" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: "combo_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "combo" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + blocks { + name: "top_mlp" + inputs { + package_name: "user_tower" + } + inputs { + package_name: "item_tower" + } + inputs { + package_name: "combo_tower" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128, 64] + } + } + } + } + model_params { + l2_regularization: 1e-6 + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 + +- feature_groups: 特征组 + + - 可包含多个feature_group: 如 user、item、combo + - wide_deep: multi_tower模型使用的都是Deep features, 所以都设置成DEEP + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - packages: 可以打包一组block,构成一个可被复用的子网络,即被打包的子网络可以以共享参数的方式在同一个模型中调用多次。与之相反,没有打包的block是不能被多次调用的(但是可以多次复用结果). + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 + +- model_params: + + - l2_regularization: 对DNN参数的regularization, 减少overfit + +- embedding_regularization: 对embedding部分加regularization, 减少overfit + ### 示例config -[multi_tower_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/multi-tower.config) +1. 内置模型:[multi_tower_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/multi-tower.config) +1. 组件化模型:[multi_tower_backbone_on_taobao.config](https://github.com/alibaba/EasyRec/tree/master/samples/model_config/multi_tower_backbone_on_taobao.config) ### 参考论文 diff --git a/docs/source/models/rank.rst b/docs/source/models/rank.rst index 41b2ff6ab..1315dca73 100644 --- a/docs/source/models/rank.rst +++ b/docs/source/models/rank.rst @@ -14,11 +14,13 @@ din bst rocket_launching + masknet + fibinet regression multi_cls 多模态排序模型 -======== +============== .. toctree:: :maxdepth: 2 diff --git a/docs/source/models/recall.rst b/docs/source/models/recall.rst index 2b0839471..2feb00ba8 100644 --- a/docs/source/models/recall.rst +++ b/docs/source/models/recall.rst @@ -10,7 +10,7 @@ co_metric_learning_i2i 冷启动召回模型 -======== +============== .. toctree:: :maxdepth: 2 diff --git a/docs/source/models/regression.md b/docs/source/models/regression.md index beb172b73..5b523f9b4 100644 --- a/docs/source/models/regression.md +++ b/docs/source/models/regression.md @@ -5,6 +5,8 @@ 如下图所示, 和CTR模型相比增加了: loss_type: L2_LOSS +## 1. 内置模型 + ```protobuf model_config:{ model_class: "DeepFM" @@ -41,3 +43,93 @@ model_config:{ loss_type: L2_LOSS } ``` + +## 2. 组件化模型 + +```protobuf +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + feature_names: 'title' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide_logit' + inputs { + feature_group_name: 'wide' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'features' + inputs { + feature_group_name: 'features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'features' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'features' + input_fn: 'lambda x: x[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + use_final_bn: false + final_activation: 'linear' + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [128, 64] + } + } + model_params { + l2_regularization: 1e-5 + wide_output_dim: 16 + } + loss_type: L2_LOSS + embedding_regularization: 1e-4 +} +``` diff --git a/docs/source/models/simple_multi_task.md b/docs/source/models/simple_multi_task.md index 465a3ea87..0ff4403cc 100644 --- a/docs/source/models/simple_multi_task.md +++ b/docs/source/models/simple_multi_task.md @@ -8,6 +8,8 @@ ### 配置说明 +#### 1.内置模型 + ```protobuf model_config:{ model_class: "SimpleMultiTask" @@ -67,3 +69,68 @@ model_config:{ SimpleMultiTask模型每个塔的输出名为:"logits\_" / "probs\_" / "y\_" + tower_name 其中,logits/probs/y对应: sigmoid之前的值/概率/回归模型的预测值 SimpleMultiTask模型每个塔的指标为:指标名+ "\_" + tower_name + +#### 2. 组件化模型 + +```protobuf +model_config { + model_name: "SimpleMultiTask" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + ... + wide_deep: DEEP + } + backbone { + blocks { + name: "identity" + inputs { + feature_group_name: "all" + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + dnn { + hidden_units: [256, 192, 128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + task_towers { + tower_name: "cvr" + label_name: "buy" + dnn { + hidden_units: [256, 192, 128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + l2_regularization: 1e-07 + } + embedding_regularization: 5e-06 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 + +- model_class: 'MultiTaskModel', 不需要修改, 通过组件化方式搭建的多目标排序模型都叫这个名字 + +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + +- 其余与内置模型参数相同 diff --git a/docs/source/models/user_define.md b/docs/source/models/user_define.md index 89685ef21..4272fcf60 100644 --- a/docs/source/models/user_define.md +++ b/docs/source/models/user_define.md @@ -1,5 +1,8 @@ # 自定义模型 +**推荐使用[组件化](../component/backbone.md)的方式[搭建模型](../component/backbone.md#id13),可[自定义组件](../component/backbone.md#id12)添加新的特性和模型**。 +EasyRec组件化升级后不再需要使用如下方式开发新模型。 + ### 获取EasyRec源码 ```bash diff --git a/docs/source/models/wide_and_deep.md b/docs/source/models/wide_and_deep.md index 7f166231d..7fc0276de 100644 --- a/docs/source/models/wide_and_deep.md +++ b/docs/source/models/wide_and_deep.md @@ -8,6 +8,8 @@ WideAndDeep包含Wide和Deep两部分,Wide部分负责记忆,Deep部分负 ### 配置说明 +#### 1. 内置模型 + ```protobuf model_config:{ model_class: "WideAndDeep" @@ -66,9 +68,102 @@ model_config:{ - input_type: 如果在提交到pai-tf集群上面运行,读取max compute 表作为输入数据,data_config:input_type要设置为OdpsInputV2。 +#### 2. 组件化模型 + +```protobuf +model_config: { + model_name: 'wide and deep' + model_class: "RankModel" + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'deep' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + wide_output_dim: 1 + only_output_feature_list: true + } + } + blocks { + name: 'deep_logit' + inputs { + feature_group_name: 'deep' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 256, 256, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'final_logit' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.add_n(x)' + } + inputs { + block_name: 'deep_logit' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'final_logit' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +``` + +- model_name: 任意自定义字符串,仅有注释作用 +- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字 +- feature_groups: 特征组 + - 包含两个feature_group: wide 和 deep group +- backbone: 通过组件化的方式搭建的主干网络,[参考文档](../component/backbone.md) + - blocks: 由多个`组件块`组成的一个有向无环图(DAG),框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑,构建TF Graph的一个子图 + - name/inputs: 每个`block`有一个唯一的名字(name),并且有一个或多个输入(inputs)和输出 + - input_fn: 配置一个lambda函数对输入做一些简单的变换 + - input_layer: 对输入的`feature group`配置的特征做一些额外的加工,比如执行可选的`batch normalization`、`layer normalization`、`feature dropout`等操作,并且可以指定输出的tensor的格式(2d、3d、list等);[参考文档](../component/backbone.md#id15) + - wide_output_dim: wide部分输出的tensor的维度 + - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer,执行一段代码逻辑;[参考文档](../component/backbone.md#keraslayer) + - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义,如果不配置`concat_blocks`,框架会自动拼接DAG的所有叶子节点并输出。 +- model_params: + - l2_regularization: 对DNN参数的regularization, 减少overfit +- embedding_regularization: 对embedding部分加regularization, 减少overfit + ### 示例Config -[WideAndDeep_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/wide_and_deep.config) +1. 内置模型:[WideAndDeep_demo.config](https://easyrec.oss-cn-beijing.aliyuncs.com/config/wide_and_deep.config) +1. 组件化模型:[wide_and_deep_backbone_on_movielens.config](https://github.com/alibaba/EasyRec/tree/master/examples/configs/wide_and_deep_backbone_on_movielens.config) ### 参考论文 diff --git a/easy_rec/python/builders/loss_builder.py b/easy_rec/python/builders/loss_builder.py index 7459372a5..ec4ab57c8 100644 --- a/easy_rec/python/builders/loss_builder.py +++ b/easy_rec/python/builders/loss_builder.py @@ -41,12 +41,18 @@ def build(loss_type, return tf.losses.mean_squared_error( labels=label, predictions=pred, weights=loss_weight, **kwargs) elif loss_type == LossType.JRC_LOSS: - alpha = 0.5 if loss_param is None else loss_param.alpha - auto_weight = False if loss_param is None else not loss_param.HasField( - 'alpha') session = kwargs.get('session_ids', None) + if loss_param is None: + return jrc_loss(label, pred, session, name=loss_name) return jrc_loss( - label, pred, session, alpha, auto_weight=auto_weight, name=loss_name) + label, + pred, + session, + loss_param.alpha, + loss_weight_strategy=loss_param.loss_weight_strategy, + sample_weights=loss_weight, + same_label_loss=loss_param.same_label_loss, + name=loss_name) elif loss_type == LossType.PAIR_WISE_LOSS: session = kwargs.get('session_ids', None) margin = 0 if loss_param is None else loss_param.margin diff --git a/easy_rec/python/compat/feature_column/feature_column_v2.py b/easy_rec/python/compat/feature_column/feature_column_v2.py index e1e4d9304..e3f7e6015 100644 --- a/easy_rec/python/compat/feature_column/feature_column_v2.py +++ b/easy_rec/python/compat/feature_column/feature_column_v2.py @@ -5193,3 +5193,13 @@ def deserialize_feature_columns(configs, custom_objects=None): deserialize_feature_column(c, custom_objects, columns_by_name) for c in configs ] + + +def is_embedding_column(fc): + if isinstance(fc, EmbeddingColumn): + return True + if isinstance(fc, fc_old._SharedEmbeddingColumn): + return True + if isinstance(fc, SharedEmbeddingColumn): + return True + return False diff --git a/easy_rec/python/layers/backbone.py b/easy_rec/python/layers/backbone.py new file mode 100644 index 000000000..893734eba --- /dev/null +++ b/easy_rec/python/layers/backbone.py @@ -0,0 +1,369 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import logging + +import six +import tensorflow as tf +from google.protobuf import struct_pb2 + +from easy_rec.python.layers.common_layers import EnhancedInputLayer +from easy_rec.python.layers.keras import MLP +from easy_rec.python.layers.utils import Parameter +from easy_rec.python.protos import backbone_pb2 +from easy_rec.python.utils.dag import DAG +from easy_rec.python.utils.load_class import load_keras_layer + +if tf.__version__ >= '2.0': + tf = tf.compat.v1 + + +class Package(object): + """A sub DAG of tf ops for reuse.""" + __packages = {} + + def __init__(self, config, features, input_layer, l2_reg=None): + self._config = config + self._features = features + self._input_layer = input_layer + self._l2_reg = l2_reg + self._dag = DAG() + self._name_to_blocks = {} + self.loss_dict = {} + input_feature_groups = set() + for block in config.blocks: + if len(block.inputs) == 0: + raise ValueError('block takes at least one input: %s' % block.name) + self._dag.add_node(block.name) + self._name_to_blocks[block.name] = block + layer = block.WhichOneof('layer') + if layer == 'input_layer': + if len(block.inputs) != 1: + raise ValueError('input layer `%s` takes only one input' % block.name) + one_input = block.inputs[0] + name = one_input.WhichOneof('name') + if name != 'feature_group_name': + raise KeyError( + '`feature_group_name` should be set for input layer: ' + + block.name) + input_name = one_input.feature_group_name + if not input_layer.has_group(input_name): + raise KeyError('invalid feature group name: ' + input_name) + if input_name in input_feature_groups: + logging.warning('input `%s` already exists in other block' % + input_name) + input_feature_groups.add(input_name) + + num_groups = len(input_feature_groups) + num_blocks = len(self._name_to_blocks) - num_groups + assert num_blocks > 0, 'there must be at least one block in backbone' + + num_pkg_input = 0 + for block in config.blocks: + layer = block.WhichOneof('layer') + if layer == 'input_layer': + continue + if block.name in input_feature_groups: + raise KeyError('block name can not be one of feature groups:' + + block.name) + for input_node in block.inputs: + input_type = input_node.WhichOneof('name') + if input_type == 'package_name': + num_pkg_input += 1 + continue + input_name = getattr(input_node, input_type) + if input_name in self._name_to_blocks: + assert input_name != block.name, 'input name can not equal to block name:' + input_name + self._dag.add_edge(input_name, block.name) + elif input_name not in input_feature_groups: + if input_layer.has_group(input_name): + logging.info('adding an input_layer block: ' + input_name) + new_block = backbone_pb2.Block() + new_block.name = input_name + input_cfg = backbone_pb2.Input() + input_cfg.feature_group_name = input_name + new_block.inputs.append(input_cfg) + new_block.input_layer.CopyFrom(backbone_pb2.InputLayer()) + self._name_to_blocks[input_name] = new_block + self._dag.add_node(input_name) + self._dag.add_edge(input_name, block.name) + input_feature_groups.add(input_name) + else: + raise KeyError( + 'invalid input name `%s`, must be the name of either a feature group or an another block' + % input_name) + num_groups = len(input_feature_groups) + assert num_pkg_input > 0 or num_groups > 0, 'there must be at least one input layer/feature group' + + if len(config.concat_blocks) == 0: + leaf = self._dag.all_leaves() + logging.warning( + '%s has no `concat_blocks`, try to use all leaf blocks: %s' % + (config.name, ','.join(leaf))) + self._config.concat_blocks.extend(leaf) + + Package.__packages[self._config.name] = self + + def block_input(self, config, block_outputs, training=None): + inputs = [] + for input_node in config.inputs: + input_type = input_node.WhichOneof('name') + input_name = getattr(input_node, input_type) + if input_type == 'package_name': + if input_name not in Package.__packages: + raise KeyError('package name `%s` does not exists' % input_name) + package = Package.__packages[input_name] + input_feature = package(training) + if len(package.loss_dict) > 0: + self.loss_dict.update(package.loss_dict) + elif input_name in block_outputs: + input_feature = block_outputs[input_name] + else: + raise KeyError('input name `%s` does not exists' % input_name) + + if input_node.HasField('input_slice'): + fn = eval('lambda x: x' + input_node.input_slice.strip()) + input_feature = fn(input_feature) + if input_node.HasField('input_fn'): + fn = eval(input_node.input_fn) + input_feature = fn(input_feature) + inputs.append(input_feature) + + if config.merge_inputs_into_list: + output = inputs + else: + output = merge_inputs(inputs, config.input_concat_axis, config.name) + + if config.HasField('extra_input_fn'): + fn = eval(config.extra_input_fn) + output = fn(output) + return output + + def __call__(self, is_training, **kwargs): + with tf.variable_scope(self._config.name, reuse=tf.AUTO_REUSE): + return self.call(is_training) + + def call(self, is_training): + block_outputs = {} + blocks = self._dag.topological_sort() + logging.info(self._config.name + ' topological order: ' + ','.join(blocks)) + print(self._config.name + ' topological order: ' + ','.join(blocks)) + for block in blocks: + config = self._name_to_blocks[block] + if config.layers: # sequential layers + logging.info('call sequential %d layers' % len(config.layers)) + output = self.block_input(config, block_outputs, is_training) + for layer in config.layers: + output = self.call_layer(output, layer, block, is_training) + block_outputs[block] = output + continue + # just one of layer + layer = config.WhichOneof('layer') + if layer is None: # identity layer + block_outputs[block] = self.block_input(config, block_outputs, + is_training) + elif layer == 'input_layer': + conf = config.input_layer + input_fn = EnhancedInputLayer(conf, self._input_layer, self._features) + feature_group = config.inputs[0].feature_group_name + output = input_fn(feature_group, is_training) + block_outputs[block] = output + else: + inputs = self.block_input(config, block_outputs, is_training) + output = self.call_layer(inputs, config, block, is_training) + block_outputs[block] = output + + outputs = [] + for output in self._config.concat_blocks: + if output in block_outputs: + temp = block_outputs[output] + if type(temp) in (tuple, list): + outputs.extend(temp) + else: + outputs.append(temp) + else: + raise ValueError('No output `%s` of backbone to be concat' % output) + output = merge_inputs(outputs, msg='backbone') + return output + + def call_keras_layer(self, layer_conf, inputs, name, training): + layer_cls, customize = load_keras_layer(layer_conf.class_name) + if layer_cls is None: + raise ValueError('Invalid keras layer class name: ' + + layer_conf.class_name) + + param_type = layer_conf.WhichOneof('params') + if customize: + if param_type is None or param_type == 'st_params': + params = Parameter(layer_conf.st_params, True, l2_reg=self._l2_reg) + else: + pb_params = getattr(layer_conf, param_type) + params = Parameter(pb_params, False, l2_reg=self._l2_reg) + layer = layer_cls(params, name=name) + kwargs = {'loss_dict': self.loss_dict} + return layer(inputs, training=training, **kwargs) + else: # internal keras layer + if param_type is None: + layer = layer_cls(name=name) + else: + assert param_type == 'st_params', 'internal keras layer only support st_params' + try: + kwargs = convert_to_dict(layer_conf.st_params) + logging.info('call %s layer with params %r' % + (layer_conf.class_name, kwargs)) + layer = layer_cls(name=name, **kwargs) + except TypeError as e: + logging.warning(e) + args = map(format_value, layer_conf.st_params.values()) + logging.info('try to call %s layer with params %r' % + (layer_conf.class_name, args)) + layer = layer_cls(*args, name=name) + try: + return layer(inputs, training=training) + except TypeError: + return layer(inputs) + + def call_layer(self, inputs, config, name, training): + layer_name = config.WhichOneof('layer') + if layer_name == 'keras_layer': + return self.call_keras_layer(config.keras_layer, inputs, name, training) + if layer_name == 'lambda': + conf = getattr(config, 'lambda') + fn = eval(conf.expression) + return fn(inputs) + if layer_name == 'repeat': + conf = config.repeat + n_loop = conf.num_repeat + outputs = [] + for i in range(n_loop): + name_i = '%s_%d' % (name, i) + output = self.call_keras_layer(conf.keras_layer, inputs, name_i, + training) + outputs.append(output) + if len(outputs) == 1: + return outputs[0] + if conf.HasField('output_concat_axis'): + return tf.concat(outputs, conf.output_concat_axis) + return outputs + if layer_name == 'recurrent': + conf = config.recurrent + fixed_input_index = -1 + if conf.HasField('fixed_input_index'): + fixed_input_index = conf.fixed_input_index + if fixed_input_index >= 0: + assert type(inputs) in (tuple, list), '%s inputs must be a list' + output = inputs + for i in range(conf.num_steps): + name_i = '%s_%d' % (name, i) + layer = conf.keras_layer + output_i = self.call_keras_layer(layer, output, name_i, training) + if fixed_input_index >= 0: + j = 0 + for idx in range(len(output)): + if idx == fixed_input_index: + continue + if type(output_i) in (tuple, list): + output[idx] = output_i[j] + else: + output[idx] = output_i + j += 1 + else: + output = output_i + if fixed_input_index >= 0: + del output[fixed_input_index] + if len(output) == 1: + return output[0] + return output + return output + + raise NotImplementedError('Unsupported backbone layer:' + layer_name) + + +class Backbone(object): + """Configurable Backbone Network.""" + + def __init__(self, config, features, input_layer, l2_reg=None): + self._config = config + self._l2_reg = l2_reg + self.loss_dict = {} + for pkg in config.packages: + Package(pkg, features, input_layer, l2_reg) + + main_pkg = backbone_pb2.BlockPackage() + main_pkg.name = 'backbone' + main_pkg.blocks.MergeFrom(config.blocks) + main_pkg.concat_blocks.extend(config.concat_blocks) + self._main_pkg = Package(main_pkg, features, input_layer, l2_reg) + + def __call__(self, is_training, **kwargs): + output = self._main_pkg(is_training, **kwargs) + if len(self._main_pkg.loss_dict) > 0: + self.loss_dict = self._main_pkg.loss_dict + + if self._config.HasField('top_mlp'): + params = Parameter.make_from_pb(self._config.top_mlp) + params.l2_regularizer = self._l2_reg + final_mlp = MLP(params, name='backbone_top_mlp') + output = final_mlp(output, training=is_training) + return output + + @classmethod + def wide_embed_dim(cls, config): + wide_embed_dim = None + for pkg in config.packages: + wide_embed_dim = get_wide_embed_dim(pkg.blocks, wide_embed_dim) + return get_wide_embed_dim(config.blocks, wide_embed_dim) + + +def get_wide_embed_dim(blocks, wide_embed_dim=None): + for block in blocks: + layer = block.WhichOneof('layer') + if layer == 'input_layer': + if block.input_layer.HasField('wide_output_dim'): + wide_dim = block.input_layer.wide_output_dim + if wide_embed_dim: + assert wide_embed_dim == wide_dim, 'wide_output_dim must be consistent' + else: + wide_embed_dim = wide_dim + return wide_embed_dim + + +def merge_inputs(inputs, axis=-1, msg=''): + if len(inputs) == 0: + raise ValueError('no inputs to be concat:' + msg) + if len(inputs) == 1: + return inputs[0] + + from functools import reduce + if all(map(lambda x: type(x) == list, inputs)): + # merge multiple lists into a list + return reduce(lambda x, y: x + y, inputs) + + if any(map(lambda x: type(x) == list, inputs)): + logging.warning('%s: try to merge inputs into list' % msg) + return reduce(lambda x, y: x + y, + [e if type(e) == list else [e] for e in inputs]) + + if axis != -1: + logging.info('concat inputs %s axis=%d' % (msg, axis)) + return tf.concat(inputs, axis=axis) + + +def format_value(value): + value_type = type(value) + if value_type == six.text_type: + return str(value) + if value_type == float: + int_v = int(value) + return int_v if int_v == value else value + if value_type == struct_pb2.ListValue: + return map(format_value, value) + if value_type == struct_pb2.Struct: + return convert_to_dict(value) + return value + + +def convert_to_dict(struct): + kwargs = {} + for key, value in struct.items(): + kwargs[str(key)] = format_value(value) + return kwargs diff --git a/easy_rec/python/layers/cmbf.py b/easy_rec/python/layers/cmbf.py index b633bac2b..e5f1caeb2 100644 --- a/easy_rec/python/layers/cmbf.py +++ b/easy_rec/python/layers/cmbf.py @@ -33,7 +33,8 @@ def __init__(self, model_config, feature_configs, features, cmbf_config, has_feature = True self._txt_seq_features = None if input_layer.has_group('text'): - self._txt_seq_features = input_layer(features, 'text', is_combine=False) + self._txt_seq_features, _, _ = input_layer( + features, 'text', is_combine=False) has_feature = True self._other_features = None if input_layer.has_group('other'): # e.g. statistical feature diff --git a/easy_rec/python/layers/common_layers.py b/easy_rec/python/layers/common_layers.py index 165fce5e1..fae4fe3fc 100644 --- a/easy_rec/python/layers/common_layers.py +++ b/easy_rec/python/layers/common_layers.py @@ -1,8 +1,12 @@ # -*- encoding: utf-8 -*- # Copyright (c) Alibaba, Inc. and its affiliates. +import six import tensorflow as tf +from easy_rec.python.compat.layers import layer_norm as tf_layer_norm +from easy_rec.python.utils.activation import get_activation + if tf.__version__ >= '2.0': tf = tf.compat.v1 @@ -14,6 +18,8 @@ def highway(x, scope='highway', dropout=0.0, reuse=None): + if isinstance(activation, six.string_types): + activation = get_activation(activation) with tf.variable_scope(scope, reuse): if size is None: size = x.shape.as_list()[-1] @@ -61,3 +67,80 @@ def text_cnn(x, pool_flat = tf.concat( pooled_outputs, 1) # shape: (batch_size, num_filters * len(filter_sizes)) return pool_flat + + +def layer_norm(input_tensor, name=None, reuse=None): + """Run layer normalization on the last dimension of the tensor.""" + return tf_layer_norm( + inputs=input_tensor, + begin_norm_axis=-1, + begin_params_axis=-1, + reuse=reuse, + scope=name) + + +class EnhancedInputLayer(object): + """Enhance the raw input layer.""" + + def __init__(self, config, input_layer, feature_dict): + if config.do_batch_norm and config.do_layer_norm: + raise ValueError( + 'can not do batch norm and layer norm for input layer at the same time' + ) + self._config = config + self._input_layer = input_layer + self._feature_dict = feature_dict + + def __call__(self, group, is_training, **kwargs): + with tf.name_scope('input_' + group): + return self.call(group, is_training) + + def call(self, group, is_training): + if self._config.output_seq_and_normal_feature: + seq_features, target_feature, target_features = self._input_layer( + self._feature_dict, group, is_combine=False) + return seq_features, target_features + + features, feature_list = self._input_layer(self._feature_dict, group) + num_features = len(feature_list) + + do_ln = self._config.do_layer_norm + do_bn = self._config.do_batch_norm + do_feature_dropout = is_training and 0.0 < self._config.feature_dropout_rate < 1.0 + if do_feature_dropout: + keep_prob = 1.0 - self._config.feature_dropout_rate + bern = tf.distributions.Bernoulli(probs=keep_prob, dtype=tf.float32) + mask = bern.sample(num_features) + elif do_bn: + features = tf.layers.batch_normalization(features, training=is_training) + elif do_ln: + features = layer_norm(features) + + do_dropout = 0.0 < self._config.dropout_rate < 1.0 + if do_feature_dropout or do_ln or do_bn or do_dropout: + for i in range(num_features): + fea = feature_list[i] + if self._config.do_batch_norm: + fea = tf.layers.batch_normalization(fea, training=is_training) + elif self._config.do_layer_norm: + fea = layer_norm(fea) + if do_dropout: + fea = tf.layers.dropout( + fea, self._config.dropout_rate, training=is_training) + if do_feature_dropout: + fea = tf.div(fea, keep_prob) * mask[i] + feature_list[i] = fea + if do_feature_dropout: + features = tf.concat(feature_list, axis=-1) + + if do_dropout and not do_feature_dropout: + features = tf.layers.dropout( + features, self._config.dropout_rate, training=is_training) + + if self._config.only_output_feature_list: + return feature_list + if self._config.only_output_3d_tensor: + return tf.stack(feature_list, axis=1) + if self._config.output_2d_tensor_and_feature_list: + return features, feature_list + return features diff --git a/easy_rec/python/layers/input_layer.py b/easy_rec/python/layers/input_layer.py index 318650317..914f977fa 100644 --- a/easy_rec/python/layers/input_layer.py +++ b/easy_rec/python/layers/input_layer.py @@ -19,10 +19,7 @@ from easy_rec.python.utils import conditional from easy_rec.python.utils import shape_utils -from easy_rec.python.compat.feature_column.feature_column_v2 import EmbeddingColumn # NOQA -from easy_rec.python.compat.feature_column.feature_column_v2 import SharedEmbeddingColumn # NOQA - -from easy_rec.python.compat.feature_column.feature_column import _SharedEmbeddingColumn # NOQA +from easy_rec.python.compat.feature_column.feature_column_v2 import is_embedding_column # NOQA class InputLayer(object): @@ -72,6 +69,119 @@ def __init__(self, def has_group(self, group_name): return group_name in self._feature_groups + def get_combined_feature(self, features, group_name, is_dict=False): + """Get combined features by group_name. + + Args: + features: input tensor dict + group_name: feature_group name + is_dict: whether to return group_features in dict + + Return: + features: all features concatenate together + group_features: list of features + feature_name_to_output_tensors: dict, feature_name to feature_value, only present when is_dict is True + """ + feature_name_to_output_tensors = {} + negative_sampler = self._feature_groups[group_name]._config.negative_sampler + + place_on_cpu = os.getenv('place_embedding_on_cpu') + place_on_cpu = eval(place_on_cpu) if place_on_cpu else False + with conditional(self._is_predicting and place_on_cpu, + ops.device('/CPU:0')): + concat_features, group_features = self.single_call_input_layer( + features, group_name, feature_name_to_output_tensors) + if group_name in self._group_name_to_seq_features: + # for target attention + group_seq_arr = self._group_name_to_seq_features[group_name] + concat_features, all_seq_fea = self.sequence_feature_layer( + features, + concat_features, + group_seq_arr, + feature_name_to_output_tensors, + negative_sampler=negative_sampler, + scope_name=group_name) + group_features.extend(all_seq_fea) + for col, fea in zip(group_seq_arr, all_seq_fea): + feature_name_to_output_tensors['seq_fea/' + col.group_name] = fea + all_seq_fea = array_ops.concat(all_seq_fea, axis=-1) + concat_features = array_ops.concat([concat_features, all_seq_fea], + axis=-1) + if is_dict: + return concat_features, group_features, feature_name_to_output_tensors + else: + return concat_features, group_features + + def get_plain_feature(self, features, group_name): + """Get plain features by group_name. Exclude sequence features. + + Args: + features: input tensor dict + group_name: feature_group name + + Return: + features: all features concatenate together + group_features: list of features + """ + assert group_name in self._feature_groups, 'invalid group_name[%s], list: %s' % ( + group_name, ','.join([x for x in self._feature_groups])) + + feature_group = self._feature_groups[group_name] + group_columns, _ = feature_group.select_columns(self._fc_parser) + if not group_columns: + return None, [] + + cols_to_output_tensors = OrderedDict() + output_features = feature_column.input_layer( + features, group_columns, cols_to_output_tensors=cols_to_output_tensors) + group_features = [cols_to_output_tensors[x] for x in group_columns] + + embedding_reg_lst = [] + for col, val in cols_to_output_tensors.items(): + if is_embedding_column(col): + embedding_reg_lst.append(val) + regularizers.apply_regularization( + self._embedding_regularizer, weights_list=embedding_reg_lst) + return output_features, group_features + + def get_sequence_feature(self, features, group_name): + """Get sequence features by group_name. Exclude plain features. + + Args: + features: input tensor dict + group_name: feature_group name + + Return: + seq_features: list of sequence features, each element is a tuple: + 3d embedding tensor (batch_size, max_seq_len, embedding_dimension), + 1d sequence length tensor. + """ + assert group_name in self._feature_groups, 'invalid group_name[%s], list: %s' % ( + group_name, ','.join([x for x in self._feature_groups])) + + if self._variational_dropout_config is not None: + raise ValueError( + 'variational dropout is not supported in not combined mode now.') + + feature_group = self._feature_groups[group_name] + _, group_seq_columns = feature_group.select_columns(self._fc_parser) + + embedding_reg_lst = [] + builder = feature_column._LazyBuilder(features) + seq_features = [] + for fc in group_seq_columns: + with variable_scope.variable_scope('input_layer/' + + fc.categorical_column.name): + tmp_embedding, tmp_seq_len = fc._get_sequence_dense_tensor(builder) + if fc.max_seq_length > 0: + tmp_embedding, tmp_seq_len = shape_utils.truncate_sequence( + tmp_embedding, tmp_seq_len, fc.max_seq_length) + seq_features.append((tmp_embedding, tmp_seq_len)) + embedding_reg_lst.append(tmp_embedding) + regularizers.apply_regularization( + self._embedding_regularizer, weights_list=embedding_reg_lst) + return seq_features + def __call__(self, features, group_name, is_combine=True, is_dict=False): """Get features by group_name. @@ -94,62 +204,18 @@ def __call__(self, features, group_name, is_combine=True, is_dict=False): """ assert group_name in self._feature_groups, 'invalid group_name[%s], list: %s' % ( group_name, ','.join([x for x in self._feature_groups])) - feature_name_to_output_tensors = {} - negative_sampler = self._feature_groups[group_name]._config.negative_sampler if is_combine: - place_on_cpu = os.getenv('place_embedding_on_cpu') - place_on_cpu = eval(place_on_cpu) if place_on_cpu else False - with conditional(self._is_predicting and place_on_cpu, - ops.device('/CPU:0')): - concat_features, group_features = self.single_call_input_layer( - features, group_name, feature_name_to_output_tensors) - if group_name in self._group_name_to_seq_features: - # for target attention - group_seq_arr = self._group_name_to_seq_features[group_name] - concat_features, all_seq_fea = self.sequence_feature_layer( - features, - concat_features, - group_seq_arr, - feature_name_to_output_tensors, - negative_sampler=negative_sampler, - scope_name=group_name) - group_features.extend(all_seq_fea) - for col, fea in zip(group_seq_arr, all_seq_fea): - feature_name_to_output_tensors['seq_fea/' + col.group_name] = fea - all_seq_fea = array_ops.concat(all_seq_fea, axis=-1) - concat_features = array_ops.concat([concat_features, all_seq_fea], - axis=-1) - if is_dict: - return concat_features, group_features, feature_name_to_output_tensors - else: - return concat_features, group_features - else: # return sequence feature in raw format instead of combine them - if self._variational_dropout_config is not None: - raise ValueError( - 'variational dropout is not supported in not combined mode now.') - - feature_group = self._feature_groups[group_name] - group_columns, group_seq_columns = feature_group.select_columns( - self._fc_parser) - - assert len(group_columns) == 0, \ - 'there are none sequence columns: %s' % str(group_columns) - - builder = feature_column._LazyBuilder(features) - seq_features = [] - embedding_reg_lst = [] - for fc in group_seq_columns: - with variable_scope.variable_scope('input_layer/' + - fc.categorical_column.name): - tmp_embedding, tmp_seq_len = fc._get_sequence_dense_tensor(builder) - if fc.max_seq_length > 0: - tmp_embedding, tmp_seq_len = shape_utils.truncate_sequence( - tmp_embedding, tmp_seq_len, fc.max_seq_length) - seq_features.append((tmp_embedding, tmp_seq_len)) - embedding_reg_lst.append(tmp_embedding) - regularizers.apply_regularization( - self._embedding_regularizer, weights_list=embedding_reg_lst) - return seq_features + return self.get_combined_feature(features, group_name, is_dict) + + # return sequence feature in raw format instead of combine them + place_on_cpu = os.getenv('place_embedding_on_cpu') + place_on_cpu = eval(place_on_cpu) if place_on_cpu else False + with conditional(self._is_predicting and place_on_cpu, + ops.device('/CPU:0')): + seq_features = self.get_sequence_feature(features, group_name) + plain_features, feature_list = self.get_plain_feature( + features, group_name) + return seq_features, plain_features, feature_list def single_call_input_layer(self, features, @@ -178,12 +244,8 @@ def single_call_input_layer(self, group_columns, cols_to_output_tensors=cols_to_output_tensors, feature_name_to_output_tensors=feature_name_to_output_tensors) - # embedding_reg_lst = [output_features] + embedding_reg_lst = [] - for col, val in cols_to_output_tensors.items(): - if isinstance(col, EmbeddingColumn) or isinstance(col, - SharedEmbeddingColumn): - embedding_reg_lst.append(val) builder = feature_column._LazyBuilder(features) seq_features = [] for column in sorted(group_seq_columns, key=lambda x: x.name): @@ -243,9 +305,12 @@ def single_call_input_layer(self, group_features = [cols_to_output_tensors[x] for x in group_columns] + \ [cols_to_output_tensors[x] for x in group_seq_columns] - if embedding_reg_lst: - regularizers.apply_regularization( - self._embedding_regularizer, weights_list=embedding_reg_lst) + for fc, val in cols_to_output_tensors.items(): + if is_embedding_column(fc): + embedding_reg_lst.append(val) + if embedding_reg_lst: + regularizers.apply_regularization( + self._embedding_regularizer, weights_list=embedding_reg_lst) return concat_features, group_features def get_wide_deep_dict(self): diff --git a/easy_rec/python/layers/keras/__init__.py b/easy_rec/python/layers/keras/__init__.py new file mode 100644 index 000000000..cd1c5bff3 --- /dev/null +++ b/easy_rec/python/layers/keras/__init__.py @@ -0,0 +1,16 @@ +from .blocks import MLP +from .blocks import Gate +from .blocks import Highway +from .bst import BST +from .din import DIN +from .fibinet import BiLinear +from .fibinet import FiBiNet +from .fibinet import SENet +from .interaction import FM +from .interaction import Cross +from .interaction import DotInteraction +from .mask_net import MaskBlock +from .mask_net import MaskNet +from .multi_task import MMoE +from .numerical_embedding import AutoDisEmbedding +from .numerical_embedding import PeriodicEmbedding diff --git a/easy_rec/python/layers/keras/blocks.py b/easy_rec/python/layers/keras/blocks.py new file mode 100644 index 000000000..928329d16 --- /dev/null +++ b/easy_rec/python/layers/keras/blocks.py @@ -0,0 +1,164 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +"""Convenience blocks for building models.""" +import logging + +import tensorflow as tf + +from easy_rec.python.utils.activation import get_activation + +if tf.__version__ >= '2.0': + tf = tf.compat.v1 + + +class MLP(tf.keras.layers.Layer): + """Sequential multi-layer perceptron (MLP) block. + + Attributes: + units: Sequential list of layer sizes. + use_bias: Whether to include a bias term. + activation: Type of activation to use on all except the last layer. + final_activation: Type of activation to use on last layer. + **kwargs: Extra args passed to the Keras Layer base class. + """ + + def __init__(self, params, name='mlp', **kwargs): + super(MLP, self).__init__(name=name, **kwargs) + params.check_required('hidden_units') + use_bn = params.get_or_default('use_bn', True) + use_final_bn = params.get_or_default('use_final_bn', True) + use_bias = params.get_or_default('use_bias', True) + dropout_rate = list(params.get_or_default('dropout_ratio', [])) + activation = params.get_or_default('activation', 'relu') + initializer = params.get_or_default('initializer', 'he_uniform') + final_activation = params.get_or_default('final_activation', None) + use_bn_after_act = params.get_or_default('use_bn_after_activation', False) + units = list(params.hidden_units) + logging.info( + 'MLP(%s) units: %s, dropout: %r, activate=%s, use_bn=%r, final_bn=%r,' + ' final_activate=%s, bias=%r, initializer=%s, bn_after_activation=%r' % + (name, units, dropout_rate, activation, use_bn, use_final_bn, + final_activation, use_bias, initializer, use_bn_after_act)) + assert len(units) > 0, 'MLP(%s) takes at least one hidden units' % name + + num_dropout = len(dropout_rate) + self._sub_layers = [] + for i, num_units in enumerate(units[:-1]): + name = 'dnn_%d' % i + drop_rate = dropout_rate[i] if i < num_dropout else 0.0 + self.add_rich_layer(num_units, use_bn, drop_rate, activation, initializer, + use_bias, use_bn_after_act, name, + params.l2_regularizer) + + n = len(units) - 1 + drop_rate = dropout_rate[n] if num_dropout > n else 0.0 + name = 'dnn_%d' % n + self.add_rich_layer(units[-1], use_final_bn, drop_rate, final_activation, + initializer, use_bias, use_bn_after_act, name, + params.l2_regularizer) + + def add_rich_layer(self, + num_units, + use_bn, + dropout_rate, + activation, + initializer, + use_bias=True, + use_bn_after_activation=False, + name='mlp', + l2_reg=None): + + def batch_norm(x, training): + return tf.layers.batch_normalization( + x, training=training, name='%s/%s/bn' % (self.name, name)) + + act_fn = get_activation(activation) + if use_bn and not use_bn_after_activation: + dense = tf.keras.layers.Dense( + units=num_units, + use_bias=use_bias, + kernel_initializer=initializer, + kernel_regularizer=l2_reg, + name=name) + self._sub_layers.append(dense) + + # bn = tf.keras.layers.BatchNormalization(name='%s/bn' % name) + # keras BN layer have a stale issue on some versions of tf + self._sub_layers.append(batch_norm) + act = tf.keras.layers.Activation(act_fn, name='%s/act' % name) + self._sub_layers.append(act) + else: + dense = tf.keras.layers.Dense( + num_units, + activation=act_fn, + use_bias=use_bias, + kernel_initializer=initializer, + kernel_regularizer=l2_reg, + name=name) + self._sub_layers.append(dense) + if use_bn and use_bn_after_activation: + self._sub_layers.append(batch_norm) + + if 0.0 < dropout_rate < 1.0: + dropout = tf.keras.layers.Dropout(dropout_rate, name='%s/dropout' % name) + self._sub_layers.append(dropout) + elif dropout_rate >= 1.0: + raise ValueError('invalid dropout_ratio: %.3f' % dropout_rate) + + def call(self, x, training=None, **kwargs): + """Performs the forward computation of the block.""" + from inspect import isfunction + for layer in self._sub_layers: + if isfunction(layer): + x = layer(x, training=training) + else: + cls = layer.__class__.__name__ + if cls in ('Dropout', 'BatchNormalization'): + x = layer(x, training=training) + else: + x = layer(x) + return x + + +class Highway(tf.keras.layers.Layer): + + def __init__(self, params, name='highway', **kwargs): + super(Highway, self).__init__(name, **kwargs) + params.check_required('emb_size') + self.emb_size = params.emb_size + self.num_layers = params.get_or_default('num_layers', 1) + self.activation = params.get_or_default('activation', 'gelu') + self.dropout_rate = params.get_or_default('dropout_rate', 0.0) + + def call(self, inputs, training=None, **kwargs): + from easy_rec.python.layers.common_layers import highway + return highway( + inputs, + self.emb_size, + activation=self.activation, + num_layers=self.num_layers, + dropout=self.dropout_rate if training else 0.0) + + +class Gate(tf.keras.layers.Layer): + """Weighted sum gate.""" + + def __init__(self, params, name='gate', **kwargs): + super(Gate, self).__init__(name, **kwargs) + self.weight_index = params.get_or_default('weight_index', 0) + + def call(self, inputs, **kwargs): + assert len( + inputs + ) > 1, 'input of Gate layer must be a list containing at least 2 elements' + weights = inputs[self.weight_index] + j = 0 + for i, x in enumerate(inputs): + if i == self.weight_index: + continue + if j == 0: + output = weights[:, j, None] * x + else: + output += weights[:, j, None] * x + j += 1 + return output diff --git a/easy_rec/python/layers/keras/bst.py b/easy_rec/python/layers/keras/bst.py new file mode 100644 index 000000000..020a06d59 --- /dev/null +++ b/easy_rec/python/layers/keras/bst.py @@ -0,0 +1,97 @@ +# -*- encoding: utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import tensorflow as tf +from tensorflow.python.keras.layers import Layer + +from easy_rec.python.layers import multihead_cross_attention +from easy_rec.python.utils.activation import get_activation +from easy_rec.python.utils.shape_utils import get_shape_list + + +class BST(Layer): + + def __init__(self, params, name='bst', l2_reg=None, **kwargs): + super(BST, self).__init__(name=name, **kwargs) + self.l2_reg = l2_reg + self.config = params.get_pb_config() + + def encode(self, seq_input, max_position): + seq_fea = multihead_cross_attention.embedding_postprocessor( + seq_input, + position_embedding_name=self.name + '/position_embeddings', + max_position_embeddings=max_position, + reuse_position_embedding=tf.AUTO_REUSE) + + n = tf.count_nonzero(seq_input, axis=-1) + seq_mask = tf.cast(n > 0, tf.int32) + + attention_mask = multihead_cross_attention.create_attention_mask_from_input_mask( + from_tensor=seq_fea, to_mask=seq_mask) + + hidden_act = get_activation(self.config.hidden_act) + attention_fea = multihead_cross_attention.transformer_encoder( + seq_fea, + hidden_size=self.config.hidden_size, + num_hidden_layers=self.config.num_hidden_layers, + num_attention_heads=self.config.num_attention_heads, + attention_mask=attention_mask, + intermediate_size=self.config.intermediate_size, + intermediate_act_fn=hidden_act, + hidden_dropout_prob=self.config.hidden_dropout_prob, + attention_probs_dropout_prob=self.config.attention_probs_dropout_prob, + initializer_range=self.config.initializer_range, + name=self.name + '/transformer', + reuse=tf.AUTO_REUSE) + # attention_fea shape: [batch_size, seq_length, hidden_size] + out_fea = attention_fea[:, 0, :] # target feature + print('bst output shape:', out_fea.shape) + return out_fea + + def call(self, inputs, training=None, **kwargs): + seq_features, target_features = inputs + assert len(seq_features) > 0, '[%s] sequence feature is empty' % self.name + if not training: + self.config.hidden_dropout_prob = 0.0 + self.config.attention_probs_dropout_prob = 0.0 + + seq_embeds = [seq_fea for seq_fea, _ in seq_features] + + max_position = self.config.max_position_embeddings + # max_seq_len: the max sequence length in current mini-batch, all sequences are padded to this length + batch_size, max_seq_len, _ = get_shape_list(seq_features[0][0], 3) + valid_len = tf.assert_less_equal( + max_seq_len, + max_position, + message='sequence length is greater than `max_position_embeddings`:' + + str(max_position) + ' in feature group:' + self.name) + with tf.control_dependencies([valid_len]): + # seq_input: [batch_size, seq_len, embed_size] + seq_input = tf.concat(seq_embeds, axis=-1) + if len(target_features) > 0: + max_position += 1 + + seq_embed_size = seq_input.shape.as_list()[-1] + if seq_embed_size != self.config.hidden_size: + seq_input = tf.layers.dense( + seq_input, + self.config.hidden_size, + activation=tf.nn.relu, + kernel_regularizer=self.l2_reg) + + if len(target_features) > 0: + target_feature = tf.concat(target_features, axis=-1) + target_size = target_feature.shape.as_list()[-1] + assert seq_embed_size == target_size, 'the embedding size of sequence and target item is not equal' \ + ' in feature group:' + self.name + if target_size != self.config.hidden_size: + target_feature = tf.layers.dense( + target_feature, + self.config.hidden_size, + activation=tf.nn.relu, + kernel_regularizer=self.l2_reg) + # target_feature: [batch_size, 1, embed_size] + target_feature = tf.expand_dims(target_feature, 1) + # seq_input: [batch_size, seq_len+1, embed_size] + seq_input = tf.concat([target_feature, seq_input], axis=1) + + return self.encode(seq_input, max_position) diff --git a/easy_rec/python/layers/keras/din.py b/easy_rec/python/layers/keras/din.py new file mode 100644 index 000000000..cee57ac90 --- /dev/null +++ b/easy_rec/python/layers/keras/din.py @@ -0,0 +1,73 @@ +# -*- encoding: utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import logging + +import tensorflow as tf +from tensorflow.python.keras.layers import Layer + +from easy_rec.python.layers import dnn +from easy_rec.python.utils.shape_utils import get_shape_list + + +class DIN(Layer): + + def __init__(self, params, name='din', l2_reg=None, **kwargs): + super(DIN, self).__init__(name=name, **kwargs) + self.l2_reg = l2_reg + self.config = params.get_pb_config() + + def call(self, inputs, training=None, **kwargs): + seq_features, target_features = inputs + assert len(seq_features) > 0, '[%s] sequence feature is empty' % self.name + assert len(target_features) > 0, '[%s] target feature is empty' % self.name + + query = tf.concat(target_features, axis=-1) + seq_input = [seq_fea for seq_fea, _ in seq_features] + keys = tf.concat(seq_input, axis=-1) + + query_emb_size = int(query.shape[-1]) + seq_emb_size = keys.shape.as_list()[-1] + if query_emb_size != seq_emb_size: + logging.info( + ' the embedding size of sequence [%d] and target item [%d] is not equal' + ' in feature group: %s', seq_emb_size, query_emb_size, self.name) + if query_emb_size < seq_emb_size: + query = tf.pad(query, [[0, 0], [0, seq_emb_size - query_emb_size]]) + else: + assert False, 'the embedding size of target item is larger than the one of sequence' + + batch_size, max_seq_len, _ = get_shape_list(keys, 3) + queries = tf.tile(tf.expand_dims(query, 1), [1, max_seq_len, 1]) + din_all = tf.concat([queries, keys, queries - keys, queries * keys], + axis=-1) + din_layer = dnn.DNN( + self.config.attention_dnn, + self.l2_reg, + self.name + '/din_attention', + training, + last_layer_no_activation=True, + last_layer_no_batch_norm=True) + output = din_layer(din_all) # [B, L, 1] + scores = tf.transpose(output, [0, 2, 1]) # [B, 1, L] + + seq_len = seq_features[0][1] + seq_mask = tf.sequence_mask(seq_len, max_seq_len, dtype=tf.bool) + seq_mask = tf.expand_dims(seq_mask, 1) + paddings = tf.ones_like(scores) * (-2**32 + 1) + scores = tf.where(seq_mask, scores, paddings) # [B, 1, L] + if self.config.attention_normalizer == 'softmax': + scores = tf.nn.softmax(scores) # (B, 1, L) + elif self.config.attention_normalizer == 'sigmoid': + scores = scores / (seq_emb_size**0.5) + scores = tf.nn.sigmoid(scores) + else: + raise ValueError('unsupported attention normalizer: ' + + self.config.attention_normalizer) + + if query_emb_size < seq_emb_size: + keys = keys[:, :, :query_emb_size] # [B, L, E] + output = tf.squeeze(tf.matmul(scores, keys), axis=[1]) + if self.config.need_target_feature: + output = tf.concat([output, query], axis=-1) + print('din output shape:', output.shape) + return output diff --git a/easy_rec/python/layers/keras/fibinet.py b/easy_rec/python/layers/keras/fibinet.py new file mode 100644 index 000000000..98cdb3179 --- /dev/null +++ b/easy_rec/python/layers/keras/fibinet.py @@ -0,0 +1,245 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import itertools +import logging + +import tensorflow as tf + +from easy_rec.python.layers.common_layers import layer_norm +from easy_rec.python.layers.keras.blocks import MLP +from easy_rec.python.layers.utils import Parameter + +if tf.__version__ >= '2.0': + tf = tf.compat.v1 + + +class SENet(tf.keras.layers.Layer): + """SENET Layer used in FiBiNET. + + Input shape + - A list of 2D tensor with shape: ``(batch_size,embedding_size)``. + The ``embedding_size`` of each field can have different value. + + Output shape + - A 2D tensor with shape: ``(batch_size,sum_of_embedding_size)``. + + References: + 1. [FiBiNET](https://arxiv.org/pdf/1905.09433.pdf) + Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction + 2. [FiBiNet++](https://arxiv.org/pdf/2209.05016.pdf) + Improving FiBiNet by Greatly Reducing Model Size for CTR Prediction + """ + + def __init__(self, params, name='SENet', **kwargs): + super(SENet, self).__init__(name, **kwargs) + self.config = params.get_pb_config() + + def call(self, inputs, **kwargs): + g = self.config.num_squeeze_group + for emb in inputs: + assert emb.shape.ndims == 2, 'field embeddings must be rank 2 tensors' + dim = int(emb.shape[-1]) + assert dim >= g and dim % g == 0, 'field embedding dimension %d must be divisible by %d' % ( + dim, g) + + field_size = len(inputs) + feature_size_list = [emb.shape.as_list()[-1] for emb in inputs] + + # Squeeze + # embedding dimension 必须能被 g 整除 + group_embs = [ + tf.reshape(emb, [-1, g, int(emb.shape[-1]) // g]) for emb in inputs + ] + + squeezed = [] + for emb in group_embs: + squeezed.append(tf.reduce_max(emb, axis=-1)) # [B, g] + squeezed.append(tf.reduce_mean(emb, axis=-1)) # [B, g] + z = tf.concat(squeezed, axis=1) # [bs, field_size * num_groups * 2] + + # Excitation + r = self.config.reduction_ratio + reduction_size = max(1, field_size * g * 2 // r) + + initializer = tf.glorot_normal_initializer() + a1 = tf.layers.dense( + z, + reduction_size, + kernel_initializer=initializer, + activation=tf.nn.relu, + name='%s/W1' % self.name) + weights = tf.layers.dense( + a1, + sum(feature_size_list), + kernel_initializer=initializer, + name='%s/W2' % self.name) + + # Re-weight + inputs = tf.concat(inputs, axis=-1) + output = inputs * weights + + # Fuse, add skip-connection + if self.config.use_skip_connection: + output += inputs + + # Layer Normalization + if self.config.use_output_layer_norm: + output = layer_norm(output) + return output + + +def _full_interaction(v_i, v_j): + # [bs, 1, dim] x [bs, dim, 1] = [bs, 1] + interaction = tf.matmul( + tf.expand_dims(v_i, axis=1), tf.expand_dims(v_j, axis=-1)) + return tf.squeeze(interaction, axis=1) + + +class BiLinear(tf.keras.layers.Layer): + """BilinearInteraction Layer used in FiBiNET. + + Input shape + - A list of 2D tensor with shape: ``(batch_size,embedding_size)``. + Its length is ``filed_size``. + The ``embedding_size`` of each field can have different value. + + Output shape + - 2D tensor with shape: ``(batch_size,output_size)``. + + Attributes: + num_output_units: the number of output units + type: ['all', 'each', 'interaction'], types of bilinear functions used in this layer + use_plus: whether to use bi-linear+ + + References: + 1. [FiBiNET](https://arxiv.org/pdf/1905.09433.pdf) + Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction + 2. [FiBiNet++](https://arxiv.org/pdf/2209.05016.pdf) + Improving FiBiNet by Greatly Reducing Model Size for CTR Prediction + """ + + def __init__(self, params, name='bilinear', **kwargs): + super(BiLinear, self).__init__(name, **kwargs) + params.check_required(['num_output_units']) + bilinear_plus = params.get_or_default('use_plus', True) + self.bilinear_type = params.get_or_default('type', 'interaction').lower() + self.output_size = params.num_output_units + + if self.bilinear_type not in ['all', 'each', 'interaction']: + raise NotImplementedError( + "bilinear_type only support: ['all', 'each', 'interaction']") + + if bilinear_plus: + self.func = _full_interaction + else: + self.func = tf.multiply + + def call(self, inputs, **kwargs): + embeddings = inputs + logging.info('Bilinear Layer with %d inputs' % len(embeddings)) + if len(embeddings) > 200: + logging.warning('There are too many inputs for bilinear layer: %d' % + len(embeddings)) + equal_dim = True + _dim = embeddings[0].shape[-1] + for emb in embeddings: + assert emb.shape.ndims == 2, 'field embeddings must be rank 2 tensors' + if emb.shape[-1] != _dim: + equal_dim = False + if not equal_dim and self.bilinear_type != 'interaction': + raise ValueError( + 'all embedding dimensions must be same when not use bilinear type: interaction' + ) + dim = int(_dim) + + field_size = len(embeddings) + initializer = tf.glorot_normal_initializer() + + # bi-linear+: p的维度为[bs, f*(f-1)/2] + # bi-linear: + # 当equal_dim=True时,p的维度为[bs, f*(f-1)/2*k],k为embeddings的size + # 当equal_dim=False时,p的维度为[bs, (k_2+k_3+...+k_f)+...+(k_i+k_{i+1}+...+k_f)+...+k_f], + # 其中 k_i为第i个field的embedding的size + if self.bilinear_type == 'all': + v_dot = [ + tf.layers.dense( + v_i, + dim, + kernel_initializer=initializer, + name='%s/all' % self.name, + reuse=tf.AUTO_REUSE) for v_i in embeddings[:-1] + ] + p = [ + self.func(v_dot[i], embeddings[j]) + for i, j in itertools.combinations(range(field_size), 2) + ] + elif self.bilinear_type == 'each': + v_dot = [ + tf.layers.dense( + v_i, + dim, + kernel_initializer=initializer, + name='%s/each_%d' % (self.name, i), + reuse=tf.AUTO_REUSE) for i, v_i in enumerate(embeddings[:-1]) + ] + p = [ + self.func(v_dot[i], embeddings[j]) + for i, j in itertools.combinations(range(field_size), 2) + ] + else: # interaction + p = [ + self.func( + tf.layers.dense( + embeddings[i], + embeddings[j].shape.as_list()[-1], + kernel_initializer=initializer, + name='%s/interaction_%d_%d' % (self.name, i, j), + reuse=tf.AUTO_REUSE), embeddings[j]) + for i, j in itertools.combinations(range(field_size), 2) + ] + + output = tf.layers.dense( + tf.concat(p, axis=-1), self.output_size, kernel_initializer=initializer) + return output + + +class FiBiNet(tf.keras.layers.Layer): + """FiBiNet++:Improving FiBiNet by Greatly Reducing Model Size for CTR Prediction. + + References: + - [FiBiNet++](https://arxiv.org/pdf/2209.05016.pdf) + Improving FiBiNet by Greatly Reducing Model Size for CTR Prediction + """ + + def __init__(self, params, name='fibinet', **kwargs): + super(FiBiNet, self).__init__(name, **kwargs) + self._config = params.get_pb_config() + if self._config.HasField('mlp'): + p = Parameter.make_from_pb(self._config.mlp) + p.l2_regularizer = params.l2_regularizer + self.final_mlp = MLP(p, name=name) + else: + self.final_mlp = None + + def call(self, inputs, training=None, **kwargs): + feature_list = [] + + params = Parameter.make_from_pb(self._config.senet) + senet = SENet(params, name='%s/senet' % self.name) + senet_output = senet(inputs) + feature_list.append(senet_output) + + if self._config.HasField('bilinear'): + params = Parameter.make_from_pb(self._config.bilinear) + bilinear = BiLinear(params, name='%s/bilinear' % self.name) + bilinear_output = bilinear(inputs) + feature_list.append(bilinear_output) + + if len(feature_list) > 1: + feature = tf.concat(feature_list, axis=-1) + else: + feature = feature_list[0] + + if self.final_mlp is not None: + feature = self.final_mlp(feature, training=training) + return feature diff --git a/easy_rec/python/layers/keras/interaction.py b/easy_rec/python/layers/keras/interaction.py new file mode 100644 index 000000000..55f56f7a1 --- /dev/null +++ b/easy_rec/python/layers/keras/interaction.py @@ -0,0 +1,312 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import tensorflow as tf + +from easy_rec.python.utils.activation import get_activation + + +class FM(tf.keras.layers.Layer): + """Factorization Machine models pairwise (order-2) feature interactions without linear term and bias. + + References + - [Factorization Machines](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) + Input shape. + - List of 2D tensor with shape: ``(batch_size,embedding_size)``. + - Or a 3D tensor with shape: ``(batch_size,field_size,embedding_size)`` + Output shape + - 2D tensor with shape: ``(batch_size, 1)``. + """ + + def __init__(self, params, name='fm', **kwargs): + super(FM, self).__init__(name, **kwargs) + self.use_variant = params.get_or_default('use_variant', False) + + def call(self, inputs, **kwargs): + if type(inputs) == list: + emb_dims = set(map(lambda x: int(x.shape[-1]), inputs)) + if len(emb_dims) != 1: + dims = ','.join([str(d) for d in emb_dims]) + raise ValueError('all embedding dim must be equal in FM layer:' + dims) + with tf.name_scope(self.name): + fea = tf.stack(inputs, axis=1) + else: + assert inputs.shape.ndims == 3, 'input of FM layer must be a 3D tensor or a list of 2D tensors' + fea = inputs + + with tf.name_scope(self.name): + square_of_sum = tf.square(tf.reduce_sum(fea, axis=1)) + sum_of_square = tf.reduce_sum(tf.square(fea), axis=1) + cross_term = tf.subtract(square_of_sum, sum_of_square) + if self.use_variant: + cross_term = 0.5 * cross_term + else: + cross_term = 0.5 * tf.reduce_sum(cross_term, axis=-1, keepdims=True) + return cross_term + + +class DotInteraction(tf.keras.layers.Layer): + """Dot interaction layer of DLRM model.. + + See theory in the DLRM paper: https://arxiv.org/pdf/1906.00091.pdf, + section 2.1.3. Sparse activations and dense activations are combined. + Dot interaction is applied to a batch of input Tensors [e1,...,e_k] of the + same dimension and the output is a batch of Tensors with all distinct pairwise + dot products of the form dot(e_i, e_j) for i <= j if self self_interaction is + True, otherwise dot(e_i, e_j) i < j. + + Attributes: + self_interaction: Boolean indicating if features should self-interact. + If it is True, then the diagonal entries of the interaction metric are + also taken. + skip_gather: An optimization flag. If it's set then the upper triangle part + of the dot interaction matrix dot(e_i, e_j) is set to 0. The resulting + activations will be of dimension [num_features * num_features] from which + half will be zeros. Otherwise activations will be only lower triangle part + of the interaction matrix. The later saves space but is much slower. + name: String name of the layer. + """ + + def __init__(self, params, name=None, **kwargs): + self._self_interaction = params.get_or_default('self_interaction', False) + self._skip_gather = params.get_or_default('skip_gather', False) + super(DotInteraction, self).__init__(name=name, **kwargs) + + def call(self, inputs, **kwargs): + """Performs the interaction operation on the tensors in the list. + + The tensors represent as transformed dense features and embedded categorical + features. + Pre-condition: The tensors should all have the same shape. + + Args: + inputs: List of features with shapes [batch_size, feature_dim]. + + Returns: + activations: Tensor representing interacted features. It has a dimension + `num_features * num_features` if skip_gather is True, otherside + `num_features * (num_features + 1) / 2` if self_interaction is True and + `num_features * (num_features - 1) / 2` if self_interaction is False. + """ + if isinstance(inputs, (list, tuple)): + # concat_features shape: batch_size, num_features, feature_dim + try: + concat_features = tf.stack(inputs, axis=1) + except (ValueError, tf.errors.InvalidArgumentError) as e: + raise ValueError('Input tensors` dimensions must be equal, original' + 'error message: {}'.format(e)) + else: + assert inputs.shape.ndims == 3, 'input of dot func must be a 3D tensor or a list of 2D tensors' + concat_features = inputs + + batch_size = tf.shape(concat_features)[0] + + # Interact features, select lower-triangular portion, and re-shape. + xactions = tf.matmul(concat_features, concat_features, transpose_b=True) + num_features = xactions.shape[-1] + ones = tf.ones_like(xactions) + if self._self_interaction: + # Selecting lower-triangular portion including the diagonal. + lower_tri_mask = tf.linalg.band_part(ones, -1, 0) + upper_tri_mask = ones - lower_tri_mask + out_dim = num_features * (num_features + 1) // 2 + else: + # Selecting lower-triangular portion not included the diagonal. + upper_tri_mask = tf.linalg.band_part(ones, 0, -1) + lower_tri_mask = ones - upper_tri_mask + out_dim = num_features * (num_features - 1) // 2 + + if self._skip_gather: + # Setting upper triangle part of the interaction matrix to zeros. + activations = tf.where( + condition=tf.cast(upper_tri_mask, tf.bool), + x=tf.zeros_like(xactions), + y=xactions) + out_dim = num_features * num_features + else: + activations = tf.boolean_mask(xactions, lower_tri_mask) + activations = tf.reshape(activations, (batch_size, out_dim)) + return activations + + +class Cross(tf.keras.layers.Layer): + """Cross Layer in Deep & Cross Network to learn explicit feature interactions. + + A layer that creates explicit and bounded-degree feature interactions + efficiently. The `call` method accepts `inputs` as a tuple of size 2 + tensors. The first input `x0` is the base layer that contains the original + features (usually the embedding layer); the second input `xi` is the output + of the previous `Cross` layer in the stack, i.e., the i-th `Cross` + layer. For the first `Cross` layer in the stack, x0 = xi. + + The output is x_{i+1} = x0 .* (W * xi + bias + diag_scale * xi) + xi, + where .* designates elementwise multiplication, W could be a full-rank + matrix, or a low-rank matrix U*V to reduce the computational cost, and + diag_scale increases the diagonal of W to improve training stability ( + especially for the low-rank case). + + References: + 1. [R. Wang et al.](https://arxiv.org/pdf/2008.13535.pdf) + See Eq. (1) for full-rank and Eq. (2) for low-rank version. + 2. [R. Wang et al.](https://arxiv.org/pdf/1708.05123.pdf) + + Example: + + ```python + # after embedding layer in a functional model: + input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64) + x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6) + x1 = Cross()(x0, x0) + x2 = Cross()(x0, x1) + logits = tf.keras.layers.Dense(units=10)(x2) + model = tf.keras.Model(input, logits) + ``` + + Args: + projection_dim: project dimension to reduce the computational cost. + Default is `None` such that a full (`input_dim` by `input_dim`) matrix + W is used. If enabled, a low-rank matrix W = U*V will be used, where U + is of size `input_dim` by `projection_dim` and V is of size + `projection_dim` by `input_dim`. `projection_dim` need to be smaller + than `input_dim`/2 to improve the model efficiency. In practice, we've + observed that `projection_dim` = d/4 consistently preserved the + accuracy of a full-rank version. + diag_scale: a non-negative float used to increase the diagonal of the + kernel W by `diag_scale`, that is, W + diag_scale * I, where I is an + identity matrix. + use_bias: whether to add a bias term for this layer. If set to False, + no bias term will be used. + preactivation: Activation applied to output matrix of the layer, before + multiplication with the input. Can be used to control the scale of the + layer's outputs and improve stability. + kernel_initializer: Initializer to use on the kernel matrix. + bias_initializer: Initializer to use on the bias vector. + kernel_regularizer: Regularizer to use on the kernel matrix. + bias_regularizer: Regularizer to use on bias vector. + + Input shape: A tuple of 2 (batch_size, `input_dim`) dimensional inputs. + Output shape: A single (batch_size, `input_dim`) dimensional output. + """ + + def __init__(self, params, **kwargs): + super(Cross, self).__init__(**kwargs) + self._projection_dim = params.get_or_default('projection_dim', None) + self._diag_scale = params.get_or_default('diag_scale', 0.0) + self._use_bias = params.get_or_default('use_bias', True) + preactivation = params.get_or_default('preactivation', None) + preact = get_activation(preactivation) + self._preactivation = tf.keras.activations.get(preact) + kernel_initializer = params.get_or_default('kernel_initializer', + 'truncated_normal') + self._kernel_initializer = tf.keras.initializers.get(kernel_initializer) + bias_initializer = params.get_or_default('bias_initializer', 'zeros') + self._bias_initializer = tf.keras.initializers.get(bias_initializer) + kernel_regularizer = params.get_or_default('kernel_regularizer', None) + self._kernel_regularizer = tf.keras.regularizers.get(kernel_regularizer) + bias_regularizer = params.get_or_default('bias_regularizer', None) + self._bias_regularizer = tf.keras.regularizers.get(bias_regularizer) + self._input_dim = None + self._supports_masking = True + + if self._diag_scale < 0: # pytype: disable=unsupported-operands + raise ValueError( + '`diag_scale` should be non-negative. Got `diag_scale` = {}'.format( + self._diag_scale)) + + def build(self, input_shape): + last_dim = input_shape[0][-1] + + if self._projection_dim is None: + self._dense = tf.keras.layers.Dense( + last_dim, + kernel_initializer=_clone_initializer(self._kernel_initializer), + bias_initializer=self._bias_initializer, + kernel_regularizer=self._kernel_regularizer, + bias_regularizer=self._bias_regularizer, + use_bias=self._use_bias, + dtype=self.dtype, + activation=self._preactivation, + ) + else: + self._dense_u = tf.keras.layers.Dense( + self._projection_dim, + kernel_initializer=_clone_initializer(self._kernel_initializer), + kernel_regularizer=self._kernel_regularizer, + use_bias=False, + dtype=self.dtype, + ) + self._dense_v = tf.keras.layers.Dense( + last_dim, + kernel_initializer=_clone_initializer(self._kernel_initializer), + bias_initializer=self._bias_initializer, + kernel_regularizer=self._kernel_regularizer, + bias_regularizer=self._bias_regularizer, + use_bias=self._use_bias, + dtype=self.dtype, + activation=self._preactivation, + ) + self.built = True + + def call(self, inputs, **kwargs): + """Computes the feature cross. + + Args: + inputs: The input tensor(x0, x) + - x0: The input tensor + - x: Optional second input tensor. If provided, the layer will compute + crosses between x0 and x; if not provided, the layer will compute + crosses between x0 and itself. + + Returns: + Tensor of crosses. + """ + if isinstance(inputs, (list, tuple)): + x0, x = inputs + else: + x0, x = inputs, inputs + + if not self.built: + self.build(x0.shape) + + if x0.shape[-1] != x.shape[-1]: + raise ValueError( + '`x0` and `x` dimension mismatch! Got `x0` dimension {}, and x ' + 'dimension {}. This case is not supported yet.'.format( + x0.shape[-1], x.shape[-1])) + + if self._projection_dim is None: + prod_output = self._dense(x) + else: + prod_output = self._dense_v(self._dense_u(x)) + + # prod_output = tf.cast(prod_output, self.compute_dtype) + + if self._diag_scale: + prod_output = prod_output + self._diag_scale * x + + return x0 * prod_output + x + + def get_config(self): + config = { + 'projection_dim': + self._projection_dim, + 'diag_scale': + self._diag_scale, + 'use_bias': + self._use_bias, + 'preactivation': + tf.keras.activations.serialize(self._preactivation), + 'kernel_initializer': + tf.keras.initializers.serialize(self._kernel_initializer), + 'bias_initializer': + tf.keras.initializers.serialize(self._bias_initializer), + 'kernel_regularizer': + tf.keras.regularizers.serialize(self._kernel_regularizer), + 'bias_regularizer': + tf.keras.regularizers.serialize(self._bias_regularizer), + } + base_config = super(Cross, self).get_config() + return dict(list(base_config.items()) + list(config.items())) + + +def _clone_initializer(initializer): + return initializer.__class__.from_config(initializer.get_config()) diff --git a/easy_rec/python/layers/keras/mask_net.py b/easy_rec/python/layers/keras/mask_net.py new file mode 100644 index 000000000..6ef740b47 --- /dev/null +++ b/easy_rec/python/layers/keras/mask_net.py @@ -0,0 +1,138 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import tensorflow as tf + +from easy_rec.python.layers.common_layers import layer_norm +from easy_rec.python.layers.keras.blocks import MLP +from easy_rec.python.layers.utils import Parameter + +if tf.__version__ >= '2.0': + tf = tf.compat.v1 + + +class MaskBlock(tf.keras.layers.Layer): + """MaskBlock use in MaskNet. + + Args: + projection_dim: project dimension to reduce the computational cost. + Default is `None` such that a full (`input_dim` by `aggregation_size`) matrix + W is used. If enabled, a low-rank matrix W = U*V will be used, where U + is of size `input_dim` by `projection_dim` and V is of size + `projection_dim` by `aggregation_size`. `projection_dim` need to be smaller + than `aggregation_size`/2 to improve the model efficiency. In practice, we've + observed that `projection_dim` = d/4 consistently preserved the + accuracy of a full-rank version. + """ + + def __init__(self, params, name='mask_block', reuse=None, **kwargs): + super(MaskBlock, self).__init__(name, **kwargs) + self.config = params.get_pb_config() + self.l2_reg = params.l2_regularizer + self._projection_dim = params.get_or_default('projection_dim', None) + self.reuse = reuse + + def call(self, inputs, **kwargs): + net, mask_input = inputs + mask_input_dim = int(mask_input.shape[-1]) + if self.config.HasField('reduction_factor'): + aggregation_size = int(mask_input_dim * self.config.reduction_factor) + elif self.config.HasField('aggregation_size') is not None: + aggregation_size = self.config.aggregation_size + else: + raise ValueError( + 'Need one of reduction factor or aggregation size for MaskBlock.') + + if self.config.input_layer_norm: + input_name = net.name.replace(':', '_') + net = layer_norm(net, reuse=tf.AUTO_REUSE, name='ln_' + input_name) + + # initializer = tf.initializers.variance_scaling() + initializer = tf.glorot_uniform_initializer() + + if self._projection_dim is None: + mask = tf.layers.dense( + mask_input, + aggregation_size, + activation=tf.nn.relu, + kernel_initializer=initializer, + kernel_regularizer=self.l2_reg, + name='%s/hidden' % self.name, + reuse=self.reuse) + else: + u = tf.layers.dense( + mask_input, + self._projection_dim, + kernel_initializer=initializer, + kernel_regularizer=self.l2_reg, + use_bias=False, + name='%s/prj_u' % self.name, + reuse=self.reuse) + mask = tf.layers.dense( + u, + aggregation_size, + activation=tf.nn.relu, + kernel_initializer=initializer, + kernel_regularizer=self.l2_reg, + name='%s/prj_v' % self.name, + reuse=self.reuse) + mask = tf.layers.dense( + mask, net.shape[-1], name='%s/mask' % self.name, reuse=self.reuse) + masked_net = net * mask + + output_size = self.config.output_size + hidden = tf.layers.dense( + masked_net, + output_size, + use_bias=False, + name='%s/output' % self.name, + reuse=self.reuse) + ln_hidden = layer_norm( + hidden, name='%s/ln_output' % self.name, reuse=self.reuse) + return tf.nn.relu(ln_hidden) + + +class MaskNet(tf.keras.layers.Layer): + """MaskNet: Introducing Feature-Wise Multiplication to CTR Ranking Models by Instance-Guided Mask. + + Refer: https://arxiv.org/pdf/2102.07619.pdf + """ + + def __init__(self, params, name='mask_net', **kwargs): + super(MaskNet, self).__init__(name, **kwargs) + self.params = params + self.config = params.get_pb_config() + if self.config.HasField('mlp'): + p = Parameter.make_from_pb(self.config.mlp) + p.l2_regularizer = params.l2_regularizer + self.mlp = MLP(p, name='%s/mlp' % name) + else: + self.mlp = None + + def call(self, inputs, training=None, **kwargs): + if self.config.use_parallel: + mask_outputs = [] + for i, block_conf in enumerate(self.config.mask_blocks): + params = Parameter.make_from_pb(block_conf) + params.l2_regularizer = self.params.l2_regularizer + mask_layer = MaskBlock(params, name='%s/block_%d' % (self.name, i)) + mask_outputs.append(mask_layer((inputs, inputs))) + all_mask_outputs = tf.concat(mask_outputs, axis=1) + + if self.mlp is not None: + output = self.mlp(all_mask_outputs) + else: + output = all_mask_outputs + return output + else: + net = inputs + for i, block_conf in enumerate(self.config.mask_blocks): + params = Parameter.make_from_pb(block_conf) + params.l2_regularizer = self.params.l2_regularizer + mask_layer = MaskBlock(params, name='%s/block_%d' % (self.name, i)) + net = mask_layer((net, inputs)) + + if self.mlp is not None: + output = self.mlp(net) + else: + output = net + return output diff --git a/easy_rec/python/layers/keras/multi_task.py b/easy_rec/python/layers/keras/multi_task.py new file mode 100644 index 000000000..ca865e5a7 --- /dev/null +++ b/easy_rec/python/layers/keras/multi_task.py @@ -0,0 +1,50 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import logging + +import tensorflow as tf + +from easy_rec.python.layers.keras.blocks import MLP + + +def gate_fn(inputs, units, name, l2_reg): + dense = tf.keras.layers.Dense( + units, kernel_regularizer=l2_reg, name='%s/dense' % name) + weights = dense(inputs) + return tf.nn.softmax(weights, axis=1) + + +class MMoE(tf.keras.layers.Layer): + """Multi-gate Mixture-of-Experts model.""" + + def __init__(self, params, name='MMoE', **kwargs): + super(MMoE, self).__init__(name, **kwargs) + params.check_required(['num_expert', 'num_task', 'expert_mlp']) + self._num_expert = params.num_expert + self._num_task = params.num_task + expert_params = params.expert_mlp + self._experts = [ + MLP(expert_params, 'expert_%d' % i) for i in range(self._num_expert) + ] + self._l2_reg = params.l2_regularizer + + def __call__(self, inputs, **kwargs): + if self._num_expert == 0: + logging.warning('num_expert of MMoE layer `%s` is 0' % self.name) + return inputs + + expert_fea_list = [expert(inputs) for expert in self._experts] + experts_fea = tf.stack(expert_fea_list, axis=1) + + task_input_list = [] + for task_id in range(self._num_task): + gate = gate_fn( + inputs, + self._num_expert, + name='gate_%d' % task_id, + l2_reg=self._l2_reg) + gate = tf.expand_dims(gate, -1) + task_input = tf.multiply(experts_fea, gate) + task_input = tf.reduce_sum(task_input, axis=1) + task_input_list.append(task_input) + return task_input_list diff --git a/easy_rec/python/layers/keras/numerical_embedding.py b/easy_rec/python/layers/keras/numerical_embedding.py new file mode 100644 index 000000000..e83c63a7e --- /dev/null +++ b/easy_rec/python/layers/keras/numerical_embedding.py @@ -0,0 +1,198 @@ +# -*- encoding:utf-8 -*- +# Copyright (c) Alibaba, Inc. and its affiliates. +import math + +import tensorflow as tf + +from easy_rec.python.utils.activation import get_activation + +if tf.__version__ >= '2.0': + tf = tf.compat.v1 + + +class NLinear(object): + """N linear layers for N token (feature) embeddings. + + To understand this module, let's revise `tf.layers.dense`. When `tf.layers.dense` is + applied to three-dimensional inputs of the shape + ``(batch_size, n_tokens, d_embedding)``, then the same linear transformation is + applied to each of ``n_tokens`` token (feature) embeddings. + + By contrast, `NLinear` allocates one linear layer per token (``n_tokens`` layers in total). + One such layer can be represented as ``tf.layers.dense(d_in, d_out)``. + So, the i-th linear transformation is applied to the i-th token embedding, as + illustrated in the following pseudocode:: + + layers = [tf.layers.dense(d_in, d_out) for _ in range(n_tokens)] + x = tf.random.normal(batch_size, n_tokens, d_in) + result = tf.stack([layers[i](x[:, i]) for i in range(n_tokens)], 1) + + Examples: + .. testcode:: + + batch_size = 2 + n_features = 3 + d_embedding_in = 4 + d_embedding_out = 5 + x = tf.random.normal(batch_size, n_features, d_embedding_in) + m = NLinear(n_features, d_embedding_in, d_embedding_out) + assert m(x).shape == (batch_size, n_features, d_embedding_out) + """ + + def __init__(self, n_tokens, d_in, d_out, bias=True, scope='nd_linear'): + """Init with input shapes. + + Args: + n_tokens: the number of tokens (features) + d_in: the input dimension + d_out: the output dimension + bias: indicates if the underlying linear layers have biases + scope: variable scope name + """ + with tf.variable_scope(scope): + self.weight = tf.get_variable( + 'weights', [1, n_tokens, d_in, d_out], dtype=tf.float32) + if bias: + initializer = tf.constant_initializer(0.0) + self.bias = tf.get_variable( + 'bias', [1, n_tokens, d_out], + dtype=tf.float32, + initializer=initializer) + else: + self.bias = None + + def __call__(self, x, *args, **kwargs): + if x.shape.ndims != 3: + raise ValueError( + 'The input must have three dimensions (batch_size, n_tokens, d_embedding)' + ) + if x.shape[2] != self.weight.shape[2]: + raise ValueError('invalid input embedding dimension %d, expect %d' % + (int(x.shape[2]), int(self.weight.shape[2]))) + + x = x[..., None] * self.weight # [B, N, D, D_out] + x = tf.reduce_sum(x, axis=-2) # [B, N, D_out] + if self.bias is not None: + x = x + self.bias + return x + + +class PeriodicEmbedding(tf.keras.layers.Layer): + """Periodic embeddings for numerical features described in [1]. + + References: + * [1] Yury Gorishniy, Ivan Rubachev, Artem Babenko, + "On Embeddings for Numerical Features in Tabular Deep Learning", 2022 + https://arxiv.org/pdf/2203.05556.pdf + + Attributes: + embedding_dim: the embedding size, must be an even positive integer. + sigma: the scale of the weight initialization. + **This is a super important parameter which significantly affects performance**. + Its optimal value can be dramatically different for different datasets, so + no "default value" can exist for this parameter, and it must be tuned for + each dataset. In the original paper, during hyperparameter tuning, this + parameter was sampled from the distribution ``LogUniform[1e-2, 1e2]``. + A similar grid would be ``[1e-2, 1e-1, 1e0, 1e1, 1e2]``. + If possible, add more intermediate values to this grid. + output_3d_tensor: whether to output a 3d tensor + output_tensor_list: whether to output the list of embedding + """ + + def __init__(self, params, name='periodic_embedding', **kwargs): + super(PeriodicEmbedding, self).__init__(name, **kwargs) + params.check_required(['embedding_dim', 'sigma']) + self.embedding_dim = int(params.embedding_dim) + if self.embedding_dim % 2: + raise ValueError('embedding_dim must be even') + sigma = params.sigma + self.initializer = tf.random_normal_initializer(stddev=sigma) + self.add_linear_layer = params.get_or_default('add_linear_layer', True) + self.linear_activation = params.get_or_default('linear_activation', 'relu') + self.output_tensor_list = params.get_or_default('output_tensor_list', False) + self.output_3d_tensor = params.get_or_default('output_3d_tensor', False) + + def call(self, inputs, **kwargs): + if inputs.shape.ndims != 2: + raise ValueError('inputs of PeriodicEmbedding must have 2 dimensions.') + + num_features = int(inputs.shape[-1]) + emb_dim = self.embedding_dim // 2 + with tf.variable_scope(self.name): + c = tf.get_variable( + 'coefficients', + shape=[1, num_features, emb_dim], + initializer=self.initializer) + + features = inputs[..., None] # [B, N, 1] + v = 2 * math.pi * c * features # [B, N, E] + emb = tf.concat([tf.sin(v), tf.cos(v)], axis=-1) # [B, N, 2E] + + dim = self.embedding_dim + if self.add_linear_layer: + linear = NLinear(num_features, dim, dim) + emb = linear(emb) + act = get_activation(self.linear_activation) + if callable(act): + emb = act(emb) + output = tf.reshape(emb, [-1, num_features * dim]) + + if self.output_tensor_list: + return output, tf.unstack(emb, axis=1) + if self.output_3d_tensor: + return output, emb + return output + + +class AutoDisEmbedding(tf.keras.layers.Layer): + """An Embedding Learning Framework for Numerical Features in CTR Prediction. + + Refer: https://arxiv.org/pdf/2012.08986v2.pdf + """ + + def __init__(self, params, name='auto_dis_embedding', **kwargs): + super(AutoDisEmbedding, self).__init__(name, **kwargs) + params.check_required(['embedding_dim', 'num_bins', 'temperature']) + self.emb_dim = int(params.embedding_dim) + self.num_bins = int(params.num_bins) + self.temperature = params.temperature + self.keep_prob = params.get_or_default('keep_prob', 0.8) + self.output_tensor_list = params.get_or_default('output_tensor_list', False) + self.output_3d_tensor = params.get_or_default('output_3d_tensor', False) + + def call(self, inputs, **kwargs): + if inputs.shape.ndims != 2: + raise ValueError('inputs of AutoDisEmbedding must have 2 dimensions.') + + num_features = int(inputs.shape[-1]) + with tf.variable_scope(self.name): + meta_emb = tf.get_variable( + 'meta_embedding', shape=[num_features, self.num_bins, self.emb_dim]) + w = tf.get_variable('project_w', shape=[1, num_features, self.num_bins]) + mat = tf.get_variable( + 'project_mat', shape=[num_features, self.num_bins, self.num_bins]) + + x = tf.expand_dims(inputs, axis=-1) # [B, N, 1] + hidden = tf.nn.leaky_relu(w * x) # [B, N, num_bin] + # 低版本的tf(1.12) matmul 不支持广播,所以改成 einsum + # y = tf.matmul(mat, hidden[..., None]) # [B, N, num_bin, 1] + # y = tf.squeeze(y, axis=3) # [B, N, num_bin] + y = tf.einsum('nik,bnk->bni', mat, hidden) # [B, N, num_bin] + + # keep_prob(float): if dropout_flag is True, keep_prob rate to keep connect + alpha = self.keep_prob + x_bar = y + alpha * hidden # [B, N, num_bin] + x_hat = tf.nn.softmax(x_bar / self.temperature) # [B, N, num_bin] + + # emb = tf.matmul(x_hat[:, :, None, :], meta_emb) # [B, N, 1, D] + # emb = tf.squeeze(emb, axis=2) # [B, N, D] + emb = tf.einsum('bnk,nkd->bnd', x_hat, meta_emb) + + output = tf.reshape(emb, [-1, self.emb_dim * num_features]) # [B, N*D] + + if self.output_tensor_list: + return output, tf.unstack(emb, axis=1) + + if self.output_3d_tensor: + return output, emb + return output diff --git a/easy_rec/python/layers/uniter.py b/easy_rec/python/layers/uniter.py index fa5c6a3ca..3018bad61 100644 --- a/easy_rec/python/layers/uniter.py +++ b/easy_rec/python/layers/uniter.py @@ -32,7 +32,8 @@ def __init__(self, model_config, feature_configs, features, uniter_config, tower_num += 1 self._txt_seq_features = None if input_layer.has_group('text'): - self._txt_seq_features = input_layer(features, 'text', is_combine=False) + self._txt_seq_features, _, _ = input_layer( + features, 'text', is_combine=False) tower_num += 1 self._use_token_type = True if tower_num > 1 else False self._other_features = None diff --git a/easy_rec/python/layers/utils.py b/easy_rec/python/layers/utils.py index 43204241c..850f6d7a2 100644 --- a/easy_rec/python/layers/utils.py +++ b/easy_rec/python/layers/utils.py @@ -19,6 +19,8 @@ import json +from google.protobuf import struct_pb2 +from google.protobuf.descriptor import FieldDescriptor from tensorflow.python.framework import ops from tensorflow.python.framework import sparse_tensor from tensorflow.python.ops import variables @@ -158,3 +160,82 @@ def mark_input_src(name, src_desc): 'name': name, 'src': src_desc })) + + +def is_proto_message(pb_obj, field): + field_type = pb_obj.DESCRIPTOR.fields_by_name[field].type + return field_type == FieldDescriptor.TYPE_MESSAGE + + +class Parameter(object): + + def __init__(self, params, is_struct, l2_reg=None): + self.params = params + self.is_struct = is_struct + self._l2_reg = l2_reg + + @staticmethod + def make_from_pb(config): + return Parameter(config, False) + + def get_pb_config(self): + assert not self.is_struct, 'Struct parameter can not convert to pb config' + return self.params + + @property + def l2_regularizer(self): + return self._l2_reg + + @l2_regularizer.setter + def l2_regularizer(self, value): + self._l2_reg = value + + def __getattr__(self, key): + if self.is_struct: + value = self.params[key] + if type(value) == struct_pb2.Struct: + return Parameter(value, True, self._l2_reg) + else: + return value + + value = getattr(self.params, key) + if is_proto_message(self.params, key): + return Parameter(value, False, self._l2_reg) + return value + + def __getitem__(self, key): + return self.__getattr__(key) + + def get_or_default(self, key, def_val): + if self.is_struct: + if key in self.params: + if def_val is None: + return self.params[key] + value = self.params[key] + if type(value) == float: + return type(def_val)(value) + return value + return def_val + else: # pb message + value = getattr(self.params, key) + if hasattr(value, '__len__'): + if len(value) > 0: + return value + elif self.params.HasField(key): + return value + return def_val + + def check_required(self, keys): + if not self.is_struct: + return + if not isinstance(keys, (list, tuple)): + keys = [keys] + for key in keys: + if key not in self.params: + raise KeyError('%s must be set in params') + + def has_field(self, key): + if self.is_struct: + return key in self.params + else: + return self.params.HasField(key) diff --git a/easy_rec/python/loss/jrc_loss.py b/easy_rec/python/loss/jrc_loss.py index fc8266b2c..30c019a77 100644 --- a/easy_rec/python/loss/jrc_loss.py +++ b/easy_rec/python/loss/jrc_loss.py @@ -12,7 +12,9 @@ def jrc_loss(labels, logits, session_ids, alpha=0.5, - auto_weight=False, + loss_weight_strategy='fixed', + sample_weights=1.0, + same_label_loss=True, name=''): """Joint Optimization of Ranking and Calibration with Contextualized Hybrid Model. @@ -23,14 +25,18 @@ def jrc_loss(labels, logits: a `Tensor` with shape [batch_size, 2]. e.g. the value of last neuron before activation. session_ids: a `Tensor` with shape [batch_size]. Session ids of each sample, used to max GAUC metric. e.g. user_id alpha: the weight to balance ranking loss and calibration loss - auto_weight: bool, whether to learn loss weight between ranking loss and calibration loss + loss_weight_strategy: str, the loss weight strategy to balancing between ce_loss and ge_loss + sample_weights: Coefficients for the loss. This must be scalar or broadcastable to + `labels` (i.e. same rank and each dimension is either 1 or the same). + same_label_loss: enable ge_loss for sample with same label in a session or not. name: the name of loss """ loss_name = name if name else 'jrc_loss' - logging.info('[{}] alpha: {}, auto_weight: {}'.format(loss_name, alpha, - auto_weight)) + logging.info('[{}] alpha: {}, loss_weight_strategy: {}'.format( + loss_name, alpha, loss_weight_strategy)) - ce_loss = tf.losses.sparse_softmax_cross_entropy(labels, logits) + ce_loss = tf.losses.sparse_softmax_cross_entropy( + labels, logits, weights=sample_weights) labels = tf.expand_dims(labels, 1) # [B, 1] labels = tf.concat([1 - labels, labels], axis=1) # [B, 2] @@ -54,13 +60,58 @@ def jrc_loss(labels, y_neg, y_pos = y[:, :, 0], y[:, :, 1] l_neg, l_pos = logits[:, :, 0], logits[:, :, 1] + if tf.is_numeric_tensor(sample_weights): + logging.info('[%s] use sample weight' % loss_name) + weights = tf.expand_dims(tf.cast(sample_weights, tf.float32), 0) + pairwise_weights = tf.tile(weights, tf.stack([batch_size, 1])) + y_pos *= pairwise_weights + y_neg *= pairwise_weights + else: + assert sample_weights == 1.0, 'invalid sample_weight %d' % sample_weights + # Compute list-wise generative loss -log p(x|y, z) - loss_pos = -tf.reduce_sum(y_pos * tf.nn.log_softmax(l_pos, axis=0), axis=0) - loss_neg = -tf.reduce_sum(y_neg * tf.nn.log_softmax(l_neg, axis=0), axis=0) - ge_loss = tf.reduce_mean((loss_pos + loss_neg) / tf.reduce_sum(mask, axis=0)) + if same_label_loss: + logging.info('[%s] enable same_label_loss' % loss_name) + loss_pos = -tf.reduce_sum(y_pos * tf.nn.log_softmax(l_pos, axis=0), axis=0) + loss_neg = -tf.reduce_sum(y_neg * tf.nn.log_softmax(l_neg, axis=0), axis=0) + ge_loss = tf.reduce_mean( + (loss_pos + loss_neg) / tf.reduce_sum(mask, axis=0)) + else: + logging.info('[%s] disable same_label_loss' % loss_name) + diag = tf.one_hot(tf.range(batch_size), batch_size) + l_pos = l_pos + (1 - diag) * y_pos * -1e9 + l_neg = l_neg + (1 - diag) * y_neg * -1e9 + loss_pos = -tf.linalg.diag_part(y_pos * tf.nn.log_softmax(l_pos, axis=0)) + loss_neg = -tf.linalg.diag_part(y_neg * tf.nn.log_softmax(l_neg, axis=0)) + ge_loss = tf.reduce_mean(loss_pos + loss_neg) + + tf.summary.scalar('loss/%s_ce' % loss_name, ce_loss) + tf.summary.scalar('loss/%s_ge' % loss_name, ge_loss) # The final JRC model - if auto_weight: + if loss_weight_strategy == 'fixed': + loss = alpha * ce_loss + (1 - alpha) * ge_loss + elif loss_weight_strategy == 'random_uniform': + weight = tf.random_uniform([]) + loss = weight * ce_loss + (1 - weight) * ge_loss + tf.summary.scalar('loss/%s_ce_weight' % loss_name, weight) + tf.summary.scalar('loss/%s_ge_weight' % loss_name, 1 - weight) + elif loss_weight_strategy == 'random_normal': + weights = tf.random_normal([2]) + loss_weight = tf.nn.softmax(weights) + loss = loss_weight[0] * ce_loss + loss_weight[1] * ge_loss + tf.summary.scalar('loss/%s_ce_weight' % loss_name, loss_weight[0]) + tf.summary.scalar('loss/%s_ge_weight' % loss_name, loss_weight[1]) + elif loss_weight_strategy == 'random_bernoulli': + bern = tf.distributions.Bernoulli(probs=0.5, dtype=tf.float32) + weights = bern.sample(2) + loss_weight = tf.cond( + tf.equal(tf.reduce_sum(weights), 1), lambda: weights, + lambda: tf.convert_to_tensor([0.5, 0.5])) + loss = loss_weight[0] * ce_loss + loss_weight[1] * ge_loss + tf.summary.scalar('loss/%s_ce_weight' % loss_name, loss_weight[0]) + tf.summary.scalar('loss/%s_ge_weight' % loss_name, loss_weight[1]) + elif loss_weight_strategy == 'uncertainty': uncertainty1 = tf.Variable( 0, name='%s_ranking_loss_weight' % loss_name, dtype=tf.float32) tf.summary.scalar('loss/%s_ranking_uncertainty' % loss_name, uncertainty1) @@ -71,5 +122,6 @@ def jrc_loss(labels, loss = tf.exp(-uncertainty1) * ce_loss + 0.5 * uncertainty1 loss += tf.exp(-uncertainty2) * ge_loss + 0.5 * uncertainty2 else: - loss = alpha * ce_loss + (1 - alpha) * ge_loss + raise ValueError('Unsupported loss weight strategy `%s` for jrc loss' % + loss_weight_strategy) return loss diff --git a/easy_rec/python/model/collaborative_metric_learning.py b/easy_rec/python/model/collaborative_metric_learning.py index d785e7141..b19537239 100644 --- a/easy_rec/python/model/collaborative_metric_learning.py +++ b/easy_rec/python/model/collaborative_metric_learning.py @@ -48,21 +48,22 @@ def __init__( raise ValueError('unsupported loss type: %s' % LossType.Name(self._loss_type)) - self._highway_features = {} - self._highway_num = len(self._model_config.highway) - for _id in range(self._highway_num): - highway_cfg = self._model_config.highway[_id] - highway_feature, _ = self._input_layer(self._feature_dict, - highway_cfg.input) - self._highway_features[highway_cfg.input] = highway_feature - - self.input_features = [] - if self._model_config.HasField('input'): - input_feature, _ = self._input_layer(self._feature_dict, - self._model_config.input) - self.input_features.append(input_feature) - - self.dnn = copy_obj(self._model_config.dnn) + if not self.has_backbone: + self._highway_features = {} + self._highway_num = len(self._model_config.highway) + for _id in range(self._highway_num): + highway_cfg = self._model_config.highway[_id] + highway_feature, _ = self._input_layer(self._feature_dict, + highway_cfg.input) + self._highway_features[highway_cfg.input] = highway_feature + + self.input_features = [] + if self._model_config.HasField('input'): + input_feature, _ = self._input_layer(self._feature_dict, + self._model_config.input) + self.input_features.append(input_feature) + + self.dnn = copy_obj(self._model_config.dnn) if self._labels is not None: if self._model_config.HasField('session_id'): @@ -79,32 +80,35 @@ def __init__( self.sample_id = None def build_predict_graph(self): - for _id in range(self._highway_num): - highway_cfg = self._model_config.highway[_id] - highway_fea = tf.layers.batch_normalization( - self._highway_features[highway_cfg.input], - training=self._is_training, - trainable=True, - name='highway_%s_bn' % highway_cfg.input) - highway_fea = highway( - highway_fea, - highway_cfg.emb_size, - activation=gelu, - scope='highway_%s' % _id) - print('highway_fea: ', highway_fea) - self.input_features.append(highway_fea) - - feature = tf.concat(self.input_features, axis=1) - - num_dnn_layer = len(self.dnn.hidden_units) - last_hidden = self.dnn.hidden_units.pop() - dnn_net = dnn.DNN(self.dnn, self._l2_reg, 'dnn', self._is_training) - net_output = dnn_net(feature) - tower_emb = tf.layers.dense( - inputs=net_output, - units=last_hidden, - kernel_regularizer=self._l2_reg, - name='dnn/dnn_%d' % (num_dnn_layer - 1)) + if self.has_backbone: + tower_emb = self.backbone + else: + for _id in range(self._highway_num): + highway_cfg = self._model_config.highway[_id] + highway_fea = tf.layers.batch_normalization( + self._highway_features[highway_cfg.input], + training=self._is_training, + trainable=True, + name='highway_%s_bn' % highway_cfg.input) + highway_fea = highway( + highway_fea, + highway_cfg.emb_size, + activation=gelu, + scope='highway_%s' % _id) + print('highway_fea: ', highway_fea) + self.input_features.append(highway_fea) + + feature = tf.concat(self.input_features, axis=1) + + num_dnn_layer = len(self.dnn.hidden_units) + last_hidden = self.dnn.hidden_units.pop() + dnn_net = dnn.DNN(self.dnn, self._l2_reg, 'dnn', self._is_training) + net_output = dnn_net(feature) + tower_emb = tf.layers.dense( + inputs=net_output, + units=last_hidden, + kernel_regularizer=self._l2_reg, + name='dnn/dnn_%d' % (num_dnn_layer - 1)) if self._model_config.output_l2_normalized_emb: norm_emb = tf.nn.l2_normalize(tower_emb, axis=-1) diff --git a/easy_rec/python/model/dbmtl.py b/easy_rec/python/model/dbmtl.py index 913793474..6c69d33ca 100644 --- a/easy_rec/python/model/dbmtl.py +++ b/easy_rec/python/model/dbmtl.py @@ -37,24 +37,29 @@ def __init__(self, features, self._model_config.bottom_uniter, self._input_layer) + elif not self.has_backbone: + self._features, self._feature_list = self._input_layer( + self._feature_dict, 'all') else: - self._features, _ = self._input_layer(self._feature_dict, 'all') + assert False, 'invalid code branch' self._init_towers(self._model_config.task_towers) def build_predict_graph(self): - if self._model_config.HasField('bottom_cmbf'): - bottom_fea = self._cmbf_layer(self._is_training, l2_reg=self._l2_reg) - elif self._model_config.HasField('bottom_uniter'): - bottom_fea = self._uniter_layer(self._is_training, l2_reg=self._l2_reg) - elif self._model_config.HasField('bottom_dnn'): - bottom_dnn = dnn.DNN( - self._model_config.bottom_dnn, - self._l2_reg, - name='bottom_dnn', - is_training=self._is_training) - bottom_fea = bottom_dnn(self._features) - else: - bottom_fea = self._features + bottom_fea = self.backbone + if bottom_fea is None: + if self._model_config.HasField('bottom_cmbf'): + bottom_fea = self._cmbf_layer(self._is_training, l2_reg=self._l2_reg) + elif self._model_config.HasField('bottom_uniter'): + bottom_fea = self._uniter_layer(self._is_training, l2_reg=self._l2_reg) + elif self._model_config.HasField('bottom_dnn'): + bottom_dnn = dnn.DNN( + self._model_config.bottom_dnn, + self._l2_reg, + name='bottom_dnn', + is_training=self._is_training) + bottom_fea = bottom_dnn(self._features) + else: + bottom_fea = self._features # MMOE block if self._model_config.HasField('expert_dnn'): diff --git a/easy_rec/python/model/easy_rec_model.py b/easy_rec/python/model/easy_rec_model.py index 325cdc257..6fb8fa60a 100644 --- a/easy_rec/python/model/easy_rec_model.py +++ b/easy_rec/python/model/easy_rec_model.py @@ -12,6 +12,7 @@ from easy_rec.python.compat import regularizers from easy_rec.python.layers import input_layer +from easy_rec.python.layers.backbone import Backbone from easy_rec.python.utils import constant from easy_rec.python.utils import estimator_utils from easy_rec.python.utils import restore_filter @@ -48,6 +49,11 @@ def __init__(self, self._l2_reg = regularizers.l2_regularizer(self.l2_regularization) # only used by model with wide feature groups, e.g. WideAndDeep self._wide_output_dim = -1 + if self.has_backbone: + wide_dim = Backbone.wide_embed_dim(model_config.backbone) + if wide_dim: + self._wide_output_dim = wide_dim + logging.info('set `wide_output_dim` to %d' % wide_dim) self._feature_configs = feature_configs self.build_input_layer(model_config, feature_configs) @@ -61,6 +67,33 @@ def __init__(self, if constant.SAMPLE_WEIGHT in features: self._sample_weight = features[constant.SAMPLE_WEIGHT] + self._backbone_output = None + self._backbone_net = self.build_backbone_network() + + def build_backbone_network(self): + if self.has_backbone: + return Backbone( + self._base_model_config.backbone, + self._feature_dict, + input_layer=self._input_layer, + l2_reg=self._l2_reg) + return None + + @property + def has_backbone(self): + return self._base_model_config.HasField('backbone') + + @property + def backbone(self): + if self._backbone_output: + return self._backbone_output + if self._backbone_net: + self._backbone_output = self._backbone_net(self._is_training) + loss_dict = self._backbone_net.loss_dict + self._loss_dict.update(loss_dict) + return self._backbone_output + return None + @property def embedding_regularization(self): return self._base_model_config.embedding_regularization diff --git a/easy_rec/python/model/esmm.py b/easy_rec/python/model/esmm.py index c6eaad483..50567ae63 100644 --- a/easy_rec/python/model/esmm.py +++ b/easy_rec/python/model/esmm.py @@ -31,7 +31,9 @@ def __init__(self, self._group_num = len(self._model_config.groups) self._group_features = [] - if self._group_num > 0: + if self.has_backbone: + logging.info('use bottom backbone network') + elif self._group_num > 0: logging.info('group_num: {0}'.format(self._group_num)) for group_id in range(self._group_num): group = self._model_config.groups[group_id] @@ -173,7 +175,9 @@ def build_predict_graph(self): Returns: self._prediction_dict: Prediction result of two tasks. """ - if self._group_num > 0: + if self.has_backbone: + all_fea = self.backbone + elif self._group_num > 0: group_fea_arr = [] # Both towers share the underlying network. for group_id in range(self._group_num): diff --git a/easy_rec/python/model/mind.py b/easy_rec/python/model/mind.py index c414703d2..270060297 100644 --- a/easy_rec/python/model/mind.py +++ b/easy_rec/python/model/mind.py @@ -32,7 +32,7 @@ def __init__(self, 'invalid model config: %s' % self._model_config.WhichOneof('model') self._model_config = self._model_config.mind - self._hist_seq_features = self._input_layer( + self._hist_seq_features, _, _ = self._input_layer( self._feature_dict, 'hist', is_combine=False) self._user_features, _ = self._input_layer(self._feature_dict, 'user') self._item_features, _ = self._input_layer(self._feature_dict, 'item') diff --git a/easy_rec/python/model/mmoe.py b/easy_rec/python/model/mmoe.py index acf1d6d59..3cc644f6d 100644 --- a/easy_rec/python/model/mmoe.py +++ b/easy_rec/python/model/mmoe.py @@ -26,7 +26,10 @@ def __init__(self, self._model_config = self._model_config.mmoe assert isinstance(self._model_config, MMoEConfig) - self._features, _ = self._input_layer(self._feature_dict, 'all') + if self.has_backbone: + self._features = self.backbone + else: + self._features, _ = self._input_layer(self._feature_dict, 'all') self._init_towers(self._model_config.task_towers) def build_predict_graph(self): diff --git a/easy_rec/python/model/multi_task_model.py b/easy_rec/python/model/multi_task_model.py index 43e5663ce..c683702ae 100644 --- a/easy_rec/python/model/multi_task_model.py +++ b/easy_rec/python/model/multi_task_model.py @@ -1,10 +1,12 @@ # -*- encoding:utf-8 -*- # Copyright (c) Alibaba, Inc. and its affiliates. import logging +from collections import OrderedDict import tensorflow as tf from easy_rec.python.builders import loss_builder +from easy_rec.python.layers.dnn import DNN from easy_rec.python.model.rank_model import RankModel from easy_rec.python.protos import tower_pb2 from easy_rec.python.protos.loss_pb2 import LossType @@ -27,6 +29,71 @@ def __init__(self, self._task_num = None self._label_name_dict = {} + def build_predict_graph(self): + if not self.has_backbone: + raise NotImplementedError( + 'method `build_predict_graph` must be implemented when backbone network do not exists' + ) + model = self._model_config.WhichOneof('model') + assert model == 'model_params', '`model_params` must be configured' + config = self._model_config.model_params + + self._init_towers(config.task_towers) + + backbone = self.backbone + if type(backbone) in (list, tuple): + if len(backbone) != len(config.task_towers): + raise ValueError( + 'The number of backbone outputs and task towers must be equal') + task_input_list = backbone + else: + task_input_list = [backbone] * len(config.task_towers) + + tower_features = {} + for i, task_tower_cfg in enumerate(config.task_towers): + tower_name = task_tower_cfg.tower_name + if task_tower_cfg.HasField('dnn'): + tower_dnn = DNN( + task_tower_cfg.dnn, + self._l2_reg, + name=tower_name, + is_training=self._is_training) + tower_output = tower_dnn(task_input_list[i]) + else: + tower_output = task_input_list[i] + tower_features[tower_name] = tower_output + + tower_outputs = {} + relation_features = {} + # bayes network + for task_tower_cfg in config.task_towers: + tower_name = task_tower_cfg.tower_name + if task_tower_cfg.HasField('relation_dnn'): + relation_dnn = DNN( + task_tower_cfg.relation_dnn, + self._l2_reg, + name=tower_name + '/relation_dnn', + is_training=self._is_training) + tower_inputs = [tower_features[tower_name]] + for relation_tower_name in task_tower_cfg.relation_tower_names: + tower_inputs.append(relation_features[relation_tower_name]) + relation_input = tf.concat( + tower_inputs, axis=-1, name=tower_name + '/relation_input') + relation_fea = relation_dnn(relation_input) + relation_features[tower_name] = relation_fea + else: + relation_fea = tower_features[tower_name] + + output_logits = tf.layers.dense( + relation_fea, + task_tower_cfg.num_class, + kernel_regularizer=self._l2_reg, + name=tower_name + '/output') + tower_outputs[tower_name] = output_logits + + self._add_to_prediction_dict(tower_outputs) + return self._prediction_dict + def _init_towers(self, task_tower_configs): """Init task towers.""" self._task_towers = task_tower_configs @@ -86,8 +153,47 @@ def build_metric_graph(self, eval_config): suffix='_%s' % tower_name)) return metric_dict + def build_loss_weight(self): + loss_weights = OrderedDict() + num_loss = 0 + for task_tower_cfg in self._task_towers: + tower_name = task_tower_cfg.tower_name + losses = task_tower_cfg.losses + n = len(losses) + if n > 0: + loss_weights[tower_name] = [loss.weight for loss in losses] + num_loss += n + else: + loss_weights[tower_name] = [1.0] + num_loss += 1 + + strategy = self._base_model_config.loss_weight_strategy + if strategy == self._base_model_config.Random: + weights = tf.random_normal([num_loss]) + weights = tf.nn.softmax(weights) + i = 0 + for k, v in loss_weights.items(): + n = len(v) + loss_weights[k] = weights[i:i + n] + i += n + return loss_weights + + def get_learnt_loss(self, loss_type, name, value): + strategy = self._base_model_config.loss_weight_strategy + if strategy == self._base_model_config.Uncertainty: + uncertainty = tf.Variable( + 0, name='%s_loss_weight' % name, dtype=tf.float32) + tf.summary.scalar('loss/%s_uncertainty' % name, uncertainty) + if loss_type in {LossType.L2_LOSS, LossType.SIGMOID_L2_LOSS}: + return 0.5 * tf.exp(-uncertainty) * value + 0.5 * uncertainty + else: + return tf.exp(-uncertainty) * value + 0.5 * uncertainty + else: + raise ValueError('Unsupported loss weight strategy: ' + strategy.Name) + def build_loss_graph(self): """Build loss graph for multi task model.""" + task_loss_weights = self.build_loss_weight() for task_tower_cfg in self._task_towers: tower_name = task_tower_cfg.tower_name loss_weight = task_tower_cfg.weight @@ -102,6 +208,7 @@ def build_loss_graph(self): task_tower_cfg.in_task_space_weight * in_task_space + task_tower_cfg.out_task_space_weight * (1 - in_task_space)) + task_loss_weight = task_loss_weights[tower_name] loss_dict = {} losses = task_tower_cfg.losses if len(losses) == 0: @@ -111,6 +218,8 @@ def build_loss_graph(self): loss_weight=loss_weight, num_class=task_tower_cfg.num_class, suffix='_%s' % tower_name) + for loss_name in loss_dict.keys(): + loss_dict[loss_name] = loss_dict[loss_name] * task_loss_weight[0] else: for loss in losses: loss_param = loss.WhichOneof('loss_param') @@ -124,20 +233,13 @@ def build_loss_graph(self): suffix='_%s' % tower_name, loss_name=loss.loss_name, loss_param=loss_param) - for loss_name, loss_value in loss_ops.items(): + for i, loss_name in enumerate(loss_ops): + loss_value = loss_ops[loss_name] if loss.learn_loss_weight: - uncertainty = tf.Variable( - 0, name='%s_loss_weight' % loss_name, dtype=tf.float32) - tf.summary.scalar('loss/%s_uncertainty' % loss_name, uncertainty) - if loss.loss_type in {LossType.L2_LOSS, LossType.SIGMOID_L2_LOSS}: - loss_dict[loss_name] = 0.5 * tf.exp( - -uncertainty) * loss_value + 0.5 * uncertainty - else: - loss_dict[loss_name] = tf.exp( - -uncertainty) * loss_value + 0.5 * uncertainty + loss_dict[loss_name] = self.get_learnt_loss( + loss.loss_type, loss_name, loss_value) else: - loss_dict[loss_name] = loss_value * loss.weight - + loss_dict[loss_name] = loss_value * task_loss_weight[i] self._loss_dict.update(loss_dict) kd_loss_dict = loss_builder.build_kd_loss(self.kd, self._prediction_dict, diff --git a/easy_rec/python/model/ple.py b/easy_rec/python/model/ple.py index f3ad71215..e04781bcd 100644 --- a/easy_rec/python/model/ple.py +++ b/easy_rec/python/model/ple.py @@ -27,7 +27,10 @@ def __init__(self, self._layer_nums = len(self._model_config.extraction_networks) self._task_nums = len(self._model_config.task_towers) - self._features, _ = self._input_layer(self._feature_dict, 'all') + if self.has_backbone: + self._features = self.backbone + else: + self._features, _ = self._input_layer(self._feature_dict, 'all') self._init_towers(self._model_config.task_towers) def gate(self, selector_fea, vec_feas, name): diff --git a/easy_rec/python/model/rank_model.py b/easy_rec/python/model/rank_model.py index 25eff23ea..f8c7f10c3 100644 --- a/easy_rec/python/model/rank_model.py +++ b/easy_rec/python/model/rank_model.py @@ -29,6 +29,18 @@ def __init__(self, if self._labels is not None: self._label_name = list(self._labels.keys())[0] + def build_predict_graph(self): + if not self.has_backbone: + raise NotImplementedError( + 'method `build_predict_graph` must be implemented when backbone network do not exits' + ) + output = self.backbone + if int(output.shape[-1]) != self._num_class: + logging.info('add head logits layer for rank model') + output = tf.layers.dense(output, self._num_class, name='output') + self._add_to_prediction_dict(output) + return self._prediction_dict + def _output_to_prediction_impl(self, output, loss_type, @@ -193,7 +205,12 @@ def build_loss_graph(self): loss_weight=self._sample_weight, num_class=self._num_class) else: - for loss in self._losses: + strategy = self._base_model_config.loss_weight_strategy + loss_weight = [1.0] + if strategy == self._base_model_config.Random and len(self._losses) > 1: + weights = tf.random_normal([len(self._losses)]) + loss_weight = tf.nn.softmax(weights) + for i, loss in enumerate(self._losses): loss_param = loss.WhichOneof('loss_param') if loss_param is not None: loss_param = getattr(loss, loss_param) @@ -205,18 +222,26 @@ def build_loss_graph(self): loss_name=loss.loss_name, loss_param=loss_param) for loss_name, loss_value in loss_ops.items(): - if loss.learn_loss_weight: - uncertainty = tf.Variable( - 0, name='%s_loss_weight' % loss_name, dtype=tf.float32) - tf.summary.scalar('loss/%s_uncertainty' % loss_name, uncertainty) - if loss.loss_type in {LossType.L2_LOSS, LossType.SIGMOID_L2_LOSS}: - loss_dict[loss_name] = 0.5 * tf.exp( - -uncertainty) * loss_value + 0.5 * uncertainty + if strategy == self._base_model_config.Fixed: + loss_dict[loss_name] = loss_value * loss.weight + elif strategy == self._base_model_config.Uncertainty: + if loss.learn_loss_weight: + uncertainty = tf.Variable( + 0, name='%s_loss_weight' % loss_name, dtype=tf.float32) + tf.summary.scalar('loss/%s_uncertainty' % loss_name, uncertainty) + if loss.loss_type in {LossType.L2_LOSS, LossType.SIGMOID_L2_LOSS}: + loss_dict[loss_name] = 0.5 * tf.exp( + -uncertainty) * loss_value + 0.5 * uncertainty + else: + loss_dict[loss_name] = tf.exp( + -uncertainty) * loss_value + 0.5 * uncertainty else: - loss_dict[loss_name] = tf.exp( - -uncertainty) * loss_value + 0.5 * uncertainty + loss_dict[loss_name] = loss_value * loss.weight + elif strategy == self._base_model_config.Random: + loss_dict[loss_name] = loss_value * loss_weight[i] else: - loss_dict[loss_name] = loss_value * loss.weight + raise ValueError('Unsupported loss weight strategy: ' + + strategy.Name) self._loss_dict.update(loss_dict) diff --git a/easy_rec/python/model/simple_multi_task.py b/easy_rec/python/model/simple_multi_task.py index b4c0613bc..05dd7a773 100644 --- a/easy_rec/python/model/simple_multi_task.py +++ b/easy_rec/python/model/simple_multi_task.py @@ -27,7 +27,10 @@ def __init__(self, self._model_config = self._model_config.simple_multi_task assert isinstance(self._model_config, SimpleMultiTaskConfig) - self._features, _ = self._input_layer(self._feature_dict, 'all') + if self.has_backbone: + self._features = self.backbone + else: + self._features, _ = self._input_layer(self._feature_dict, 'all') self._init_towers(self._model_config.task_towers) def build_predict_graph(self): diff --git a/easy_rec/python/protos/backbone.proto b/easy_rec/python/protos/backbone.proto new file mode 100644 index 000000000..c93fbf0df --- /dev/null +++ b/easy_rec/python/protos/backbone.proto @@ -0,0 +1,96 @@ +syntax = "proto2"; +package protos; + +import "easy_rec/python/protos/dnn.proto"; +import "easy_rec/python/protos/keras_layer.proto"; + +message InputLayer { + optional bool do_batch_norm = 1; + optional bool do_layer_norm = 2; + optional float dropout_rate = 3; + optional float feature_dropout_rate = 4; + optional bool only_output_feature_list = 5; + optional bool only_output_3d_tensor = 6; + optional bool output_2d_tensor_and_feature_list = 7; + optional bool output_seq_and_normal_feature = 8; + optional uint32 wide_output_dim = 9; +} + +message Lambda { + required string expression = 1; +} + +message Input { + oneof name { + string feature_group_name = 1; + string block_name = 2; + string package_name = 3; + } + optional string input_fn = 11; + optional string input_slice = 12; +} + +message RecurrentLayer { + required uint32 num_steps = 1 [default = 1]; + optional uint32 fixed_input_index = 2; + required KerasLayer keras_layer = 3; +} + +message RepeatLayer { + required uint32 num_repeat = 1 [default = 1]; + // default output the list of multiple outputs + optional int32 output_concat_axis = 2; + required KerasLayer keras_layer = 3; +} + +message Layer { + oneof layer { + Lambda lambda = 1; + KerasLayer keras_layer = 2; + RecurrentLayer recurrent = 3; + RepeatLayer repeat = 4; + InputLayer input_layer = 5; + } +} + +message Block { + required string name = 1; + // the input names of feature groups or other blocks + repeated Input inputs = 2; + optional int32 input_concat_axis = 3 [default = -1]; + optional bool merge_inputs_into_list = 4; + optional string extra_input_fn = 5; + + // sequential layers + repeated Layer layers = 6; + + // only take effect when there are no layers + oneof layer { + InputLayer input_layer = 101; + Lambda lambda = 102; + KerasLayer keras_layer = 103; + RecurrentLayer recurrent = 104; + RepeatLayer repeat = 105; + } +} + +// a package of blocks for reuse; e.g. call in a contrastive learning manner +message BlockPackage { + // package name + required string name = 1; + // a few blocks generating a DAG + repeated Block blocks = 2; + // the names of output blocks + repeated string concat_blocks = 3; +} + +message BackboneTower { + // a few sub DAGs + repeated BlockPackage packages = 1; + // a few blocks generating a DAG + repeated Block blocks = 2; + // the names of output blocks + repeated string concat_blocks = 3; + // optional top mlp layer + optional MLP top_mlp = 4; +} diff --git a/easy_rec/python/protos/cmbf.proto b/easy_rec/python/protos/cmbf.proto index 598bf1ecf..34e082115 100644 --- a/easy_rec/python/protos/cmbf.proto +++ b/easy_rec/python/protos/cmbf.proto @@ -1,9 +1,50 @@ syntax = "proto2"; package protos; -import "easy_rec/python/protos/layer.proto"; import "easy_rec/python/protos/dnn.proto"; +message CMBFTower { + // The number of heads of cross modal fusion layer + required uint32 multi_head_num = 1 [default = 1]; + // The number of heads of image feature learning layer + required uint32 image_multi_head_num = 101 [default = 1]; + // The number of heads of text feature learning layer + required uint32 text_multi_head_num = 102 [default = 1]; + // The dimension of text heads + required uint32 text_head_size = 2; + // The dimension of image heads + required uint32 image_head_size = 3 [default = 64]; + // The number of patches of image feature, take effect when there is only one image feature + required uint32 image_feature_patch_num = 4 [default = 1]; + // Do dimension reduce to this size for image feature before single modal learning module + required uint32 image_feature_dim = 5 [default = 0]; + // The number of self attention layers for image features + required uint32 image_self_attention_layer_num = 6 [default = 0]; + // The number of self attention layers for text features + required uint32 text_self_attention_layer_num = 7 [default = 1]; + // The number of cross modal layers + required uint32 cross_modal_layer_num = 8 [default = 1]; + // The dimension of image cross modal heads + required uint32 image_cross_head_size = 9; + // The dimension of text cross modal heads + required uint32 text_cross_head_size = 10; + // Dropout probability for hidden layers + required float hidden_dropout_prob = 11 [default = 0.0]; + // Dropout probability of the attention probabilities + required float attention_probs_dropout_prob = 12 [default = 0.0]; + + // Whether to add embeddings for different text sequence features + required bool use_token_type = 13 [default = false]; + // Whether to add position embeddings for the position of each token in the text sequence + required bool use_position_embeddings = 14 [default = true]; + // Maximum sequence length that might ever be used with this model + required uint32 max_position_embeddings = 15 [default = 0]; + // Dropout probability for text sequence embeddings + required float text_seq_emb_dropout_prob = 16 [default = 0.1]; + // dnn layers for other features + optional DNN other_feature_dnn = 17; +} + message CMBF { required CMBFTower config = 1; diff --git a/easy_rec/python/protos/dbmtl.proto b/easy_rec/python/protos/dbmtl.proto index 841b8adec..9adff1f62 100644 --- a/easy_rec/python/protos/dbmtl.proto +++ b/easy_rec/python/protos/dbmtl.proto @@ -3,7 +3,8 @@ package protos; import "easy_rec/python/protos/dnn.proto"; import "easy_rec/python/protos/tower.proto"; -import "easy_rec/python/protos/layer.proto"; +import "easy_rec/python/protos/cmbf.proto"; +import "easy_rec/python/protos/uniter.proto"; message DBMTL { // shared bottom cmbf layer diff --git a/easy_rec/python/protos/dnn.proto b/easy_rec/python/protos/dnn.proto index 021d34dbb..ff40f0fe4 100644 --- a/easy_rec/python/protos/dnn.proto +++ b/easy_rec/python/protos/dnn.proto @@ -12,3 +12,20 @@ message DNN { // use batch normalization optional bool use_bn = 4 [default = true]; } + +message MLP { + // hidden units for each layer + repeated uint32 hidden_units = 1; + // ratio of dropout + repeated float dropout_ratio = 2; + // activation function + optional string activation = 3 [default = 'relu']; + // use batch normalization + optional bool use_bn = 4 [default = true]; + optional bool use_final_bn = 5 [default = true]; + optional string final_activation = 6 [default = 'relu']; + optional bool use_bias = 7 [default = true]; + // kernel_initializer + optional string initializer = 8 [default = 'he_uniform']; + optional bool use_bn_after_activation = 9; +} diff --git a/easy_rec/python/protos/easy_rec_model.proto b/easy_rec/python/protos/easy_rec_model.proto index 27dcefadc..76506d710 100644 --- a/easy_rec/python/protos/easy_rec_model.proto +++ b/easy_rec/python/protos/easy_rec_model.proto @@ -1,6 +1,7 @@ syntax = "proto2"; package protos; +import "easy_rec/python/protos/backbone.proto"; import "easy_rec/python/protos/fm.proto"; import "easy_rec/python/protos/deepfm.proto"; import "easy_rec/python/protos/wide_and_deep.proto"; @@ -24,9 +25,17 @@ import "easy_rec/python/protos/loss.proto"; import "easy_rec/python/protos/rocket_launching.proto"; import "easy_rec/python/protos/variational_dropout.proto"; import "easy_rec/python/protos/multi_tower_recall.proto"; +import "easy_rec/python/protos/tower.proto"; + // for input performance test message DummyModel { +} + +// configure backbone network common parameters +message ModelParams { + optional float l2_regularization = 1; + repeated BayesTaskTower task_towers = 3; } // for knowledge distillation @@ -44,17 +53,19 @@ message KD { optional float loss_weight = 4 [default=1.0]; // only for loss_type == CROSS_ENTROPY_LOSS optional float temperature = 5 [default=1.0]; - } message EasyRecModel { required string model_class = 1; + // just a name for backbone config + optional string model_name = 99; // actually input layers, each layer produce a group of feature repeated FeatureGroupConfig feature_groups = 2; // model parameters oneof model { + ModelParams model_params = 100; DummyModel dummy = 101; WideAndDeep wide_and_deep = 102; DeepFM deepfm = 103; @@ -102,4 +113,12 @@ message EasyRecModel { repeated Loss losses = 15; + enum LossWeightStrategy { + Fixed = 0; + Uncertainty = 1; + Random = 2; + } + required LossWeightStrategy loss_weight_strategy = 16 [default = Fixed]; + + optional BackboneTower backbone = 17; } diff --git a/easy_rec/python/protos/fm.proto b/easy_rec/python/protos/fm.proto index c90af8cab..31d8f27d7 100644 --- a/easy_rec/python/protos/fm.proto +++ b/easy_rec/python/protos/fm.proto @@ -2,5 +2,6 @@ syntax = "proto2"; package protos; message FM { + optional bool use_variant = 1; optional float l2_regularization = 5 [default = 1e-4]; } diff --git a/easy_rec/python/protos/keras_layer.proto b/easy_rec/python/protos/keras_layer.proto new file mode 100644 index 000000000..2798260d3 --- /dev/null +++ b/easy_rec/python/protos/keras_layer.proto @@ -0,0 +1,27 @@ +syntax = "proto2"; +package protos; + +import "google/protobuf/struct.proto"; +import "easy_rec/python/protos/layer.proto"; +import "easy_rec/python/protos/dnn.proto"; +import "easy_rec/python/protos/fm.proto"; +import "easy_rec/python/protos/seq_encoder.proto"; + +message KerasLayer { + required string class_name = 1; + oneof params { + google.protobuf.Struct st_params = 2; + PeriodicEmbedding periodic_embedding = 3; + AutoDisEmbedding auto_dis_embedding = 4; + FM fm = 5; + MaskBlock mask_block = 6; + MaskNet masknet = 7; + SENet senet = 8; + Bilinear bilinear = 9; + FiBiNet fibinet = 10; + MLP mlp = 11; + DINEncoder din = 12; + BSTEncoder bst = 13; + MMoELayer mmoe = 14; + } +} diff --git a/easy_rec/python/protos/layer.proto b/easy_rec/python/protos/layer.proto index 6cea6d3bd..52a1cbf30 100644 --- a/easy_rec/python/protos/layer.proto +++ b/easy_rec/python/protos/layer.proto @@ -4,73 +4,68 @@ package protos; import "easy_rec/python/protos/dnn.proto"; message HighWayTower { - required string input = 1; + optional string input = 1; required uint32 emb_size = 2; + required string activation = 3 [default = 'gelu']; + optional float dropout_rate = 4; } -message CMBFTower { - // The number of heads of cross modal fusion layer - required uint32 multi_head_num = 1 [default = 1]; - // The number of heads of image feature learning layer - required uint32 image_multi_head_num = 101 [default = 1]; - // The number of heads of text feature learning layer - required uint32 text_multi_head_num = 102 [default = 1]; - // The dimension of text heads - required uint32 text_head_size = 2; - // The dimension of image heads - required uint32 image_head_size = 3 [default = 64]; - // The number of patches of image feature, take effect when there is only one image feature - required uint32 image_feature_patch_num = 4 [default = 1]; - // Do dimension reduce to this size for image feature before single modal learning module - required uint32 image_feature_dim = 5 [default = 0]; - // The number of self attention layers for image features - required uint32 image_self_attention_layer_num = 6 [default = 0]; - // The number of self attention layers for text features - required uint32 text_self_attention_layer_num = 7 [default = 1]; - // The number of cross modal layers - required uint32 cross_modal_layer_num = 8 [default = 1]; - // The dimension of image cross modal heads - required uint32 image_cross_head_size = 9; - // The dimension of text cross modal heads - required uint32 text_cross_head_size = 10; - // Dropout probability for hidden layers - required float hidden_dropout_prob = 11 [default = 0.0]; - // Dropout probability of the attention probabilities - required float attention_probs_dropout_prob = 12 [default = 0.0]; +message PeriodicEmbedding { + required uint32 embedding_dim = 1; + required float sigma = 2; + optional bool add_linear_layer = 3 [default = true]; + optional string linear_activation = 4 [default = 'relu']; + optional bool output_3d_tensor = 5; + optional bool output_tensor_list = 6; +} + +message AutoDisEmbedding { + required uint32 embedding_dim = 1; + required uint32 num_bins = 2; + required float keep_prob = 3 [default = 0.8]; + required float temperature = 4; + optional bool output_3d_tensor = 5; + optional bool output_tensor_list = 6; +} + +message SENet { + required uint32 reduction_ratio = 1 [default = 4]; + optional uint32 num_squeeze_group = 2 [default = 2]; + optional bool use_skip_connection = 3 [default = true]; + optional bool use_output_layer_norm = 4 [default = true]; +} + +message Bilinear { + required string type = 1 [default = 'interaction']; + required bool use_plus = 2 [default = true]; + required uint32 num_output_units = 3; +} + +message FiBiNet { + optional Bilinear bilinear = 1; + required SENet senet = 2; + optional MLP mlp = 8; +} + +message MaskBlock { + optional float reduction_factor = 1; + required uint32 output_size = 2; + optional uint32 aggregation_size = 3; + optional bool input_layer_norm = 4 [default = true]; + optional uint32 projection_dim = 5; +} - // Whether to add embeddings for different text sequence features - required bool use_token_type = 13 [default = false]; - // Whether to add position embeddings for the position of each token in the text sequence - required bool use_position_embeddings = 14 [default = true]; - // Maximum sequence length that might ever be used with this model - required uint32 max_position_embeddings = 15 [default = 0]; - // Dropout probability for text sequence embeddings - required float text_seq_emb_dropout_prob = 16 [default = 0.1]; - // dnn layers for other features - optional DNN other_feature_dnn = 17; +message MaskNet { + repeated MaskBlock mask_blocks = 1; + required bool use_parallel = 2 [default = true]; + optional MLP mlp = 3; } -message UniterTower { - // Size of the encoder layers and the pooler layer - required uint32 hidden_size = 1; - // Number of hidden layers in the Transformer encoder - required uint32 num_hidden_layers = 2; - // Number of attention heads for each attention layer in the Transformer encoder - required uint32 num_attention_heads = 3; - // The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder - required uint32 intermediate_size = 4; - // The non-linear activation function (function or string) in the encoder and pooler. - required string hidden_act = 5 [default = 'gelu']; // "gelu", "relu", "tanh" and "swish" are supported. - // The dropout probability for all fully connected layers in the embeddings, encoder, and pooler - required float hidden_dropout_prob = 6 [default = 0.1]; - // The dropout ratio for the attention probabilities - required float attention_probs_dropout_prob = 7 [default = 0.1]; - // The maximum sequence length that this model might ever be used with - required uint32 max_position_embeddings = 8 [default = 512]; - // Whether to add position embeddings for the position of each token in the text sequence - required bool use_position_embeddings = 9 [default = true]; - // The stddev of the truncated_normal_initializer for initializing all weight matrices - required float initializer_range = 10 [default = 0.02]; - // dnn layers for other features - optional DNN other_feature_dnn = 11; +message MMoELayer { + // number of tasks + required uint32 num_task = 1; + // mmoe expert mlp layer definition + optional MLP expert_mlp = 2; + // number of mmoe experts + optional uint32 num_expert = 3; } diff --git a/easy_rec/python/protos/loss.proto b/easy_rec/python/protos/loss.proto index c5b74f47d..5c913bf6e 100644 --- a/easy_rec/python/protos/loss.proto +++ b/easy_rec/python/protos/loss.proto @@ -93,4 +93,6 @@ message PairwiseLogisticLoss { message JRCLoss { required string session_name = 1; optional float alpha = 2 [default = 0.5]; + optional bool same_label_loss = 3 [default = true]; + required string loss_weight_strategy = 4 [default = 'fixed']; } diff --git a/easy_rec/python/protos/seq_encoder.proto b/easy_rec/python/protos/seq_encoder.proto new file mode 100644 index 000000000..2b845a429 --- /dev/null +++ b/easy_rec/python/protos/seq_encoder.proto @@ -0,0 +1,37 @@ +syntax = "proto2"; +package protos; + +import "easy_rec/python/protos/dnn.proto"; + + +message BSTEncoder { + // Size of the encoder layers and the pooler layer + required uint32 hidden_size = 1; + // Number of hidden layers in the Transformer encoder + required uint32 num_hidden_layers = 2; + // Number of attention heads for each attention layer in the Transformer encoder + required uint32 num_attention_heads = 3; + // The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder + required uint32 intermediate_size = 4; + // The non-linear activation function (function or string) in the encoder and pooler. + required string hidden_act = 5 [default = 'gelu']; // "gelu", "relu", "tanh" and "swish" are supported. + // The dropout probability for all fully connected layers in the embeddings, encoder, and pooler + required float hidden_dropout_prob = 6 [default = 0.1]; + // The dropout ratio for the attention probabilities + required float attention_probs_dropout_prob = 7 [default = 0.1]; + // The maximum sequence length that this model might ever be used with + required uint32 max_position_embeddings = 8 [default = 512]; + // Whether to add position embeddings for the position of each token in the text sequence + required bool use_position_embeddings = 9 [default = true]; + // The stddev of the truncated_normal_initializer for initializing all weight matrices + required float initializer_range = 10 [default = 0.02]; +} + +message DINEncoder { + // din attention layer + required DNN attention_dnn = 1; + // whether to keep target item feature + required bool need_target_feature = 2 [default = true]; + // option: softmax, sigmoid + required string attention_normalizer = 3 [default = 'softmax']; +} diff --git a/easy_rec/python/protos/uniter.proto b/easy_rec/python/protos/uniter.proto index 7e78ad23e..9efc1dc9e 100644 --- a/easy_rec/python/protos/uniter.proto +++ b/easy_rec/python/protos/uniter.proto @@ -1,9 +1,33 @@ syntax = "proto2"; package protos; -import "easy_rec/python/protos/layer.proto"; import "easy_rec/python/protos/dnn.proto"; +message UniterTower { + // Size of the encoder layers and the pooler layer + required uint32 hidden_size = 1; + // Number of hidden layers in the Transformer encoder + required uint32 num_hidden_layers = 2; + // Number of attention heads for each attention layer in the Transformer encoder + required uint32 num_attention_heads = 3; + // The size of the "intermediate" (i.e. feed-forward) layer in the Transformer encoder + required uint32 intermediate_size = 4; + // The non-linear activation function (function or string) in the encoder and pooler. + required string hidden_act = 5 [default = 'gelu']; // "gelu", "relu", "tanh" and "swish" are supported. + // The dropout probability for all fully connected layers in the embeddings, encoder, and pooler + required float hidden_dropout_prob = 6 [default = 0.1]; + // The dropout ratio for the attention probabilities + required float attention_probs_dropout_prob = 7 [default = 0.1]; + // The maximum sequence length that this model might ever be used with + required uint32 max_position_embeddings = 8 [default = 512]; + // Whether to add position embeddings for the position of each token in the text sequence + required bool use_position_embeddings = 9 [default = true]; + // The stddev of the truncated_normal_initializer for initializing all weight matrices + required float initializer_range = 10 [default = 0.02]; + // dnn layers for other features + optional DNN other_feature_dnn = 11; +} + message Uniter { required UniterTower config = 1; diff --git a/easy_rec/python/test/train_eval_test.py b/easy_rec/python/test/train_eval_test.py index f66e5ead3..2ae51751f 100644 --- a/easy_rec/python/test/train_eval_test.py +++ b/easy_rec/python/test/train_eval_test.py @@ -97,10 +97,20 @@ def test_wide_and_deep(self): self._test_dir) self.assertTrue(self._success) + def test_wide_and_deep_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/wide_and_deep_backbone_on_avazau.config', + self._test_dir) + self.assertTrue(self._success) + def test_dlrm(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/dlrm_on_taobao.config', self._test_dir) + def test_dlrm_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/dlrm_backbone_on_taobao.config', self._test_dir) + def test_adamw_optimizer(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/deepfm_combo_on_avazu_adamw_ctr.config', @@ -131,6 +141,12 @@ def test_multi_tower(self): 'samples/model_config/multi_tower_on_taobao.config', self._test_dir) self.assertTrue(self._success) + def test_multi_tower_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/multi_tower_backbone_on_taobao.config', + self._test_dir) + self.assertTrue(self._success) + def test_multi_tower_gauc(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/multi_tower_on_taobao_gauc.config', @@ -336,6 +352,21 @@ def test_dcn(self): 'samples/model_config/dcn_on_taobao.config', self._test_dir) self.assertTrue(self._success) + def test_fibinet(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/fibinet_on_taobao.config', self._test_dir) + self.assertTrue(self._success) + + def test_masknet(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/masknet_on_taobao.config', self._test_dir) + self.assertTrue(self._success) + + def test_dcn_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/dcn_backbone_on_taobao.config', self._test_dir) + self.assertTrue(self._success) + def test_dcn_with_f1(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/dcn_f1_on_taobao.config', self._test_dir) @@ -521,11 +552,6 @@ def test_deepfm_with_sigmoid_l2_loss(self): self._test_dir) self.assertTrue(self._success) - # def test_deepfm_with_sequence_attention(self): - # self._success = test_utils.test_single_train_eval( - # 'samples/model_config/deppfm_seq_attn_on_taobao.config', self._test_dir) - # self.assertTrue(self._success) - def test_deepfm_with_embedding_learning_rate(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/deepfm_combo_on_avazu_emblr_ctr.config', @@ -549,6 +575,11 @@ def test_mmoe(self): 'samples/model_config/mmoe_on_taobao.config', self._test_dir) self.assertTrue(self._success) + def test_mmoe_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/mmoe_backbone_on_taobao.config', self._test_dir) + self.assertTrue(self._success) + def test_mmoe_with_multi_loss(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/mmoe_on_taobao_with_multi_loss.config', @@ -566,6 +597,12 @@ def test_simple_multi_task(self): self._test_dir) self.assertTrue(self._success) + def test_simple_multi_task_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/simple_multi_task_backbone_on_taobao.config', + self._test_dir) + self.assertTrue(self._success) + def test_esmm(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/esmm_on_taobao.config', self._test_dir) @@ -581,6 +618,11 @@ def test_dbmtl(self): 'samples/model_config/dbmtl_on_taobao.config', self._test_dir) self.assertTrue(self._success) + def test_dbmtl_backbone(self): + self._success = test_utils.test_single_train_eval( + 'samples/model_config/dbmtl_backbone_on_taobao.config', self._test_dir) + self.assertTrue(self._success) + def test_dbmtl_cmbf(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/dbmtl_cmbf_on_movielens.config', self._test_dir) @@ -770,6 +812,18 @@ def test_batch_tfrecord_input(self): self._test_dir) self.assertTrue(self._success) + def test_autodis_embedding(self): + self._success = test_utils.test_distributed_train_eval( + 'samples/model_config/deepfm_on_criteo_with_autodis.config', + self._test_dir) + self.assertTrue(self._success) + + def test_periodic_embedding(self): + self._success = test_utils.test_distributed_train_eval( + 'samples/model_config/deepfm_on_criteo_with_periodic.config', + self._test_dir) + self.assertTrue(self._success) + def test_sample_weight(self): self._success = test_utils.test_single_train_eval( 'samples/model_config/deepfm_with_sample_weight.config', self._test_dir) diff --git a/easy_rec/python/test/util_test.py b/easy_rec/python/test/util_test.py index 233a00772..c14524488 100644 --- a/easy_rec/python/test/util_test.py +++ b/easy_rec/python/test/util_test.py @@ -4,6 +4,7 @@ import tensorflow as tf from easy_rec.python.utils import estimator_utils +from easy_rec.python.utils.dag import DAG from easy_rec.python.utils.expr_util import get_expression if tf.__version__ >= '2.0': @@ -57,6 +58,28 @@ def test_get_expression_or(self): ['age_level', 'item_age_level']) assert result == "tf.greater(parsed_dict['age_level'], 3) | tf.less(parsed_dict['item_age_level'], 1)" + def test_dag(self): + dag = DAG() + dag.add_node('a') + dag.add_node('b') + dag.add_node('c') + dag.add_node('d') + dag.add_edge('a', 'b') + dag.add_edge('a', 'd') + dag.add_edge('b', 'c') + order = dag.topological_sort() + idx_a = order.index('a') + idx_b = order.index('b') + idx_c = order.index('c') + idx_d = order.index('d') + assert idx_a < idx_b + assert idx_a < idx_d + assert idx_b < idx_c + c = dag.all_downstreams('b') + assert c == ['c'] + leaf = dag.all_leaves() + assert leaf == ['c', 'd'] + if __name__ == '__main__': tf.test.main() diff --git a/easy_rec/python/train_eval.py b/easy_rec/python/train_eval.py index a96d0c58e..a7ebca5db 100644 --- a/easy_rec/python/train_eval.py +++ b/easy_rec/python/train_eval.py @@ -106,8 +106,12 @@ help='is use check mode') parser.add_argument( '--selected_cols', type=str, default=None, help='select input columns') + parser.add_argument('--gpu', type=str, default=None, help='gpu id') args, extra_args = parser.parse_known_args() + if args.gpu is not None: + os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu + edit_config_json = {} if args.edit_config_json: edit_config_json = json.loads(args.edit_config_json) diff --git a/easy_rec/python/utils/dag.py b/easy_rec/python/utils/dag.py new file mode 100644 index 000000000..b72ae072b --- /dev/null +++ b/easy_rec/python/utils/dag.py @@ -0,0 +1,192 @@ +import logging +from collections import OrderedDict +from collections import defaultdict +from copy import copy +from copy import deepcopy + + +class DAG(object): + """Directed acyclic graph implementation.""" + + def __init__(self): + """Construct a new DAG with no nodes or edges.""" + self.reset_graph() + + def add_node(self, node_name, graph=None): + """Add a node if it does not exist yet, or error out.""" + if not graph: + graph = self.graph + if node_name in graph: + raise KeyError('node %s already exists' % node_name) + graph[node_name] = set() + + def add_node_if_not_exists(self, node_name, graph=None): + try: + self.add_node(node_name, graph=graph) + except KeyError: + logging.info('node %s already exist' % node_name) + + def delete_node(self, node_name, graph=None): + """Deletes this node and all edges referencing it.""" + if not graph: + graph = self.graph + if node_name not in graph: + raise KeyError('node %s does not exist' % node_name) + graph.pop(node_name) + + for node, edges in graph.items(): + if node_name in edges: + edges.remove(node_name) + + def delete_node_if_exists(self, node_name, graph=None): + try: + self.delete_node(node_name, graph=graph) + except KeyError: + logging.info('node %s does not exist' % node_name) + + def add_edge(self, ind_node, dep_node, graph=None): + """Add an edge (dependency) between the specified nodes.""" + if not graph: + graph = self.graph + if ind_node not in graph or dep_node not in graph: + raise KeyError('one or more nodes do not exist in graph') + test_graph = deepcopy(graph) + test_graph[ind_node].add(dep_node) + is_valid, message = self.validate(test_graph) + if is_valid: + graph[ind_node].add(dep_node) + else: + raise Exception() + + def delete_edge(self, ind_node, dep_node, graph=None): + """Delete an edge from the graph.""" + if not graph: + graph = self.graph + if dep_node not in graph.get(ind_node, []): + raise KeyError('this edge does not exist in graph') + graph[ind_node].remove(dep_node) + + def rename_edges(self, old_task_name, new_task_name, graph=None): + """Change references to a task in existing edges.""" + if not graph: + graph = self.graph + for node, edges in graph.items(): + + if node == old_task_name: + graph[new_task_name] = copy(edges) + del graph[old_task_name] + + else: + if old_task_name in edges: + edges.remove(old_task_name) + edges.add(new_task_name) + + def predecessors(self, node, graph=None): + """Returns a list of all predecessors of the given node.""" + if graph is None: + graph = self.graph + return [key for key in graph if node in graph[key]] + + def downstream(self, node, graph=None): + """Returns a list of all nodes this node has edges towards.""" + if graph is None: + graph = self.graph + if node not in graph: + raise KeyError('node %s is not in graph' % node) + return list(graph[node]) + + def all_downstreams(self, node, graph=None): + """Returns a list of all nodes ultimately downstream of the given node in the dependency graph. + + in topological order. + """ + if graph is None: + graph = self.graph + nodes = [node] + nodes_seen = set() + i = 0 + while i < len(nodes): + downstreams = self.downstream(nodes[i], graph) + for downstream_node in downstreams: + if downstream_node not in nodes_seen: + nodes_seen.add(downstream_node) + nodes.append(downstream_node) + i += 1 + return list( + filter(lambda node: node in nodes_seen, + self.topological_sort(graph=graph))) + + def all_leaves(self, graph=None): + """Return a list of all leaves (nodes with no downstreams).""" + if graph is None: + graph = self.graph + return [key for key in graph if not graph[key]] + + def from_dict(self, graph_dict): + """Reset the graph and build it from the passed dictionary. + + The dictionary takes the form of {node_name: [directed edges]} + """ + self.reset_graph() + for new_node in graph_dict.keys(): + self.add_node(new_node) + for ind_node, dep_nodes in graph_dict.items(): + if not isinstance(dep_nodes, list): + raise TypeError('dict values must be lists') + for dep_node in dep_nodes: + self.add_edge(ind_node, dep_node) + + def reset_graph(self): + """Restore the graph to an empty state.""" + self.graph = OrderedDict() + + def independent_nodes(self, graph=None): + """Returns a list of all nodes in the graph with no dependencies.""" + if graph is None: + graph = self.graph + + dependent_nodes = set( + node for dependents in graph.values() for node in dependents) + return [node for node in graph.keys() if node not in dependent_nodes] + + def validate(self, graph=None): + """Returns (Boolean, message) of whether DAG is valid.""" + graph = graph if graph is not None else self.graph + if len(self.independent_nodes(graph)) == 0: + return False, 'no independent nodes detected' + try: + self.topological_sort(graph) + except ValueError: + return False, 'failed topological sort' + return True, 'valid' + + def topological_sort(self, graph=None): + """Returns a topological ordering of the DAG. + + Raises an error if this is not possible (graph is not valid). + """ + if graph is None: + graph = self.graph + result = [] + in_degree = defaultdict(lambda: 0) + + for u in graph: + for v in graph[u]: + in_degree[v] += 1 + ready = [node for node in graph if not in_degree[node]] + + while ready: + u = ready.pop() + result.append(u) + for v in graph[u]: + in_degree[v] -= 1 + if in_degree[v] == 0: + ready.append(v) + + if len(result) == len(graph): + return result + else: + raise ValueError('graph is not acyclic') + + def size(self): + return len(self.graph) diff --git a/easy_rec/python/utils/load_class.py b/easy_rec/python/utils/load_class.py index 2da1e4e41..9ac749c76 100644 --- a/easy_rec/python/utils/load_class.py +++ b/easy_rec/python/utils/load_class.py @@ -220,3 +220,30 @@ def create_class(cls, name): return newclass return RegisterABCMeta + + +def load_keras_layer(name): + """Load keras layer class. + + Args: + name: keras layer name + + Return: + (layer_class, is_customize) + """ + name = name.strip() + if name == '' or name is None: + return None + + path = 'easy_rec.python.layers.keras.' + name + try: + cls = pydoc.locate(path) + if cls is not None: + return cls, True + path = 'tensorflow.keras.layers.' + name + return pydoc.locate(path), False + except pydoc.ErrorDuringImport: + print('load keras layer %s failed' % name) + logging.error('load keras layer %s failed: %s' % + (name, traceback.format_exc())) + return None, False diff --git a/easy_rec/version.py b/easy_rec/version.py index 08d341768..5d3a16322 100644 --- a/easy_rec/version.py +++ b/easy_rec/version.py @@ -1,3 +1,3 @@ # -*- encoding:utf-8 -*- # Copyright (c) Alibaba, Inc. and its affiliates. -__version__ = '0.6.4' +__version__ = '0.7.0' diff --git a/examples/configs/dcn_backbone_on_movielens.config b/examples/configs/dcn_backbone_on_movielens.config new file mode 100644 index 000000000..2717a5c4c --- /dev/null +++ b/examples/configs/dcn_backbone_on_movielens.config @@ -0,0 +1,203 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/dcn_on_movieslen" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: false +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_name: 'DCN v2' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "deep" + inputs { + feature_group_name: 'all' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + blocks { + name: "dcn" + inputs { + feature_group_name: 'all' + input_fn: 'lambda x: [x, x]' + } + recurrent { + num_steps: 3 + fixed_input_index: 0 + keras_layer { + class_name: 'Cross' + } + } + } + concat_blocks: ['deep', 'dcn'] + top_mlp { + hidden_units: [64, 32, 16] + } + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +export_config { + multi_placeholder: false +} diff --git a/examples/configs/deepfm_backbone_on_criteo.config b/examples/configs/deepfm_backbone_on_criteo.config new file mode 100644 index 000000000..25fc5cfc6 --- /dev/null +++ b/examples/configs/deepfm_backbone_on_criteo.config @@ -0,0 +1,643 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/deepfm_backbone_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + embedding_dim: 16 + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + embedding_dim: 16 + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + embedding_dim: 16 + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: "deep_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide_features' + inputs { + feature_group_name: 'wide_features' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'wide_logit' + inputs { + block_name: 'wide_features' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'deep_features' + inputs { + feature_group_name: 'deep_features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'deep_features' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + st_params { + fields { + key: 'use_variant' + value { bool_value: true } + } + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'deep_features' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/deepfm_backbone_on_criteo_with_autodis.config b/examples/configs/deepfm_backbone_on_criteo_with_autodis.config new file mode 100644 index 000000000..e0c6ccb43 --- /dev/null +++ b/examples/configs/deepfm_backbone_on_criteo_with_autodis.config @@ -0,0 +1,759 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/deepfm_autodis_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + feature_name: "D1" + input_names: "F1" + embedding_dim:16 + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + feature_name: "D2" + input_names: "F2" + embedding_dim:16 + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + feature_name: "D3" + input_names: "F3" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + feature_name: "D4" + input_names: "F4" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + feature_name: "D5" + input_names: "F5" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + feature_name: "D6" + input_names: "F6" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + feature_name: "D7" + input_names: "F7" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + feature_name: "D8" + input_names: "F8" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + feature_name: "D9" + input_names: "F9" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + feature_name: "D10" + input_names: "F10" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + feature_name: "D11" + input_names: "F11" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + feature_name: "D12" + input_names: "F12" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + feature_name: "D13" + input_names: "F13" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } +} +model_config: { + model_name: 'DeepFM with AutoDis' + model_class: 'RankModel' + feature_groups: { + group_name: "numerical_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "categorical_features" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide_features" + feature_names: "D1" + feature_names: "D2" + feature_names: "D3" + feature_names: "D4" + feature_names: "D5" + feature_names: "D6" + feature_names: "D7" + feature_names: "D8" + feature_names: "D9" + feature_names: "D10" + feature_names: "D11" + feature_names: "D12" + feature_names: "D13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide_features' + inputs { + feature_group_name: 'wide_features' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'wide_logit' + inputs { + block_name: 'wide_features' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'numerical_features' + } + keras_layer { + class_name: 'AutoDisEmbedding' + auto_dis_embedding { + embedding_dim: 16 + num_bins: 20 + temperature: 0.815 + output_tensor_list: true + } + } + } + blocks { + name: 'categorical_features' + inputs { + feature_group_name: 'categorical_features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'categorical_features' + input_slice: '[1]' + } + inputs { + block_name: 'num_emb' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'categorical_features' + input_slice: '[0]' + } + inputs { + block_name: 'num_emb' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + # no wide_logit may have better performance + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/deepfm_backbone_on_criteo_with_periodic.config b/examples/configs/deepfm_backbone_on_criteo_with_periodic.config new file mode 100644 index 000000000..06753ad2c --- /dev/null +++ b/examples/configs/deepfm_backbone_on_criteo_with_periodic.config @@ -0,0 +1,757 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/deepfm_periodic_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + feature_name: "D1" + input_names: "F1" + embedding_dim:16 + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + feature_name: "D2" + input_names: "F2" + embedding_dim:16 + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + feature_name: "D3" + input_names: "F3" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + feature_name: "D4" + input_names: "F4" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + feature_name: "D5" + input_names: "F5" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + feature_name: "D6" + input_names: "F6" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + feature_name: "D7" + input_names: "F7" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + feature_name: "D8" + input_names: "F8" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + feature_name: "D9" + input_names: "F9" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + feature_name: "D10" + input_names: "F10" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + feature_name: "D11" + input_names: "F11" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + feature_name: "D12" + input_names: "F12" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + feature_name: "D13" + input_names: "F13" + embedding_dim:16 + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } +} +model_config: { + model_name: 'DeepFM with Periodic' + model_class: 'RankModel' + feature_groups: { + group_name: "numerical_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "categorical_features" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide_features" + feature_names: "D1" + feature_names: "D2" + feature_names: "D3" + feature_names: "D4" + feature_names: "D5" + feature_names: "D6" + feature_names: "D7" + feature_names: "D8" + feature_names: "D9" + feature_names: "D10" + feature_names: "D11" + feature_names: "D12" + feature_names: "D13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide_features' + inputs { + feature_group_name: 'wide_features' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'wide_logit' + inputs { + block_name: 'wide_features' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'numerical_features' + } + keras_layer { + class_name: 'PeriodicEmbedding' + periodic_embedding { + embedding_dim: 16 + sigma: 0.005 + output_tensor_list: true + } + } + } + blocks { + name: 'categorical_features' + inputs { + feature_group_name: 'categorical_features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'categorical_features' + input_slice: '[1]' + } + inputs { + block_name: 'num_emb' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'categorical_features' + input_slice: '[0]' + } + inputs { + block_name: 'num_emb' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/deepfm_backbone_on_movielens.config b/examples/configs/deepfm_backbone_on_movielens.config new file mode 100644 index 000000000..56f210b10 --- /dev/null +++ b/examples/configs/deepfm_backbone_on_movielens.config @@ -0,0 +1,246 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/deepfm_backbone_movieslen" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [8, 4, 4] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_name: 'DeepFM' + model_class: 'RankModel' + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + feature_names: 'title' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'features' + inputs { + feature_group_name: 'features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'features' + input_slice: '[1]' + } + keras_layer { + class_name: 'FM' + } + } + blocks { + name: 'deep' + inputs { + block_name: 'features' + input_slice: '[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'add' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + inputs { + block_name: 'fm' + } + inputs { + block_name: 'deep' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'add' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} +export_config { + multi_placeholder: false +} diff --git a/examples/configs/dlrm_backbone_on_criteo.config b/examples/configs/dlrm_backbone_on_criteo.config new file mode 100644 index 000000000..bb7b2a92f --- /dev/null +++ b/examples/configs/dlrm_backbone_on_criteo.config @@ -0,0 +1,578 @@ +# align with raw dlrm model +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/dlrm_backbone_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_name: 'DLRM' + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + backbone { + blocks { + name: 'bottom_mlp' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'bottom_mlp' + input_fn: 'lambda x: [x]' + } + inputs { + block_name: 'sparse' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_fn: 'lambda x: x[0]' + } + } + concat_blocks: ['sparse_2d', 'dot'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/dlrm_on_criteo.config b/examples/configs/dlrm_on_criteo.config new file mode 100644 index 000000000..e6c45d574 --- /dev/null +++ b/examples/configs/dlrm_on_criteo.config @@ -0,0 +1,534 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/dlrm_criteo_ckpt" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_class: 'DLRM' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + dlrm { + bot_dnn { + hidden_units: [64, 32, 16] + } + top_dnn { + hidden_units: [256, 128, 64] + } + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/dlrm_on_criteo_with_autodis.config b/examples/configs/dlrm_on_criteo_with_autodis.config new file mode 100644 index 000000000..53de6a279 --- /dev/null +++ b/examples/configs/dlrm_on_criteo_with_autodis.config @@ -0,0 +1,587 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/dlrm_autodis_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_name: 'DLRM with autodis' + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + backbone { + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'AutoDisEmbedding' + auto_dis_embedding { + embedding_dim: 16 + num_bins: 40 + temperature: 0.815 + output_tensor_list: true + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'num_emb' + input_slice: '[1]' + } + inputs { + block_name: 'sparse' + input_slice: '[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_slice: '[0]' + } + } + blocks { + name: 'num_emb_2d' + inputs { + block_name: 'num_emb' + input_slice: '[0]' + } + } + concat_blocks: ['num_emb_2d', 'dot', 'sparse_2d'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/dlrm_on_criteo_with_periodic.config b/examples/configs/dlrm_on_criteo_with_periodic.config new file mode 100644 index 000000000..36c120e95 --- /dev/null +++ b/examples/configs/dlrm_on_criteo_with_periodic.config @@ -0,0 +1,595 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/dlrm_periodic_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_name: 'dlrm with periodic' + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + backbone { + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'PeriodicEmbedding' + st_params { + fields { + key: "output_tensor_list" + value { bool_value: true } + } + fields { + key: "embedding_dim" + value { number_value: 16 } + } + fields { + key: "sigma" + value { number_value: 0.005 } + } + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'num_emb' + input_slice: '[1]' + } + inputs { + block_name: 'sparse' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'DotInteraction' + } + } + blocks { + name: 'sparse_2d' + inputs { + block_name: 'sparse' + input_slice: '[0]' + } + } + blocks { + name: 'num_emb_2d' + inputs { + block_name: 'num_emb' + input_fn: 'lambda x: x[0]' + } + } + concat_blocks: ['num_emb_2d', 'dot', 'sparse_2d'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/dlrm_standard_on_criteo.config b/examples/configs/dlrm_standard_on_criteo.config new file mode 100644 index 000000000..720560693 --- /dev/null +++ b/examples/configs/dlrm_standard_on_criteo.config @@ -0,0 +1,568 @@ +train_input_path: "examples/data/criteo/criteo_train_data" +eval_input_path: "examples/data/criteo/criteo_test_data" +model_dir: "examples/ckpt/dlrm_standard_criteo" + +train_config { + log_step_count_steps: 500 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 20000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: STRING + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: STRING + default_val:"" + } + label_fields: "label" + + batch_size: 4096 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + }features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 16 + } +} +model_config: { + model_name: 'Stardard DLRM' + model_class: 'RankModel' + feature_groups: { + group_name: "dense" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "sparse" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + backbone { + blocks { + name: 'bottom_mlp' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + only_output_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'bottom_mlp' + } + inputs { + block_name: 'sparse' + } + keras_layer { + class_name: 'DotInteraction' + } + } + concat_blocks: ['bottom_mlp', 'dot'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/examples/configs/fibinet_on_movielens.config b/examples/configs/fibinet_on_movielens.config new file mode 100644 index 000000000..9c583354b --- /dev/null +++ b/examples/configs/fibinet_on_movielens.config @@ -0,0 +1,204 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/fibinet_on_movieslen_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: False +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_name: 'FiBiNet' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "all" + inputs { + feature_group_name: "all" + } + input_layer { + do_batch_norm: true + only_output_feature_list: true + } + } + blocks { + name: "fibinet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'FiBiNet' + fibinet { + senet { + reduction_ratio: 4 + } + bilinear { + type: 'each' + num_output_units: 512 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['fibinet'] + } + model_params { + } + embedding_regularization: 1e-4 +} +export_config { + multi_placeholder: false +} diff --git a/examples/configs/masknet_on_movielens.config b/examples/configs/masknet_on_movielens.config new file mode 100644 index 000000000..2d3c0da21 --- /dev/null +++ b/examples/configs/masknet_on_movielens.config @@ -0,0 +1,200 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/masknet_on_movieslen_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_name: 'MaskNet' + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['mask_net'] + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-4 +} +export_config { + multi_placeholder: false +} diff --git a/examples/configs/mlp_on_movielens.config b/examples/configs/mlp_on_movielens.config new file mode 100644 index 000000000..4660125a3 --- /dev/null +++ b/examples/configs/mlp_on_movielens.config @@ -0,0 +1,239 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/mlp_movieslen" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_class: "RankModel" + feature_groups: { + group_name: 'features' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: 'mlp' + inputs { + feature_group_name: 'features' + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 256 } + } + fields { + key: 'activation' + value: { string_value: 'relu' } + } + } + } + } + layers { + keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 256 } + } + fields { + key: 'activation' + value: { string_value: 'relu' } + } + } + } + } + layers { + keras_layer { + class_name: 'Dropout' + st_params { + fields { + key: 'rate' + value: { number_value: 0.5 } + } + } + } + } + layers { + keras_layer { + class_name: 'Dense' + st_params { + fields { + key: 'units' + value: { number_value: 1 } + } + } + } + } + } + concat_blocks: 'mlp' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} diff --git a/examples/configs/multi_tower_on_movielens.config b/examples/configs/multi_tower_on_movielens.config new file mode 100644 index 000000000..25cad1309 --- /dev/null +++ b/examples/configs/multi_tower_on_movielens.config @@ -0,0 +1,221 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/multi_tower_movieslen" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_name: "multi tower" + model_class: "RankModel" + feature_groups: { + group_name: 'user' + feature_names: 'user_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + wide_deep: DEEP + } + feature_groups: { + group_name: 'item' + feature_names: 'movie_id' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + packages { + name: 'user_tower' + blocks { + name: 'mlp' + inputs { + feature_group_name: 'user' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: 'item_tower' + blocks { + name: 'mlp' + inputs { + feature_group_name: 'item' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128] + } + } + } + } + blocks { + name: 'top_mlp' + inputs { + package_name: 'user_tower' + } + inputs { + package_name: 'item_tower' + } + layers { + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [128, 64] + } + } + } + } + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} diff --git a/examples/configs/wide_and_deep_backbone_on_movielens.config b/examples/configs/wide_and_deep_backbone_on_movielens.config new file mode 100644 index 000000000..d3c069611 --- /dev/null +++ b/examples/configs/wide_and_deep_backbone_on_movielens.config @@ -0,0 +1,219 @@ +train_input_path: "examples/data/movielens_1m/movies_train_data" +eval_input_path: "examples/data/movielens_1m/movies_test_data" +model_dir: "examples/ckpt/wide_and_deep_movieslen" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 2000 + sync_replicas: True +} + +eval_config { + metrics_set: { + auc {} + } + metrics_set: { + gauc { + uid_field: 'user_id' + } + } + metrics_set: { + max_f1 {} + } +} + +data_config { + input_fields { + input_name:'label' + input_type: INT32 + } + input_fields { + input_name:'user_id' + input_type: INT32 + } + input_fields { + input_name: 'movie_id' + input_type: INT32 + } + input_fields { + input_name:'rating' + input_type: INT32 + } + input_fields { + input_name: 'gender' + input_type: INT32 + } + input_fields { + input_name: 'age' + input_type: INT32 + } + input_fields { + input_name: 'job_id' + input_type: INT32 + } + input_fields { + input_name: 'zip_id' + input_type: STRING + } + input_fields { + input_name: 'title' + input_type: STRING + } + input_fields { + input_name: 'genres' + input_type: STRING + } + input_fields { + input_name: 'year' + input_type: INT32 + } + + label_fields: 'label' + batch_size: 1024 + num_epochs: 1 + prefetch_size: 32 + input_type: CSVInput + separator: '\t' +} + +feature_config: { + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 12000 + } + features: { + input_names: 'movie_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 6000 + } + features: { + input_names: 'gender' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 2 + } + features: { + input_names: 'job_id' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 21 + } + features: { + input_names: 'age' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 7 + } + features: { + input_names: 'genres' + feature_type: TagFeature + separator: '|' + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'title' + feature_type: SequenceFeature + separator: ' ' + embedding_dim: 16 + hash_bucket_size: 10000 + sequence_combiner: { + text_cnn: { + filter_sizes: [2, 3, 4] + num_filters: [16, 8, 8] + } + } + } + features: { + input_names: 'year' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 36 + } +} +model_config: { + model_class: "RankModel" + feature_groups: { + group_name: 'wide' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: WIDE + } + feature_groups: { + group_name: 'deep' + feature_names: 'user_id' + feature_names: 'movie_id' + feature_names: 'job_id' + feature_names: 'age' + feature_names: 'gender' + feature_names: 'year' + feature_names: 'genres' + wide_deep: DEEP + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + wide_output_dim: 1 + only_output_feature_list: true + } + } + blocks { + name: 'deep_logit' + inputs { + feature_group_name: 'deep' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 256, 256, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'final_logit' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.add_n(x)' + } + inputs { + block_name: 'deep_logit' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'final_logit' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-4 +} diff --git a/examples/data/criteo/process_criteo_kaggle.py b/examples/data/criteo/process_criteo_kaggle.py index 60b7d9776..5b9cb4f34 100644 --- a/examples/data/criteo/process_criteo_kaggle.py +++ b/examples/data/criteo/process_criteo_kaggle.py @@ -11,8 +11,9 @@ samples_num = data_train.shape[0] print('samples_num:', samples_num, round(samples_num * 0.9)) -data_train[:round(samples_num * 0.9)].to_csv( +train_num = int(round(samples_num * 0.9)) +data_train[:train_num].to_csv( r'criteo_train_data', index=False, sep='\t', mode='a', header=False) -data_train[round(samples_num * 0.9):].to_csv( +data_train[train_num:].to_csv( r'criteo_test_data', index=False, sep='\t', mode='a', header=False) print('Done.') diff --git a/examples/rank_model/readme.md b/examples/rank_model/readme.md index 15d3f4dca..f6a2ba791 100644 --- a/examples/rank_model/readme.md +++ b/examples/rank_model/readme.md @@ -32,10 +32,12 @@ | MovieLens-1M | DeepFM | 0.8688 | | MovieLens-1M | DCN | 0.8576 | | MovieLens-1M | AutoInt | 0.8513 | +| MovieLens-1M | MaskNet | 0.8872 | +| MovieLens-1M | FibiNet | 0.8879 | # Criteo Research Kaggle 数据集 -在MovieLens-1M 数据集中, 我们提供了2个模型上的demo示例。 +在 `Criteo Research Kaggle` 数据集中, 我们提供了2个模型上的demo示例。 [FM](fm.md) / [DeepFM](deepfm.md) diff --git a/examples/readme.md b/examples/readme.md index 9688ba29d..71117b5e0 100644 --- a/examples/readme.md +++ b/examples/readme.md @@ -100,14 +100,22 @@ EasyRec的模型训练和评估都是基于config配置文件的,配置文件 - [deepfm_on_movielens.config](configs/deepfm_on_movielens.config) +- [deepfm_backbone_on_movielens.config](configs/deepfm_backbone_on_movielens.config) + - [dcn_on_movielens.config](configs/dcn_on_movielens.config) - [autoint_on_movielens.config](configs/autoint_on_movielens.config) +- [masknet_on_movielens.config](configs/masknet_on_movielens.config) + +- [fibinet_on_movielens.config](configs/fibinet_on_movielens.config) + - [fm_on_criteo.config](configs/fm_on_criteo.config) - [deepfm_on_criteo.config](configs/deepfm_on_criteo.config) +- [deepfm_backbone_on_criteo.config](configs/deepfm_backbone_on_criteo.config) + **召回任务** - [dssm_on_books.config](configs/dssm_on_books.config) @@ -228,19 +236,36 @@ python -m easy_rec.python.train_eval --pipeline_config_path examples/configs/dee - MovieLens-1M - | Model | Epoch | AUC | - | --------- | ----- | ------ | - | Wide&Deep | 1 | 0.8558 | - | DeepFM | 1 | 0.8688 | - | DCN | 1 | 0.8576 | - | AutoInt | 1 | 0.8513 | + | Model | Epoch | AUC | + | -------------------- | ----- | ------ | + | MLP | 1 | 0.8616 | + | Wide&Deep | 1 | 0.8558 | + | Wide&Deep(Backbone) | 1 | 0.8854 | + | MultiTower(Backbone) | 1 | 0.8814 | + | DeepFM | 1 | 0.8867 | + | DeepFM(Backbone) | 1 | 0.8872 | + | DCN | 1 | 0.8576 | + | DCN_v2 | 1 | 0.8770 | + | AutoInt | 1 | 0.8513 | + | MaskNet | 1 | 0.8872 | + | FibiNet | 1 | 0.8893 | + + 备注:`MovieLens-1M` 数据集较小,评估指标方差较大,以上结果仅供参考。 - Criteo-Research - | Model | Epoch | AUC | - | ------ | ----- | ------ | - | FM | 1 | 0.7577 | - | DeepFM | 1 | 0.7967 | + | Model | Epoch | AUC | + | ----------------- | ----- | ------- | + | FM | 1 | 0.7577 | + | DeepFM | 1 | 0.7970 | + | DeepFM (backbone) | 1 | 0.7970 | + | DeepFM (periodic) | 1 | 0.7979 | + | DeepFM (autodis) | 1 | 0.7982 | + | DLRM | 1 | 0.79785 | + | DLRM (backbone) | 1 | 0.7983 | + | DLRM (standard) | 1 | 0.7949 | + | DLRM (autodis) | 1 | 0.7989 | + | DLRM (periodic) | 1 | 0.7998 | ### 召回模型 diff --git a/samples/model_config/dbmtl_backbone_on_taobao.config b/samples/model_config/dbmtl_backbone_on_taobao.config new file mode 100644 index 000000000..aafe5a9ef --- /dev/null +++ b/samples/model_config/dbmtl_backbone_on_taobao.config @@ -0,0 +1,316 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/dbmtl_backbone_taobao_ckpt" + +train_config { + optimizer_config { + adam_optimizer { + learning_rate { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 1e-07 + } + } + } + use_moving_average: false + } + num_steps: 200 + sync_replicas: true + save_checkpoints_steps: 100 + log_step_count_steps: 100 +} +eval_config { + metrics_set { + auc { + } + } +} +data_config { + batch_size: 4096 + label_fields: "clk" + label_fields: "buy" + prefetch_size: 32 + input_type: CSVInput + input_fields { + input_name: "clk" + input_type: INT32 + } + input_fields { + input_name: "buy" + input_type: INT32 + } + input_fields { + input_name: "pid" + input_type: STRING + } + input_fields { + input_name: "adgroup_id" + input_type: STRING + } + input_fields { + input_name: "cate_id" + input_type: STRING + } + input_fields { + input_name: "campaign_id" + input_type: STRING + } + input_fields { + input_name: "customer" + input_type: STRING + } + input_fields { + input_name: "brand" + input_type: STRING + } + input_fields { + input_name: "user_id" + input_type: STRING + } + input_fields { + input_name: "cms_segid" + input_type: STRING + } + input_fields { + input_name: "cms_group_id" + input_type: STRING + } + input_fields { + input_name: "final_gender_code" + input_type: STRING + } + input_fields { + input_name: "age_level" + input_type: STRING + } + input_fields { + input_name: "pvalue_level" + input_type: STRING + } + input_fields { + input_name: "shopping_level" + input_type: STRING + } + input_fields { + input_name: "occupation" + input_type: STRING + } + input_fields { + input_name: "new_user_class_level" + input_type: STRING + } + input_fields { + input_name: "tag_category_list" + input_type: STRING + } + input_fields { + input_name: "tag_brand_list" + input_type: STRING + } + input_fields { + input_name: "price" + input_type: INT32 + } +} +feature_config: { + features { + input_names: "pid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "adgroup_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cate_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features { + input_names: "campaign_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "customer" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "brand" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "user_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cms_segid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "cms_group_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "final_gender_code" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "age_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "pvalue_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "shopping_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "occupation" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "new_user_class_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "tag_category_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "tag_brand_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "price" + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config { + model_name: "DBMTL" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + feature_names: "cms_group_id" + feature_names: "age_level" + feature_names: "pvalue_level" + feature_names: "shopping_level" + feature_names: "occupation" + feature_names: "new_user_class_level" + feature_names: "adgroup_id" + feature_names: "cate_id" + feature_names: "campaign_id" + feature_names: "customer" + feature_names: "brand" + feature_names: "price" + feature_names: "pid" + feature_names: "tag_category_list" + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64, 32] + } + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + task_towers { + tower_name: "cvr" + label_name: "buy" + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + dnn { + hidden_units: [256, 128, 64, 32] + } + relation_tower_names: ["ctr"] + relation_dnn { + hidden_units: [32] + } + weight: 1.0 + } + l2_regularization: 1e-6 + } + embedding_regularization: 5e-6 +} diff --git a/samples/model_config/dcn_backbone_on_taobao.config b/samples/model_config/dcn_backbone_on_taobao.config new file mode 100644 index 000000000..86f4e2462 --- /dev/null +++ b/samples/model_config/dcn_backbone_on_taobao.config @@ -0,0 +1,291 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/dcn_backbone_taobao_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 100 + sync_replicas: True + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + input_fields { + input_name:'clk' + input_type: INT32 + } + input_fields { + input_name:'buy' + input_type: INT32 + } + input_fields { + input_name: 'pid' + input_type: STRING + } + input_fields { + input_name: 'adgroup_id' + input_type: STRING + } + input_fields { + input_name: 'cate_id' + input_type: STRING + } + input_fields { + input_name: 'campaign_id' + input_type: STRING + } + input_fields { + input_name: 'customer' + input_type: STRING + } + input_fields { + input_name: 'brand' + input_type: STRING + } + input_fields { + input_name: 'user_id' + input_type: STRING + } + input_fields { + input_name: 'cms_segid' + input_type: STRING + } + input_fields { + input_name: 'cms_group_id' + input_type: STRING + } + input_fields { + input_name: 'final_gender_code' + input_type: STRING + } + input_fields { + input_name: 'age_level' + input_type: STRING + } + input_fields { + input_name: 'pvalue_level' + input_type: STRING + } + input_fields { + input_name: 'shopping_level' + input_type: STRING + } + input_fields { + input_name: 'occupation' + input_type: STRING + } + input_fields { + input_name: 'new_user_class_level' + input_type: STRING + } + input_fields { + input_name: 'tag_category_list' + input_type: STRING + } + input_fields { + input_name: 'tag_brand_list' + input_type: STRING + } + input_fields { + input_name: 'price' + input_type: INT32 + } + + label_fields: 'clk' + batch_size: 4096 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: 'pid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'adgroup_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cate_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: 'campaign_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'customer' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'brand' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cms_segid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'cms_group_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'final_gender_code' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'age_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'pvalue_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'shopping_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'occupation' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'new_user_class_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'tag_category_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'tag_brand_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'price' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config: { + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'occupation' + feature_names: 'new_user_class_level' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'price' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + backbone { + blocks { + name: "deep" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128, 64] + } + } + } + blocks { + name: "cross" + inputs { + feature_group_name: "all" + input_fn: "lambda x: [x, x]" + } + recurrent { + num_steps: 5 + fixed_input_index: 0 + keras_layer { + class_name: "Cross" + } + } + } + concat_blocks: ['deep', 'cross'] + top_mlp { + hidden_units: [64, 32, 16] + } + } + model_params { + l2_regularization: 1e-6 + } + embedding_regularization: 1e-4 +} diff --git a/samples/model_config/deepfm_on_criteo_with_autodis.config b/samples/model_config/deepfm_on_criteo_with_autodis.config new file mode 100755 index 000000000..41975c090 --- /dev/null +++ b/samples/model_config/deepfm_on_criteo_with_autodis.config @@ -0,0 +1,786 @@ +train_input_path: "data/test/criteo_sample.tfrecord" +eval_input_path: "data/test/criteo_sample.tfrecord" + +model_dir: "experiments/deepfm_with_autodis" + +train_config { + log_step_count_steps: 20 + # fine_tune_checkpoint: "" + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.0001 + decay_steps: 10000 + decay_factor: 0.5 + min_learning_rate: 0.0000001 + } + } + } + use_moving_average: false + } + + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: INT64 + default_val:"" + } + label_fields: "label" + + batch_size: 8096 + num_epochs: 10000 + prefetch_size: 32 + input_type: TFRecordInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + feature_name: "D1" + input_names: "F1" + embedding_dim:10 + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + feature_name: "D2" + input_names: "F2" + embedding_dim:10 + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + feature_name: "D3" + input_names: "F3" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + feature_name: "D4" + input_names: "F4" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + feature_name: "D5" + input_names: "F5" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + feature_name: "D6" + input_names: "F6" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + feature_name: "D7" + input_names: "F7" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + feature_name: "D8" + input_names: "F8" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + feature_name: "D9" + input_names: "F9" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + feature_name: "D10" + input_names: "F10" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + feature_name: "D11" + input_names: "F11" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + feature_name: "D12" + input_names: "F12" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + feature_name: "D13" + input_names: "F13" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } +} +model_config:{ + model_class: 'RankModel' + feature_groups: { + group_name: "numerical_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "categorical_features" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide_features" + feature_names: "D1" + feature_names: "D2" + feature_names: "D3" + feature_names: "D4" + feature_names: "D5" + feature_names: "D6" + feature_names: "D7" + feature_names: "D8" + feature_names: "D9" + feature_names: "D10" + feature_names: "D11" + feature_names: "D12" + feature_names: "D13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide_features' + inputs { + feature_group_name: 'wide_features' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'wide_logit' + inputs { + block_name: 'wide_features' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'numerical_features' + } + keras_layer { + class_name: 'AutoDisEmbedding' + auto_dis_embedding { + embedding_dim: 10 + num_bins: 20 + temperature: 0.815 + output_tensor_list: true + } + } + } + blocks { + name: 'categorical_features' + inputs { + feature_group_name: 'categorical_features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'categorical_features' + input_fn: 'lambda x: x[1]' + } + inputs { + block_name: 'num_emb' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'categorical_features' + input_fn: 'lambda x: x[0]' + } + inputs { + block_name: 'num_emb' + input_fn: 'lambda x: x[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/samples/model_config/deepfm_on_criteo_with_periodic.config b/samples/model_config/deepfm_on_criteo_with_periodic.config new file mode 100755 index 000000000..081fbf2cf --- /dev/null +++ b/samples/model_config/deepfm_on_criteo_with_periodic.config @@ -0,0 +1,785 @@ +train_input_path: "data/test/criteo_sample.tfrecord" +eval_input_path: "data/test/criteo_sample.tfrecord" + +model_dir: "experiments/deepfm_with_periodic" + +train_config { + log_step_count_steps: 20 + # fine_tune_checkpoint: "" + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.0001 + decay_steps: 10000 + decay_factor: 0.5 + min_learning_rate: 0.0000001 + } + } + } + use_moving_average: false + } + + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "\t" + input_fields: { + input_name: "label" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F1" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F2" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F3" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F4" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F5" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F6" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F7" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F8" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F9" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F10" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F11" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F12" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "F13" + input_type: FLOAT + default_val:"0" + } + input_fields: { + input_name: "C1" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C2" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C3" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C4" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C5" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C6" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C7" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C8" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C9" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C10" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C11" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C12" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C13" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C14" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C15" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C16" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C17" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C18" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C19" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C20" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C21" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C22" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C23" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C24" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C25" + input_type: INT64 + default_val:"" + } + input_fields: { + input_name: "C26" + input_type: INT64 + default_val:"" + } + label_fields: "label" + + batch_size: 8096 + num_epochs: 10000 + prefetch_size: 32 + input_type: TFRecordInput +} + +feature_config: { + features: { + input_names: "F1" + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + input_names: "F2" + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + input_names: "F3" + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + input_names: "F4" + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + input_names: "F5" + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + input_names: "F6" + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + input_names: "F7" + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + input_names: "F8" + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + input_names: "F9" + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + input_names: "F10" + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + input_names: "F11" + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + input_names: "F12" + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + input_names: "F13" + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } + features: { + input_names: "C1" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C2" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C3" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C4" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C5" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C6" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C7" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C8" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C9" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C10" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C11" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C12" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C13" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C14" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C15" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C16" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C17" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C18" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C19" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C20" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C21" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C22" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C23" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C24" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C25" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + input_names: "C26" + hash_bucket_size: 1000000 + feature_type: IdFeature + embedding_dim: 10 + embedding_name: "vocab_embed" + } + features: { + feature_name: "D1" + input_names: "F1" + embedding_dim:10 + feature_type: RawFeature + min_val:0.0 + max_val: 5775.0 + } + features: { + feature_name: "D2" + input_names: "F2" + embedding_dim:10 + feature_type: RawFeature + min_val: -3.0 + max_val: 257675.0 + } + features: { + feature_name: "D3" + input_names: "F3" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 65535.0 + } + features: { + feature_name: "D4" + input_names: "F4" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 969.0 + } + features: { + feature_name: "D5" + input_names: "F5" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 23159456.0 + } + features: { + feature_name: "D6" + input_names: "F6" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 431037.0 + } + features: { + feature_name: "D7" + input_names: "F7" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 56311.0 + } + features: { + feature_name: "D8" + input_names: "F8" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 6047.0 + } + features: { + feature_name: "D9" + input_names: "F9" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 29019.0 + } + features: { + feature_name: "D10" + input_names: "F10" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 46.0 + } + features: { + feature_name: "D11" + input_names: "F11" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 231.0 + } + features: { + feature_name: "D12" + input_names: "F12" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 4008.0 + } + features: { + feature_name: "D13" + input_names: "F13" + embedding_dim:10 + feature_type: RawFeature + min_val: 0.0 + max_val: 7393.0 + } +} +model_config:{ + model_class: 'RankModel' + feature_groups: { + group_name: "numerical_features" + feature_names: "F1" + feature_names: "F2" + feature_names: "F3" + feature_names: "F4" + feature_names: "F5" + feature_names: "F6" + feature_names: "F7" + feature_names: "F8" + feature_names: "F9" + feature_names: "F10" + feature_names: "F11" + feature_names: "F12" + feature_names: "F13" + wide_deep:DEEP + } + feature_groups: { + group_name: "categorical_features" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide_features" + feature_names: "D1" + feature_names: "D2" + feature_names: "D3" + feature_names: "D4" + feature_names: "D5" + feature_names: "D6" + feature_names: "D7" + feature_names: "D8" + feature_names: "D9" + feature_names: "D10" + feature_names: "D11" + feature_names: "D12" + feature_names: "D13" + feature_names: "C1" + feature_names: "C2" + feature_names: "C3" + feature_names: "C4" + feature_names: "C5" + feature_names: "C6" + feature_names: "C7" + feature_names: "C8" + feature_names: "C9" + feature_names: "C10" + feature_names: "C11" + feature_names: "C12" + feature_names: "C13" + feature_names: "C14" + feature_names: "C15" + feature_names: "C16" + feature_names: "C17" + feature_names: "C18" + feature_names: "C19" + feature_names: "C20" + feature_names: "C21" + feature_names: "C22" + feature_names: "C23" + feature_names: "C24" + feature_names: "C25" + feature_names: "C26" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide_features' + inputs { + feature_group_name: 'wide_features' + } + input_layer { + wide_output_dim: 1 + } + } + blocks { + name: 'wide_logit' + inputs { + block_name: 'wide_features' + } + lambda { + expression: 'lambda x: tf.reduce_sum(x, axis=1, keepdims=True)' + } + } + blocks { + name: 'num_emb' + inputs { + feature_group_name: 'numerical_features' + } + keras_layer { + class_name: 'PeriodicEmbedding' + periodic_embedding { + embedding_dim: 10 + sigma: 0.005 + output_tensor_list: true + } + } + } + blocks { + name: 'categorical_features' + inputs { + feature_group_name: 'categorical_features' + } + input_layer { + output_2d_tensor_and_feature_list: true + } + } + blocks { + name: 'fm' + inputs { + block_name: 'categorical_features' + input_fn: 'lambda x: x[1]' + } + inputs { + block_name: 'num_emb' + input_fn: 'lambda x: x[1]' + } + keras_layer { + class_name: 'FM' + fm { + use_variant: true + } + } + } + blocks { + name: 'deep' + inputs { + block_name: 'categorical_features' + input_fn: 'lambda x: x[0]' + } + inputs { + block_name: 'num_emb' + input_fn: 'lambda x: x[0]' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 128, 64] + } + } + } + concat_blocks: ['wide_logit', 'fm', 'deep'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + l2_regularization: 1e-5 + } + embedding_regularization: 1e-5 +} diff --git a/samples/model_config/dlrm_backbone_on_taobao.config b/samples/model_config/dlrm_backbone_on_taobao.config new file mode 100644 index 000000000..a66f1d190 --- /dev/null +++ b/samples/model_config/dlrm_backbone_on_taobao.config @@ -0,0 +1,299 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/dlrm_backbone_taobao_ckpt" + +train_config { + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 100 + log_step_count_steps: 10 + sync_replicas: true + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + input_fields { + input_name:'clk' + input_type: INT32 + } + input_fields { + input_name:'buy' + input_type: INT32 + } + input_fields { + input_name: 'pid' + input_type: STRING + } + input_fields { + input_name: 'adgroup_id' + input_type: STRING + } + input_fields { + input_name: 'cate_id' + input_type: STRING + } + input_fields { + input_name: 'campaign_id' + input_type: STRING + } + input_fields { + input_name: 'customer' + input_type: STRING + } + input_fields { + input_name: 'brand' + input_type: STRING + } + input_fields { + input_name: 'user_id' + input_type: STRING + } + input_fields { + input_name: 'cms_segid' + input_type: STRING + } + input_fields { + input_name: 'cms_group_id' + input_type: STRING + } + input_fields { + input_name: 'final_gender_code' + input_type: STRING + } + input_fields { + input_name: 'age_level' + input_type: DOUBLE + } + input_fields { + input_name: 'pvalue_level' + input_type: DOUBLE + } + input_fields { + input_name: 'shopping_level' + input_type: DOUBLE + } + input_fields { + input_name: 'occupation' + input_type: STRING + } + input_fields { + input_name: 'new_user_class_level' + input_type: DOUBLE + } + input_fields { + input_name: 'tag_category_list' + input_type: STRING + } + input_fields { + input_name: 'tag_brand_list' + input_type: STRING + } + input_fields { + input_name: 'price' + input_type: DOUBLE + } + + label_fields: 'clk' + batch_size: 4096 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: 'pid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'adgroup_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cate_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: 'campaign_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'customer' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'brand' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cms_segid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'cms_group_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'final_gender_code' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'age_level' + feature_type: RawFeature + } + features: { + input_names: 'pvalue_level' + feature_type: RawFeature + } + features: { + input_names: 'shopping_level' + feature_type: RawFeature + } + features: { + input_names: 'occupation' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'new_user_class_level' + feature_type: RawFeature + } + features: { + input_names: 'tag_category_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 10000 + embedding_dim: 16 + } + features: { + input_names: 'tag_brand_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'price' + feature_type: RawFeature + } +} +model_config { + model_class: 'RankModel' + + feature_groups { + group_name: 'dense' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'new_user_class_level' + feature_names: 'price' + + wide_deep: DEEP + } + + feature_groups { + group_name: 'sparse' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'occupation' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + + wide_deep: DEEP + } + backbone { + blocks { + name: 'bottom_mlp' + inputs { + feature_group_name: 'dense' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [64, 32, 16] + } + } + } + blocks { + name: 'sparse' + inputs { + feature_group_name: 'sparse' + } + input_layer { + only_output_feature_list: true + } + } + blocks { + name: 'dot' + inputs { + block_name: 'bottom_mlp' + input_fn: 'lambda x: [x]' + } + inputs { + block_name: 'sparse' + } + keras_layer { + class_name: 'DotInteraction' + } + } + concat_blocks: ['bottom_mlp', 'dot'] + top_mlp { + hidden_units: [256, 128, 64] + } + } + model_params { + } + embedding_regularization: 1e-5 +} + +export_config { +} diff --git a/samples/model_config/fibinet_on_taobao.config b/samples/model_config/fibinet_on_taobao.config new file mode 100644 index 000000000..05736d118 --- /dev/null +++ b/samples/model_config/fibinet_on_taobao.config @@ -0,0 +1,293 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/fibinet_taobao_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 100 + sync_replicas: True + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + input_fields { + input_name:'clk' + input_type: INT32 + } + input_fields { + input_name:'buy' + input_type: INT32 + } + input_fields { + input_name: 'pid' + input_type: STRING + } + input_fields { + input_name: 'adgroup_id' + input_type: STRING + } + input_fields { + input_name: 'cate_id' + input_type: STRING + } + input_fields { + input_name: 'campaign_id' + input_type: STRING + } + input_fields { + input_name: 'customer' + input_type: STRING + } + input_fields { + input_name: 'brand' + input_type: STRING + } + input_fields { + input_name: 'user_id' + input_type: STRING + } + input_fields { + input_name: 'cms_segid' + input_type: STRING + } + input_fields { + input_name: 'cms_group_id' + input_type: STRING + } + input_fields { + input_name: 'final_gender_code' + input_type: STRING + } + input_fields { + input_name: 'age_level' + input_type: STRING + } + input_fields { + input_name: 'pvalue_level' + input_type: STRING + } + input_fields { + input_name: 'shopping_level' + input_type: STRING + } + input_fields { + input_name: 'occupation' + input_type: STRING + } + input_fields { + input_name: 'new_user_class_level' + input_type: STRING + } + input_fields { + input_name: 'tag_category_list' + input_type: STRING + } + input_fields { + input_name: 'tag_brand_list' + input_type: STRING + } + input_fields { + input_name: 'price' + input_type: INT32 + } + + label_fields: 'clk' + batch_size: 4096 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: 'pid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'adgroup_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cate_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: 'campaign_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'customer' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'brand' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cms_segid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'cms_group_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'final_gender_code' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'age_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'pvalue_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'shopping_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'occupation' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'new_user_class_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'tag_category_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'tag_brand_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'price' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config: { + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'occupation' + feature_names: 'new_user_class_level' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'price' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + backbone { + blocks { + name: "all" + inputs { + feature_group_name: "all" + } + input_layer { + do_batch_norm: true + only_output_feature_list: true + } + } + blocks { + name: "fibinet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'FiBiNet' + fibinet { + senet { + reduction_ratio: 4 + } + bilinear { + type: 'each' + num_output_units: 512 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['fibinet'] + } + model_params { + l2_regularization: 1e-6 + } + embedding_regularization: 1e-4 +} diff --git a/samples/model_config/masknet_on_taobao.config b/samples/model_config/masknet_on_taobao.config new file mode 100644 index 000000000..26b8c8262 --- /dev/null +++ b/samples/model_config/masknet_on_taobao.config @@ -0,0 +1,288 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/masknet_taobao_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 100 + sync_replicas: True + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + input_fields { + input_name:'clk' + input_type: INT32 + } + input_fields { + input_name:'buy' + input_type: INT32 + } + input_fields { + input_name: 'pid' + input_type: STRING + } + input_fields { + input_name: 'adgroup_id' + input_type: STRING + } + input_fields { + input_name: 'cate_id' + input_type: STRING + } + input_fields { + input_name: 'campaign_id' + input_type: STRING + } + input_fields { + input_name: 'customer' + input_type: STRING + } + input_fields { + input_name: 'brand' + input_type: STRING + } + input_fields { + input_name: 'user_id' + input_type: STRING + } + input_fields { + input_name: 'cms_segid' + input_type: STRING + } + input_fields { + input_name: 'cms_group_id' + input_type: STRING + } + input_fields { + input_name: 'final_gender_code' + input_type: STRING + } + input_fields { + input_name: 'age_level' + input_type: STRING + } + input_fields { + input_name: 'pvalue_level' + input_type: STRING + } + input_fields { + input_name: 'shopping_level' + input_type: STRING + } + input_fields { + input_name: 'occupation' + input_type: STRING + } + input_fields { + input_name: 'new_user_class_level' + input_type: STRING + } + input_fields { + input_name: 'tag_category_list' + input_type: STRING + } + input_fields { + input_name: 'tag_brand_list' + input_type: STRING + } + input_fields { + input_name: 'price' + input_type: INT32 + } + + label_fields: 'clk' + batch_size: 4096 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: 'pid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'adgroup_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cate_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: 'campaign_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'customer' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'brand' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cms_segid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'cms_group_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'final_gender_code' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'age_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'pvalue_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'shopping_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'occupation' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'new_user_class_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'tag_category_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'tag_brand_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'price' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config: { + model_class: 'RankModel' + feature_groups: { + group_name: 'all' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'occupation' + feature_names: 'new_user_class_level' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'price' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + backbone { + blocks { + name: "mask_net" + inputs { + feature_group_name: "all" + } + keras_layer { + class_name: 'MaskNet' + masknet { + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mask_blocks { + aggregation_size: 512 + output_size: 256 + } + mlp { + hidden_units: [512, 256] + } + } + } + } + concat_blocks: ['mask_net'] + } + model_params { + l2_regularization: 1e-6 + } + embedding_regularization: 1e-4 +} diff --git a/samples/model_config/mmoe_backbone_on_taobao.config b/samples/model_config/mmoe_backbone_on_taobao.config new file mode 100644 index 000000000..39018342c --- /dev/null +++ b/samples/model_config/mmoe_backbone_on_taobao.config @@ -0,0 +1,316 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/mmoe_backbone_taobao_ckpt" + +train_config { + optimizer_config { + adam_optimizer { + learning_rate { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 1e-07 + } + } + } + use_moving_average: false + } + num_steps: 200 + sync_replicas: true + save_checkpoints_steps: 100 + log_step_count_steps: 100 +} +data_config { + batch_size: 4096 + label_fields: "clk" + label_fields: "buy" + prefetch_size: 32 + input_type: CSVInput + input_fields { + input_name: "clk" + input_type: INT32 + } + input_fields { + input_name: "buy" + input_type: INT32 + } + input_fields { + input_name: "pid" + input_type: STRING + } + input_fields { + input_name: "adgroup_id" + input_type: STRING + } + input_fields { + input_name: "cate_id" + input_type: STRING + } + input_fields { + input_name: "campaign_id" + input_type: STRING + } + input_fields { + input_name: "customer" + input_type: STRING + } + input_fields { + input_name: "brand" + input_type: STRING + } + input_fields { + input_name: "user_id" + input_type: STRING + } + input_fields { + input_name: "cms_segid" + input_type: STRING + } + input_fields { + input_name: "cms_group_id" + input_type: STRING + } + input_fields { + input_name: "final_gender_code" + input_type: STRING + } + input_fields { + input_name: "age_level" + input_type: STRING + } + input_fields { + input_name: "pvalue_level" + input_type: STRING + } + input_fields { + input_name: "shopping_level" + input_type: STRING + } + input_fields { + input_name: "occupation" + input_type: STRING + } + input_fields { + input_name: "new_user_class_level" + input_type: STRING + } + input_fields { + input_name: "tag_category_list" + input_type: STRING + } + input_fields { + input_name: "tag_brand_list" + input_type: STRING + } + input_fields { + input_name: "price" + input_type: INT32 + } +} +feature_config: { + features { + input_names: "pid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "adgroup_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cate_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features { + input_names: "campaign_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "customer" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "brand" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "user_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cms_segid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "cms_group_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "final_gender_code" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "age_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "pvalue_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "shopping_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "occupation" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "new_user_class_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "tag_category_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "tag_brand_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "price" + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config { + model_name: "MMoE" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + feature_names: "cms_group_id" + feature_names: "age_level" + feature_names: "pvalue_level" + feature_names: "shopping_level" + feature_names: "occupation" + feature_names: "new_user_class_level" + feature_names: "adgroup_id" + feature_names: "cate_id" + feature_names: "campaign_id" + feature_names: "customer" + feature_names: "brand" + feature_names: "price" + feature_names: "pid" + feature_names: "tag_category_list" + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: 'all' + inputs { + feature_group_name: 'all' + } + input_layer { + only_output_feature_list: true + } + } + blocks { + name: "senet" + inputs { + block_name: "all" + } + keras_layer { + class_name: 'SENet' + senet { + reduction_ratio: 4 + } + } + } + blocks { + name: "mmoe" + inputs { + block_name: "senet" + } + keras_layer { + class_name: 'MMoE' + mmoe { + num_task: 2 + num_expert: 3 + expert_mlp { + hidden_units: [256, 128] + } + } + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + task_towers { + tower_name: "cvr" + label_name: "buy" + dnn { + hidden_units: [128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + l2_regularization: 1e-06 + } + embedding_regularization: 5e-05 +} diff --git a/samples/model_config/multi_tower_backbone_on_taobao.config b/samples/model_config/multi_tower_backbone_on_taobao.config new file mode 100644 index 000000000..93ec357e4 --- /dev/null +++ b/samples/model_config/multi_tower_backbone_on_taobao.config @@ -0,0 +1,348 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/multi_tower_backbone_taobao_ckpt" + +train_config { + log_step_count_steps: 100 + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 0.00001 + } + } + } + use_moving_average: false + } + save_checkpoints_steps: 100 + sync_replicas: True + num_steps: 200 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + input_fields { + input_name:'clk' + input_type: INT32 + } + input_fields { + input_name:'buy' + input_type: INT32 + } + input_fields { + input_name: 'pid' + input_type: STRING + } + input_fields { + input_name: 'adgroup_id' + input_type: STRING + } + input_fields { + input_name: 'cate_id' + input_type: STRING + } + input_fields { + input_name: 'campaign_id' + input_type: STRING + } + input_fields { + input_name: 'customer' + input_type: STRING + } + input_fields { + input_name: 'brand' + input_type: STRING + } + input_fields { + input_name: 'user_id' + input_type: STRING + } + input_fields { + input_name: 'cms_segid' + input_type: STRING + } + input_fields { + input_name: 'cms_group_id' + input_type: STRING + } + input_fields { + input_name: 'final_gender_code' + input_type: STRING + } + input_fields { + input_name: 'age_level' + input_type: STRING + } + input_fields { + input_name: 'pvalue_level' + input_type: STRING + } + input_fields { + input_name: 'shopping_level' + input_type: STRING + } + input_fields { + input_name: 'occupation' + input_type: STRING + } + input_fields { + input_name: 'new_user_class_level' + input_type: STRING + } + input_fields { + input_name: 'tag_category_list' + input_type: STRING + } + input_fields { + input_name: 'tag_brand_list' + input_type: STRING + } + input_fields { + input_name: 'price' + input_type: INT32 + } + + label_fields: 'clk' + batch_size: 4096 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: 'pid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'adgroup_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'cate_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: 'campaign_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'customer' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'brand' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features: { + input_names: 'user_id' + feature_type: IdFeature + embedding_dim: 32 + hash_bucket_size: 100000 + } + features: { + input_names: 'cms_segid' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'cms_group_id' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: 'final_gender_code' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'age_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'pvalue_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'shopping_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'occupation' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'new_user_class_level' + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: 'tag_category_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'tag_brand_list' + feature_type: TagFeature + separator: '|' + hash_bucket_size: 100000 + embedding_dim: 16 + } + features: { + input_names: 'price' + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config: { + model_name: 'MultiTower' + model_class: 'RankModel' + feature_groups: { + group_name: 'user' + feature_names: 'user_id' + feature_names: 'cms_segid' + feature_names: 'cms_group_id' + feature_names: 'age_level' + feature_names: 'pvalue_level' + feature_names: 'shopping_level' + feature_names: 'occupation' + feature_names: 'new_user_class_level' + wide_deep: DEEP + } + feature_groups: { + group_name: 'item' + feature_names: 'adgroup_id' + feature_names: 'cate_id' + feature_names: 'campaign_id' + feature_names: 'customer' + feature_names: 'brand' + feature_names: 'price' + wide_deep: DEEP + } + feature_groups: { + group_name: 'combo' + feature_names: 'pid' + feature_names: 'tag_category_list' + feature_names: 'tag_brand_list' + wide_deep: DEEP + } + losses { + loss_type: F1_REWEIGHTED_LOSS + weight: 1.0 + f1_reweighted_loss { + f1_beta_square: 2.25 + } + } + losses { + loss_type: PAIR_WISE_LOSS + weight: 1.0 + } + backbone { + packages { + name: "user_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "user" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: "item_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "item" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + packages { + name: "combo_tower" + blocks { + name: "mlp" + inputs { + feature_group_name: "combo" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128] + } + } + } + } + blocks { + name: "top_mlp" + inputs { + package_name: "user_tower" + } + inputs { + package_name: "item_tower" + } + inputs { + package_name: "combo_tower" + } + keras_layer { + class_name: "MLP" + mlp { + hidden_units: [256, 128, 64] + } + } + } + } + model_params { + l2_regularization: 1e-6 + } + embedding_regularization: 1e-4 +} + +export_config { + multi_placeholder: false +} diff --git a/samples/model_config/simple_multi_task_backbone_on_taobao.config b/samples/model_config/simple_multi_task_backbone_on_taobao.config new file mode 100644 index 000000000..9737e8193 --- /dev/null +++ b/samples/model_config/simple_multi_task_backbone_on_taobao.config @@ -0,0 +1,291 @@ +train_input_path: "data/test/tb_data/taobao_train_data" +eval_input_path: "data/test/tb_data/taobao_test_data" +model_dir: "experiments/simple_multi_task_backbone_taobao_ckpt" + +train_config { + optimizer_config { + adam_optimizer { + learning_rate { + exponential_decay_learning_rate { + initial_learning_rate: 0.001 + decay_steps: 1000 + decay_factor: 0.5 + min_learning_rate: 1e-07 + } + } + } + use_moving_average: false + } + num_steps: 200 + sync_replicas: true + save_checkpoints_steps: 100 + log_step_count_steps: 100 +} +eval_config { + metrics_set { + auc { + } + } +} +data_config { + batch_size: 4096 + label_fields: "clk" + label_fields: "buy" + prefetch_size: 32 + input_type: CSVInput + input_fields { + input_name: "clk" + input_type: INT32 + } + input_fields { + input_name: "buy" + input_type: INT32 + } + input_fields { + input_name: "pid" + input_type: STRING + } + input_fields { + input_name: "adgroup_id" + input_type: STRING + } + input_fields { + input_name: "cate_id" + input_type: STRING + } + input_fields { + input_name: "campaign_id" + input_type: STRING + } + input_fields { + input_name: "customer" + input_type: STRING + } + input_fields { + input_name: "brand" + input_type: STRING + } + input_fields { + input_name: "user_id" + input_type: STRING + } + input_fields { + input_name: "cms_segid" + input_type: STRING + } + input_fields { + input_name: "cms_group_id" + input_type: STRING + } + input_fields { + input_name: "final_gender_code" + input_type: STRING + } + input_fields { + input_name: "age_level" + input_type: STRING + } + input_fields { + input_name: "pvalue_level" + input_type: STRING + } + input_fields { + input_name: "shopping_level" + input_type: STRING + } + input_fields { + input_name: "occupation" + input_type: STRING + } + input_fields { + input_name: "new_user_class_level" + input_type: STRING + } + input_fields { + input_name: "tag_category_list" + input_type: STRING + } + input_fields { + input_name: "tag_brand_list" + input_type: STRING + } + input_fields { + input_name: "price" + input_type: INT32 + } +} +feature_config: { + features { + input_names: "pid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "adgroup_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cate_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features { + input_names: "campaign_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "customer" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "brand" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "user_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100000 + } + features { + input_names: "cms_segid" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "cms_group_id" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features { + input_names: "final_gender_code" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "age_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "pvalue_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "shopping_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "occupation" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "new_user_class_level" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features { + input_names: "tag_category_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "tag_brand_list" + feature_type: TagFeature + embedding_dim: 16 + hash_bucket_size: 100000 + separator: "|" + } + features { + input_names: "price" + feature_type: IdFeature + embedding_dim: 16 + num_buckets: 50 + } +} +model_config { + model_name: "SimpleMultiTask" + model_class: "MultiTaskModel" + feature_groups { + group_name: "all" + feature_names: "user_id" + feature_names: "cms_segid" + feature_names: "cms_group_id" + feature_names: "age_level" + feature_names: "pvalue_level" + feature_names: "shopping_level" + feature_names: "occupation" + feature_names: "new_user_class_level" + feature_names: "adgroup_id" + feature_names: "cate_id" + feature_names: "campaign_id" + feature_names: "customer" + feature_names: "brand" + feature_names: "price" + feature_names: "pid" + feature_names: "tag_category_list" + feature_names: "tag_brand_list" + wide_deep: DEEP + } + backbone { + blocks { + name: "identity" + inputs { + feature_group_name: "all" + } + } + } + model_params { + task_towers { + tower_name: "ctr" + label_name: "clk" + dnn { + hidden_units: [256, 192, 128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + task_towers { + tower_name: "cvr" + label_name: "buy" + dnn { + hidden_units: [256, 192, 128, 64] + } + num_class: 1 + weight: 1.0 + loss_type: CLASSIFICATION + metrics_set: { + auc {} + } + } + l2_regularization: 1e-07 + } + embedding_regularization: 5e-06 +} diff --git a/samples/model_config/wide_and_deep_backbone_on_avazau.config b/samples/model_config/wide_and_deep_backbone_on_avazau.config new file mode 100755 index 000000000..59de34076 --- /dev/null +++ b/samples/model_config/wide_and_deep_backbone_on_avazau.config @@ -0,0 +1,391 @@ +train_input_path: "data/test/dwd_avazu_ctr_deepmodel_10w.csv" +eval_input_path: "data/test/dwd_avazu_ctr_deepmodel_10w.csv" +model_dir: "experiments/wide_and_deep_backbone_on_avazu" + +train_config { + log_step_count_steps: 200 + # fine_tune_checkpoint: "" + optimizer_config: { + adam_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.0001 + decay_steps: 10000 + decay_factor: 0.5 + min_learning_rate: 0.0000001 + } + } + } + use_moving_average: false + } + + sync_replicas: true + save_checkpoints_steps: 500 + num_steps: 100 +} + +eval_config { + metrics_set: { + auc {} + } +} + +data_config { + separator: "," + input_fields: { + input_name: "label" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "hour" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "c1" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "banner_pos" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "site_id" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "site_domain" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "site_category" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "app_id" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "app_domain" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "app_category" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "device_id" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "device_ip" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "device_model" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "device_type" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "device_conn_type" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c14" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c15" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c16" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c17" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c18" + input_type: STRING + default_val:"0" + } + input_fields: { + input_name: "c19" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "c20" + input_type: INT64 + default_val:"0" + } + input_fields: { + input_name: "c21" + input_type: INT64 + default_val:"0" + } + label_fields: "label" + + batch_size: 1024 + num_epochs: 10000 + prefetch_size: 32 + input_type: CSVInput +} + +feature_config: { + features: { + input_names: "hour" + feature_type: RawFeature + boundaries: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23] + embedding_dim: 16 + } + features: { + input_names: "c1" + feature_type: RawFeature + boundaries: [1000.0,1001.0,1002.0,1003.0,1004.0,1005.0,1006.0,1007.0,1008.0,1009.0,1010.0,1011.0,1012.0,1013.0,1014.0,1015.0] + embedding_dim: 16 + } + features: { + input_names: "banner_pos" + feature_type: RawFeature + boundaries: [1,2,3,4,5,6] + embedding_dim: 16 + } + features: { + input_names: "site_id" + feature_type: IdFeature + embedding_dim: 32 + hash_bucket_size: 10000 + } + features: { + input_names: "site_domain" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 100 + } + features: { + input_names: "site_category" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: "app_id" + feature_type: IdFeature + embedding_dim: 32 + hash_bucket_size: 10000 + } + features: { + input_names: "app_domain" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 1000 + } + features: { + input_names: "app_category" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 100 + } + features: { + input_names: "device_id" + feature_type: IdFeature + embedding_dim: 64 + hash_bucket_size: 100000 + } + features: { + input_names: "device_ip" + feature_type: IdFeature + embedding_dim: 64 + hash_bucket_size: 100000 + } + features: { + input_names: "device_model" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10000 + } + features: { + input_names: "device_type" + feature_type: IdFeature + embedding_dim: 16 + hash_bucket_size: 10 + } + features: { + input_names: "device_conn_type" + feature_type: IdFeature + embedding_dim: 32 + hash_bucket_size: 10 + } + features: { + input_names: "c14" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 500 + } + features: { + input_names: "c15" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 500 + } + features: { + input_names: "c16" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 500 + } + features: { + input_names: "c17" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 500 + } + features: { + input_names: "c18" + feature_type: IdFeature + embedding_dim: 20 + hash_bucket_size: 500 + } + features: { + input_names: "c19" + feature_type: RawFeature + boundaries: [10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190] + embedding_dim: 16 + } + features: { + input_names: "c20" + feature_type: RawFeature + boundaries: [100.0,200.0,300.0,400.0,500.0,600.0,700.0,800.0, 900.0, 1000.0,1100.0,1200.0, 1300.0,1400.0] + embedding_dim: 16 + } + features: { + input_names: "c21" + feature_type: RawFeature + boundaries: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] + embedding_dim: 16 + } +} +model_config { + model_class: "RankModel" + feature_groups: { + group_name: "deep" + feature_names: "hour" + feature_names: "c1" + feature_names: "banner_pos" + feature_names: "site_id" + feature_names: "site_domain" + feature_names: "site_category" + feature_names: "app_id" + feature_names: "app_domain" + feature_names: "app_category" + feature_names: "device_id" + feature_names: "device_ip" + feature_names: "device_model" + feature_names: "device_type" + feature_names: "device_conn_type" + feature_names: "c14" + feature_names: "c15" + feature_names: "c16" + feature_names: "c17" + feature_names: "c18" + feature_names: "c19" + feature_names: "c20" + feature_names: "c21" + wide_deep:DEEP + } + feature_groups: { + group_name: "wide" + feature_names: "hour" + feature_names: "c1" + feature_names: "banner_pos" + feature_names: "site_id" + feature_names: "site_domain" + feature_names: "site_category" + feature_names: "app_id" + feature_names: "app_domain" + feature_names: "app_category" + feature_names: "device_id" + feature_names: "device_ip" + feature_names: "device_model" + feature_names: "device_type" + feature_names: "device_conn_type" + feature_names: "c14" + feature_names: "c15" + feature_names: "c16" + feature_names: "c17" + feature_names: "c18" + feature_names: "c19" + feature_names: "c20" + feature_names: "c21" + wide_deep:WIDE + } + backbone { + blocks { + name: 'wide' + inputs { + feature_group_name: 'wide' + } + input_layer { + only_output_feature_list: true + wide_output_dim: 1 + } + } + blocks { + name: 'deep_logit' + inputs { + feature_group_name: 'deep' + } + keras_layer { + class_name: 'MLP' + mlp { + hidden_units: [256, 256, 256, 1] + use_final_bn: false + final_activation: 'linear' + } + } + } + blocks { + name: 'final_logit' + inputs { + block_name: 'wide' + input_fn: 'lambda x: tf.add_n(x)' + } + inputs { + block_name: 'deep_logit' + } + merge_inputs_into_list: true + keras_layer { + class_name: 'Add' + } + } + concat_blocks: 'final_logit' + } + model_params { + l2_regularization: 1e-4 + } + embedding_regularization: 1e-7 +}