GitHub - Sein-Kim/TERACON

Task Relation-aware Continual User Representation Learning

The revised source code for Task Relation-aware Continual User Representation Learning paper, accepted at KDD 2023.
In the revised version, the code was reorganized, and we enhanced readability by adding annotations and detailed descriptions, resulting in improved efficiency compared to the previous version.
For the old version of the TERACON code, please use the following URL.

Abstract

User modeling, which learns to represent users into a low-dimensional representation space based on their past behaviors, got a surge of interest from the industry for providing personalized services to users. Previous efforts in user modeling mainly focus on learning a task-specific user representation that is designed for a single task. However, since learning task-specific user representations for every task is infeasible, recent studies introduce the concept of universal user representation, which is a more generalized representation of a user that is relevant to a variety of tasks. Despite their effectiveness, existing approaches for learning universal user representations are impractical in real-world applications due to the data requirement, catastrophic forgetting and the limited learning capability for continually added tasks. In this paper, we propose a novel continual user representation learning method, called TERACON, whose learning capability is not limited as the number of learned tasks increases while capturing the relationship between the tasks. The main idea is to introduce an embedding for each task, i.e., task embedding, which is utilized to generate task-specific soft masks that not only allow the entire model parameters to be updated until the end of training sequence, but also facilitate the relationship between the tasks to be captured. Moreover, we introduce a novel knowledge retention module with pseudo-labeling strategy that successfully alleviates the long-standing problem of continual learning, i.e., catastrophic forgetting. Extensive experiments on public and proprietary real-world datasets demonstrate the superiority and practicality of TERACON.

Dataset

You can download the datasets (TTL and Movielens) from the following links (taken from CONURE)
- TTL: https://drive.google.com/file/d/1imhHUsivh6oMEtEW-RwVc4OsDqn-xOaP/view
- MovieLens: https://grouplens.org/datasets/movielens/25m/
For your own custom dataset, format it as follows:
- Format: Input Sequence ,, Targets
- For e.g.,
```
0,0,0,0,0,0,1,2,3,4,,2,3,4
0,0,0,0,0,1,2,5,7,3,,8
0,0,0,5,6,7,2,2,4,5,,10
0,0,0,0,0,8,9,3,4,4,,20
```
- Please refer to the example datasets in the example folder.

Requirments

Pytorch version: 1.7.1
Numpy version: 1.19.2

How to Run

git clone https://github.com/Sein-Kim/TERACON.git
cd TERACON
mkdir -p saved_models Data/Session ColdRec

Download TTL data from here and upload it to ColdRec folder.
To train the model on Task 1, run train_task1.py as follows:
```
python train_task1.py --epochs 10 --lr 0.001
```
Then, run subsequent tasks by executing the following script:
```
sh ttl_train.sh
```
To perform inference on past tasks using the current model, run inference_past_tasks.py.
- As an example, if you have trained from Task 1 to Task 5 using TTL dataset, run the following script to perform inference on tasks from Task 1 to Task 5:
```
sh ttl_inference.sh
```

Arguments

--datapath: Path of the dataset.
- usage example :--dataset ./ColdRec/original_desen_finetune_click_nosuerID.csv
--paths: Path of the model trained on a previous task.
- usage example :--paths ./saved_models/task1.t7
--savepath: Path to which the current model is saved.
- usage example : --savepath ./saved_models/task2
--n_tasks: Total number of the tasks.
- usage example :--n_tasks 2
--datapath_index: Path of the item index dictionary (i.e., Data/Session/index.csv).
- usage example :--datapath_index Data/Session/index.csv
- Note that the file index.csv is automatically generated when running Task 1. Specifically, when training the model on Task 1, data_loader generates the index.csv file, which contains the index information for all items in Task 1.
--lr: Learning rate.
- usage example : --lr 0.0001
--alpha: A hyperparameter that controls the contribution of the knowledge retention.
- usage example : --alpha 0.7
--smax: Positive scaling hyper-parameter.
- usage example : --smax 50

Source code of the backbone network

The source code of the backbone network is referenced from:
- https://github.com/yuangh-x/2022-NIPS-Tenrec
- https://github.com/syiswell/NextItNet-Pytorch

Cite (Bibtex)

Please cite the following paer, if you find TERACON useful in your research:
- Kim, Sein and Lee, Namkyeong and Kim, Donghyun and Yang, Minchul and Park, Chanyoung. “Task Relation-aware Continual User Representation Learning.” KDD 2023.
- Bibtex

@article{kim2023task,
  title={Task Relation-aware Continual User Representation Learning},
  author={Kim, Sein and Lee, Namkyeong and Kim, Donghyun and Yang, Minchul and Park, Chanyoung},
  journal={arXiv preprint arXiv:2306.01792},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
example		example
imgs		imgs
.gitignore		.gitignore
README.md		README.md
README_Kor.md		README_Kor.md
data_loader.py		data_loader.py
generator_hat_other.py		generator_hat_other.py
inference_past_tasks.py		inference_past_tasks.py
teracon.py		teracon.py
train_task1.py		train_task1.py
train_teracon.py		train_teracon.py
ttl_inference.sh		ttl_inference.sh
ttl_train.sh		ttl_train.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task Relation-aware Continual User Representation Learning

Abstract

Dataset

Requirments

How to Run

Arguments

Source code of the backbone network

Cite (Bibtex)

About

Releases

Packages

Languages

Sein-Kim/TERACON

Folders and files

Latest commit

History

Repository files navigation

Task Relation-aware Continual User Representation Learning

Abstract

Dataset

Requirments

How to Run

Arguments

Source code of the backbone network

Cite (Bibtex)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages