Deep Learning(Item2vec Embedding + MLP) based Feature-Engineering & Training & Predict all in one Recommendation System that can run on small server or edge device.
Progress:
- Simple 2 layer MLP test on MovieLens
- Dropout and L2 regularization
- Batch Normalization
Progress:
- YouTube DNN test on MovieLens
- Dropout and L2 regularization
- Batch Normalization
Progress:
- DIN test on MovieLens
- Euclidean Distance based attention
- Dropout and L2 regularization
- Batch Normalization
You can run the MovieLens training and predict demo by:
# download and unzip the SQLite DB file
wget https://github.com/auxten/edgeRec/files/9895974/movielens.db.zip && \
unzip movielens.db.zip
# compile the edgeRec and put it in the current directory
GOBIN=`pwd` go install github.com/auxten/edgeRec@latest && \
./edgeRec
Wait for the message shown: Listening and serving HTTP on :8080
.
Then test the API in another terminal:
curl --header "Content-Type: application/json" \
--request POST \
--data '{"userId":108,"itemIdList":[1,2,39]}' \
http://localhost:8080/api/v1/recommend
Should get the response like this:
{"itemScoreList":[
{"itemId":1,"score":0.7517360474797006},
{"itemId":2,"score":0.5240565619788571},
{"itemId":39,"score":0.38496231172036016}
]}
So, with a higher score, user #108 may prefer movie #1 over #2 and #39.
To create a deep learning based recommendation system, you need to follow the steps below:
if you prefer show me the code
, just go to MovieLens Example
-
Implement the
recommend.RecSys
interface including func below:GetUserFeature(context.Context, int) (Tensor, error) GetItemFeature(context.Context, int) (Tensor, error) SampleGenerator(context.Context) (<-chan Sample, error)
-
Call the functions to
Train
andStartHttpApi
model, _ = recommend.Train(recSys) recommend.StartHttpApi(model, "/api/v1/recommend", ":8080")
-
If you want better AUC with item embedding, you can implement the
recommend.ItemEmbedding
interface including func below://ItemEmbedding is an interface used to generate item embedding with item2vec model //by just providing a behavior based item sequence. // Example: user liked items sequence, user bought items sequence, // user viewed items sequence type ItemEmbedding interface { ItemSeqGenerator() (<-chan string, error) }
All you need to do is implement the functions of the gray part:
- Pure Golang implementation, battery included.
- Parameter Server based Online Learning
- Training & Inference all in one binary powered by golang
- Databases support
- MySQL support
- SQLite support
- Database Aggregation accelerated Feature Normalization
- Feature Engineering
- Item2vec embedding
- Rule based FE config
- DeepL based Auto Feature Engineering
- Demo
- MovieLens Demo
- Android demo
- iOS demo
- Apple M1 Max
- Database: SQLite3
- Model: SkipGram, Optimizer: HierarchicalSoftmax
- WindowSize: 5
- Data: MovieLens 10m
read 9520886 words 12.169282375s
trained 9519544 words 17.155356791s
Search Embedding of:
59784 "Kung Fu Panda (2008)" Action|Animation|Children|Comedy
RANK | WORD | SIMILARITY | TITLE & GENRES
-------+-------+-------------+-------------
1 | 60072 | 0.974392 | Wanted (2008) Action|Thriller
2 | 60040 | 0.974080 | Incredible Hulk, The (2008) Action|Fantasy|Sci-Fi
3 | 60069 | 0.973728 | WALL·E (2008) Adventure|Animation|Children|Comedy|Romance|Sci-Fi
4 | 60074 | 0.970396 | Hancock (2008) Action|Comedy|Drama|Fantasy
5 | 63859 | 0.969845 | Bolt (2008) Action|Adventure|Animation|Children|Comedy
6 | 57640 | 0.969305 | Hellboy II: The Golden Army (2008) Action|Adventure|Comedy|Fantasy|Sci-Fi
7 | 58299 | 0.967733 | Horton Hears a Who! (2008) Adventure|Animation|Children|Comedy
8 | 59037 | 0.966410 | Speed Racer (2008) Action|Adventure|Children
9 | 59315 | 0.964556 | Iron Man (2008) Action|Adventure|Sci-Fi
10 | 58105 | 0.963332 | Spiderwick Chronicles, The (2008) Adventure|Children|Drama|Fantasy
- Dataset: MovieLens 100k, split by 80%+20% userId randomly
- Code: example/movielens
- Training time: 28s
- AUC: 0.782
-
Q: What model do you use?
-
A: Just 2 layers of neural network and item2vec embedding.
-
Q: Where can I use this?
-
A: Simple system with a database. With 100 lines of golang, you got a better than nothing recommendation system.
-
Q: Where wouldn't I use this?
-
A: Large (100+ million) dataset using SOTA models.
To make this project work, quite a lot of code are copied and modified from the following libraries:
- Neural Network & Parameter Server:
- Feature Engineering:
- FastAPI like framework:
- Gopher logo with GIMP:
- JetBrains for providing free license for this project.
- YouTube DNN
- Deep Interest Network for Click-Through Rate Prediction
- Document Embedding with Paragraph Vectors
- EdgeRec: Recommender System on Edge in Mobile Taobao // not very identical implementation