File tree 2 files changed +24
-33
lines changed
2 files changed +24
-33
lines changed Original file line number Diff line number Diff line change
1
+ ---
2
+ date : ' 2024-11-15'
3
+ title : ' Byte pair encoding for Knowledge Graph Embeddings'
4
+ type : ' Bachelor'
5
+ supervisor : dice:CaglarDemir
6
+ contact : dice:CaglarDemir
7
+ ---
8
+
9
+ # Topic
10
+ A knowledge graph embedding (KGE) model assigns a unique embedding row for each unique entities/nodes and relations/edges.
11
+ As the size of the unique entities or relations grows, the memory usage of KGE increases.
12
+ Therefore, the memory requirement to train KGE model or deploy a trained model is bounded by the size of the data.
13
+
14
+ LLMs uses byte pair encoding techniques to learn to represent sequence of chars with subword unit.
15
+ Therefore, LLM embeddings are subword units, instead of unique words.
16
+ Recently, we show that byte pair encoding schema developed for LLMs can also be used for KGEs (see
17
+ [ Inference over Unseen Entities, Relations and Literals on Knowledge Graphs] ( https://arxiv.org/pdf/2410.06742 ) .
18
+ In this thesis, the student will design a byte pair encoding schema based on a given knowledge graph.
19
+ The student will closely work on [ dice-embeddings] ( https://github.com/dice-group/dice-embeddings ) .
20
+
21
+
22
+ #### Question & Answer Session
23
+
24
+ In case you have further questions, feel free to contact [ Caglar Demir] ( https://dice-research.org/CaglarDemir ) .
Load Diff This file was deleted.
You can’t perform that action at this time.
0 commit comments