Merge pull request #248 from dice-group/develop

Prep for the new release
dice-group · Jun 26, 2024 · 4bc42ae · 4bc42ae
2 parents dae330e + 3eebbac
commit 4bc42ae
Show file tree

Hide file tree

Showing 16 changed files with 400 additions and 1,508 deletions.
diff --git a/README.md b/README.md
@@ -35,7 +35,7 @@ Deploy a pre-trained embedding model without writing a single line of code.
 ### Installation from Source
 ``` bash
 git clone https://github.com/dice-group/dice-embeddings.git
-conda create -n dice python=3.10.13 --no-default-packages && conda activate dice && cd dice-embeddings &&
+conda create -n dice python=3.10.13 --no-default-packages && conda activate dice
 pip3 install -e .
 ```
 or
@@ -48,7 +48,7 @@ wget https://files.dice-research.org/datasets/dice-embeddings/KGs.zip --no-check
 ```
 To test the Installation
 ```bash
-python -m pytest -p no:warnings -x # Runs >114 tests leading to > 15 mins
+python -m pytest -p no:warnings -x # Runs >119 tests leading to > 15 mins
 python -m pytest -p no:warnings --lf # run only the last failed test
 python -m pytest -p no:warnings --ff # to run the failures first and then the rest of the tests.
 ```
@@ -95,45 +95,26 @@ A KGE model can also be trained from the command line
 ```bash
 dicee --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
 ```
-dicee automaticaly detects available GPUs and trains a model with distributed data parallels technique. Under the hood, dicee uses lighning as a default trainer.
+dicee automatically detects available GPUs and trains a model with distributed data parallels technique.
 ```bash
 # Train a model by only using the GPU-0
 CUDA_VISIBLE_DEVICES=0 dicee --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
 # Train a model by only using GPU-1
 CUDA_VISIBLE_DEVICES=1 dicee --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
-NCCL_P2P_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1 python dicee/scripts/run.py --trainer PL --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
+# Train a model by using all available GPUs
+dicee --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
 ```
-Under the hood, dicee executes run.py script and uses lighning as a default trainer
+Under the hood, dicee executes the run.py script and uses [lightning](https://lightning.ai/) as a default trainer.
 ```bash
 # Two equivalent executions
 # (1)
 dicee --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
-# Evaluate Keci on Train set: Evaluate Keci on Train set
-# {'H@1': 0.9518788343558282, 'H@3': 0.9988496932515337, 'H@10': 1.0, 'MRR': 0.9753123402351737}
-# Evaluate Keci on Validation set: Evaluate Keci on Validation set
-# {'H@1': 0.6932515337423313, 'H@3': 0.9041411042944786, 'H@10': 0.9754601226993865, 'MRR': 0.8072362996241839}
-# Evaluate Keci on Test set: Evaluate Keci on Test set
-# {'H@1': 0.6951588502269289, 'H@3': 0.9039334341906202, 'H@10': 0.9750378214826021, 'MRR': 0.8064032293278861}
-
 # (2)
 CUDA_VISIBLE_DEVICES=0,1 python dicee/scripts/run.py --trainer PL --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
-# Evaluate Keci on Train set: Evaluate Keci on Train set
-# {'H@1': 0.9518788343558282, 'H@3': 0.9988496932515337, 'H@10': 1.0, 'MRR': 0.9753123402351737}
-# Evaluate Keci on Train set: Evaluate Keci on Train set
-# Evaluate Keci on Validation set: Evaluate Keci on Validation set
-# {'H@1': 0.6932515337423313, 'H@3': 0.9041411042944786, 'H@10': 0.9754601226993865, 'MRR': 0.8072362996241839}
-# Evaluate Keci on Test set: Evaluate Keci on Test set
-# {'H@1': 0.6951588502269289, 'H@3': 0.9039334341906202, 'H@10': 0.9750378214826021, 'MRR': 0.8064032293278861}
 ```
 Similarly, models can be easily trained with torchrun
 ```bash
 torchrun --standalone --nnodes=1 --nproc_per_node=gpu dicee/scripts/run.py --trainer torchDDP --dataset_dir "KGs/UMLS" --model Keci --eval_model "train_val_test"
-# Evaluate Keci on Train set: Evaluate Keci on Train set: Evaluate Keci on Train set
-# {'H@1': 0.9518788343558282, 'H@3': 0.9988496932515337, 'H@10': 1.0, 'MRR': 0.9753123402351737}
-# Evaluate Keci on Validation set: Evaluate Keci on Validation set
-# {'H@1': 0.6932515337423313, 'H@3': 0.9041411042944786, 'H@10': 0.9754601226993865, 'MRR': 0.8072499937521418}
-# Evaluate Keci on Test set: Evaluate Keci on Test set
-{'H@1': 0.6951588502269289, 'H@3': 0.9039334341906202, 'H@10': 0.9750378214826021, 'MRR': 0.8064032293278861}
 ```
 You can also train a model in multi-node multi-gpu setting.
 ```bash
@@ -143,7 +124,7 @@ torchrun --nnodes 2 --nproc_per_node=gpu  --node_rank 1 --rdzv_id 455 --rdzv_bac
 Train a KGE model by providing the path of a single file and store all parameters under newly created directory
 called `KeciFamilyRun`.
 ```bash
-dicee --path_single_kg "KGs/Family/family-benchmark_rich_background.owl" --model Keci --path_to_store_single_run KeciFamilyRun --backend rdflib
+dicee --path_single_kg "KGs/Family/family-benchmark_rich_background.owl" --model Keci --path_to_store_single_run KeciFamilyRun --backend rdflib --eval_model None
 ```
 where the data is in the following form
 ```bash
@@ -152,6 +133,11 @@ _:1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07
 <http://www.benchmark.org/family#hasChild> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
 <http://www.benchmark.org/family#hasParent> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
 ```
+**Continual Training:** the training phase of a pretrained model can be resumed.
+```bash
+dicee --continual_learning KeciFamilyRun --path_single_kg "KGs/Family/family-benchmark_rich_background.owl" --model Keci --path_to_store_single_run KeciFamilyRun --backend rdflib --eval_model None
+```
+
 **Apart from n-triples or standard link prediction dataset formats, we support ["owl", "nt", "turtle", "rdf/xml", "n3"]***.
 Moreover, a KGE model can be also trained  by providing **an endpoint of a triple store**.
 ```bash
@@ -285,16 +271,22 @@ pre_trained_kge.predict_topk(r=[".."],t=[".."],topk=10)
 
 ## Downloading Pretrained Models 
 
+We provide plenty pretrained knowledge graph embedding models at [dice-research.org/projects/DiceEmbeddings/](https://files.dice-research.org/projects/DiceEmbeddings/).
 <details> <summary> To see a code snippet </summary>
 
 ```python
 from dicee import KGE
-# (1) Load a pretrained ConEx on DBpedia 
-model = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/KINSHIP-Keci-dim128-epoch256-KvsAll")
+mure = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/YAGO3-10-Pykeen_MuRE-dim128-epoch256-KvsAll")
+quate = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/YAGO3-10-Pykeen_QuatE-dim128-epoch256-KvsAll")
+keci = KGE(url="https://files.dice-research.org/projects/DiceEmbeddings/YAGO3-10-Keci-dim128-epoch256-KvsAll")
+quate.predict_topk(h=["Mongolia"],r=["isLocatedIn"],topk=3)
+# [('Asia', 0.9894362688064575), ('Europe', 0.01575559377670288), ('Tadanari_Lee', 0.012544365599751472)]
+keci.predict_topk(h=["Mongolia"],r=["isLocatedIn"],topk=3)
+# [('Asia', 0.6522021293640137), ('Chinggis_Khaan_International_Airport', 0.36563414335250854), ('Democratic_Party_(Mongolia)', 0.19600993394851685)]
+mure.predict_topk(h=["Mongolia"],r=["isLocatedIn"],topk=3)
+# [('Asia', 0.9996906518936157), ('Ulan_Bator', 0.0009907372295856476), ('Philippines', 0.0003116439620498568)]
 ```
 
-- For more please look at [dice-research.org/projects/DiceEmbeddings/](https://files.dice-research.org/projects/DiceEmbeddings/)
-
 </details>
 
 ## How to Deploy

diff --git a/dicee/config.py b/dicee/config.py
@@ -133,6 +133,8 @@ def __init__(self, **kwargs):
         self.block_size: int = None
         "block size of LLM"
 
+        self.continual_learning=None
+        "Path of a pretrained model size of LLM"
 
     def __iter__(self):
         # Iterate

diff --git a/dicee/evaluator.py b/dicee/evaluator.py
@@ -456,7 +456,7 @@ def dummy_eval(self, trained_model, form_of_labelling: str):
                                                    valid_set=valid_set,
                                                    test_set=test_set,
                                                    trained_model=trained_model)
-        elif self.args.scoring_technique in ['KvsAll', 'KvsSample', '1vsAll', 'PvsAll', 'CCvsAll']:
+        elif self.args.scoring_technique in ["AllvsAll",'KvsAll', 'KvsSample', '1vsAll']:
             self.eval_with_vs_all(train_set=train_set,
                                   valid_set=valid_set,
                                   test_set=test_set,

diff --git a/dicee/executer.py b/dicee/executer.py
@@ -234,31 +234,32 @@ class ContinuousExecute(Execute):
     (1) Loading & Preprocessing & Serializing input data.
     (2) Training & Validation & Testing
     (3) Storing all necessary info
+
+    During the continual learning we can only modify *** num_epochs *** parameter.
+    Trained model stored in the same folder as the seed model for the training.
+    Trained model is noted with the current time.
     """
 
     def __init__(self, args):
-        assert os.path.exists(args.path_experiment_folder)
-        assert os.path.isfile(args.path_experiment_folder + '/configuration.json')
-        # (1) Load Previous input configuration
-        previous_args = load_json(args.path_experiment_folder + '/configuration.json')
-        dargs = vars(args)
-        del args
-        for k in list(dargs.keys()):
-            if dargs[k] is None:
-                del dargs[k]
-        # (2) Update (1) with new input
-        previous_args.update(dargs)
+        # (1) Current input configuration.
+        assert os.path.exists(args.continual_learning)
+        assert os.path.isfile(args.continual_learning + '/configuration.json')
+        # (2) Load previous input configuration.
+        previous_args = load_json(args.continual_learning + '/configuration.json')
+        args=vars(args)
+        #
+        previous_args["num_epochs"]=args["num_epochs"]
+        previous_args["continual_learning"]=args["continual_learning"]
+        print("Updated configuration:",previous_args)
         try:
-            report = load_json(dargs['path_experiment_folder'] + '/report.json')
+            report = load_json(args['continual_learning'] + '/report.json')
             previous_args['num_entities'] = report['num_entities']
             previous_args['num_relations'] = report['num_relations']
         except AssertionError:
             print("Couldn't find report.json.")
         previous_args = SimpleNamespace(**previous_args)
-        previous_args.full_storage_path = previous_args.path_experiment_folder
         print('ContinuousExecute starting...')
         print(previous_args)
-        # TODO: can we remove continuous_training from Execute ?
         super().__init__(previous_args, continuous_training=True)
 
     def continual_start(self) -> dict:
@@ -279,7 +280,7 @@ def continual_start(self) -> dict:
         """
         # (1)
         self.trainer = DICE_Trainer(args=self.args, is_continual_training=True,
-                                    storage_path=self.args.path_experiment_folder)
+                                    storage_path=self.args.continual_learning)
         # (2)
         self.trained_model, form_of_labelling = self.trainer.continual_start()
 

diff --git a/dicee/models/__init__.py b/dicee/models/__init__.py
@@ -6,3 +6,4 @@
 from .clifford import Keci, KeciBase, CMult, DeCaL # noqa
 from .pykeen_models import * # noqa
 from .function_space import * # noqa
+from .dualE import DualE
diff --git a/dicee/models/base_model.py b/dicee/models/base_model.py
@@ -431,6 +431,8 @@ class IdentityClass(torch.nn.Module):
     def __init__(self, args=None):
         super().__init__()
         self.args = args
+    def __call__(self, x):
+        return x
 
     @staticmethod
     def forward(x):

diff --git a/dicee/models/clifford.py b/dicee/models/clifford.py
@@ -764,7 +764,7 @@ def forward_triples(self, x: torch.Tensor) -> torch.FloatTensor:
 
         Parameter
         ---------
-        x: torch.LongTensor with (n,3) shape
+        x: torch.LongTensor with (n, ) shape
 
         Returns
         -------
@@ -844,9 +844,9 @@ def forward_triples(self, x: torch.Tensor) -> torch.FloatTensor:
             sigma_qr = 0
         return h0r0t0 + score_p + score_q + score_r + sigma_pp + sigma_qq + sigma_rr + sigma_pq + sigma_qr + sigma_pr
 
-    def cl_pqr(self, a):
+    def cl_pqr(self, a:torch.tensor)->torch.tensor:
 
-        ''' Input: tensor(batch_size, emb_dim) ----> output: tensor with 1+p+q+r components with size (batch_size, emb_dim/(1+p+q+r)) each.
+        ''' Input: tensor(batch_size, emb_dim) ---> output: tensor with 1+p+q+r components with size (batch_size, emb_dim/(1+p+q+r)) each.
 
         1) takes a tensor of size (batch_size, emb_dim), split it into 1 + p + q +r components, hence 1+p+q+r must be a divisor 
         of the emb_dim. 
@@ -861,17 +861,25 @@ def cl_pqr(self, a):
     def compute_sigmas_single(self, list_h_emb, list_r_emb, list_t_emb):
 
         '''here we compute all the sums with no others vectors interaction taken with the scalar product with t, that is,
-        1) s0 = h_0r_0t_0
-        2) s1 = \sum_{i=1}^{p}h_ir_it_0
-        3) s2 = \sum_{j=p+1}^{p+q}h_jr_jt_0
-        4) s3 = \sum_{i=1}^{q}(h_0r_it_i + h_ir_0t_i)
-        5) s4 = \sum_{i=p+1}^{p+q}(h_0r_it_i + h_ir_0t_i)
-        5) s5 = \sum_{i=p+q+1}^{p+q+r}(h_0r_it_i + h_ir_0t_i)
+        
+        .. math::
+
+             s0 = h_0r_0t_0
+             s1 = \sum_{i=1}^{p}h_ir_it_0
+             s2 = \sum_{j=p+1}^{p+q}h_jr_jt_0
+             s3 = \sum_{i=1}^{q}(h_0r_it_i + h_ir_0t_i)
+             s4 = \sum_{i=p+1}^{p+q}(h_0r_it_i + h_ir_0t_i)
+             s5 = \sum_{i=p+q+1}^{p+q+r}(h_0r_it_i + h_ir_0t_i)
         
         and return:
         
-        *) sigma_0t = \sigma_0 \cdot t_0 = s0 + s1 -s2
-        *) s3, s4 and s5'''
+        .. math::
+
+            sigma_0t = \sigma_0 \cdot t_0 = s0 + s1 -s2
+            s3, s4 and s5
+        
+        
+        '''
 
         p = self.p
         q = self.q
@@ -906,15 +914,19 @@ def compute_sigmas_multivect(self, list_h_emb, list_r_emb):
 
            For same bases vectors interaction we have
 
-           1) \sigma_pp = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(h_ir_{i'}-h_{i'}r_i) (models the interactions between e_i and e_i' for 1 <= i, i' <= p)
-           2) \sigma_qq = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(h_jr_{j'}-h_{j'} (models the interactions between e_j and e_j' for p+1 <= j, j' <= p+q)
-           3) \sigma_rr = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(h_kr_{k'}-h_{k'}r_k) (models the interactions between e_k and e_k' for p+q+1 <= k, k' <= p+q+r) 
-           
+           .. math::
+
+                \sigma_pp = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(h_ir_{i'}-h_{i'}r_i) (models the interactions between e_i and e_i' for 1 <= i, i' <= p)
+                \sigma_qq = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(h_jr_{j'}-h_{j'} (models the interactions between e_j and e_j' for p+1 <= j, j' <= p+q)
+                \sigma_rr = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(h_kr_{k'}-h_{k'}r_k) (models the interactions between e_k and e_k' for p+q+1 <= k, k' <= p+q+r) 
+            
            For different base vector interactions, we have
            
-           4) \sigma_pq = \sum_{i=1}^{p}\sum_{j=p+1}^{p+q}(h_ir_j - h_jr_i) (interactionsn between e_i and e_j for 1<=i <=p and p+1<= j <= p+q)
-           5) \sigma_pr = \sum_{i=1}^{p}\sum_{k=p+q+1}^{p+q+r}(h_ir_k - h_kr_i) (interactionsn between e_i and e_k for 1<=i <=p and p+q+1<= k <= p+q+r)
-           6) \sigma_qr = \sum_{j=p+1}^{p+q}\sum_{j=p+q+1}^{p+q+r}(h_jr_k - h_kr_j) (interactionsn between e_j and e_k for p+1 <= j <=p+q and p+q+1<= j <= p+q+r)
+            .. math::
+
+                \sigma_pq = \sum_{i=1}^{p}\sum_{j=p+1}^{p+q}(h_ir_j - h_jr_i) (interactionsn between e_i and e_j for 1<=i <=p and p+1<= j <= p+q)
+                \sigma_pr = \sum_{i=1}^{p}\sum_{k=p+q+1}^{p+q+r}(h_ir_k - h_kr_i) (interactionsn between e_i and e_k for 1<=i <=p and p+q+1<= k <= p+q+r)
+                \sigma_qr = \sum_{j=p+1}^{p+q}\sum_{j=p+q+1}^{p+q+r}(h_jr_k - h_kr_j) (interactionsn between e_j and e_k for p+1 <= j <=p+q and p+q+1<= j <= p+q+r)
            
            '''
 
@@ -958,15 +970,15 @@ def forward_k_vs_all(self, x: torch.Tensor) -> torch.FloatTensor:
         """
             Kvsall training
 
-            (1) Retrieve real-valued embedding vectors for heads and relations \mathbb{R}^d .
-            (2) Construct head entity and relation embeddings according to Cl_{p,q}(\mathbb{R}^d) .
+            (1) Retrieve real-valued embedding vectors for heads and relations
+            (2) Construct head entity and relation embeddings according to Cl_{p,q, r}(\mathbb{R}^d) .
             (3) Perform Cl multiplication
             (4) Inner product of (3) and all entity embeddings
 
             forward_k_vs_with_explicit and this funcitons are identical
             Parameter
             ---------
-            x: torch.LongTensor with (n,2) shape
+            x: torch.LongTensor with (n, ) shape
             Returns
             -------
             torch.FloatTensor with (n, |E|) shape
@@ -1097,9 +1109,12 @@ def construct_cl_multivector(self, x: torch.FloatTensor, re: int, p: int, q: int
 
     def compute_sigma_pp(self, hp, rp):
         """
-        \sigma_{p,p}^* = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(x_iy_{i'}-x_{i'}y_i)
+        Compute 
+        .. math::
+        
+            \sigma_{p,p}^* = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(x_iy_{i'}-x_{i'}y_i)
 
-        sigma_{pp} captures the interactions between along p bases
+        \sigma_{pp} captures the interactions between along p bases
         For instance, let p e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3
         This can be implemented with a nested two for loops
 
@@ -1125,7 +1140,12 @@ def compute_sigma_pp(self, hp, rp):
 
     def compute_sigma_qq(self, hq, rq):
         """
-        Compute  \sigma_{q,q}^* = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(x_jy_{j'}-x_{j'}y_j) Eq. 16
+        Compute  
+
+        .. math::
+        
+            \sigma_{q,q}^* = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(x_jy_{j'}-x_{j'}y_j) Eq. 16
+
         sigma_{q} captures the interactions between along q bases
         For instance, let q e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3
         This can be implemented with a nested two for loops
@@ -1157,7 +1177,9 @@ def compute_sigma_qq(self, hq, rq):
 
     def compute_sigma_rr(self, hk, rk):
         """
-        \sigma_{r,r}^* = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(x_ky_{k'}-x_{k'}y_k)
+        .. math:: 
+        
+            \sigma_{r,r}^* = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(x_ky_{k'}-x_{k'}y_k)
 
         """
         # Compute indexes for the upper triangle of p by p matrix
@@ -1173,7 +1195,11 @@ def compute_sigma_rr(self, hk, rk):
 
     def compute_sigma_pq(self, *, hp, hq, rp, rq):
         """
-        \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
+        Compute 
+
+        .. math:: 
+        
+            \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
 
         results = []
         sigma_pq = torch.zeros(b, r, p, q)
@@ -1189,7 +1215,11 @@ def compute_sigma_pq(self, *, hp, hq, rp, rq):
 
     def compute_sigma_pr(self, *, hp, hk, rp, rk):
         """
-        \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
+        Compute
+
+        .. math:: 
+
+            \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
 
         results = []
         sigma_pq = torch.zeros(b, r, p, q)
@@ -1205,7 +1235,9 @@ def compute_sigma_pr(self, *, hp, hk, rp, rk):
 
     def compute_sigma_qr(self, *, hq, hk, rq, rk):
         """
-        \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
+        .. math:: 
+
+            \sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j
 
         results = []
         sigma_pq = torch.zeros(b, r, p, q)