RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) #251

ADIthaker · 2023-03-16T02:59:03Z

TorchKGE version: 0.17.5
Python version: 3.9.16
Operating System: Ubuntu on Colab

Description

I was trying to run the 'Simplest training' example available on the torchkge site.
For some reason, it keeps giving me an error that all my tensors should be on the same device. However, I have simply copy-pasted the example with my own dataset.

What I Did

The code works only if I change use_all parameter in dataloader = DataLoader(train, batch_size=batch_size, use_cuda="all") to None, i.e. when I shift my dataloader to the cpu which slows down training.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-27-e141f90ae057>](https://localhost:8080/#) in <module>
     18     for i, batch in enumerate(dataloader):
     19         h, t, r = batch[0], batch[1], batch[2]
---> 20         n_h, n_t = sampler.corrupt_batch(h, t, r)
     21         optimizer.zero_grad()
     22 

[/usr/local/lib/python3.9/dist-packages/torchkge/sampling.py](https://localhost:8080/#) in corrupt_batch(self, heads, tails, relations, n_neg)
    315         # Randomly choose which samples will have head/tail corrupted
    316         mask = bernoulli(self.bern_probs[relations].repeat(n_neg)).double()
--> 317         n_h_cor = int(mask.sum().item())
    318         neg_heads[mask == 1] = randint(1, self.n_ent,
    319                                        (n_h_cor,),

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

The text was updated successfully, but these errors were encountered:

armand33 · 2023-03-20T14:20:48Z

I think this issue has been fixed by PR #246. Can you confirm ? The changes will be included in an upcoming patch release.

dimou-gk · 2023-04-09T16:18:13Z

@armand33 I have encountered the same error while working on a project of my own. The code is mostly the same as the example with the Simplest Training, with the addition of an evalutation at the end, also according to the Model Evaluation . In the code below i am using Uniform negative sampler to try and test if the same problem occurs with other negative samplers.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-7-f85c0a8624f9>](https://localhost:8080/#) in <cell line: 14>()
     13 
     14 if __name__=='__main__':
---> 15     main()

4 frames
[<ipython-input-7-f85c0a8624f9>](https://localhost:8080/#) in main()
     10     }
     11 
---> 12     train_test_kge(hyperparameters)
     13 
     14 if __name__=='__main__':

[<ipython-input-6-41a929de9fd0>](https://localhost:8080/#) in train_test_kge(parameters)
     47     #Testing
     48     evaluator = torchkge.evaluation.TripletClassificationEvaluator(model, kg_val, kg_test)
---> 49     evaluator.evaluate(parameters['batch_size'])
     50     print('Accuracy on test set: {}'.format(evaluator.accuracy(parameters['batch_size'])))

[/usr/local/lib/python3.9/dist-packages/torchkge/evaluation.py](https://localhost:8080/#) in evaluate(self, b_size)
    525         r_idx = self.kg_val.relations
    526 
--> 527         neg_heads, neg_tails = self.sampler.corrupt_kg(b_size, self.is_cuda,
    528                                                        which='main')
    529         neg_scores = self.get_scores(neg_heads, neg_tails, r_idx, b_size)

[/usr/local/lib/python3.9/dist-packages/torchkge/sampling.py](https://localhost:8080/#) in corrupt_kg(self, batch_size, use_cuda, which)
    127         for i, batch in enumerate(dataloader):
    128             heads, tails, rels = batch[0], batch[1], batch[2]
--> 129             neg_heads, neg_tails = self.corrupt_batch(heads, tails, rels,
    130                                                       n_neg=1)
    131 

[/usr/local/lib/python3.9/dist-packages/torchkge/sampling.py](https://localhost:8080/#) in corrupt_batch(self, heads, tails, relations, n_neg)
    460 
    461         # Randomly choose which samples will have head/tail corrupted
--> 462         mask = bernoulli(self.bern_probs[relations]).double()
    463         n_heads_corrupted = int(mask.sum().item())
    464 

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

armand33 · 2023-04-13T10:29:06Z

Hi @GeorgeKonstantinosDimou, thanks for the message. Have you tried updating your version of TorchKGE. The patch of PR #246 has been included in the latest release.

dimou-gk · 2023-04-23T14:48:39Z

Greetings @armand33, yes version 0.17.7 still had the same problem unfortunately

shreyash-Pandey-Katni · 2023-05-06T08:15:29Z

Hi @GeorgeKonstantinosDimou and @ADIthaker, I looked into your issue and successfully reproduced it. It seems you are using an example code from the website. If you will just change the use_cuda to None. It will not work because the example code has the following snippet which will shift the whole model to CUDA but Data Loader will keep data on the CPU. So If you want to use CPU then Comment out the following snippet.

# Move everything to CUDA if available
if cuda.is_available():
    cuda.empty_cache()
    model.cuda()
    criterion.cuda()

Hope this clears your doubt.

armand33 self-assigned this Mar 20, 2023

armand33 added the bug Something isn't working label Mar 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) #251

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) #251

ADIthaker commented Mar 16, 2023

armand33 commented Mar 20, 2023

dimou-gk commented Apr 9, 2023 •

edited

Loading

armand33 commented Apr 13, 2023

dimou-gk commented Apr 23, 2023

shreyash-Pandey-Katni commented May 6, 2023

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) #251

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) #251

Comments

ADIthaker commented Mar 16, 2023

Description

What I Did

armand33 commented Mar 20, 2023

dimou-gk commented Apr 9, 2023 • edited Loading

armand33 commented Apr 13, 2023

dimou-gk commented Apr 23, 2023

shreyash-Pandey-Katni commented May 6, 2023

dimou-gk commented Apr 9, 2023 •

edited

Loading