Bug when running quicker_`learning_experiment.py` #57

gngdb · 2015-03-10T20:23:57Z

Trying to run the quicker_learning_experiment.py on stonesoup get the following error:

Error when tring to find the memory information on the GPU: unspecified launch failure                            
Error freeing device pointer 0x1308f00000 (unspecified launch failure). Driver report 0 bytes free and 0 bytes tot
al                                                                                                                
CudaNdarray_uninit: error freeing self->devdata. (self=0x7f59efdfd2f0, self->devata=0x1308f00000)                 
Traceback (most recent call last):                                                                                
  File "train.py", line 129, in <module>                                                                          
    main(args.run_settings,verbose=args.v,force=args.f)                                                           
  File "train.py", line 34, in main                                                                               
    train_pylearn2(run_settings, verbose=verbose, force=force)                                                    
  File "train.py", line 110, in train_pylearn2                                                                    
    train.main_loop()                                                                                             
  File "/afs/inf.ed.ac.uk/user/s08/s0805516/repos/pylearn2/pylearn2/train.py", line 207, in main_loop             
    rval = self.algorithm.train(dataset=self.dataset)                                                             
  File "/afs/inf.ed.ac.uk/user/s08/s0805516/repos/pylearn2/pylearn2/training_algorithms/sgd.py", line 455, in trai
n                                                                                                                 
    self.sgd_update(*batch)                                                                                       
  File "/afs/inf.ed.ac.uk/user/s08/s0805516/repos/Theano/theano/compile/function_module.py", line 606, in __call__
    storage_map=self.fn.storage_map)                                                                              
  File "/afs/inf.ed.ac.uk/user/s08/s0805516/repos/Theano/theano/compile/function_module.py", line 595, in __call__
    outputs = self.fn()                                                                                           
RuntimeError: Cuda error: GpuElemwise node_cc53582e3475c50a9d3e2365029d7cfa_0 Add: unspecified launch failure.    
    n_blocks=30 threads_per_block=256                                                                             
   Call: kernel_Add_node_cc53582e3475c50a9d3e2365029d7cfa_0_Ccontiguous<<<n_blocks, threads_per_block>>>(numEls, l
ocal_dims[0], local_dims[1], i0_data, local_str[0][0], local_str[0][1], i1_data, local_str[1][0], local_str[1][1],
 o0_data, local_ostr[0][0], local_ostr[0][1])                                                                     

Apply node that caused the error: GpuElemwise{Add}[(0, 0)](GpuCorrMM_gradInputs{valid, (1, 1)}.0, GpuDimShuffle{x,
0,1,2}.0)                                                                                                         
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, False, False, False))]               
Inputs shapes: [(128, 48, 52, 52), (1, 48, 52, 52)]                                                               
Inputs strides: [(129792, 2704, 52, 1), (0, 2704, 52, 1)]                                                         
Inputs values: ['not shown', 'not shown']                                                                         

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created.
 This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimiza
tions can be disabled with 'optimizer=None'.                                                                      
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node
.

Haven't looked into it in more depth.

The text was updated successfully, but these errors were encountered:

gngdb added this to the GPU Training deadline milestone Mar 10, 2015

gngdb added the bug label Mar 10, 2015

gngdb modified the milestones: Pylearn2 models, GPU Training deadline Mar 10, 2015

gngdb added the ready label Mar 10, 2015

gngdb changed the title ~~Bug when running quicker_learning_experiment.py~~ Bug when running quicker_learning_experiment.py Mar 10, 2015

gngdb changed the title ~~Bug when running quicker_learning_experiment.py~~ Bug when running quicker_learning_experiment.py Mar 11, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug when running quicker_`learning_experiment.py` #57

Bug when running quicker_`learning_experiment.py` #57

gngdb commented Mar 10, 2015

Bug when running quicker_learning_experiment.py #57

Bug when running quicker_learning_experiment.py #57

Comments

gngdb commented Mar 10, 2015

Bug when running quicker_`learning_experiment.py` #57

Bug when running quicker_`learning_experiment.py` #57