Spend long time to traning data #21

mjohn123 · 2016-04-04T13:58:42Z

Hello all, I am running the fcnTrain on my PC (core i7, 16GB Ram, GPU). However, it spend long time (0.5 day to 1 day) to running each epoch. If anyone meet the issue, could you give me the solution to solve that issue? Thank all

HLinn · 2016-04-07T01:52:49Z

I have met the same question,do you have any solutions later? @mjohn123

mjohn123 · 2016-04-07T03:07:55Z

I still looking for the solution. I did not solve it yet

brisker · 2016-04-12T05:56:28Z

Is your gpu frequency very low like 1Hz or normal? My gpu properties:
CUDADevice with properties:

                  Name: 'GeForce GTX TITAN X'
                 Index: 1
     ComputeCapability: '5.2'
        SupportsDouble: 1
         DriverVersion: 7.5000
        ToolkitVersion: 6.5000
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 1.2885e+10
       AvailableMemory: 1.2609e+10
   MultiprocessorCount: 24
          ClockRateKHz: 1076000
           ComputeMode: 'Default'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

train: epoch 01: 1/565: 0.9 Hz accuracy: 0.677 0.048 0.032 objective: 3.044
train: epoch 01: 2/565: 1.0 Hz accuracy: 0.691 0.048 0.033 objective: 3.035
train: epoch 01: 3/565: 1.1 Hz accuracy: 0.698 0.048 0.033 objective: 2.994

mjohn123 · 2016-04-12T08:28:35Z

Hi brisker. This is my GPU. Note that, I just used CUDA (not cudnn) for my simulation. I ran matconvnet with cuda mode successful.

 CUDADevice with properties:

                      Name: 'GeForce GTX 750 Ti'
                     Index: 1
         ComputeCapability: '5.0'
            SupportsDouble: 1
             DriverVersion: 7.5000
            ToolkitVersion: 7
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 2.1475e+09
           AvailableMemory: 1.6844e+09
       MultiprocessorCount: 5
              ClockRateKHz: 1202000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
            DeviceSelected: 1

train: epoch 30: 259/565: 0.3 Hz accuracy: 0.905 0.776 0.681 objective: 0.258
train: epoch 30: 260/565: 0.3 Hz accuracy: 0.905 0.776 0.682 objective: 0.258
train: epoch 30: 261/565: 0.3 Hz accuracy: 0.905 0.776 0.682 objective: 0.258

brisker · 2016-04-12T11:58:47Z

What kind of mistakes do you think may be the reason? Have you got some ideas?What is your Matlab version?

mjohn123 · 2016-04-12T12:45:06Z

Hello, I am using Matlab 8.6.0.267246 (R2015b)

jingyanw · 2016-04-14T18:40:37Z

I was able to get ~0.5 Hz with a CPU and >5 Hz with a GPU (GeForce GTX TITAN X).
Could you check if you've passed the gpus field to cnn_train_dag correctly? For example, you can check your GPU memory usage during training.

mjohn123 · 2016-04-15T01:30:23Z

Thank jingyanw. You are right. My GPU did not use when I training (as the attach file). That is reason why it spend long time (just use CPU). How could I solve it (or active GPU) for fcnTrain?

I am using win 10-matlab 2015b. I installed matconvnet by using the command vl_compilenn('enableGpu', true) and run test with vl_testnn('gpu', true) and it has no error as figure. In fcnTrain.m, I used opts.train.gpus = [] ; Is it correct? I used current code as

% Setup data fetching options
bopts.useGpu = numel(opts.train.gpus) > 0 ;

% Launch SGD
info = cnn_train_dag(net, imdb, getBatchWrapper(bopts), ...
                     opts.train, ....
                     'train', train, ...
                     'val', val, ...
                     opts.train) ;

If I set opts.train.gpus = 1; then the error is

Error using vl_nnconv
Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU
by calling 'gpuDevice(1)'.

Error in dagnn.Conv/backward (line 20)
      [derInputs{1}, derParams{1}, derParams{2}] = vl_nnconv(...

Error in dagnn.Layer/backwardAdvanced (line 118)
      [derInputs, derParams] = obj.backward ...

Error in dagnn.DagNN/eval (line 107)
  obj.layers(l).block.backwardAdvanced(obj.layers(l)) ;

Error in cnn_train_dag>process_epoch (line 194)
      net.eval(inputs, opts.derOutputs) ;

Error in cnn_train_dag (line 89)
    stats.train(epoch) = process_epoch(net, state, opts, 'train') ;

Error in fcnTrain (line 98)
info = cnn_train_dag(net, imdb, getBatchWrapper(bopts), ...

jingyanw · 2016-04-15T13:49:48Z

The error indicates that you do not have enough GPU memory. Running FCN takes ~5GB GPU memory on my machine.

mjohn123 · 2016-04-15T15:28:59Z

I see. My GPU only has 2GB. I also have GPU on CPU chip, but I think it is not useful. Thank you for your help. I think I can only use CPU.

brisker · 2016-04-18T05:59:39Z

@jingyanw Hello， I also got it around 5.5Hz with titan x, but why the classification examples like mnist is more than 1000 Hz but why this fcnTrain is so slow?

daofeng2007 · 2016-08-13T22:13:27Z

@brisker I also got it around 5.5 using Pascal Titan X. My problem is when it finished epoch 01 and started to run epoch 02, following error showed up

train: epoch 02: 1/ 56:Error using gpuArray
The GPU failed to allocate memory. To continue, reset the GPU by running 'gpuDevice(1)'. If this problem persists,
partition your computations into smaller pieces.

Error in getBatch (line 91)
ims = gpuArray(ims) ;

Error in fcnTrain>@(imdb,batch)getBatch(imdb,batch,opts,'prefetch',nargout==0) (line 108)
fn = @(imdb,batch) getBatch(imdb,batch,opts,'prefetch',nargout==0) ;

Error in cnn_train_dag>process_epoch (line 197)
inputs = state.getBatch(state.imdb, batch) ;

Error in cnn_train_dag (line 83)
[stats.train(epoch),prof] = process_epoch(net, state, opts, 'train') ;

Error in fcnTrain (line 99)
info = cnn_train_dag(net, imdb, getBatchWrapper(bopts), ...

HosnaCSE · 2017-01-03T03:50:22Z

Hello,

I have 12GB RAM on GPU (Titan) and 8GM Ram on CPU. My net is encounter error as "Out of memory on device" which is required only 4GM RAM. Does anyone have similar experience? Any suggestion?
(I can run this with 128X128 image which require 2GB but I need bigger image like 256X256 for Semantic Segmentation)

Thanks.
Hosna

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spend long time to traning data #21

Spend long time to traning data #21

mjohn123 commented Apr 4, 2016

HLinn commented Apr 7, 2016

mjohn123 commented Apr 7, 2016

brisker commented Apr 12, 2016

mjohn123 commented Apr 12, 2016

brisker commented Apr 12, 2016

mjohn123 commented Apr 12, 2016

jingyanw commented Apr 14, 2016

mjohn123 commented Apr 15, 2016 •

edited

Loading

jingyanw commented Apr 15, 2016

mjohn123 commented Apr 15, 2016

brisker commented Apr 18, 2016 •

edited

Loading

daofeng2007 commented Aug 13, 2016

HosnaCSE commented Jan 3, 2017

Spend long time to traning data #21

Spend long time to traning data #21

Comments

mjohn123 commented Apr 4, 2016

HLinn commented Apr 7, 2016

mjohn123 commented Apr 7, 2016

brisker commented Apr 12, 2016

mjohn123 commented Apr 12, 2016

brisker commented Apr 12, 2016

mjohn123 commented Apr 12, 2016

jingyanw commented Apr 14, 2016

mjohn123 commented Apr 15, 2016 • edited Loading

jingyanw commented Apr 15, 2016

mjohn123 commented Apr 15, 2016

brisker commented Apr 18, 2016 • edited Loading

daofeng2007 commented Aug 13, 2016

HosnaCSE commented Jan 3, 2017

mjohn123 commented Apr 15, 2016 •

edited

Loading

brisker commented Apr 18, 2016 •

edited

Loading