Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The num_of_steps setting for Inception_v2 #5

Open
wesley-stone opened this issue May 30, 2018 · 3 comments
Open

The num_of_steps setting for Inception_v2 #5

wesley-stone opened this issue May 30, 2018 · 3 comments

Comments

@wesley-stone
Copy link

First of all, thank you very much. I noticed that 'num_steps' in 'faster_rcnn_inception_resnet_v2_atrous_kitti.config' file is not specified. Is this mean it would train infinitely? If so, could you share your experience on how many steps would be enough to have a stable loss?

@sshleifer
Copy link
Owner

sshleifer commented May 30, 2018 via email

@wesley-stone
Copy link
Author

wesley-stone commented May 31, 2018

I have trained it for about 21 hours on one TITAN X GPU with 1.2 steps/second. But my loss still fluctuate between 0 to 1. Did you change any parameters in 'faster_rcnn_inception_resnet_v2_atrous_kitti.config' such as learning rate? It seems from 0 to 900k steps, the learning rate is a constant .0003.

I found the training procedure could be significantly slowed down when running eval.sh at the same time. So I did not run eval currently. Will this affect the result?

thanks

this is my current training loss state:

INFO:tensorflow:global step 95931: loss = 0.4842 (0.827 sec/step)
INFO:tensorflow:global step 95932: loss = 0.2304 (0.831 sec/step)
INFO:tensorflow:global step 95933: loss = 0.6756 (0.824 sec/step)
INFO:tensorflow:global step 95934: loss = 0.5103 (0.829 sec/step)
INFO:tensorflow:global step 95935: loss = 0.3497 (0.820 sec/step)
INFO:tensorflow:global step 95936: loss = 0.3261 (0.829 sec/step)
INFO:tensorflow:global step 95937: loss = 0.3748 (0.823 sec/step)
INFO:tensorflow:global step 95938: loss = 0.1620 (0.826 sec/step)
INFO:tensorflow:global step 95939: loss = 0.3487 (0.828 sec/step)
INFO:tensorflow:global step 95940: loss = 0.3864 (0.823 sec/step)
INFO:tensorflow:global step 95941: loss = 0.1237 (0.827 sec/step)
INFO:tensorflow:global step 95942: loss = 0.4237 (0.827 sec/step)
INFO:tensorflow:global step 95943: loss = 0.2671 (0.841 sec/step)
INFO:tensorflow:global step 95944: loss = 0.5672 (0.873 sec/step)
INFO:tensorflow:global step 95945: loss = 0.2411 (0.889 sec/step)
INFO:tensorflow:global step 95946: loss = 0.3034 (0.876 sec/step)
INFO:tensorflow:global step 95947: loss = 0.0378 (0.883 sec/step)
INFO:tensorflow:global step 95948: loss = 0.2312 (0.876 sec/step)
INFO:tensorflow:global step 95949: loss = 0.1306 (0.855 sec/step)
INFO:tensorflow:global step 95950: loss = 0.3180 (0.818 sec/step)

default config in 'faster_rcnn_inception_resnet_v2_atrous_kitti.config' is

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 0
            learning_rate: .0003
          }
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017/model.ckpt"
  from_detection_checkpoint: true
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

@sshleifer
Copy link
Owner

sshleifer commented May 31, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants