Skip to content

Localization of rings in synthecic images using a convolutional neural network (CNN). With localization we mean to predict the center of the circle in normalized coordinates relative to the center of the image.

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



80 Commits

Repository files navigation


Objective: Localization of rings in synthetic images using a convolutional neural network (CNN). With localization we mean to predict the center of the circle in normalized coordinates relative to the center of the image.

The images shows the synthetically generated input into the CNN, the prediction (purple rectangle) and the ground truth (green rectangle) for the center of the red ellipse.

alt text alt text alt text alt text

We are using PIL (Python Imaging Library) to generate and modify the images and TensorFlow to construct and train the neural network.

Structure of the neural net that performs regression:

  • input_layer = tf.reshape(features["x"], [-1, 200, 300, 3])
  • conv1 = tf.layers.conv2d(inputs=input_layer, filters=4, kernel_size=[5, 5], padding="same", activation=tf.nn.relu)
  • pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)
  • conv2 = tf.layers.conv2d(inputs=pool1, filters=8, kernel_size=[5, 5], padding="same", activation=tf.nn.relu)
  • pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)
  • pool2_flat = tf.reshape(pool2, [-1, 75 * 50 * 8])
  • dropout = tf.layers.dropout(inputs=pool2_flat, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)
  • dense1 = tf.layers.dense(inputs=dropout, units=200, activation=tf.nn.relu)
  • dense2 = tf.layers.dense(inputs=dense1, units=20, activation=tf.nn.relu)
  • dense3 = tf.layers.dense(inputs=dense2, units=2, activation=None)
  • predictions = {"predict_results": tf.identity(dense3, name="final_layer")

Computation time: I ran this code on a laptop with a 2014 quadcore CPU, a mid-range mobile GPU (2GB VRAM) and 16 GB RAM. Time for 100 training steps: 51.750 seconds. Visually acceptable predictions are achieved after 2000 steps, good predictions are achieved after 6000 steps. A mean_squared_error of 0.01 is reachable.

Results: The current model shows good convergence and good prediction accuracy. Still, there is a lot of room for improvement, for example by switching to advanced localization algorithms like YOLOv3. A convolutional neural network is not able to achieve outstanding accuracy in regression tasks.


There are 2 independent functionalities:

  1. Generating labeled data in the form of images which contain a single ellipse on a noisy background. For this, run the file

  2. Train a CNN that predicts the center of the ellipse. For this, run

After a defined number of training steps, an evaluation with unseen test data follows. The predictions for the test data are visualized and saved to the test_output/ folder. The weights of the neural network are also saved to a defined folder and used as initialization for the next training, if available.

Author: Gerrit Schoettler Contact: gerrit.schoettler[at]


Localization of rings in synthecic images using a convolutional neural network (CNN). With localization we mean to predict the center of the circle in normalized coordinates relative to the center of the image.






No releases published


No packages published
