Skip to content

Latest commit

 

History

History
109 lines (77 loc) · 4.15 KB

09-dropout.md

File metadata and controls

109 lines (77 loc) · 4.15 KB

8.9 Regularization and dropout

Slides

Dropout is a technique that prevents overfitting in neural networks by randomly dropping nodes of a layer during training. As a result, the trained model works as an ensemble model consisting of multiple neural networks.

From perivous experiments we got the best value of learning rate 0.01 and layer size of 100. We'll use these values for the next experiment along with different values of dropout rates:

# Function to define model by adding new dense layer and dropout
def make_model(learning_rate=0.01, size_inner=100, droprate=0.5):
    base_model = Xception(weights='imagenet',
                          include_top=False,
                          input_shape=(150,150,3))

    base_model.trainable = False
    
    #########################################
    
    inputs = keras.Input(shape=(150,150,3))
    base = base_model(inputs, training=False)
    vectors = keras.layers.GlobalAveragePooling2D()(base)
    inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
    drop = keras.layers.Dropout(droprate)(inner) # add dropout layer
    outputs = keras.layers.Dense(10)(drop)
    model = keras.Model(inputs, outputs)
    
    #########################################
    
    optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    loss = keras.losses.CategoricalCrossentropy(from_logits=True)

    # Compile the model
    model.compile(optimizer=optimizer,
                  loss=loss,
                  metrics=['accuracy'])
    
    return model


# Create checkpoint to save best model for version 3
filepath = './xception_v3_{epoch:02d}_{val_accuracy:.3f}.h5'
checkpoint = keras.callbacks.ModelCheckpoint(filepath=filepath,
                                             save_best_only=True,
                                             monitor='val_accuracy',
                                             mode='max')

# Set the best values of learning rate and inner layer size based on previous experiments
learning_rate = 0.001
size = 100

# Dict to store results
scores = {}

# List of dropout rates
droprates = [0.0, 0.2, 0.5, 0.8]

for droprate in droprates:
    print(droprate)
    
    model = make_model(learning_rate=learning_rate,
                       size_inner=size,
                       droprate=droprate)
    
    # Train for longer (epochs=30) cause of dropout regularization
    history = model.fit(train_ds, epochs=30, validation_data=val_ds, callbacks=[checkpoint])
    scores[droprate] = history.history
    
    print()
    print()

Note: Because we introduce dropout in the neural networks, we will need to train our model for longer, hence, number of epochs is set to 30.

Classes, functions, attributes:

  • tf.keras.layers.Dropout(): dropout layer to randomly sets input units (i.e, nodes) to 0 with a frequency of rate at each epoch during training
  • rate: argument to set the fraction of the input units to drop, it is a value of float between 0 and 1

Notes

Add notes from the video (PRs are welcome)

  • A neural network might learn false patterns, i.e. if it repeatedly recognizes a certain logo on a t-shirt it might learn that the logo defines the t-shirt which is wrong since the logo might also be seen on a hoodie.
  • hiding parts of the images (freeze) from being seen by the learning neural network
  • dropout = randomly freezing parts of the image
  • comparing different performance parameters while changing dropout rate and regularization
⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.

Navigation