Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative loss & logits_variance_loss #12

Open
GKalliatakis opened this issue Jan 20, 2020 · 4 comments
Open

Negative loss & logits_variance_loss #12

GKalliatakis opened this issue Jan 20, 2020 · 4 comments

Comments

@GKalliatakis
Copy link

Hi, I have created a Bayesian CNN classifier as described in this repo, but my model's loss is always negative as well as the logits_variance_loss (see screenshot below). Any idea why is that happening?

Screenshot from 2020-01-20 14-49-19

@pranavpandey2511
Copy link

@GKalliatakis Hi, can you please share the code for the loss function you wrote along with the training loop code.

@GKalliatakis
Copy link
Author

The loss function is exactly the one described in this repo:

# Bayesian categorical cross entropy.
# N data points, C classes, T monte carlo simulations
# true - true values. Shape: (N, C)
# pred_var - predicted logit values and variance. Shape: (N, C + 1)
# returns - loss (N,)
def bayesian_categorical_crossentropy(T, num_classes):
  def bayesian_categorical_crossentropy_internal(true, pred_var):
    # shape: (N,)
    std = K.sqrt(pred_var[:, num_classes:])
    # shape: (N,)
    variance = pred_var[:, num_classes]
    variance_depressor = K.exp(variance) - K.ones_like(variance)
    # shape: (N, C)
    pred = pred_var[:, 0:num_classes]
    # shape: (N,)
    undistorted_loss = K.categorical_crossentropy(pred, true, from_logits=True)
    # shape: (T,)
    iterable = K.variable(np.ones(T))
    dist = distributions.Normal(loc=K.zeros_like(std), scale=std)
    monte_carlo_results = K.map_fn(gaussian_categorical_crossentropy(true, pred, dist, undistorted_loss, num_classes), iterable, name='monte_carlo_results')
    
    variance_loss = K.mean(monte_carlo_results, axis=0) * undistorted_loss
    
    return variance_loss + undistorted_loss + variance_depressor
  
  return bayesian_categorical_crossentropy_internal

# for a single monte carlo simulation, 
#   calculate categorical_crossentropy of 
#   predicted logit values plus gaussian 
#   noise vs true values.
# true - true values. Shape: (N, C)
# pred - predicted logit values. Shape: (N, C)
# dist - normal distribution to sample from. Shape: (N, C)
# undistorted_loss - the crossentropy loss without variance distortion. Shape: (N,)
# num_classes - the number of classes. C
# returns - total differences for all classes (N,)
def gaussian_categorical_crossentropy(true, pred, dist, undistorted_loss, num_classes):
  def map_fn(i):
    std_samples = K.transpose(dist.sample(num_classes))
    distorted_loss = K.categorical_crossentropy(pred + std_samples, true, from_logits=True)
    diff = undistorted_loss - distorted_loss
    return -K.elu(diff)
  return map_fn

Then the model was compiled with the following settings (again as described in this repo):

        # Compile the model using two losses, one is the aleatoric uncertainty loss function
        # and the other is the standard categorical cross entropy function.
        self.model.compile(
            optimizer=Adam(lr=1e-3, decay=0.001),
            # optimizer=SGD(lr=1e-5, momentum=0.9),
            loss={'logits_variance': bayesian_categorical_crossentropy(self.monte_carlo_simulations, self.classes),  # aleatoric uncertainty loss function
                  'softmax_output': 'categorical_crossentropy'  # standard categorical cross entropy function
                  # 'softmax_output': standard_categorical_cross_entropy  # standard categorical cross entropy function
                  },
            metrics={'softmax_output': metrics.categorical_accuracy},
            # the aleatoric uncertainty loss function is weighted less than the categorical cross entropy loss
            # because the aleatoric uncertainty loss includes the categorical cross entropy loss as one of its terms.
            loss_weights={'logits_variance': .2, 'softmax_output': 1.}
        )

The only thing I am concerned about and is different from the implementation described here is the way raw images are fed in during training, because we have to do with a multi-output model and the author of this repo is dealing with a smaller dataset which allows him to do model.fit
In my case I have created a custom generator:

def multiple_outputs(generator, image_dir, batch_size, image_size, subset):
    gen = generator.flow_from_directory(
        image_dir,
        target_size=(image_size, image_size),
        batch_size=batch_size,
        class_mode='categorical',
        subset=subset)

    while True:
        gnext = gen.next()
        # return image batch and 3 sets of lables
        yield gnext[0], [gnext[1], gnext[1]]

which is used as follows:

datagen = ImageDataGenerator(rescale=1. / 255, validation_split=0.20)
custom_train_generator = multiple_outputs(generator = datagen,
                                             image_dir = base_dir,
                                             batch_size = train_batch_size,
                                             image_size = img_width,
                                             subset = 'training')

and then the custom generator is passed during fit:

history = self.model.fit_generator(custom_train_generator,
                                          epochs=nb_of_epochs,
                                          steps_per_epoch=steps_per_epoch,
                                          validation_data=validation_data,
                                          validation_steps=validation_steps,
                                          callbacks=callbacks_list)

Any thoughts?

@kenrickfernandes
Copy link

Hello @GKalliatakis, were you able to solve this issue?

@sborquez
Copy link

Hello, I think the order of the arguments of the K.categorical_crossentropy calls is wrong 🤔. In the Keras documentation, the arguments appear with y_true as the first argument and y_pred as the second.

undistorted_loss = K.categorical_crossentropy(true, pred, from_logits=True)

distorted_loss = K.categorical_crossentropy(true, pred + std_samples, from_logits=True)

Should the pred and true arguments be swapped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants