Skip to content

Repo for the 'Artificial Neural Networks and Deep Learning' competition - 2019/2020

Notifications You must be signed in to change notification settings

mett29/DL-Competition

Repository files navigation

Repo for the 'Artificial Neural Networks and Deep Learning' course (2019/2020)

This repository contains the final python notebooks that were employed in 3 Kaggle challenges that were proposed during the course. We have exploited Colab and Kaggle servers to train our models. Since they offer the possibility to keep track of the file history, sometimes we didn't remember to update this repository and therefore you might miss some intermediate modification we have made on the files. The datasets that were used in each of the challenge are contained in a separate repo which is imported as a submodule. Artificial Neural Networks have shown impressing results in a broad range of application domains. The challenges are nothing else than a set of problems taken from image processing. The order in which they were presented was set to progressively increase the complexity of the tasks.

The repo is organzed as follow:

DL-CompetitionsDatasets, contains the datasets;

dataSetStatistics.py, was used to evaluate some characteristics of the datasets;

image_classification.ipynb, python notebook for the first challenge;

image_segmentation.ipynb, python notebook for the second challenge;

question_answering.ipynb, python notebook for the third challenge;

resize_on_disk.ipynb, python notebook to transform the dataset of the third challenge;

The challenges

  1. Image classification

    The first competition consists of a classification problem. In an image classification problem, given an image, the goal is to predict the correct class to which the image belongs. The task request to categorize 307 images in 20 different classes.

    In this challenge we have used: Convolutional Neural Netowrks, basic data augmentation techniques (zoom, rotation, horizontal and vertical flip), transfer learning with an without fine tuning ( Resnet, DenseNet201, InceptionV3, InceptionResNetV2, InceptionResNetV2, Xception) and ensembles with K-folding.

    For more information on the competition or in the techniques applied take a look on the two links below. CNN

    to the Kaggle competition ; notebook

  2. Aerial image segmentation

    In this second challenge we were requested to segment an image. Image segmentation can be seen as a classification problem applied to each pixel in the figure provided as input. The dataset, that with high probability was a subset of the Inria dataset, contains aerial orthorectified color images (you can see an example below). The challenge consists in determining which of the pixels belonged to a building.

    In this challenge we have used: U-Net models, transfer learning using pretrained networks such as DenseUNet and ResUNet, data augmentation( horizontal/vertical/zoom), preprocessing and postprocessing using techniques taken from image analysis and computer vision ( histogram equalization, Gaussian filters, morphological Transformations provided by OpenCV), we tried to increase the number of channels adding what can be obtained trough Laplacian filter and we tried a custom data augmentation, aimed at enriching the dataset by creating synthetic aerial images

    For more information on the competition or in the techniques applied take a look on the two links below. example output/input/ground truth

    to the Kaggle competition; notebook

  3. Visual question answering

    This was the most difficult challenge we faced. In this task the network takes two inputs: i) a synthetic scene in which are presented several objects with different geometric shapes and/or finishes (colour, material) ii) and a question about the existence of something in the scene (e.g., Is there a yellow thing?') or about counting (e.g., How many big objects are there?'). The network has to produce a suitable answer by choosing between a set of predefined sentence: yes, no, 0, 1, ..., 9. So in a certain sense, it can be seen as a classification problem.

    An example.
    example
    Q: What number of other matte objects are the same shape as the small rubber object?
    A: 1

    Even if the challenge was a subset of CLEVR, the dataset was huge : more than 12 GB. As a consequence, the first thing we did was to accelerate the training procedure (a batch of 64 elements took 2 seconds to be processed). After reading A simple neural network module for relational reasoning, it became clear that the task could be solved using images with lower resolution. In this way, we were able to reduce by around 8 times the time taken to process a batch and this allows to exploit more efficient caching mechanisms.

    The basic architecture that we used was a combination of three NNs. A CNN processed the image, while embedding + LSTM examined the question. The two outputs were then transformed by a dense layer to output a 1-hot encoded answer.

    We have tried several approach: tackling with different networks counting and boolean questions, GRU, different pre-trained feature extractors, pretrained word embedding, attention mechanisms and we designed a custom data generator to provide evenly distributed batches.

    For more information on the competition or in the techniques applied take a look on the two links below.

    to the Kaggle competition; notebook

About

Repo for the 'Artificial Neural Networks and Deep Learning' competition - 2019/2020

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published