Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OASIS brain data set using VQVAE - Daniel Miller 45810536 #458

Open
wants to merge 30 commits into
base: topic-recognition
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c2702e2
Trying something meaningful
Oct 11, 2022
15ca543
Created required files
dapmiller Oct 11, 2022
032ac04
Provided brief description for each file of what is to be included
dapmiller Oct 14, 2022
446c8ba
Added 2 new functions into datasey.py which one downloads the oasis d…
dapmiller Oct 14, 2022
13144c6
Add comments to function for explanation and understanding in dataset…
dapmiller Oct 15, 2022
6e334b3
Added functions to dataset.py to load labels and also process them wi…
dapmiller Oct 15, 2022
4c758ce
Added calls to load validation and test labels in train.py from datas…
dapmiller Oct 15, 2022
38430c7
Added brief summary to Read.me file about model
dapmiller Oct 15, 2022
d4f1ec4
Safety commit of working loading and processing of data before implem…
dapmiller Oct 15, 2022
b1bb74b
Created encoder, decoder and overall vq-vae functions in modules.py h…
dapmiller Oct 15, 2022
fa73b09
Finished building model in modules.py. It compiles and prints out sum…
dapmiller Oct 18, 2022
b2c7b49
Finished building model in modules.py and it successfully compiles in…
dapmiller Oct 18, 2022
df70801
Testing commit
dapmiller Oct 18, 2022
dee848c
Commited the module.py file and removed unnecessary code
dapmiller Oct 18, 2022
9ee75d3
created a function to batch data before training in dataset.py
dapmiller Oct 19, 2022
605cb17
Successfully trained data. Safety Check Commit. Removed Batching func…
dapmiller Oct 20, 2022
1e99576
Created functions in train.py that compute structural similiarity. It…
dapmiller Oct 20, 2022
d361777
Edited the ssim in train.py. Also added references to dataset.py anhd…
dapmiller Oct 21, 2022
235585b
Fixed bug that was stopping loss reduce in model.fit. Currently imple…
dapmiller Oct 21, 2022
b0d498b
Create Images
dapmiller Oct 21, 2022
9b90b9f
Delete Images
dapmiller Oct 21, 2022
f396991
Edited train.py bug in training. It now runs and the losses appear to…
dapmiller Oct 21, 2022
3aeadda
Fixed Bug in model.fit that caused loss. It was due to variance being…
dapmiller Oct 21, 2022
a8f82d0
Fixed Bug in model.fit that caused loss. It was due to variance being…
dapmiller Oct 21, 2022
92073ce
Merge branch 'topic-recognition' of https://github.com/dapmiller/Patt…
dapmiller Oct 21, 2022
ab702fe
Fixed Bug in model.fit that caused loss. It was due to variance being…
dapmiller Oct 21, 2022
6899684
Trying to add images
dapmiller Oct 21, 2022
63a9a18
Added Graph
dapmiller Oct 21, 2022
f5db768
Cleaned up Files and finsihed off REadme file. Commented out the pixe…
dapmiller Oct 21, 2022
09450f8
Fixed ssim to read 0.74 not 74
dapmiller Oct 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions recognition/Miller/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Vector Quantized Variational Auto-encoder(VQ VAE Model)

In this report, a generative model of the Vector Quantized Variational AutoEncoder (VQ VAE) was used to generate reconstructed images of the OASIS brain data set that are "reasonably clear" and have a Structured Similarity (SSIM) of over 0.6. The VQ VAE was adapted using tensorflow keras.

#### Description of VQ VAE Algorithm
![](https://miro.medium.com/max/1400/1*yRdNe3xi4f3KV6ULW7yArA.png)
>Figure 1: Graphical representation of a VQ-VAE network.

A standard VAE (encoder->decoder) uses a continous latent space that is sampled using gaussain distribution; this makes it hard to learn a continuous distribution with a gradient descent. In comparison, VQ VAE uses a discrete latent space; and consists of three parts as seen above:

1. Encoder:
* Convolutional network to downsample the features of an image
2. Latent Space:
* Codebook consists of n latent embedding vectors of dimension D each
* Each code represents the distance between each embedding and encoded output (euclidean distance) ->outputs embeded vector
* feed closest encoder output to codebook as input to decoder
3. Decoder:
* Convolutional network to upsample and gnerate reconstructed samples.

#### ==============Oasis Brain Data Set==============
![](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRl7czOsj3uzWRQ6NT2ofed7QBsKiqrUq6Bsw&usqp=CAU)
>Figure 2: Comparison of an image stored in the train vs test data sets

The Oasis MRI Dataset cobtains 9664 training images, 544 test images and 1120 validation images. An example of train and test data is shown above. The images are preloaded into a file location and from there extracted into processing for use.

##### Data Pre-Processing

Before the data was used, it was normalised through residual extration and rescaling. This makes it easier to compare the distributions with different means and scales to maintain the shape of the distribution.

## ==============Training==============

The three data groups - train, test, and validate are split 0.85/0.1/0.05. The training set contains the most images so the model has enough information to learn from to produce accurate reconstructions later. The test set is used to validate these reconstructions. The validation set is not required, as the model is judged by the quality of the reconstructons on the test set. The model is trained with ... epochs on a batch size of 128.
*insert image

## ==============Results==============

The reconstructed images achieved a mean Structured Similarity of ...
*Inerset image
## Dependencies
* Python 3.7
* TensorFlow 2.6.0
* Numpy 1.19.5
* matplotlib 3.2.2
* Pillow 7.1.2
* os
* Pre-processed OASIS MRI dataset (accessible at https://cloudstor.aarnet.edu.au/plus/s/n5aZ4XX1WBKp6HZ/download).

## References
[1] A. v. d. Oord, O. Vinyals, and K. Kavukcuoglu, 2018. Neural Discrete Representation Learning. [Online]. Available at: https://arxiv.org/pdf/1711.00937.pdf.

[2] Paul, S., 2021. Keras documentation: Vector-Quantized Variational Autoencoders. [online] Keras.io. Available at: https://keras.io/examples/generative/vq_vae/.

[3] https://github.com/shakes76/PatternFlow/tree/master/recognition/MySolution
113 changes: 113 additions & 0 deletions recognition/Miller/dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
"""
dataset.py" containing the data loader for loading and preprocessing your data

This was file utilises and modifies the fucntions found in https://github.com/shakes76/PatternFlow/tree/master/recognition/MySolution
"""

import tensorflow as tf
import glob
import numpy as np
from matplotlib import image
import os
from PIL import Image


# Download the Oasis Data as zip file. Will need to extract it manually afterwards
def download_oasis ():

dataset_url = "https://cloudstor.aarnet.edu.au/plus/s/n5aZ4XX1WBKp6HZ/download"

# Download file from URL Path, origin=path, fname=file name, untar=compress file
tf.keras.utils.get_file(origin=dataset_url,fname='oa-sis' ,untar=True)

# Loads the training images (non segmented) from given path and returns an numpy array of arrays
def load_training (path):

image_list = []
# Iterate through all paths and convert to 'png'
for filename in glob.glob(path + '/*.png'):
# Read an image from the given filename into an array
im = image.imread (filename)
# Append array to list
image_list.append(im)

print('train_X shape:', np.array(image_list).shape)

# Create an numpy array to hold all the array turned images
train_set = np.array(image_list, dtype=np.float32)


return train_set

# Normalizes training images and adds 4th dimention
def process_training (data_set):

""" Residual Extraction -> Useful for comparing distributions with different means but similar shapes"""
# Calculate the residuals of the data - each residual is dist from each distribution mean which is now zero
data_set = (data_set - np.mean(data_set)) / np.std(data_set)
""" Min-Max Rescaling -> Useful for comparign distributions with different scales or different shapes"""
# Rescale Data - ratio of dist of each value from min value in each dataset to range of values in each dataset -> value between (0,1) now
# Forces dataset to be same scale, and perseves shape of distribution -> "Squeezed and shifted to fit between 0 and 1"
data_set= (data_set - np.amin(data_set)) / np.amax(data_set - np.amin(data_set))
# Add 4th dimension
data_set = data_set [:,:,:,np.newaxis]

return data_set

# Loads labels images from given path and map pixel values to class indices and convert image data type to unit8
def load_labels (path):
image_list =[]

# Iterate through all paths and convert to 'png'
for filename in glob.glob(path+'/*.png'):
# Read an image from the given filename into an array
im=image.imread (filename)
# Create 'im.shape[0] x im.shape[1]' shaped array of arrays of zeros
one_hot = np.zeros((im.shape[0], im.shape[1]))
# Iterate through sorted and unique arrays of given array turned image
for i, unique_value in enumerate(np.unique(im)):
# One hot each unique array with its numerical value of its entry in the dataset -> transform categorical into numerical dummy features
one_hot[:, :][im == unique_value] = i
# Append array to list
image_list.append(one_hot)

print('train_y shape:',np.array(image_list).shape)

# Create an numpy array to hold all the array turned images
labels = np.array(image_list, dtype=np.uint8)

#pyplot.imshow(labels[2])
#pyplot.show()

return labels

# One hot encode label data and convert to numpy array
def process_labels (seg_data):
onehot_Y = []

# Iterate through all array turned images by shapes first value
for n in range(seg_data.shape[0]):

# Get data at position in array
im = seg_data[n]

# There are 4 classes
n_classes = 4

# Create 'im.shape[0] x im.shape[1] x n_classes' shaped array of arrays of arrays of zeros with type uint8
one_hot = np.zeros((im.shape[0], im.shape[1], n_classes),dtype=np.uint8)

# Iterate through sorted and unique arrays of given array turned image
for i, unique_value in enumerate(np.unique(im)):
# One hot each unique array with its numerical value of its entry in the dataset -> transform categorical into numerical dummy features
one_hot[:, :, i][im == unique_value] = 1
# Append array to list
onehot_Y.append(one_hot)

# Create an numpy array to hold all the array turned images
onehot_Y =np.array(onehot_Y)
#print (onehot_Y.dtype)
#print (np.unique(onehot_validate_Y))
#print (onehot_Y.shape)

return onehot_Y
197 changes: 197 additions & 0 deletions recognition/Miller/modules.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
"""
“modules.py" containing the source code of the components of your model. Each component must be
implementated as a class or a function

Based on Neural Discrete Representation Learning by van der Oord et al https://arxiv.org/pdf/1711.00937.pdf
and the given example on https://keras.io/examples/generative/vq_vae/
"""
import tensorflow as tf

"""CREATE STRUCTURE OF VQ-VAR MODEL"""

"""
Class Representation of the Vector Quantization laye

Structure is:
1. Reshape into (n,h,w,d)
2. Calculate L2-normalized distance between the inputs and the embeddings. -> (n*h*w, d)
3. Argmin -> find minimum distance between indices for each n*w*h vector
4. Index from dictionary: index the closest vector from the dictionary for each of n*h*w vectors
5. Reshape into original shape (n, h, w, d)
6. Copy gradients from q -> x
"""
class VectorQ_layer(tf.keras.layers.Layer):
def __init__(self, embedding_num, latent_dimension, beta=0.25, **kwargs):
super().__init__(**kwargs)
self.embedding_num = embedding_num
self.latent_dimension = latent_dimension
self.beta = beta

# Initialize the embeddings which we will quantize.
w_init = tf.random_uniform_initializer()
self.embeddings = tf.Variable(initial_value=w_init(shape=(self.latent_dimension, self.embedding_num), dtype="float32"),trainable=True,name="embeddings_vqvae",)

# Forward Pass behaviour. Takes Tensor as input
def call(self, x):
# Calculate the input shape and store for later -> Shape of (n,h,w,d)
input_shape = tf.shape(x)

# Flatten the inputs to keep the embedding dimension intact.
# Combine all dimensions into last one 'd' -> (n*h*w, d)
flatten = tf.reshape(x, [-1, self.latent_dimension])

# Get code indices
# Calculate L2-normalized distance between the inputs and the embeddings.
# For each n*h*w vectors, we calculate the distance from each of k vectors of embedding dictionaty to obtain matrix of shape (n*h*w, k)
similarity = tf.matmul(flatten, self.embeddings)
distances = (tf.reduce_sum(flatten ** 2, axis=1, keepdims=True) + tf.reduce_sum(self.embeddings ** 2, axis=0) - 2 * similarity)

# For each n*h*w vectors, find the indices of closest k vector from dictionary; find minimum distance.
encoded_indices = tf.argmin(distances, axis=1)

# Turn the indices into a one hot encoded vectors; index the closest vector from the dictionary for each n*h*w vector
encodings = tf.one_hot(encoded_indices, self.embedding_num)
quantized = tf.matmul(encodings, self.embeddings, transpose_b=True)

# Reshape the quantized values back to its original input shape -> (n,h,w,d)
quantized = tf.reshape(quantized, input_shape)

""" LOSS CALCULATIONS """
"""
COMMITMENT LOSS
Since volume of embedding spcae is dimensionless, it may grow arbitarily if embedding ei does not
train as fast as encoder parameters. Thus add a commitment loss to make sure encoder commits to an embedding
CODE BOOK LOSS
Gradients bypass embedding, so we use a dictionary learningn algorithm which uses l2 error to
move embedding vectors ei towards encoder output

tf.stop_gradient -> no gradient flows through
"""
commitment_loss = tf.reduce_mean((tf.stop_gradient(quantized) - x) ** 2)
codebook_loss = tf.reduce_mean((quantized - tf.stop_gradient(x)) ** 2)
self.add_loss(self.beta * commitment_loss + codebook_loss)
# Straight-through estimator.
# Unable to back propragate as gradient wont flow through argmin. Hence copy gradient from qunatised to x
# During backpropagation, (quantized -x) wont be included in computation anf the gradient obtained will be copied for inputs
quantized = x + tf.stop_gradient(quantized - x)

return quantized

# Represents the VAE Structure
class VAE:
def __init__(self, embedding_num, latent_dimension, beta=0.25):
self.embedding_num = embedding_num
self.latent_dimension = latent_dimension
self.beta=beta
"""
Returns layered model for encoder architecture built from convolutional layers.

activations: ReLU advised as other activations are not optimal for encoder/decoder quantization architecture.
e.g. Leaky ReLU activated models are difficult to train -> cause sporadic loss spikes that model struggles to recover from
"""
# Encoder Component
def encoder_component(self):
#2D Convolutional Layers
# filters -> dimesion of output space
# kernal_size -> convolution window size
# activation -> activation func used
# relu ->
# strides -> spaces convolution window moves vertically and horizontally
# padding -> "same" pads with zeros to maintain output size same as input size
inputs = tf.keras.Input(shape=(256, 256, 1))

layer = tf.keras.layers.Conv2D(32, 3, activation="relu", strides=2, padding="same")(inputs)
layer = tf.keras.layers.Conv2D(64, 3, activation="relu", strides=2, padding="same")(layer)

outputs = tf.keras.layers.Conv2D(self.latent_dimension, 1, padding="same")(layer)
return tf.keras.Model(inputs, outputs, name="encoder")

# Returns the vq Layer
def vq_layer(self):
return VectorQ_layer(self.embedding_num, self.latent_dimension, self.beta, name="vector_quantizer")

"""
Returns the model for decoder architecture built from tranposed convolutional layers.

activations: ReLU advised as other activations are not optimal for encoder/decoder quantization architecture.
e.g. Leaky ReLU activated models are difficult to train -> cause sporadic loss spikes that model struggles to recover from
"""
# Decoder Component
def decoder_component(self):
inputs = tf.keras.Input(shape=self.encoder_component().output.shape[1:])
#2D Convolutional Transpose Layers
# filters -> dimesion of output space
# kernal_size -> convolution window size
# activation -> activation func used
# relu ->
# strides -> spaces convolution window moves vertically and horizontally
# padding -> "same" pads with zeros to maintain output size same as input size
layer = tf.keras.layers.Conv2DTranspose(64, 3, activation="relu", strides=2, padding="same")(inputs)
layer = tf.keras.layers.Conv2DTranspose(32, 3, activation="relu", strides=2, padding="same")(layer)
outputs = tf.keras.layers.Conv2DTranspose(1, 3, padding="same")(layer)
return tf.keras.Model(inputs, outputs, name="decoder")

# Build Model
def build_model(self):
vq_layer = self.vq_layer()
encoder = self.encoder_component()
decoder = self.decoder_component()

inputs = tf.keras.Input(shape=(256, 256, 1))
encoder_outputs = encoder(inputs)
quantized_latents = vq_layer(encoder_outputs)
reconstructions = decoder(quantized_latents)
model = tf.keras.Model(inputs, reconstructions, name="vq_vae")
model.summary()
return model

# Create a model instance and sets training paramters
class VQVAETRAINER(tf.keras.models.Model):
def __init__(self, variance, latent_dimension=32, embeddings_num=128, **kwargs):

super(VQVAETRAINER, self).__init__(**kwargs)
self.latent_dimension = latent_dimension
self.embeddings_num = embeddings_num
self.variance = variance

VAE_model = VAE(self.embeddings_num, self.latent_dimension)
self.vqvae_model = VAE_model.build_model()


self.total_loss_tracker = tf.keras.metrics.Mean(name="total_loss")
self.reconstruction_loss_tracker = tf.keras.metrics.Mean(name="reconstruction_loss")
self.vq_loss_tracker = tf.keras.metrics.Mean(name="vq_loss")

@property
def metrics(self):
# Model metrics -> returns losses (total loss, reconstruction loss and the vq_loss)
return [self.total_loss_tracker, self.reconstruction_loss_tracker, self.vq_loss_tracker]

def train_step(self, x):
with tf.GradientTape() as tape:
# Outputs from the VQ-VAE.
reconstructions = self.vqvae_model(x)

# Calculate the losses.
reconstruction_loss = (tf.reduce_mean((x - reconstructions) ** 2) / self.variance)
total_loss = reconstruction_loss + sum(self.vqvae_model.losses)

# Backpropagation.
grads = tape.gradient(total_loss, self.vqvae_model.trainable_variables)
self.optimizer.apply_gradients(zip(grads, self.vqvae_model.trainable_variables))

# Loss tracking.
"""CODEBOOK LOSS + COMMITMENT LOSS -> euclidean loss + encoder loss"""
self.total_loss_tracker.update_state(total_loss)
"""RECONSTRUCTION ERROR (MSE) -> between input and reconstruction"""
self.reconstruction_loss_tracker.update_state(reconstruction_loss)
self.vq_loss_tracker.update_state(sum(self.vqvae_model.losses))

# Log results.
return {
"loss": self.total_loss_tracker.result(),
"reconstruction_loss": self.reconstruction_loss_tracker.result(),
"vqvae_loss": self.vq_loss_tracker.result(),
}


Loading