Pixel-level classification using a U-Net architecture for Biomedical Image Segmentation of Mitochondria
For the create_patches.py
script to run as expected, make sure you download the sub-volumes (including groundtruth sub-volumes) for both training and testing as our dataset.
You can download the dataset from EPFL
- Clone the repository
git clone https://github.com/KushGabani/Biomedical-Image-Segmentation.git
- Navigate to the directory
cd Biomedical-Image-Segmentation
- Create virtual environment using virtualenv
# Not required if already installed
pip install virtualenv # (For Windows/Linux)
pip3 install virtualenv # (For Mac)
# Required
virtualenv venv
- Activate the virtual environment
source venv/bin/activate # (For Mac/Linux)
venv\Scripts\activate # (For Windows)
- Download the required dependencies
pip install -r requirements.txt
A single image of the dataset is 768 x 1024, hence we will split each images and their respective masks in small patches that will be easier to process and also increase the number of images to train on.
🚨 Make sure you create a new directory named
data
in the root of the project first before executing the script
Execute this command to run the create_patches.py
script.
(For Windows)
python ./create_patches.py <dataset_root_directory> <patch_size>
(For Mac/Linux)
python3 ./create_patches.py <dataset_root_directory> <patch_size>
A patch size of 256 x 256 is used if not specified.
The data_generator.py
facilitates the creation of a custom data generator using tf.keras.utils.Sequence
. This is encouraged to use in contrast to the data_preprocessor.py
file if your local system could not afford to process the entire dataset at once. The Data Generator will read and preprocess images in batches during training at runtime.
Instantiate a DataGenerator
object for both training and validation.
from data_generator import DataGenerator
train_generator = DataGenerator(batch_size=16, data_dir="./data/", shuffle=True, phase='train', test_size=0.1)
validation_generator = DataGenerator(batch_size=16, data_dir="./data/", shuffle=False, phase='test', test_size=0.1)
Preprocessed data is too large to be uploaded to the GitHub repository, hence you will have to preprocess locally. If you don't want to preprocess the data beforehand, you can use the data_generator.py
that will provide a DataGenerator
for model.fit()
Once the dataset is downloaded and the patches are created, you can now execute the data_preprocessor.py
script to preprocess and save the data in a numpy compressed file format.
(For Windows)
python ./data_preprocessor.py
(For Mac/Linux)
python3 ./data_preprocessor.py
The .npz
files can be found in the root directory of the project with the filename preprocessed_data.npz
You can now visualize the preprocessed data by executing the plot_samples.py
script.
(For Windows)
python ./plot_samples.py
(For Mac/Linux)
python3 ./plot_samples.py
The U-Net architecture is used for segmentation. The unet.py
contains the model architecture implemented in Tensorflow 2.x using the Keras API