Why Deep? Because of deep-learning.
Why FISH? Because of Fluorescent in-situ Hybridization.
This repository is intended to share data and code for resolving some problems met in cytogenetics imaging such overlapping chromosomes.
In cytogenetics, experiments typically starts from chromosomal preparations fixed on glass slides. Occasionally a chromosome can fall on another one, yielding overlapping chromosomes in the image. Before computers and images processing with photography, chromosomes were cut from a paper picture and then classified (at least two paper pictures were required when chromosomes are overlapping). Automatic segmentation methods were developped to overcome this problem, however, these methods rely on a geometric analysis of the chromosome contour and require some human intervention when partial overlap occurs.
The QFISH on metaphase was classified as a low-throughput method for quantitative analysis of the lenght of the telomeres by Vera and Blasco. One of the botleneck of the method is the resolution of the the overlapping chromosomes. Modern deep-learning techniques have the potential to provide a more reliable, fully-automated solution.
- A Geometric Approach To Fully Automatic Chromosome Segmentation
- Automated Discrimination of Dicentric and Monocentric Chromosomes by Machine Learning-based Image Processing
- An Efficient Segmentation Method for Overlapping Chromosome Images
- A Review of Cytogenetics and its Automation
This notebook is run from jupyter with a python2 Kernel on a Ubuntu 16.04 OS inside a virtual environnement using the python packages available on the system. Several image processing libraries are used:
Up to now, there's only python notebooks is to produce a dataset large enough to train a supervised learning algorithm (semantic segmentation) capable of segmenting overlapping chromosomes. The overlapping chromosomes generated, imply only two chromosomes (this is a start). They are obtained by varying the relative positions and orientations of the two chromosomes.
The first stage would to submit one dataset to a semantic segmentation algorithm such segnet. Different implementations of Segnet are available in the current deep-learning frameworks:
The latter deep-learning framework is supposed to be more efficient than the segnet implementation:
-
Raw Images : set of 15 multispectral raw images (DAPI,Cy3,Cy5) of human lymphocytes chromosomes (12 bits tiff).
-
DAPI.tif and Cy3.tif : 12 bits images of metaphasic chromosomes. The telomeres marking the end of the chromosomes are visible in the Cy3.tif image.The metaphase doesn't contain overlapping chromosomes (metaphase 3)
-
LowRes train directory : more than 100 000 pairs (grey+groundtruth label) of low resolution images of overlapping chromosomes
-
LowRes validation directory : more than 100 000 pairs (grey+groundtruth label) of low resolution images of overlapping chromosomes
-
FullRes train : more power, why not trying to train a neural network on full resolution images (~100 000 pairs of grey+grountruth images)
-
FullRes validation : ~50 000 pairs of grey+groundtruth images to check the training