DeepInsightTab2Image

DeepInsight Tab2Image coder: Improvement in performance and easy to integrate this coder. A simple way of converting tabular data to images for convolutional neural networks (CNNs). Additions of new functions (such as blurring technique) increase the classification accuracy of a CNN model used. One line function to convert tabular data to image samples using:

[XTrain,model] = deepinsightTab2Img(data,options);

See details by typing help deepinsightTab2Img in Matlab console.

To convert test or validation data use X = deepinsightConv(data,model), where model is generated from deepinsightTab2Img function.

New updates in this package are:

Package is redesigned to simplyfy the usage.
A number of projection methods are included: tsne,umap,kpca,pca and lda (a supervised method).
New blurring technique is included. This technique populate nearby pixels of characteristic pixels. This technique has shown to improve the classification performance of CNN model significantly.
Conversion of a d x n matrix or a d x n x layers matrix (3D) is possible. Multi-omics data or multi-layered data can be converted to colored images.
For multi-layered data (d x n x layers), projection of data using a particular layer (e.g. layer-1) is possible. This will find pixel locations based on layer-1, and the elements of other layers (e.g. layer-2 and layer-3) are mapped to these pixel locations.
Continuing from above (4), it is possible to simultaneously use all the layers to find pixel locations. Thereafter, the elements of all the layers are mapped to the common pixel locations.
Augmentation of data is possible.

DeepInsightTab2Image tested on:

OS: Linux Ubuntu 20.04; Matlab version: 2022a; GPU A100 (2 parallel);

Download and Install

Download Matlab package DeepInsightTab2Image from the link above. Store it in your working directory and quick check if the codes are working properly:

>> data=rand(5,10);
>> [XTrain,model] = deepinsightTab2Img(data);
%following message will be displayed
NORM-2
Layer-1 data used for Cart2Pixel
tSNE with exact algorithm is used
Distance: euclidean
Pixels: 224 x 224

>> Xtest = deepinsightConv(rand(5,1),model);
%following message will be shown
NORM-2

The testing is successful if no errors are reported by executing the above two functions

Clear the Workspace and Load the example dataset omics.mat (data size is 8.2M):

>>  clear all;
>>  load omics.mat
% data = 5062 x 230 x 3 single
% Labels = 230 x 1 categorical

The above omics.mat data is a multi-layered data with 5062 dimension, 230 samples and 3 layers: d=5062; n=230; layers=3.

Set aside some samples

>>  rng('default');
>>  inx = randperm(size(data,2));
>>  inx = inx(1:10);
>>  dataXts = data(:,inx,:); % 5062 x 10 x 3 single
>>  data(:,inx,:) = [];      % 5062 x 220 x 3 single
>>  LabelsXts = Labels(inx); % 10 x 1 categorical
>>  Labels(inx) = [];        % 220 x 1 categorical

Execute Tab2Img function on data

>>  [XTrain, model] = deepinsightTab2Img(data);
% This will convert data to images (XTrain) using default parameters
% XTrain = 224 x 224 x 3 x 220 (4-D uint8)

model will defined all the parameters used. Since in the above case options are not changed, default paramters model.Parm can be seen.

>>  model.Parm
% struct with fields
%     Method: 'tsne'
%       Dist: 'euclidean'
%     Labels: []
%  PixelSize: 224
%       Norm: 2
%    Augment: 'no'
% AugSamples: 500 (note: only used when 'Augment' = 'yes')
% FeatureMap: 1 (i.e. Layer-1 is used for projection and other layers are used for mapping).
%   Blurring: 'no'
%   SnowFall: 'no'
%       Step: 4
%    MPS_Fix: 1

model also defines normalization parameters (model.Min and model.Max), pixel locations (model.xp and model.yp), images sizes and Labels (if used)

Plot the converted sample
```
>>  figure; imshow(XTrain(:,:,:,1));
```
DeepInsightTab2Img- Method: tsne with Distance: euclidean

Type help deepinsightTab2Img for various options. Apply blurring technique

>>  [XTrain, model] = deepinsightTab2Img(data,'Blurring','yes');
>>  figure; imshow(XTrain(:,:,:,1));
% nearby pixels of characteristic pixels will be populated

DeepInsightTab2Img- Blurring technique

Convert a validation set or a test set using model obtained from (6).

>>  XTest = deepinsightConv(dataXts,model);
% XTest = 4-D uint8 of size 224 x 224 x 3 x 10

plot XTest images

>>  P = imtile(XTest(:,:,:,1:9)); % these XTest samples belong to different class Labels
>>  P = rescale(P);
>>  figure; imshow(P);

deepinsightConv- plotting Test images

Change the tsne distance to cosine and apply the same procedure

>>  [XTrain, model] = deepinsightTab2Img(data,'Dist','cosine','Blurring','yes'); % distance cosine with Blurring technique
>>  figure; imshow(XTrain(:,:,:,1)); title('Dist cosine');

DeepInsightTab2Img- distance: cosine

Many options can be changed according to the requirements. Details of options for deepinsightTab2Img are given below.

    'Method': 'tsne' | 'kpca' | 'umap' | 'pca'| 'lda' (supervised method therefore Labels are required)
      'Dist': 'euclidean' (default) | 'seuclidean' | 'cityblock' | 'chebychev' | 'minkowski' | 'mahalanobis' | 
            'cosine' | 'correlation' | 'spearman' | 'hamming' | 
            'jaccard' | function handle.   
            %(Dist variable is applicable only for `tsne` Method option).
    'Labels': Labels (categorical values applicable for Method 'lda' or data augmentation 'Augment','yes')
 'PixelSize': k (default k = 224, will give 224 x 224 image size) | set 'PixelSize',[] to determine pixel frame size automatically 
     'Norm' : 1 | 2
   'Augment': 'no' (default) | yes
'AugSamples': m (m samples per class, default m is 500)
'FeatureMap': 0 (all layers used for projection ) | 1 (default) layer-1 projection | 2 (layer-2 projection) | 3 (layer-3 projection)
  'Blurring': 'no' (default) | 'yes'
  'SnowFall': 'no' (default) | 'yes'
      'Step': s (default s=4), s=[1,5]
  'MPS_Fix' : 1 (default) | 0 %(Pixel size will be determined automatically, managed internally)

Options for deepinsightConv are given below

>>  [ImgData, model] = deepinsightConv(data,model,options);
%options can be
%'AugSamples': k (defined k>0 to be augmented per class)
%'Labels': provide labels for validation data (labels_validation)

>>  [ImgData, model] = deepinsightConv(data,model,'AugSamples',50,'Labels',labels_validation);
% model.Validation = 
%  struct with field
%  AugSamples:50
%      Labels: px1 categorical 
%   orgLabels: qx1 categorical

Apply 'umap' projection method. Note for 'umap', option 'Dist' is not required. Also note, that 'umap' uses Python or R code. Therefore, first install necessary Python/R packages. For Python the following packages are used numpy, sys and umap.

Moreover, change the default PixelSize to 50

>>  [XTrain, model] = deepinsightTab2Img(data,'Method','umap','PixelSize',50);
% NORM-2
% Layer-1 data used for Cart2Pixel
% umap is used
% Pixels: 50 x 50

Plot images of class-1 and class-2

>>  numObservations = [1:8,213:220]; % Labels positions 1:8 belong to class-1 and 213P1=:220 belong to class-2
>>  P1 = imtile(XTrain(:,:,:,numObservations),'Frames',1:8,'GridSize',[2 4]);
>>  P2 = imtile(XTrain(:,:,:,numObservations),'Frames',9:16,'GridSize',[2 4]);
>>  figure; subplot(2,1,1);imshow(P1); title(['class ',num2str(double(Labels(numObservations(1))))]);
>>  subplot(2,1,2);imshow(P2); title(['class ',num2str(double(Labels(numObservations(9))))]);

Uniform Manifold Approximation and Projection (umap) with PizelSize 50 x 50.

Using lda for projection: since lda is a supervised method Labels are to be provided

>>  [XTrain, model] = deepinsightTab2Img(data, 'Method', 'lda','Labels',Labels);
% NORM-2
% Layer-1 data used for Cart2Pixel
% lda is used
% t cluster for LDA 292
% Pixels: 224 x 224

To augment image data apply Augment as

>>  [XTrain, model] = deepinsightTab2Img(data, 'Method','lda','Augment','yes','Labels',Labels);
% This will augment around m= 500 samples per class. To change this number apply ...'AugSamples',m ...
% size of XTrain is 224 x 224 x 3 x 1222
>>  P = imtile(XTrain(:,:,:,[1,length(Labels)+1]));
>>  figure; imshow(P); title('original and augmented sample');

deepinsightTab2Img- projection method lda with augmentated samples

Effect of using blurring technique with an illustration

Generate an artificial image

>>  M = ones(15,15);
>>  row = [4,9]; col = [4, 7]; % define row and columns for characteristic pixel locations
>>  M(sub2ind(size(M),row,col)) = 0.4; % define characteristic pixel values
>>  figure; subplot(2,3,1); imagesc(M); title('original image')
>>  for step=1:5
>>      MB = BlurTech(M,row,col,step);
>>      subplot(2,3,step+1); imagesc(MB); title(['Blurring step ',num2str(step)]);
>>  end

blurring technique with step = 1 .. 5.

Augmenting Validation data

If it is required to augment validation data as well then deepinsightConv can be used as

>>  [XValidation, model] = deepinsightConv(data_validation,model,'AugSamples',k,'Labels',labels_validation);
% size of XValidation would be R x C x layers x r, and model is generated from deepinsightTab2Img
%    where r = number of original validation samples + augmented validation samples

The details about validation can be access by

>>  model.Validation
% struct with feilds
% AugSamples: k
%     Labels: r (p, original + q, augmented samples)
%  orgLabels: number of validation samples (before augmentation)

Reference

Sharma A*, Lysenko A*, Boroevich K, Tsunoda T*, DeepInsight-3D for precision oncology: an improved anti-cancer drug response prediction from high-dimensional multi-omics data with convolutional neural networks, bioRxiv, 2022 https://doi.org/10.1101/2022.07.14.500140
Sharma et al., DeepFeature: feature selection in nonimage data using convolutional neural network, Briefings in Bioinformatics, 22(6), 2021. https://academic.oup.com/bib/article/22/6/bbab297/6343526
Sharma et al., DeepInsight: a methodology to transform a non-image data to an image for convolution neural network architecture, 9:11399, Scientific Reports, 2019. https://www.nature.com/articles/s41598-019-47765-6
Castillo-Cara M et al., A Deep Learning Approach Using Blurring Image Techniques for Bluetooth-Based Indoor Localisation, 2022, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4180099
Kalkan H et al., Prediction of Alzheimer’s Disease by a Novel Image-Based Representation of Gene Expression, 13(8), Genes, 2022.

Related materials

DeepInsight YouTube

A YouTube video about the original DeepInsight method is available here. A Matlab page on DeepInsight can be viewed from here.

GitHub weblink of DeepInsight (Python and Matlab)

Overall weblink here

Winning Kaggle competition by Mark Peng

a) Competition details: Mechanisms of Actions (MoA) Predictions https://www.kaggle.com/competitions/lish-moa

b) Peng et al., 1st 1st PlaceWinning Solution– Hungry for Gold. Laboratory for Innovation Science at Harvard, Mechanisms of Action (MoA) Prediction Competition 2020. here

c) Organizers: MIT and Harvard University (Connectivity Map here)

d) DeepInsight EfficientNet-B3 Noisy Student (PyTorch) here

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
BlurTech.p		BlurTech.p
BlurTechStep.p		BlurTechStep.p
BlurTech_samples.m		BlurTech_samples.m
BlurTech_samples_p.m		BlurTech_samples_p.m
Cart2Pixel.m		Cart2Pixel.m
ConvPixel.m		ConvPixel.m
Fig1.png		Fig1.png
Fig2.png		Fig2.png
Fig3.png		Fig3.png
Fig4.png		Fig4.png
Fig5.png		Fig5.png
Fig6.png		Fig6.png
Fig7.png		Fig7.png
LDAproj.m		LDAproj.m
PCA.m		PCA.m
README.md		README.md
SnowFall.m		SnowFall.m
augmentDeepInsight2.m		augmentDeepInsight2.m
augmentDeepInsightClass.m		augmentDeepInsightClass.m
data_inR.txt		data_inR.txt
data_outR.txt		data_outR.txt
deepinsightConv.m		deepinsightConv.m
deepinsightTab2Img.m		deepinsightTab2Img.m
define_variables.m		define_variables.m
distanceMatrix.m		distanceMatrix.m
func_LDA.m		func_LDA.m
func_umap.R		func_umap.R
func_umap.py		func_umap.py
imageTransformer.m		imageTransformer.m
kPCA.m		kPCA.m
kernel.m		kernel.m
minboundrect.m		minboundrect.m
normalization.m		normalization.m
omics.mat		omics.mat
prepareData.m		prepareData.m
umap_Rmatlab.m		umap_Rmatlab.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepInsightTab2Image

New updates in this package are:

DeepInsightTab2Image tested on:

Download and Install

Reference

Related materials

DeepInsight YouTube

GitHub weblink of DeepInsight (Python and Matlab)

Winning Kaggle competition by Mark Peng

About

Releases

Packages

Languages

binfnstats/DeepInsightTab2Image

Folders and files

Latest commit

History

Repository files navigation

DeepInsightTab2Image

New updates in this package are:

DeepInsightTab2Image tested on:

Download and Install

Reference

Related materials

DeepInsight YouTube

GitHub weblink of DeepInsight (Python and Matlab)

Winning Kaggle competition by Mark Peng

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages