How to generate embedding

An additional script is provided to generate 2D embedding of the scRNA-seq data or for the latent features from DAWN. The steps to create visualizations are listed below.

For scRNA-seq data

The scRNA-seq data should be in a Cell x Gene matrix, where Cells are the rows and Genes are the columns.
The matrix values can be of counts or one of the four RNA-seq expression units (RPM, TPM, FPKM and RPKM).
Note: Log normalized values should not be used.
Perform any necessary filtering of cells based on your quality criteria.
Remove all row and column labels, only the numerical matrix should be present and saved as a comma-separated values (CSV) file.
Generate embedding: python visualizer.py <path to data csv file>.

For DAWN features

Unlike scRNA-seq data, no additional steps are required.
To generate embedding: python visualizer.py <path to latent features csv file>.

Outputs for the embedding

Two files are created after the visualizer completes.

A CSV file which contains the (X, Y) coordinates for the samples. This file is named similar to the input file but has the suffix: 2d_coord.
A TIF image containing the plot for the embedding. This file is also named similar to the input file but has the suffix: 2d_viz.

Using an embedding to generate EM clustering

Typically, a 2D embedding contains some cell clusters with good separation. This number of clusters in the embedding can be used as numClusters for EM clustering.

Home
1. Software Requirements
2. Setup Theano for GPUs
3. How to use DUSC
4. How to generate embedding
5. Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to generate embedding

For scRNA-seq data

For DAWN features

Outputs for the embedding

Using an embedding to generate EM clustering

Clone this wiki locally