Skip to content

GuanLab/Cell-Linage-Reconstruction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rule-based Hierachical Clustering for Cell Linage Reconstruction

Hanrui Zhang ([email protected]) and Yuanfang Guan ([email protected])

This is team Guan&Zhang's submission to Allen Institute Cell Lineage Reconstruction DREAM Challenge.

Overview

Our method in subchallenge 1 was based on:

  1. distance transformation,
  2. rule based hierarchical clustering based on minimal distance which we describe in the following figure.

Figure 1. Workflow of Guanlab’s method in Subchallenge 1.

First of all, we summarized the frequency of different edition states of the 10 barcodes based on the training set (right panel in Figure 1). The more frequent edition states should be assigned less importance, therefore we assign larger distances. The edition states of the barcode 1-10 is in the Barcode distance table (Figure 1).

Then, we start to reconstruct the lineage tree for every training dataset. First for every group of cells we need to reconstruct the lineage tree, we transform their edition status according to the Barcode Distance table (middle panel in Figure 1).

Then the transformed edition status are sent for hierarchical clustering, where the two cells with minimal distances are clustered together, and the parent cell of the two cells are deduced based on the irreversible edition rules (“Constructed new nodes from leaves” in Figure 1). The hierarchical clustering stops until there’s only one node left for the whole cell set therefore no more cells left for clustering.

Dependencies

Before you start, make sure the following dependencies have been installed:

Usage

To reproduce Guan&Zhang's submission for SubChallenge1:

python train.py [INPUTPATH]

[INPUTPATH] is the path of directory containing all the input recordings.

This program will generate a file prediction.txt and a new folder ./output/.

  • prediction.txt : a data table containing two columns. The first column dreamID is the tree id. The second column nw is the recomstructed tree in Newick format.
  • ./output/: the reconstructed trees in Newich format.

About

2019 DREAM Lineage Reconstruction Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published