necsi-gene-network-clustering

Requirements

Python3 (3.7)
Pandas
NumPy
matplotlib
scipy
sklearn
(optional) igraph (required for analysis-network)

Preparation

In data folder unzip both data files which are zipped (this is because GitHub won't let us upload huge files)

Order of operations

Run analysis notebook to generate the list of genes products which are statistically significantly different between stage 1 and stage 2 tumors, stage 1 and stage 3 tumors and stage 2 and stage 3 tumors. This program also applies deep learning to generate a model to categorize tumors into stages by gene expression levels
Run vizualization-gephi-simplified notebook to generate the graph (gml file) of one of the sets of nodes identified by step 1
Open gml file in Gephi to produce results (using CircularPack layout to generate clusters based on modularity and size based on page rank) .gephi files in out folder have all settings we used
(Optional) run analysis-network to demonstrate scale free network

Notes:

Experiments were done with graphviz and igraph plotting, neither panned out as well as gephi, old programs are denoted by old_ prefix

simplified means we're not plotting the whole network, just a subset of the statistically significant gene prodcuts

simplified networks don't show connections between nodes which are indirectly connected through non statistically significant nodes

For example: Assume a,c,h,i are significant (the rest aren't)

Full Network:

|-------------|
|---|         |
a-b-c-d-e-f-g-h-i
|_____|

Simplified Network:

a-c
|
h_i

The problem here is that even though a is connected to i and c is connected to i (indirectly) we don't represent that in the simplified network, but in reality we believe this isn't significant because the shortest pathway won't be affected, and its the shortest pathway that characterizes behavior

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
out		out
.gitignore		.gitignore
Diff_1_2_01.txt		Diff_1_2_01.txt
Diff_1_3_01.txt		Diff_1_3_01.txt
Diff_2_3_01.txt		Diff_2_3_01.txt
Diff_Relapse_01.txt		Diff_Relapse_01.txt
MOCA-vizualization-gephi-simplified-by-cluster.ipynb		MOCA-vizualization-gephi-simplified-by-cluster.ipynb
MOCA-vizualization-gephi-simplified.ipynb		MOCA-vizualization-gephi-simplified.ipynb
MOCA_aggregation.ipynb		MOCA_aggregation.ipynb
README.md		README.md
analysis-distant-relapse.ipynb		analysis-distant-relapse.ipynb
analysis-network.ipynb		analysis-network.ipynb
analysis.ipynb		analysis.ipynb
old_vizualization-graphviz-simplified.ipynb		old_vizualization-graphviz-simplified.ipynb
old_vizualization-graphviz.ipynb		old_vizualization-graphviz.ipynb
old_vizualization-igraph-simplified-union.ipynb		old_vizualization-igraph-simplified-union.ipynb
old_vizualization-igraph-simplified.ipynb		old_vizualization-igraph-simplified.ipynb
old_vizualization-igraph-tool.ipynb		old_vizualization-igraph-tool.ipynb
old_vizualization-igraph.ipynb		old_vizualization-igraph.ipynb
vizualization-gephi-simplified.ipynb		vizualization-gephi-simplified.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

necsi-gene-network-clustering

Requirements

Preparation

Order of operations

Notes:

About

Releases

Packages

Languages

wasbridge/necsi-gene-network-clustering

Folders and files

Latest commit

History

Repository files navigation

necsi-gene-network-clustering

Requirements

Preparation

Order of operations

Notes:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages