Tutorial
-Load the reference and query datasets:
-diff --git a/doc/.ipynb_checkpoints/Makefile-checkpoint b/doc/.ipynb_checkpoints/Makefile-checkpoint new file mode 100644 index 0000000..d0c3cbf --- /dev/null +++ b/doc/.ipynb_checkpoints/Makefile-checkpoint @@ -0,0 +1,20 @@ +# Minimal makefile for Sphinx documentation +# + +# You can set these variables from the command line, and also +# from the environment for the first two. +SPHINXOPTS ?= +SPHINXBUILD ?= sphinx-build +SOURCEDIR = source +BUILDDIR = build + +# Put it first so that "make" without argument is like "make help". +help: + @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) + +.PHONY: help Makefile + +# Catch-all target: route all unknown targets to Sphinx using the new +# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). +%: Makefile + @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) diff --git a/doc/.ipynb_checkpoints/make-checkpoint.bat b/doc/.ipynb_checkpoints/make-checkpoint.bat new file mode 100644 index 0000000..6247f7e --- /dev/null +++ b/doc/.ipynb_checkpoints/make-checkpoint.bat @@ -0,0 +1,35 @@ +@ECHO OFF + +pushd %~dp0 + +REM Command file for Sphinx documentation + +if "%SPHINXBUILD%" == "" ( + set SPHINXBUILD=sphinx-build +) +set SOURCEDIR=source +set BUILDDIR=build + +if "%1" == "" goto help + +%SPHINXBUILD% >NUL 2>NUL +if errorlevel 9009 ( + echo. + echo.The 'sphinx-build' command was not found. Make sure you have Sphinx + echo.installed, then set the SPHINXBUILD environment variable to point + echo.to the full path of the 'sphinx-build' executable. Alternatively you + echo.may add the Sphinx directory to PATH. + echo. + echo.If you don't have Sphinx installed, grab it from + echo.http://sphinx-doc.org/ + exit /b 1 +) + +%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% +goto end + +:help +%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% + +:end +popd diff --git a/doc/build/doctrees/.ipynb_checkpoints/index-checkpoint.doctree b/doc/build/doctrees/.ipynb_checkpoints/index-checkpoint.doctree new file mode 100644 index 0000000..799491f Binary files /dev/null and b/doc/build/doctrees/.ipynb_checkpoints/index-checkpoint.doctree differ diff --git a/doc/build/doctrees/G2Gtutorial.doctree b/doc/build/doctrees/G2Gtutorial.doctree deleted file mode 100644 index 9dc4180..0000000 Binary files a/doc/build/doctrees/G2Gtutorial.doctree and /dev/null differ diff --git a/doc/build/doctrees/environment.pickle b/doc/build/doctrees/environment.pickle index abd2614..4090d48 100644 Binary files a/doc/build/doctrees/environment.pickle and b/doc/build/doctrees/environment.pickle differ diff --git a/doc/build/doctrees/index.doctree b/doc/build/doctrees/index.doctree index 6da29c5..9cd00fa 100644 Binary files a/doc/build/doctrees/index.doctree and b/doc/build/doctrees/index.doctree differ diff --git a/doc/build/doctrees/notebooks/G2G_Tutorial.doctree b/doc/build/doctrees/notebooks/G2G_Tutorial.doctree deleted file mode 100644 index f79aa55..0000000 Binary files a/doc/build/doctrees/notebooks/G2G_Tutorial.doctree and /dev/null differ diff --git a/doc/build/html/.buildinfo b/doc/build/html/.buildinfo index 9bb5531..ba68229 100644 --- a/doc/build/html/.buildinfo +++ b/doc/build/html/.buildinfo @@ -1,4 +1,4 @@ # Sphinx build info version 1 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. -config: 6e59a27ad7a662760f493dfdc52ce191 +config: 2d92bab5530a9992afd79242fd06b61f tags: 645f666f9bcd5a90fca523b33c5a78b7 diff --git a/doc/build/html/.ipynb_checkpoints/index-checkpoint.html b/doc/build/html/.ipynb_checkpoints/index-checkpoint.html index c28c032..a5a5e47 100644 --- a/doc/build/html/.ipynb_checkpoints/index-checkpoint.html +++ b/doc/build/html/.ipynb_checkpoints/index-checkpoint.html @@ -4,24 +4,23 @@ -
Genes2Genes (G2G) is a new Python framework for aligning single-cell pseudotime trajectories of gene expression between any reference and query for a pairwise comparison such as:
@@ -99,14 +99,14 @@A single-cell trajectory describes the transcriptomic state of cells along some axis of progression (such as time), due to undergoing some dynamic process (e.g. cell differentiation, treatment response, or disease infection). Given an scRNA-seq profile, there are various tools available today to infer such trajectory by estimating a pseudo ordering of the cells along an axis, commonly referred to as ‘pseudotime’. The pseudotime axis of a trajectory can be descritized to represent it as a sequence of discrete time points. Given two such discrete pseudotime sequences of two trajectories, a pairwise alignment between them defines a non-linear mapping between their time points. This mapping could have 1-to-1 matches as well as 1-to-many/many-to-1 matches (a.k.a warps) between the time points, while unmapping the time points which have significantly different transcriptomic states. Below is an example visualization of two cell differentiation trajectories.
- +For two trajectories representing single lineages as above, G2G generates an optimal pairwise trajectory alignment that captures the matches and mismatches between their time points in sequential order, allowing a user to quantify the degree of similarity between them.
- +G2G defines 5 different states of alignment between any two R and Q time points, corresponding to all possible match and mismatch states. They are: 1-to-1 match (M
), 1-to-many match (V
), many-to-1 match (W
), insertion (I
) and deletion (D
). Here, I
or D
refer to a mismatched time point in Q or R, respectively. These states jointly cover the alignment states defined in classical dynamic time warping and biological sequence alignment.
Given an scRNA-seq dataset with their pseudotime estimates and a specified set of genes (e.g. all transcription factors, highly variable genes, biological/signaling pathway genes), G2G generates fully-descriptive alignments for each gene (i.e. gene-level alignment), as well as an average (aggregate) alignment (i.e. cell-level alignment) across all genes.
Below is an example gene-level alignment of the gene JUNB in T cell differentiation between a pan-fetal reference and an artificial thymic organoid system:
- +The user can estimate pseudotime of the cells in their datasets using any suitable method available (such as Diffusion pseudotime, Palantir, GPLVM, Monocle etc.). For better visualisation and interpretation of the alignment results, we recommend the data to be annotated with their cell types (manually and/or using an automatic annotation tool such as CellTypist).
-Please refer to our Tutorial for an example analysis between a reference and query dataset from literature.
+Please refer to our Tutorial for an example analysis between a reference and query dataset from literature.
Our manuscript is currently available as a preprint at bioRxiv:
-Sumanaweera, D., Suo, C., Cujba, A.M., Muraro, D., Dann, E., Polanski, K., Steemers, A.S., Lee, W., Oliver, A.J., Park, J.E. and Meyer, K.B., 2023. Gene-level alignment of single cell trajectories informs the progression of in vitro T cell differentiation. bioRxiv, pp.2023-03.
+Sumanaweera, D., Suo, C., Cujba, A.M., Muraro, D., Dann, E., Polanski, K., Steemers, A.S., Lee, W., Oliver, A.J., Park, J.E. and Meyer, K.B., 2023. Gene-level alignment of single cell trajectories. bioRxiv, pp.2023-03.
+This publication is part of the Human Cell Atlas
+Marie Skłodowska-Curie grant agreement No: 101026506 (Marie Curie Individual Fellowship) under the European Union’s Horizon 2020 research and innovation programme; Wellcome Trust Ph.D. Fellowship for Clinicians; Wellcome Trust (WT206194); ERC Consolidator Grant (646794); Wellcome Sanger Institute’s Translation Committee Fund.
Load the reference and query datasets:
-\n", - " | Gene | \n", - "alignment_similarity_percentage | \n", - "opt_alignment_cost | \n", - "l2fc | \n", - "color | \n", - "abs_l2fc | \n", - "
---|---|---|---|---|---|---|
63 | \n", - "CCRL2 | \n", - "0.2174 | \n", - "55.943685 | \n", - "-0.487688 | \n", - "red | \n", - "0.487688 | \n", - "
77 | \n", - "NFKBIA | \n", - "0.2174 | \n", - "54.673471 | \n", - "-0.091748 | \n", - "red | \n", - "0.091748 | \n", - "
68 | \n", - "NLRP3 | \n", - "0.2174 | \n", - "57.177548 | \n", - "0.069058 | \n", - "red | \n", - "0.069058 | \n", - "
3 | \n", - "TNF | \n", - "0.2174 | \n", - "57.990078 | \n", - "-0.006439 | \n", - "red | \n", - "0.006439 | \n", - "
45 | \n", - "C5AR1 | \n", - "0.2727 | \n", - "57.858236 | \n", - "0.8711 | \n", - "red | \n", - "0.8711 | \n", - "
... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "
34 | \n", - "NUP54 | \n", - "0.75 | \n", - "30.744063 | \n", - "0.012993 | \n", - "green | \n", - "0.012993 | \n", - "
15 | \n", - "CD44 | \n", - "0.7647 | \n", - "28.366715 | \n", - "-0.021366 | \n", - "green | \n", - "0.021366 | \n", - "
19 | \n", - "PLAGL2 | \n", - "0.8235 | \n", - "31.807956 | \n", - "-0.051268 | \n", - "green | \n", - "0.051268 | \n", - "
51 | \n", - "ZSWIM4 | \n", - "0.8235 | \n", - "30.214575 | \n", - "0.030379 | \n", - "green | \n", - "0.030379 | \n", - "
26 | \n", - "SGMS2 | \n", - "0.8667 | \n", - "37.626682 | \n", - "-0.020399 | \n", - "green | \n", - "0.020399 | \n", - "
89 rows × 6 columns
\n", - "\n", - " | Gene_set | \n", - "Term | \n", - "Overlap | \n", - "P-value | \n", - "Adjusted P-value | \n", - "Old P-value | \n", - "Old Adjusted P-value | \n", - "Odds Ratio | \n", - "Combined Score | \n", - "Genes | \n", - "-log10 Adjusted P-value | \n", - "-log10 FDR q-val | \n", - "
---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | \n", - "KEGG_2021_Human | \n", - "C-type lectin receptor signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
14 | \n", - "KEGG_2021_Human | \n", - "NF-kappa B signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;TNF;CXCL2;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
0 | \n", - "MSigDB_Hallmark_2020 | \n", - "TNF-alpha Signaling via NF-kB | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;CCRL2;TNF;CXCL2 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
1 | \n", - "MSigDB_Hallmark_2020 | \n", - "Inflammatory Response | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;C5AR1;CCRL2;NLRP3 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
15 | \n", - "KEGG_2021_Human | \n", - "NOD-like receptor signaling pathway | \n", - "4/181 | \n", - "1.305897e-06 | \n", - "0.000040 | \n", - "0 | \n", - "0 | \n", - "74.625235 | \n", - "1011.068962 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.392729 | \n", - "4.392729 | \n", - "
16 | \n", - "KEGG_2021_Human | \n", - "Lipid and atherosclerosis | \n", - "4/215 | \n", - "2.592326e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "62.492891 | \n", - "803.843248 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
17 | \n", - "KEGG_2021_Human | \n", - "Legionellosis | \n", - "3/57 | \n", - "2.596486e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "158.222222 | \n", - "2034.951617 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
18 | \n", - "KEGG_2021_Human | \n", - "Coronavirus disease | \n", - "4/232 | \n", - "3.507534e-06 | \n", - "0.000054 | \n", - "0 | \n", - "0 | \n", - "57.783626 | \n", - "725.796849 | \n", - "NFKBIA;C5AR1;NLRP3;TNF | \n", - "4.264666 | \n", - "4.264666 | \n", - "
19 | \n", - "KEGG_2021_Human | \n", - "Shigellosis | \n", - "4/246 | \n", - "4.425543e-06 | \n", - "0.000059 | \n", - "0 | \n", - "0 | \n", - "54.402204 | \n", - "670.676767 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "4.230649 | \n", - "4.230649 | \n", - "
20 | \n", - "KEGG_2021_Human | \n", - "IL-17 signaling pathway | \n", - "3/94 | \n", - "1.177987e-05 | \n", - "0.000137 | \n", - "0 | \n", - "0 | \n", - "93.715856 | \n", - "1063.592345 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.863467 | \n", - "3.863467 | \n", - "
21 | \n", - "KEGG_2021_Human | \n", - "T cell receptor signaling pathway | \n", - "3/104 | \n", - "1.596141e-05 | \n", - "0.000165 | \n", - "0 | \n", - "0 | \n", - "84.394625 | \n", - "932.167056 | \n", - "NFKBIA;TNF;MALT1 | \n", - "3.782688 | \n", - "3.782688 | \n", - "
22 | \n", - "KEGG_2021_Human | \n", - "TNF signaling pathway | \n", - "3/112 | \n", - "1.993521e-05 | \n", - "0.000185 | \n", - "0 | \n", - "0 | \n", - "78.169069 | \n", - "846.025656 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.731896 | \n", - "3.731896 | \n", - "
23 | \n", - "KEGG_2021_Human | \n", - "Yersinia infection | \n", - "3/137 | \n", - "3.642707e-05 | \n", - "0.000308 | \n", - "0 | \n", - "0 | \n", - "63.505330 | \n", - "649.037070 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.511485 | \n", - "3.511485 | \n", - "
24 | \n", - "KEGG_2021_Human | \n", - "Influenza A | \n", - "3/172 | \n", - "7.174680e-05 | \n", - "0.000556 | \n", - "0 | \n", - "0 | \n", - "50.264582 | \n", - "479.643099 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.254896 | \n", - "3.254896 | \n", - "
25 | \n", - "KEGG_2021_Human | \n", - "Pathogenic Escherichia coli infection | \n", - "3/197 | \n", - "1.073310e-04 | \n", - "0.000768 | \n", - "0 | \n", - "0 | \n", - "43.731959 | \n", - "399.692315 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.114735 | \n", - "3.114735 | \n", - "
\n", - " | Distance threshold | \n", - "Mean Silhouette Score | \n", - "Number of clusters | \n", - "
---|---|---|---|
66 | \n", - "0.67 | \n", - "0.371279 | \n", - "2.0 | \n", - "
65 | \n", - "0.66 | \n", - "0.371279 | \n", - "2.0 | \n", - "
64 | \n", - "0.65 | \n", - "0.371279 | \n", - "2.0 | \n", - "
63 | \n", - "0.64 | \n", - "0.371279 | \n", - "2.0 | \n", - "
62 | \n", - "0.63 | \n", - "0.371279 | \n", - "2.0 | \n", - "
61 | \n", - "0.62 | \n", - "0.371279 | \n", - "2.0 | \n", - "
60 | \n", - "0.61 | \n", - "0.371279 | \n", - "2.0 | \n", - "
59 | \n", - "0.60 | \n", - "0.395886 | \n", - "3.0 | \n", - "
58 | \n", - "0.59 | \n", - "0.395886 | \n", - "3.0 | \n", - "
57 | \n", - "0.58 | \n", - "0.395886 | \n", - "3.0 | \n", - "
56 | \n", - "0.57 | \n", - "0.395886 | \n", - "3.0 | \n", - "
55 | \n", - "0.56 | \n", - "0.391956 | \n", - "4.0 | \n", - "
54 | \n", - "0.55 | \n", - "0.391956 | \n", - "4.0 | \n", - "
53 | \n", - "0.54 | \n", - "0.391956 | \n", - "4.0 | \n", - "
52 | \n", - "0.53 | \n", - "0.391956 | \n", - "4.0 | \n", - "
51 | \n", - "0.52 | \n", - "0.391956 | \n", - "4.0 | \n", - "
50 | \n", - "0.51 | \n", - "0.391956 | \n", - "4.0 | \n", - "
49 | \n", - "0.50 | \n", - "0.391956 | \n", - "4.0 | \n", - "
48 | \n", - "0.49 | \n", - "0.391956 | \n", - "4.0 | \n", - "
47 | \n", - "0.48 | \n", - "0.391956 | \n", - "4.0 | \n", - "
46 | \n", - "0.47 | \n", - "0.391956 | \n", - "4.0 | \n", - "
45 | \n", - "0.46 | \n", - "0.391956 | \n", - "4.0 | \n", - "
44 | \n", - "0.45 | \n", - "0.391956 | \n", - "4.0 | \n", - "
43 | \n", - "0.44 | \n", - "0.391956 | \n", - "4.0 | \n", - "
39 | \n", - "0.40 | \n", - "0.374176 | \n", - "6.0 | \n", - "
38 | \n", - "0.39 | \n", - "0.374176 | \n", - "6.0 | \n", - "
37 | \n", - "0.38 | \n", - "0.374176 | \n", - "6.0 | \n", - "
36 | \n", - "0.37 | \n", - "0.374944 | \n", - "7.0 | \n", - "
9 | \n", - "0.10 | \n", - "0.384710 | \n", - "57.0 | \n", - "
8 | \n", - "0.09 | \n", - "0.398382 | \n", - "65.0 | \n", - "
7 | \n", - "0.08 | \n", - "0.415730 | \n", - "68.0 | \n", - "
6 | \n", - "0.07 | \n", - "0.415730 | \n", - "68.0 | \n", - "
5 | \n", - "0.06 | \n", - "0.415730 | \n", - "68.0 | \n", - "
4 | \n", - "0.05 | \n", - "0.415730 | \n", - "68.0 | \n", - "
3 | \n", - "0.04 | \n", - "0.415730 | \n", - "68.0 | \n", - "
2 | \n", - "0.03 | \n", - "0.415730 | \n", - "68.0 | \n", - "
1 | \n", - "0.02 | \n", - "0.415730 | \n", - "68.0 | \n", - "
0 | \n", - "0.01 | \n", - "0.415730 | \n", - "68.0 | \n", - "
import anndata
-import numpy as np
-import seaborn as sb
-import matplotlib.pyplot as plt
-import platform
-from optbinning import ContinuousOptimalBinning
-
-import Main
-import VisualUtils
-import ClusterUtils
-import TimeSeriesPreprocessor
-import PathwayAnalyserV2
-
-print(platform.python_version())
-
---------------------------------------------------------------------------
-ModuleNotFoundError Traceback (most recent call last)
-Input In [1], in <cell line: 6>()
- 4 import matplotlib.pyplot as plt
- 5 import platform
-----> 6 from optbinning import ContinuousOptimalBinning
- 8 import Main
- 9 import VisualUtils
-
-ModuleNotFoundError: No module named 'optbinning'
-
Load the reference and query anndata objects
-input_dir = 'data/'
-adata_ref = anndata.read_h5ad(input_dir + 'adata_pam_local.h5ad') # PAM dataset
-adata_query = anndata.read_h5ad(input_dir +'adata_lps_local.h5ad') # LPS dataset
-
print(adata_ref)
-print(adata_query)
-
AnnData object with n_obs × n_vars = 179 × 89
- obs: 'time'
-AnnData object with n_obs × n_vars = 290 × 89
- obs: 'time'
-
# check the current range
-print(min(adata_ref.obs['time']), max(adata_ref.obs['time']))
-print(min(adata_query.obs['time']), max(adata_query.obs['time']))
-
-# if it does not follow [0,1] range, run below
-adata_ref.obs['time'] = TimeSeriesPreprocessor.Utils.minmax_normalise(np.asarray(adata_ref.obs['time']))
-adata_query.obs['time'] = TimeSeriesPreprocessor.Utils.minmax_normalise(np.asarray(adata_query.obs['time']))
-
0.0 1.0
-0.0 1.0
-
# Visualize the pseudotime distributions
-sb.kdeplot(adata_ref.obs['time'], fill=True, label='PAM', color='forestgreen')
-sb.kdeplot(adata_query.obs['time'], fill=True, label='LPS', color='midnightblue');
-plt.xlabel('pseudotime'); plt.legend(); plt.show()
-
x = np.asarray(adata_ref.obs.time)
-optb = ContinuousOptimalBinning(name='pseudotime', dtype="numerical")
-optb.fit(x, x)
-print(len(optb.splits))
-
-x = np.asarray(adata_query.obs.time)
-optb = ContinuousOptimalBinning(name='pseudotime', dtype="numerical")
-optb.fit(x, x)
-print(len(optb.splits))
-
14
-14
-
OptBinning estimates 14 optimal number of splits for both reference and query pseudotime distributions. Therefore we choose the same number of interpolation points.
-For this dataset, we use the author-given time annotations (1h,2h,4h,6h) as the cell-type annotations.
-adata_ref.obs['annotation'] = [x.split('_')[1] for x in adata_ref.obs_names]
-adata_query.obs['annotation'] = [x.split('_')[1] for x in adata_query.obs_names]
-
Next we define a colormap of our choice for these annotations, and call the below function.
-col = np.array(sb.color_palette('colorblind'))[range(4)]
-joint_cmap={'1h':col[0], '2h':col[1] , '4h':col[2] , '6h':col[3]}
-vs = VisualUtils.VisualUtils.get_celltype_composition_across_time(adata_ref, adata_query, n_points=14,
- ANNOTATION_COLNAME='annotation', optimal_binning=False, ref_cmap=joint_cmap, query_cmap=joint_cmap)
-
# trying max n points for optimal binning = 14
-====================================================
-Optimal equal number of bins for R and Q = 14
-
We can now run Gene-level alignment for all 89 genes in the PAM and LPS datasets
-gene_list = adata_ref.var_names
-print(len(gene_list))
-
89
-
This is done by first creating an aligner object, passing and setting all relevant parameters. -Next we align all gene pairs. (This step is parallelizing indepedenent gene-alignments to make the process time-efficient, however the computational time for an individual alignment will increase as the number of cells and/or the number of interpolation time points increase.
-aligner = Main.RefQueryAligner(adata_ref, adata_query, gene_list, len(vs.optimal_bining_S))
-aligner.WEIGHT_BY_CELL_DENSITY = True
-aligner.WINDOW_SIZE=0.1
-aligner.state_params = [0.99,0.1,0.7]
-aligner.optimal_binning = True
-aligner.opt_binning_S = vs.optimal_bining_S
-aligner.opt_binning_T = vs.optimal_bining_T
-aligner.align_all_pairs()
-
WINDOW_SIZE= 0.1
-
Now we can check the aggregate (average) alignment across all genes:
-aligner.get_aggregate_alignment()
-
Average Alignment: IDDDMMMMMMMMMIIIDID
-
We can also visualize this alignment in terms of cell-type composition
-vs.visualize_gene_alignment("IDDDMMMMMMMMMIIIDID", cmap=joint_cmap)
-
We can also visualize an individual gene (e.g. JUNB), displaying its alignment statistics
-VisualUtils.show_gene_alignment('TNF', aligner, vs, joint_cmap)
-
DDDIDIDIDDDMMMMMIIIIIID
-Optimal alignment cost: 57.99 nits
-Alignment similarity percentage: 21.74 %
-
To check only the cell plots of a gene alignment (e.g. SERTAD2)
-VisualUtils.plotTimeSeries('SERTAD2', aligner, plot_cells=True)
-
The below attributes and functions can be used to examine any gene-alignment object
-GENE = 'TNF'
-gene_obj = aligner.results_map[GENE]
-
-al = gene_obj.alignment_str
-print(al)
-print(VisualUtils.color_al_str(al))
-
DDDIDIDIDDDMMMMMIIIIIID
-DDDIDIDIDDDMMMMMIIIIIID
-
print(gene_obj.al_visual)
-
01234567890123456789012 Alignment index
-012 3 4 56789012 3 Reference index
-***-*-*-********------*
----*-*-*---***********-
- 0 1 2 34567890123 Query index
-DDDIDIDIDDDMMMMMIIIIIID 5-state string
-
gene_obj.landscape_obj.plot_alignment_landscape()
-
We can use the alignment similarity percentage statistic of genes to rank genes from highly distant to highly similar
-df = aligner.get_stat_df()
-
mean matched percentage:
-50.39 %
-
VisualUtils.plot_alignmentSim_vs_l2fc(df)
-
df
-
- | Gene | -alignment_similarity_percentage | -opt_alignment_cost | -l2fc | -color | -abs_l2fc | -
---|---|---|---|---|---|---|
63 | -CCRL2 | -0.2174 | -55.943685 | --0.487688 | -red | -0.487688 | -
77 | -NFKBIA | -0.2174 | -54.673471 | --0.091748 | -red | -0.091748 | -
68 | -NLRP3 | -0.2174 | -57.177548 | -0.069058 | -red | -0.069058 | -
3 | -TNF | -0.2174 | -57.990078 | --0.006439 | -red | -0.006439 | -
45 | -C5AR1 | -0.2727 | -57.858236 | -0.8711 | -red | -0.8711 | -
... | -... | -... | -... | -... | -... | -... | -
34 | -NUP54 | -0.75 | -30.744063 | -0.012993 | -green | -0.012993 | -
15 | -CD44 | -0.7647 | -28.366715 | --0.021366 | -green | -0.021366 | -
19 | -PLAGL2 | -0.8235 | -31.807956 | --0.051268 | -green | -0.051268 | -
51 | -ZSWIM4 | -0.8235 | -30.214575 | -0.030379 | -green | -0.030379 | -
26 | -SGMS2 | -0.8667 | -37.626682 | --0.020399 | -green | -0.020399 | -
89 rows × 6 columns
-Let us use 30% alignment similarity (=0.3) as a threshold in this case
-topDEgenes = df[list(df['alignment_similarity_percentage'] <=0.3)]['Gene']
-topDEgenes
-
63 CCRL2
-77 NFKBIA
-68 NLRP3
-3 TNF
-45 C5AR1
-84 SPATA13
-33 CXCL2
-12 RALGDS
-31 INSIG1
-6 MALT1
-Name: Gene, dtype: object
-
pathway_df = PathwayAnalyserV2.run_overrepresentation_analysis(topDEgenes) # this is a wrapper function call for GSEAPy enrichr inferface
-pathway_df[pathway_df['Adjusted P-value']<0.001]
-
- | Gene_set | -Term | -Overlap | -P-value | -Adjusted P-value | -Old P-value | -Old Adjusted P-value | -Odds Ratio | -Combined Score | -Genes | --log10 Adjusted P-value | --log10 FDR q-val | -
---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | -KEGG_2021_Human | -C-type lectin receptor signaling pathway | -4/104 | -1.414253e-07 | -0.000007 | -0 | -0 | -132.600000 | -2091.300110 | -NFKBIA;NLRP3;TNF;MALT1 | -5.182020 | -5.182020 | -
14 | -KEGG_2021_Human | -NF-kappa B signaling pathway | -4/104 | -1.414253e-07 | -0.000007 | -0 | -0 | -132.600000 | -2091.300110 | -NFKBIA;TNF;CXCL2;MALT1 | -5.182020 | -5.182020 | -
0 | -MSigDB_Hallmark_2020 | -TNF-alpha Signaling via NF-kB | -4/200 | -1.944057e-06 | -0.000013 | -0 | -0 | -67.326531 | -885.393278 | -NFKBIA;CCRL2;TNF;CXCL2 | -4.898378 | -4.898378 | -
1 | -MSigDB_Hallmark_2020 | -Inflammatory Response | -4/200 | -1.944057e-06 | -0.000013 | -0 | -0 | -67.326531 | -885.393278 | -NFKBIA;C5AR1;CCRL2;NLRP3 | -4.898378 | -4.898378 | -
15 | -KEGG_2021_Human | -NOD-like receptor signaling pathway | -4/181 | -1.305897e-06 | -0.000040 | -0 | -0 | -74.625235 | -1011.068962 | -NFKBIA;NLRP3;TNF;CXCL2 | -4.392729 | -4.392729 | -
16 | -KEGG_2021_Human | -Lipid and atherosclerosis | -4/215 | -2.592326e-06 | -0.000048 | -0 | -0 | -62.492891 | -803.843248 | -NFKBIA;NLRP3;TNF;CXCL2 | -4.316101 | -4.316101 | -
17 | -KEGG_2021_Human | -Legionellosis | -3/57 | -2.596486e-06 | -0.000048 | -0 | -0 | -158.222222 | -2034.951617 | -NFKBIA;TNF;CXCL2 | -4.316101 | -4.316101 | -
18 | -KEGG_2021_Human | -Coronavirus disease | -4/232 | -3.507534e-06 | -0.000054 | -0 | -0 | -57.783626 | -725.796849 | -NFKBIA;C5AR1;NLRP3;TNF | -4.264666 | -4.264666 | -
19 | -KEGG_2021_Human | -Shigellosis | -4/246 | -4.425543e-06 | -0.000059 | -0 | -0 | -54.402204 | -670.676767 | -NFKBIA;NLRP3;TNF;MALT1 | -4.230649 | -4.230649 | -
20 | -KEGG_2021_Human | -IL-17 signaling pathway | -3/94 | -1.177987e-05 | -0.000137 | -0 | -0 | -93.715856 | -1063.592345 | -NFKBIA;TNF;CXCL2 | -3.863467 | -3.863467 | -
21 | -KEGG_2021_Human | -T cell receptor signaling pathway | -3/104 | -1.596141e-05 | -0.000165 | -0 | -0 | -84.394625 | -932.167056 | -NFKBIA;TNF;MALT1 | -3.782688 | -3.782688 | -
22 | -KEGG_2021_Human | -TNF signaling pathway | -3/112 | -1.993521e-05 | -0.000185 | -0 | -0 | -78.169069 | -846.025656 | -NFKBIA;TNF;CXCL2 | -3.731896 | -3.731896 | -
23 | -KEGG_2021_Human | -Yersinia infection | -3/137 | -3.642707e-05 | -0.000308 | -0 | -0 | -63.505330 | -649.037070 | -NFKBIA;NLRP3;TNF | -3.511485 | -3.511485 | -
24 | -KEGG_2021_Human | -Influenza A | -3/172 | -7.174680e-05 | -0.000556 | -0 | -0 | -50.264582 | -479.643099 | -NFKBIA;NLRP3;TNF | -3.254896 | -3.254896 | -
25 | -KEGG_2021_Human | -Pathogenic Escherichia coli infection | -3/197 | -1.073310e-04 | -0.000768 | -0 | -0 | -43.731959 | -399.692315 | -NFKBIA;NLRP3;TNF | -3.114735 | -3.114735 | -
We first run cluster diagnostics to decide on a distance threshold with a good tradeoff between the number of clusters and the quality of structure. We use levenshtein distance metric
-# Running experiment to determine the distance threshold with a good trade-off
-df = ClusterUtils.run_clustering(aligner, metric='levenshtein', experiment_mode=True)
-
compute distance matrix
-using levenshtein distance metric
-
68%|██████▊ | 67/99 [00:00<00:00, 248.94it/s]
-
We can inspect structures of distance thresholds that give local optimals of the mean silhouette score.
-df[df['Mean Silhouette Score'] > 0.37].sort_values('Distance threshold', ascending=False)
-
- | Distance threshold | -Mean Silhouette Score | -Number of clusters | -
---|---|---|---|
66 | -0.67 | -0.371279 | -2.0 | -
65 | -0.66 | -0.371279 | -2.0 | -
64 | -0.65 | -0.371279 | -2.0 | -
63 | -0.64 | -0.371279 | -2.0 | -
62 | -0.63 | -0.371279 | -2.0 | -
61 | -0.62 | -0.371279 | -2.0 | -
60 | -0.61 | -0.371279 | -2.0 | -
59 | -0.60 | -0.395886 | -3.0 | -
58 | -0.59 | -0.395886 | -3.0 | -
57 | -0.58 | -0.395886 | -3.0 | -
56 | -0.57 | -0.395886 | -3.0 | -
55 | -0.56 | -0.391956 | -4.0 | -
54 | -0.55 | -0.391956 | -4.0 | -
53 | -0.54 | -0.391956 | -4.0 | -
52 | -0.53 | -0.391956 | -4.0 | -
51 | -0.52 | -0.391956 | -4.0 | -
50 | -0.51 | -0.391956 | -4.0 | -
49 | -0.50 | -0.391956 | -4.0 | -
48 | -0.49 | -0.391956 | -4.0 | -
47 | -0.48 | -0.391956 | -4.0 | -
46 | -0.47 | -0.391956 | -4.0 | -
45 | -0.46 | -0.391956 | -4.0 | -
44 | -0.45 | -0.391956 | -4.0 | -
43 | -0.44 | -0.391956 | -4.0 | -
39 | -0.40 | -0.374176 | -6.0 | -
38 | -0.39 | -0.374176 | -6.0 | -
37 | -0.38 | -0.374176 | -6.0 | -
36 | -0.37 | -0.374944 | -7.0 | -
9 | -0.10 | -0.384710 | -57.0 | -
8 | -0.09 | -0.398382 | -65.0 | -
7 | -0.08 | -0.415730 | -68.0 | -
6 | -0.07 | -0.415730 | -68.0 | -
5 | -0.06 | -0.415730 | -68.0 | -
4 | -0.05 | -0.415730 | -68.0 | -
3 | -0.04 | -0.415730 | -68.0 | -
2 | -0.03 | -0.415730 | -68.0 | -
1 | -0.02 | -0.415730 | -68.0 | -
0 | -0.01 | -0.415730 | -68.0 | -
If we select distance threshold 0.37 which gives a local optimal of 0.3749 silhouette_score with 7 clusters
-ClusterUtils.run_clustering(aligner, metric='levenshtein', DIST_THRESHOLD=0.37)
-
compute distance matrix
-using levenshtein distance metric
-run agglomerative clustering | 0.37
-silhouette_score: 0.37494418907841714
-
Below visualizes all alignment paths in each cluster along with its number of genes
-ClusterUtils.visualise_clusters(aligner,n_cols = 4, figsize= (10,6))
-
Below is the Levenshtein distance heat map of all genes ordered based on the above clustering structure
-VisualUtils.plot_distmap_with_clusters(aligner)
-
Print the cluster-specific aggregate (average) alignments for each cluster along with its number of genes
-ClusterUtils.print_cluster_average_alignments(aligner)
-
cluster: 0 IDDDMMMMMMMMIIIIIDDD 18
-cluster: 1 IIIDIDIDIDIDDDMMMMMMM 3
-cluster: 2 IDDDMMMMMMMMMMIIDI 36
-cluster: 3 IDDDMMMMMIMVVVVVIDDDDD 12
-cluster: 4 IIIDDDDDDDMMMMMMIIIIID 12
-cluster: 5 IIIDIDIDDDDDDMMMMVMVVM 2
-cluster: 6 DDMMMMMMMMIIMMMM 6
-
We can obtain the aggregate (average) alignment for any given gene subset in the aligner
-GENE_SUBSET = gene_list[40:60]
-aligner.get_aggregate_alignment_for_subset(GENE_SUBSET )
-
Average Alignment: IDDDMMMMMMMIIDIIIIDDD
-
We can check aggregate alignment and statistics of a pathway gene set with below provided wrapper function calls to retrieve pathway gene sets
-IGS = PathwayAnalyserV2.InterestingGeneSets(MSIGDB_PATH='../MSIGDB/msigdb7.5.1/')
-IGS.add_new_set_from_msigdb('hallmark', 'HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION', aligner.gene_list, 'EMT')
-
PathwayAnalyserV2.get_pathway_alignment_stat(aligner, IGS.SETS['EMT'], 'EMT', cluster=True, FIGSIZE=(6,6))
-
mean matched percentage: 51.04 %
-mean matched percentage wrt ref: 64.29 %
-mean matched percentage wrt query: 67.14 %
-Average Alignment: IDDDMMMMMMMMMMIMI
-Z-normalised Interpolated mean trends
-
Show all alignments
-aligner.show_ordered_alignments()
-
Gene Alignment
--------- ------------------------
-PTAFR MDDMMMMMMMMMMIIDI
-OSBPL3 MDMMMMMMVVVVVVDIDDDDD
-RFFL MDDDMMMMMMMMMMIII
-TNFAIP2 MDDMMMMMMMIIDIIIIDDD
-SGMS2 DMMMMMMMMMMMMIM
-SLC16A10 DMMMMMMMMMMMIIIDD
-FPR1 DMMMMMMMMMIIDIIIDDD
-FAM20C DMMMMMMMMDDDMMIIII
-CLEC4D IMMDIDMMMMMMMIIIDDD
-TSHZ1 DDMMMMMMMVVVVVVIDDDDD
-IL1F9 DDMMMMMMMVVVVVVIDDDDD
-PSTPIP2 DDMMMMMMMVVVVVIDIDDDD
-RELA DDMMMMMMMMMMMMII
-NUP54 DDMMMMMMMMMMMMII
-DDHD1 DDMMMMMMMMMMMMII
-NRP2 DDMMMMMMMMMMMIIID
-TREM1 DDMMMMMMMMMIIDIIIDD
-GRAMD1B DDMMMMMMMMMIIMMM
-TOP1 DDMMMMMMMMIDIIIIIDDD
-ICOSL DDMMMMMMMMIIMMMM
-DUSP16 DDMMMMMMMMIIMMMM
-PTPRE DDMMMMMMMMIIIIIIDDDD
-LDLR DDMMMMMMMMIIDIIIIDDD
-TNIP1 DDMMMMMMMMIIIIIIDDDD
-PLAGL2 DDDMMMMMMMMMMMVVV
-ZSWIM4 DDDMMMMMMMMMMMVVV
-ZC3H12C DDDMMMMMMVVVVVVVIDDDDD
-AK150559 DDDMMMMMVVVVVVVVIDDDDDD
-F10 IDDMMMMVVVVVVVVIDDDDDDDD
-FAM108C DDDMMMMMMMMMMMIII
-RBM7 DDDMMMMMMMMMMMIII
-RASA2 DDDMMMMMMMMMMMIII
-SLC25A37 DDDMMMMMMMMMMMIII
-IRAK-2 DDDMMMMMMMMMMIIIID
-PLEKHO2 DDDMMMMMMMMMMIIIID
-LCP2 DDDMMMMMMMMMIIIIIDD
-TRIM13 DDDMMMMMMMMMIIIMM
-PTX3 DDDMMMMMMIIIMMMMM
-SPATA13 IDDMMMMMMIDIIIIIIDDDDD
-BCL2L11 IDDMMMDMMMMMMMMII
-CD44 DDIDMMMMMMMMMMMVV
-AK163103 DDIDMMMMMMVVVVVVIDDDDD
-LZTFL1 DDIDMMMMMMVVVVVVIDDDDD
-IRAK3 DDIDMMMMMVVVVVVIIDDDDDD
-ARG2 DDIDMMMMMMMMMMMII
-ZEB2 DDIDMMMMMMMMMMIIID
-TLR2 DDIDMMMMMMMMMMIIID
-MCOLN2 DDIDMMMMMMMMMMIIID
-CPD IDIDMMMMMMMMMIIIDDD
-RCAN1 DDDDMMMMMMMIIIMMMI
-PILRA DDIDMMMMMMMIDIIIIIDDD
-ARHGEF3 DDIDMMMMMMMIIMMMM
-C5AR1 IIIDMMMMMMIIIIIDDDDDDD
-SLC39A14 DDIDMMMMMMDDMMMIIII
-CLCN7 DDDIDMMMMMMVVVVVVIDDDD
-BC031781 DDDIDMMMMMMMMMIIIID
-NUPR1 DDDIDMMMMMMMMMIIIID
-CDC42EP4 DDDIDMMMMMMMMMIIIID
-NFKBIE DDDIDMMMMMMMMIIIMM
-PLSCR1 DDDIDMMMMMMMMIIIIIDD
-NCK1 DDDIDMMMMMMMMIIIMM
-ADORA2B DDDIDMMMMMMMMIIIIIDD
-ORAI2 DDDIDMMMMMMMIIIIIIDDD
-KLF7 DDDIDMMMMMMMIIIIIIDDD
-NIACR1 DDDIDDMMMMVVVVVVVVIDDDDD
-PDE4B DDDIDDMMMMMMMMMIIII
-SERTAD2 DDDIDDMMMMMMMMIIIIID
-CXCL1 DDDIDDMMMMMMMIIIIIIDD
-MPP5 DDDIDDMMMMMIIMMMIIID
-TGM2 DDDIDIDMMMMMMMMMIII
-PIP5K1A IDDIDIDMMMMMMMMMIID
-FLRT3 DDDIDDDMMMMMMMMIIIII
-SOCS3 DDDIDDDDMMMVVVVVIMMMM
-TNFAIP3 DDDIDDDDMMMMMMMIIIIII
-RASGEF1B DDDIDDDDMMMMMMMIIIIII
-SLC25A25 DDDIDDDDMMMMMMIIIIIIM
-INSIG1 DDDIDDDDMMMMMMIIIIIIID
-CXCL2 DDDIDIDDMMMMMMIIIIIIDD
-MALT1 DDDIDIDDDMMMMMMIIIIIID
-RALGDS DDDIDIDDDMMMMMMIIIIIID
-H1F0 DDDIDIDIDMMMMMIIIMMM
-IL1A DDDIDIDIDDMMMMMMMIIII
-NLRP3 DDDIDIDIDDMMMMMIIIIIIDD
-TNF DDDIDIDIDDDMMMMMIIIIIID
-NFKBIA DDDIDIDIDDDMMMMMIIIIIID
-PLK2 IIIDIDIDIDDDMMMMMMMID
-NFKBIZ DDDIDIDDDDDIVVVVVVVDMMMM
-NFKBID IIIDIDIDIDDDDMMMMMMMI
-CCRL2 IIIDIDIDIDIIIDDDDDMMMMM
-
30 PTAFR
-40 OSBPL3
-74 RFFL
-80 TNFAIP2
-26 SGMS2
- ...
-77 NFKBIA
-81 PLK2
-88 NFKBIZ
-46 NFKBID
-63 CCRL2
-Name: genes, Length: 89, dtype: object
-
\n", - " | Gene | \n", - "alignment_similarity_percentage | \n", - "opt_alignment_cost | \n", - "l2fc | \n", - "color | \n", - "abs_l2fc | \n", - "
---|---|---|---|---|---|---|
63 | \n", - "CCRL2 | \n", - "0.2174 | \n", - "55.943685 | \n", - "-0.487688 | \n", - "red | \n", - "0.487688 | \n", - "
77 | \n", - "NFKBIA | \n", - "0.2174 | \n", - "54.673471 | \n", - "-0.091748 | \n", - "red | \n", - "0.091748 | \n", - "
68 | \n", - "NLRP3 | \n", - "0.2174 | \n", - "57.177548 | \n", - "0.069058 | \n", - "red | \n", - "0.069058 | \n", - "
3 | \n", - "TNF | \n", - "0.2174 | \n", - "57.990078 | \n", - "-0.006439 | \n", - "red | \n", - "0.006439 | \n", - "
45 | \n", - "C5AR1 | \n", - "0.2727 | \n", - "57.858236 | \n", - "0.8711 | \n", - "red | \n", - "0.8711 | \n", - "
... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "
34 | \n", - "NUP54 | \n", - "0.75 | \n", - "30.744063 | \n", - "0.012993 | \n", - "green | \n", - "0.012993 | \n", - "
15 | \n", - "CD44 | \n", - "0.7647 | \n", - "28.366715 | \n", - "-0.021366 | \n", - "green | \n", - "0.021366 | \n", - "
19 | \n", - "PLAGL2 | \n", - "0.8235 | \n", - "31.807956 | \n", - "-0.051268 | \n", - "green | \n", - "0.051268 | \n", - "
51 | \n", - "ZSWIM4 | \n", - "0.8235 | \n", - "30.214575 | \n", - "0.030379 | \n", - "green | \n", - "0.030379 | \n", - "
26 | \n", - "SGMS2 | \n", - "0.8667 | \n", - "37.626682 | \n", - "-0.020399 | \n", - "green | \n", - "0.020399 | \n", - "
89 rows × 6 columns
\n", - "\n", - " | Gene_set | \n", - "Term | \n", - "Overlap | \n", - "P-value | \n", - "Adjusted P-value | \n", - "Old P-value | \n", - "Old Adjusted P-value | \n", - "Odds Ratio | \n", - "Combined Score | \n", - "Genes | \n", - "-log10 Adjusted P-value | \n", - "-log10 FDR q-val | \n", - "
---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | \n", - "KEGG_2021_Human | \n", - "C-type lectin receptor signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
14 | \n", - "KEGG_2021_Human | \n", - "NF-kappa B signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;TNF;CXCL2;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
0 | \n", - "MSigDB_Hallmark_2020 | \n", - "TNF-alpha Signaling via NF-kB | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;CCRL2;TNF;CXCL2 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
1 | \n", - "MSigDB_Hallmark_2020 | \n", - "Inflammatory Response | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;C5AR1;CCRL2;NLRP3 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
15 | \n", - "KEGG_2021_Human | \n", - "NOD-like receptor signaling pathway | \n", - "4/181 | \n", - "1.305897e-06 | \n", - "0.000040 | \n", - "0 | \n", - "0 | \n", - "74.625235 | \n", - "1011.068962 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.392729 | \n", - "4.392729 | \n", - "
16 | \n", - "KEGG_2021_Human | \n", - "Lipid and atherosclerosis | \n", - "4/215 | \n", - "2.592326e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "62.492891 | \n", - "803.843248 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
17 | \n", - "KEGG_2021_Human | \n", - "Legionellosis | \n", - "3/57 | \n", - "2.596486e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "158.222222 | \n", - "2034.951617 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
18 | \n", - "KEGG_2021_Human | \n", - "Coronavirus disease | \n", - "4/232 | \n", - "3.507534e-06 | \n", - "0.000054 | \n", - "0 | \n", - "0 | \n", - "57.783626 | \n", - "725.796849 | \n", - "NFKBIA;C5AR1;NLRP3;TNF | \n", - "4.264666 | \n", - "4.264666 | \n", - "
19 | \n", - "KEGG_2021_Human | \n", - "Shigellosis | \n", - "4/246 | \n", - "4.425543e-06 | \n", - "0.000059 | \n", - "0 | \n", - "0 | \n", - "54.402204 | \n", - "670.676767 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "4.230649 | \n", - "4.230649 | \n", - "
20 | \n", - "KEGG_2021_Human | \n", - "IL-17 signaling pathway | \n", - "3/94 | \n", - "1.177987e-05 | \n", - "0.000137 | \n", - "0 | \n", - "0 | \n", - "93.715856 | \n", - "1063.592345 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.863467 | \n", - "3.863467 | \n", - "
21 | \n", - "KEGG_2021_Human | \n", - "T cell receptor signaling pathway | \n", - "3/104 | \n", - "1.596141e-05 | \n", - "0.000165 | \n", - "0 | \n", - "0 | \n", - "84.394625 | \n", - "932.167056 | \n", - "NFKBIA;TNF;MALT1 | \n", - "3.782688 | \n", - "3.782688 | \n", - "
22 | \n", - "KEGG_2021_Human | \n", - "TNF signaling pathway | \n", - "3/112 | \n", - "1.993521e-05 | \n", - "0.000185 | \n", - "0 | \n", - "0 | \n", - "78.169069 | \n", - "846.025656 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.731896 | \n", - "3.731896 | \n", - "
23 | \n", - "KEGG_2021_Human | \n", - "Yersinia infection | \n", - "3/137 | \n", - "3.642707e-05 | \n", - "0.000308 | \n", - "0 | \n", - "0 | \n", - "63.505330 | \n", - "649.037070 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.511485 | \n", - "3.511485 | \n", - "
24 | \n", - "KEGG_2021_Human | \n", - "Influenza A | \n", - "3/172 | \n", - "7.174680e-05 | \n", - "0.000556 | \n", - "0 | \n", - "0 | \n", - "50.264582 | \n", - "479.643099 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.254896 | \n", - "3.254896 | \n", - "
25 | \n", - "KEGG_2021_Human | \n", - "Pathogenic Escherichia coli infection | \n", - "3/197 | \n", - "1.073310e-04 | \n", - "0.000768 | \n", - "0 | \n", - "0 | \n", - "43.731959 | \n", - "399.692315 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.114735 | \n", - "3.114735 | \n", - "
\n", - " | Distance threshold | \n", - "Mean Silhouette Score | \n", - "Number of clusters | \n", - "
---|---|---|---|
66 | \n", - "0.67 | \n", - "0.371279 | \n", - "2.0 | \n", - "
65 | \n", - "0.66 | \n", - "0.371279 | \n", - "2.0 | \n", - "
64 | \n", - "0.65 | \n", - "0.371279 | \n", - "2.0 | \n", - "
63 | \n", - "0.64 | \n", - "0.371279 | \n", - "2.0 | \n", - "
62 | \n", - "0.63 | \n", - "0.371279 | \n", - "2.0 | \n", - "
61 | \n", - "0.62 | \n", - "0.371279 | \n", - "2.0 | \n", - "
60 | \n", - "0.61 | \n", - "0.371279 | \n", - "2.0 | \n", - "
59 | \n", - "0.60 | \n", - "0.395886 | \n", - "3.0 | \n", - "
58 | \n", - "0.59 | \n", - "0.395886 | \n", - "3.0 | \n", - "
57 | \n", - "0.58 | \n", - "0.395886 | \n", - "3.0 | \n", - "
56 | \n", - "0.57 | \n", - "0.395886 | \n", - "3.0 | \n", - "
55 | \n", - "0.56 | \n", - "0.391956 | \n", - "4.0 | \n", - "
54 | \n", - "0.55 | \n", - "0.391956 | \n", - "4.0 | \n", - "
53 | \n", - "0.54 | \n", - "0.391956 | \n", - "4.0 | \n", - "
52 | \n", - "0.53 | \n", - "0.391956 | \n", - "4.0 | \n", - "
51 | \n", - "0.52 | \n", - "0.391956 | \n", - "4.0 | \n", - "
50 | \n", - "0.51 | \n", - "0.391956 | \n", - "4.0 | \n", - "
49 | \n", - "0.50 | \n", - "0.391956 | \n", - "4.0 | \n", - "
48 | \n", - "0.49 | \n", - "0.391956 | \n", - "4.0 | \n", - "
47 | \n", - "0.48 | \n", - "0.391956 | \n", - "4.0 | \n", - "
46 | \n", - "0.47 | \n", - "0.391956 | \n", - "4.0 | \n", - "
45 | \n", - "0.46 | \n", - "0.391956 | \n", - "4.0 | \n", - "
44 | \n", - "0.45 | \n", - "0.391956 | \n", - "4.0 | \n", - "
43 | \n", - "0.44 | \n", - "0.391956 | \n", - "4.0 | \n", - "
39 | \n", - "0.40 | \n", - "0.374176 | \n", - "6.0 | \n", - "
38 | \n", - "0.39 | \n", - "0.374176 | \n", - "6.0 | \n", - "
37 | \n", - "0.38 | \n", - "0.374176 | \n", - "6.0 | \n", - "
36 | \n", - "0.37 | \n", - "0.374944 | \n", - "7.0 | \n", - "
9 | \n", - "0.10 | \n", - "0.384710 | \n", - "57.0 | \n", - "
8 | \n", - "0.09 | \n", - "0.398382 | \n", - "65.0 | \n", - "
7 | \n", - "0.08 | \n", - "0.415730 | \n", - "68.0 | \n", - "
6 | \n", - "0.07 | \n", - "0.415730 | \n", - "68.0 | \n", - "
5 | \n", - "0.06 | \n", - "0.415730 | \n", - "68.0 | \n", - "
4 | \n", - "0.05 | \n", - "0.415730 | \n", - "68.0 | \n", - "
3 | \n", - "0.04 | \n", - "0.415730 | \n", - "68.0 | \n", - "
2 | \n", - "0.03 | \n", - "0.415730 | \n", - "68.0 | \n", - "
1 | \n", - "0.02 | \n", - "0.415730 | \n", - "68.0 | \n", - "
0 | \n", - "0.01 | \n", - "0.415730 | \n", - "68.0 | \n", - "
\n", - " | Gene | \n", - "alignment_similarity_percentage | \n", - "opt_alignment_cost | \n", - "l2fc | \n", - "color | \n", - "abs_l2fc | \n", - "
---|---|---|---|---|---|---|
63 | \n", - "CCRL2 | \n", - "0.2174 | \n", - "55.943685 | \n", - "-0.487688 | \n", - "red | \n", - "0.487688 | \n", - "
77 | \n", - "NFKBIA | \n", - "0.2174 | \n", - "54.673471 | \n", - "-0.091748 | \n", - "red | \n", - "0.091748 | \n", - "
68 | \n", - "NLRP3 | \n", - "0.2174 | \n", - "57.177548 | \n", - "0.069058 | \n", - "red | \n", - "0.069058 | \n", - "
3 | \n", - "TNF | \n", - "0.2174 | \n", - "57.990078 | \n", - "-0.006439 | \n", - "red | \n", - "0.006439 | \n", - "
45 | \n", - "C5AR1 | \n", - "0.2727 | \n", - "57.858236 | \n", - "0.8711 | \n", - "red | \n", - "0.8711 | \n", - "
... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "
34 | \n", - "NUP54 | \n", - "0.75 | \n", - "30.744063 | \n", - "0.012993 | \n", - "green | \n", - "0.012993 | \n", - "
15 | \n", - "CD44 | \n", - "0.7647 | \n", - "28.366715 | \n", - "-0.021366 | \n", - "green | \n", - "0.021366 | \n", - "
19 | \n", - "PLAGL2 | \n", - "0.8235 | \n", - "31.807956 | \n", - "-0.051268 | \n", - "green | \n", - "0.051268 | \n", - "
51 | \n", - "ZSWIM4 | \n", - "0.8235 | \n", - "30.214575 | \n", - "0.030379 | \n", - "green | \n", - "0.030379 | \n", - "
26 | \n", - "SGMS2 | \n", - "0.8667 | \n", - "37.626682 | \n", - "-0.020399 | \n", - "green | \n", - "0.020399 | \n", - "
89 rows × 6 columns
\n", - "\n", - " | Gene_set | \n", - "Term | \n", - "Overlap | \n", - "P-value | \n", - "Adjusted P-value | \n", - "Old P-value | \n", - "Old Adjusted P-value | \n", - "Odds Ratio | \n", - "Combined Score | \n", - "Genes | \n", - "-log10 Adjusted P-value | \n", - "-log10 FDR q-val | \n", - "
---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | \n", - "KEGG_2021_Human | \n", - "C-type lectin receptor signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
14 | \n", - "KEGG_2021_Human | \n", - "NF-kappa B signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;TNF;CXCL2;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
0 | \n", - "MSigDB_Hallmark_2020 | \n", - "TNF-alpha Signaling via NF-kB | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;CCRL2;TNF;CXCL2 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
1 | \n", - "MSigDB_Hallmark_2020 | \n", - "Inflammatory Response | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;C5AR1;CCRL2;NLRP3 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
15 | \n", - "KEGG_2021_Human | \n", - "NOD-like receptor signaling pathway | \n", - "4/181 | \n", - "1.305897e-06 | \n", - "0.000040 | \n", - "0 | \n", - "0 | \n", - "74.625235 | \n", - "1011.068962 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.392729 | \n", - "4.392729 | \n", - "
16 | \n", - "KEGG_2021_Human | \n", - "Lipid and atherosclerosis | \n", - "4/215 | \n", - "2.592326e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "62.492891 | \n", - "803.843248 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
17 | \n", - "KEGG_2021_Human | \n", - "Legionellosis | \n", - "3/57 | \n", - "2.596486e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "158.222222 | \n", - "2034.951617 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
18 | \n", - "KEGG_2021_Human | \n", - "Coronavirus disease | \n", - "4/232 | \n", - "3.507534e-06 | \n", - "0.000054 | \n", - "0 | \n", - "0 | \n", - "57.783626 | \n", - "725.796849 | \n", - "NFKBIA;C5AR1;NLRP3;TNF | \n", - "4.264666 | \n", - "4.264666 | \n", - "
19 | \n", - "KEGG_2021_Human | \n", - "Shigellosis | \n", - "4/246 | \n", - "4.425543e-06 | \n", - "0.000059 | \n", - "0 | \n", - "0 | \n", - "54.402204 | \n", - "670.676767 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "4.230649 | \n", - "4.230649 | \n", - "
20 | \n", - "KEGG_2021_Human | \n", - "IL-17 signaling pathway | \n", - "3/94 | \n", - "1.177987e-05 | \n", - "0.000137 | \n", - "0 | \n", - "0 | \n", - "93.715856 | \n", - "1063.592345 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.863467 | \n", - "3.863467 | \n", - "
21 | \n", - "KEGG_2021_Human | \n", - "T cell receptor signaling pathway | \n", - "3/104 | \n", - "1.596141e-05 | \n", - "0.000165 | \n", - "0 | \n", - "0 | \n", - "84.394625 | \n", - "932.167056 | \n", - "NFKBIA;TNF;MALT1 | \n", - "3.782688 | \n", - "3.782688 | \n", - "
22 | \n", - "KEGG_2021_Human | \n", - "TNF signaling pathway | \n", - "3/112 | \n", - "1.993521e-05 | \n", - "0.000185 | \n", - "0 | \n", - "0 | \n", - "78.169069 | \n", - "846.025656 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.731896 | \n", - "3.731896 | \n", - "
23 | \n", - "KEGG_2021_Human | \n", - "Yersinia infection | \n", - "3/137 | \n", - "3.642707e-05 | \n", - "0.000308 | \n", - "0 | \n", - "0 | \n", - "63.505330 | \n", - "649.037070 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.511485 | \n", - "3.511485 | \n", - "
24 | \n", - "KEGG_2021_Human | \n", - "Influenza A | \n", - "3/172 | \n", - "7.174680e-05 | \n", - "0.000556 | \n", - "0 | \n", - "0 | \n", - "50.264582 | \n", - "479.643099 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.254896 | \n", - "3.254896 | \n", - "
25 | \n", - "KEGG_2021_Human | \n", - "Pathogenic Escherichia coli infection | \n", - "3/197 | \n", - "1.073310e-04 | \n", - "0.000768 | \n", - "0 | \n", - "0 | \n", - "43.731959 | \n", - "399.692315 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.114735 | \n", - "3.114735 | \n", - "
\n", - " | Distance threshold | \n", - "Mean Silhouette Score | \n", - "Number of clusters | \n", - "
---|---|---|---|
66 | \n", - "0.67 | \n", - "0.371279 | \n", - "2.0 | \n", - "
65 | \n", - "0.66 | \n", - "0.371279 | \n", - "2.0 | \n", - "
64 | \n", - "0.65 | \n", - "0.371279 | \n", - "2.0 | \n", - "
63 | \n", - "0.64 | \n", - "0.371279 | \n", - "2.0 | \n", - "
62 | \n", - "0.63 | \n", - "0.371279 | \n", - "2.0 | \n", - "
61 | \n", - "0.62 | \n", - "0.371279 | \n", - "2.0 | \n", - "
60 | \n", - "0.61 | \n", - "0.371279 | \n", - "2.0 | \n", - "
59 | \n", - "0.60 | \n", - "0.395886 | \n", - "3.0 | \n", - "
58 | \n", - "0.59 | \n", - "0.395886 | \n", - "3.0 | \n", - "
57 | \n", - "0.58 | \n", - "0.395886 | \n", - "3.0 | \n", - "
56 | \n", - "0.57 | \n", - "0.395886 | \n", - "3.0 | \n", - "
55 | \n", - "0.56 | \n", - "0.391956 | \n", - "4.0 | \n", - "
54 | \n", - "0.55 | \n", - "0.391956 | \n", - "4.0 | \n", - "
53 | \n", - "0.54 | \n", - "0.391956 | \n", - "4.0 | \n", - "
52 | \n", - "0.53 | \n", - "0.391956 | \n", - "4.0 | \n", - "
51 | \n", - "0.52 | \n", - "0.391956 | \n", - "4.0 | \n", - "
50 | \n", - "0.51 | \n", - "0.391956 | \n", - "4.0 | \n", - "
49 | \n", - "0.50 | \n", - "0.391956 | \n", - "4.0 | \n", - "
48 | \n", - "0.49 | \n", - "0.391956 | \n", - "4.0 | \n", - "
47 | \n", - "0.48 | \n", - "0.391956 | \n", - "4.0 | \n", - "
46 | \n", - "0.47 | \n", - "0.391956 | \n", - "4.0 | \n", - "
45 | \n", - "0.46 | \n", - "0.391956 | \n", - "4.0 | \n", - "
44 | \n", - "0.45 | \n", - "0.391956 | \n", - "4.0 | \n", - "
43 | \n", - "0.44 | \n", - "0.391956 | \n", - "4.0 | \n", - "
39 | \n", - "0.40 | \n", - "0.374176 | \n", - "6.0 | \n", - "
38 | \n", - "0.39 | \n", - "0.374176 | \n", - "6.0 | \n", - "
37 | \n", - "0.38 | \n", - "0.374176 | \n", - "6.0 | \n", - "
36 | \n", - "0.37 | \n", - "0.374944 | \n", - "7.0 | \n", - "
9 | \n", - "0.10 | \n", - "0.384710 | \n", - "57.0 | \n", - "
8 | \n", - "0.09 | \n", - "0.398382 | \n", - "65.0 | \n", - "
7 | \n", - "0.08 | \n", - "0.415730 | \n", - "68.0 | \n", - "
6 | \n", - "0.07 | \n", - "0.415730 | \n", - "68.0 | \n", - "
5 | \n", - "0.06 | \n", - "0.415730 | \n", - "68.0 | \n", - "
4 | \n", - "0.05 | \n", - "0.415730 | \n", - "68.0 | \n", - "
3 | \n", - "0.04 | \n", - "0.415730 | \n", - "68.0 | \n", - "
2 | \n", - "0.03 | \n", - "0.415730 | \n", - "68.0 | \n", - "
1 | \n", - "0.02 | \n", - "0.415730 | \n", - "68.0 | \n", - "
0 | \n", - "0.01 | \n", - "0.415730 | \n", - "68.0 | \n", - "
\n", - " | Gene | \n", - "alignment_similarity_percentage | \n", - "opt_alignment_cost | \n", - "l2fc | \n", - "color | \n", - "abs_l2fc | \n", - "
---|---|---|---|---|---|---|
63 | \n", - "CCRL2 | \n", - "0.2174 | \n", - "55.943685 | \n", - "-0.487688 | \n", - "red | \n", - "0.487688 | \n", - "
77 | \n", - "NFKBIA | \n", - "0.2174 | \n", - "54.673471 | \n", - "-0.091748 | \n", - "red | \n", - "0.091748 | \n", - "
68 | \n", - "NLRP3 | \n", - "0.2174 | \n", - "57.177548 | \n", - "0.069058 | \n", - "red | \n", - "0.069058 | \n", - "
3 | \n", - "TNF | \n", - "0.2174 | \n", - "57.990078 | \n", - "-0.006439 | \n", - "red | \n", - "0.006439 | \n", - "
45 | \n", - "C5AR1 | \n", - "0.2727 | \n", - "57.858236 | \n", - "0.8711 | \n", - "red | \n", - "0.8711 | \n", - "
... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "... | \n", - "
34 | \n", - "NUP54 | \n", - "0.75 | \n", - "30.744063 | \n", - "0.012993 | \n", - "green | \n", - "0.012993 | \n", - "
15 | \n", - "CD44 | \n", - "0.7647 | \n", - "28.366715 | \n", - "-0.021366 | \n", - "green | \n", - "0.021366 | \n", - "
19 | \n", - "PLAGL2 | \n", - "0.8235 | \n", - "31.807956 | \n", - "-0.051268 | \n", - "green | \n", - "0.051268 | \n", - "
51 | \n", - "ZSWIM4 | \n", - "0.8235 | \n", - "30.214575 | \n", - "0.030379 | \n", - "green | \n", - "0.030379 | \n", - "
26 | \n", - "SGMS2 | \n", - "0.8667 | \n", - "37.626682 | \n", - "-0.020399 | \n", - "green | \n", - "0.020399 | \n", - "
89 rows × 6 columns
\n", - "\n", - " | Gene_set | \n", - "Term | \n", - "Overlap | \n", - "P-value | \n", - "Adjusted P-value | \n", - "Old P-value | \n", - "Old Adjusted P-value | \n", - "Odds Ratio | \n", - "Combined Score | \n", - "Genes | \n", - "-log10 Adjusted P-value | \n", - "-log10 FDR q-val | \n", - "
---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | \n", - "KEGG_2021_Human | \n", - "C-type lectin receptor signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
14 | \n", - "KEGG_2021_Human | \n", - "NF-kappa B signaling pathway | \n", - "4/104 | \n", - "1.414253e-07 | \n", - "0.000007 | \n", - "0 | \n", - "0 | \n", - "132.600000 | \n", - "2091.300110 | \n", - "NFKBIA;TNF;CXCL2;MALT1 | \n", - "5.182020 | \n", - "5.182020 | \n", - "
0 | \n", - "MSigDB_Hallmark_2020 | \n", - "TNF-alpha Signaling via NF-kB | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;CCRL2;TNF;CXCL2 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
1 | \n", - "MSigDB_Hallmark_2020 | \n", - "Inflammatory Response | \n", - "4/200 | \n", - "1.944057e-06 | \n", - "0.000013 | \n", - "0 | \n", - "0 | \n", - "67.326531 | \n", - "885.393278 | \n", - "NFKBIA;C5AR1;CCRL2;NLRP3 | \n", - "4.898378 | \n", - "4.898378 | \n", - "
15 | \n", - "KEGG_2021_Human | \n", - "NOD-like receptor signaling pathway | \n", - "4/181 | \n", - "1.305897e-06 | \n", - "0.000040 | \n", - "0 | \n", - "0 | \n", - "74.625235 | \n", - "1011.068962 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.392729 | \n", - "4.392729 | \n", - "
16 | \n", - "KEGG_2021_Human | \n", - "Lipid and atherosclerosis | \n", - "4/215 | \n", - "2.592326e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "62.492891 | \n", - "803.843248 | \n", - "NFKBIA;NLRP3;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
17 | \n", - "KEGG_2021_Human | \n", - "Legionellosis | \n", - "3/57 | \n", - "2.596486e-06 | \n", - "0.000048 | \n", - "0 | \n", - "0 | \n", - "158.222222 | \n", - "2034.951617 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "4.316101 | \n", - "4.316101 | \n", - "
18 | \n", - "KEGG_2021_Human | \n", - "Coronavirus disease | \n", - "4/232 | \n", - "3.507534e-06 | \n", - "0.000054 | \n", - "0 | \n", - "0 | \n", - "57.783626 | \n", - "725.796849 | \n", - "NFKBIA;C5AR1;NLRP3;TNF | \n", - "4.264666 | \n", - "4.264666 | \n", - "
19 | \n", - "KEGG_2021_Human | \n", - "Shigellosis | \n", - "4/246 | \n", - "4.425543e-06 | \n", - "0.000059 | \n", - "0 | \n", - "0 | \n", - "54.402204 | \n", - "670.676767 | \n", - "NFKBIA;NLRP3;TNF;MALT1 | \n", - "4.230649 | \n", - "4.230649 | \n", - "
20 | \n", - "KEGG_2021_Human | \n", - "IL-17 signaling pathway | \n", - "3/94 | \n", - "1.177987e-05 | \n", - "0.000137 | \n", - "0 | \n", - "0 | \n", - "93.715856 | \n", - "1063.592345 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.863467 | \n", - "3.863467 | \n", - "
21 | \n", - "KEGG_2021_Human | \n", - "T cell receptor signaling pathway | \n", - "3/104 | \n", - "1.596141e-05 | \n", - "0.000165 | \n", - "0 | \n", - "0 | \n", - "84.394625 | \n", - "932.167056 | \n", - "NFKBIA;TNF;MALT1 | \n", - "3.782688 | \n", - "3.782688 | \n", - "
22 | \n", - "KEGG_2021_Human | \n", - "TNF signaling pathway | \n", - "3/112 | \n", - "1.993521e-05 | \n", - "0.000185 | \n", - "0 | \n", - "0 | \n", - "78.169069 | \n", - "846.025656 | \n", - "NFKBIA;TNF;CXCL2 | \n", - "3.731896 | \n", - "3.731896 | \n", - "
23 | \n", - "KEGG_2021_Human | \n", - "Yersinia infection | \n", - "3/137 | \n", - "3.642707e-05 | \n", - "0.000308 | \n", - "0 | \n", - "0 | \n", - "63.505330 | \n", - "649.037070 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.511485 | \n", - "3.511485 | \n", - "
24 | \n", - "KEGG_2021_Human | \n", - "Influenza A | \n", - "3/172 | \n", - "7.174680e-05 | \n", - "0.000556 | \n", - "0 | \n", - "0 | \n", - "50.264582 | \n", - "479.643099 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.254896 | \n", - "3.254896 | \n", - "
25 | \n", - "KEGG_2021_Human | \n", - "Pathogenic Escherichia coli infection | \n", - "3/197 | \n", - "1.073310e-04 | \n", - "0.000768 | \n", - "0 | \n", - "0 | \n", - "43.731959 | \n", - "399.692315 | \n", - "NFKBIA;NLRP3;TNF | \n", - "3.114735 | \n", - "3.114735 | \n", - "
\n", - " | Distance threshold | \n", - "Mean Silhouette Score | \n", - "Number of clusters | \n", - "
---|---|---|---|
66 | \n", - "0.67 | \n", - "0.371279 | \n", - "2.0 | \n", - "
65 | \n", - "0.66 | \n", - "0.371279 | \n", - "2.0 | \n", - "
64 | \n", - "0.65 | \n", - "0.371279 | \n", - "2.0 | \n", - "
63 | \n", - "0.64 | \n", - "0.371279 | \n", - "2.0 | \n", - "
62 | \n", - "0.63 | \n", - "0.371279 | \n", - "2.0 | \n", - "
61 | \n", - "0.62 | \n", - "0.371279 | \n", - "2.0 | \n", - "
60 | \n", - "0.61 | \n", - "0.371279 | \n", - "2.0 | \n", - "
59 | \n", - "0.60 | \n", - "0.395886 | \n", - "3.0 | \n", - "
58 | \n", - "0.59 | \n", - "0.395886 | \n", - "3.0 | \n", - "
57 | \n", - "0.58 | \n", - "0.395886 | \n", - "3.0 | \n", - "
56 | \n", - "0.57 | \n", - "0.395886 | \n", - "3.0 | \n", - "
55 | \n", - "0.56 | \n", - "0.391956 | \n", - "4.0 | \n", - "
54 | \n", - "0.55 | \n", - "0.391956 | \n", - "4.0 | \n", - "
53 | \n", - "0.54 | \n", - "0.391956 | \n", - "4.0 | \n", - "
52 | \n", - "0.53 | \n", - "0.391956 | \n", - "4.0 | \n", - "
51 | \n", - "0.52 | \n", - "0.391956 | \n", - "4.0 | \n", - "
50 | \n", - "0.51 | \n", - "0.391956 | \n", - "4.0 | \n", - "
49 | \n", - "0.50 | \n", - "0.391956 | \n", - "4.0 | \n", - "
48 | \n", - "0.49 | \n", - "0.391956 | \n", - "4.0 | \n", - "
47 | \n", - "0.48 | \n", - "0.391956 | \n", - "4.0 | \n", - "
46 | \n", - "0.47 | \n", - "0.391956 | \n", - "4.0 | \n", - "
45 | \n", - "0.46 | \n", - "0.391956 | \n", - "4.0 | \n", - "
44 | \n", - "0.45 | \n", - "0.391956 | \n", - "4.0 | \n", - "
43 | \n", - "0.44 | \n", - "0.391956 | \n", - "4.0 | \n", - "
39 | \n", - "0.40 | \n", - "0.374176 | \n", - "6.0 | \n", - "
38 | \n", - "0.39 | \n", - "0.374176 | \n", - "6.0 | \n", - "
37 | \n", - "0.38 | \n", - "0.374176 | \n", - "6.0 | \n", - "
36 | \n", - "0.37 | \n", - "0.374944 | \n", - "7.0 | \n", - "
9 | \n", - "0.10 | \n", - "0.384710 | \n", - "57.0 | \n", - "
8 | \n", - "0.09 | \n", - "0.398382 | \n", - "65.0 | \n", - "
7 | \n", - "0.08 | \n", - "0.415730 | \n", - "68.0 | \n", - "
6 | \n", - "0.07 | \n", - "0.415730 | \n", - "68.0 | \n", - "
5 | \n", - "0.06 | \n", - "0.415730 | \n", - "68.0 | \n", - "
4 | \n", - "0.05 | \n", - "0.415730 | \n", - "68.0 | \n", - "
3 | \n", - "0.04 | \n", - "0.415730 | \n", - "68.0 | \n", - "
2 | \n", - "0.03 | \n", - "0.415730 | \n", - "68.0 | \n", - "
1 | \n", - "0.02 | \n", - "0.415730 | \n", - "68.0 | \n", - "
0 | \n", - "0.01 | \n", - "0.415730 | \n", - "68.0 | \n", - "