TSUMUGI (Trait-driven Surveillance for Mutation-based Gene module Identification) is a tool that uses phenotype data from knockout mice provided by the International Mouse Phenotyping Consortium (IMPC).
It identifies and visualizes gene modules based on phenotypic similarity.
TSUMUGI is available as a web-tool that anyone can access from here👇️
🔗 https://larc-tsukuba.github.io/tsumugi/
The name TSUMUGI is derived from the Japanese word 紡ぎ, symbolizing the concept of twisting various elements (genes) together to create a unified entity (phenotype).
TSUMUGI accepts the following three types of input:
When you input a phenotype of interest, TSUMUGI searches for a group of genes whose knockout (KO) mice not only exhibit that phenotype, but also share similar overall phenotype profiles.
Phenotype names are based on the Mammalian Phenotype Ontology (MPO).
You can view the list of phenotypes currently searchable with TSUMUGI here:
👉 Phenotype List
When you specify a particular gene, TSUMUGI searches for other genes whose KO mice share similar phenotype profiles with the KO mouse of the specified gene.
Gene symbols follow the official nomenclature from MGI.
You can view the list of gene symbols currently searchable with TSUMUGI here:
👉 Gene Symbol List
You can input multiple genes (one per line).
TSUMUGI calculates phenotypic similarity between genes within the list only.
Note
Unlike a single gene search, the Gene List input compares phenotypes within the listed genes only.
Caution
If no similar phenotype pairs are found, the message
No similar phenotypes were found among the entered genes.
will appear.
If more than 200 gene pairs are detected, an alert will show
Too many genes submitted. Please limit the number to 200 or fewer.
to prevent browser overload.
You can download raw data of phenotype similarities between gene pairs (CSV format, gzip-compressed).
The file includes:
- Gene pair (Gene1, Gene2)
- Phenotypic similarity score (Jaccard Similarity)
- Number of shared phenotypes
- List of shared phenotypes
Caution
The file size is approximately 100MB. Downloading may take some time.
After input is submitted, the page automatically transitions to a network view.
Only gene pairs that share three or more abnormal phenotypes are visualized.
- Each node represents a single gene.
- Clicking a node displays the list of abnormal phenotypes observed in its KO mouse.
- Nodes can be repositioned freely by dragging.
- Clicking an edge reveals detailed information about the shared phenotypes between the two connected genes.
The left-hand control panel allows you to adjust the appearance and content of the network.
You can choose from the following layout options:
- Cose (default)
- Circle
- Random
- Grid
- Concentric
The Phenotypes similarity
slider lets you filter gene pairs based on their phenotypic similarity (Jaccard index).
The similarity scores are scaled to a range of 1 to 10, allowing filtering in 10 steps.
The Phenotypes severity
slider allows you to filter nodes based on phenotype severity in KO mice.
Higher severity scores indicate stronger phenotypic effects.
This score is also scaled from 1 to 10 for 10-step filtering.
Note
The severity slider is not available if the phenotype data from IMPC is binary (e.g., abnormal embryo development) or if the input consists of a single gene.
The following visual properties can be adjusted:
- Font size
- Edge thickness
- Node repulsion (distance between nodes; applicable to Cose layout only)
You can search for specific gene names within the network.
You can restrict results to phenotypes observed in KO mice with a specific genotype:
Homo
: HomozygousHetero
: HeterozygousHemi
: Hemizygous
You can extract phenotypes specific to sex:
Female
Male
You can export the current network as PNG and CSV files.
The CSV file includes cluster (connected component) information and a list of phenotypes observed for each gene's KO mouse.
The dataset used is IMPC Release-22.1 (2024-12-11), specifically the statistical-results-ALL.csv.gz
file.
Detailed information about the data fields can be found here:
👉 Data fields
Gene–phenotype pairs were extracted where the KO mouse exhibited a phenotype with a p-value ≤ 0.0001
(p_value
, female_ko_effect_p_value
, or male_ko_effect_p_value
).
- Genotype-specific phenotypes are annotated with
homo
,hetero
, orhemi
. - Sex-specific phenotypes are annotated with
female
ormale
.
Phenotypic similarity is quantified using the Jaccard index,
which measures the proportion of shared phenotypes between two gene knockouts as a value between 0 and 1.
Jaccard(A, B) = |A ∩ B| / |A ∪ B|
For example, suppose KO mice of gene A and gene B exhibit the following abnormal phenotypes:
A: {abnormal embryo development, abnormal heart morphology, abnormal kidney morphology}
B: {abnormal embryo development, abnormal heart morphology, abnormal lung morphology}
In this case, there are 2 shared phenotypes and 4 unique phenotypes in total.
The Jaccard index is calculated as follows:
Jaccard(A, B) = 2 / 4 = 0.5
If you have any questions or requests, feel free to contact us using one of the methods below (inquiries in Japanese are welcome):
-
Google Form
👉 Contact Form -
If you have a GitHub account
👉 GitHub Issue