ASCT+B Cell-Type Label Mapper

asctb_ct_label_mapper is a package to ensure controlled vocabulary for annotations of scRNA-seq datasets. The goal is to enable cross-dataset or cross-experiment comparison of data by aligning annotations to a standard reference point.

Given a specific organ's scRNA-seq annotated dataset (.h5ad/.rds), you can create a translation file for mapping raw-labels to the ASCT+B naming convention.

General flow:

Create the reference-embeddings by fetching the corresponding ASCT+B organ (with latest version):

Fetch the ASCT+B dataset from the ASCT+B Master Tables.
Parse the data to create wrangled 3 columns CT-ID, CT-Name, CT-Label.
Fetch Description of each unique CT-ID from Cell Ontology.
Use NLP-preprocessing best practices for the text fields.
Use a Sentence-Transformer model hosted on Hugging Face to create embeddings of shape cx768 (c is the Number of unique CTs in the ASCT+B Master table).

For each input raw Cell-Type annotation/cluster label, create the embedding and compare it against the embeddings generated in step #1.
Identify the best matching ASCT+B label for the input raw label.
You can also visualize the agreeability of cross-dataset annotations before and after using ASCTB CT Label Mapper.

A walkthrough is available on Google Colab here.

Architecture:

Step 1: Create Reference Embeddings

Step 2: Map input Cell-Type labels to these Reference Embeddings

Output: Top-2 matches from ASCT+B as suggestions for each of query Cell-Type annotation label

Expert provides feedback in order to finalize the translation from query annotation label to ASCT+B annotation label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ASCT+B Cell-Type Label Mapper

General flow:

Architecture:

Step 1: Create Reference Embeddings

Step 2: Map input Cell-Type labels to these Reference Embeddings

Output: Top-2 matches from ASCT+B as suggestions for each of query Cell-Type annotation label

Cosine Similarity

Files

README.md

Latest commit

History

README.md

File metadata and controls

ASCT+B Cell-Type Label Mapper

General flow:

Architecture:

Step 1: Create Reference Embeddings

Step 2: Map input Cell-Type labels to these Reference Embeddings

Output: Top-2 matches from ASCT+B as suggestions for each of query Cell-Type annotation label

Cosine Similarity