Skip to content

How to load datasets

Jayaram Kancherla edited this page Jul 21, 2023 · 1 revision

Kana supports single-cell datasets stored in various formats for your conveience

  • ExperimentHub: Import a dataset from Bioconductor’s ExperimentHub.
  • Matrix Market (MTX): If you have a 10x dataset stored in mtx format, choose this option. Import a mtx file, barcode, and a feature gene file.
  • 10X HDF5 Matrix: For those with 10x datasets stored as H5 in version V3.
  • ZIP File: Have a dataset stored ArtifactDB formatted zip file? No problem!
  • ZIP File: And yes, if you have an RDS file from R, we'll work our magic on that too!

Kana supports multi-modal analysis of single-cell datasets. Kana currently supports RNA-seq, CITE-seq, Perturb-seq, and similar CRISPR-based modalities.

Load single dataset:

when you load a single dataset, Kana automatically identifies default dataset parameters, determining the analysis type (single-modal vs. multi-modal).

image

Additionally, you have the power to:

Perform batch correction

If batch information is present, you can specify the annotation column containing it. We offer various batch correction methods, including no correction, linear regression, or mutual nearest neighbor (MNN) correction.

  • The default MNN correction is perfect for situations with different cell type compositions across samples.
  • If you prefer a simpler approach with consistent batch effects, go for linear regression.
  • You can also choose not to correct if you find sample-sample differences intriguing.
image

or

Perform analysis on a subset of cells

To focus on specific cells, use this option to perform analysis only on a subset. Filter cells based on desired annotations before running any analysis steps. For example, you can keep cells whose annotation for “level1class” is “astrocytes” or “microglia” using the interface.

image

Load multiple datasets

You can import more than one dataset into Kana. In this scenario, each dataset is considered as a batch and performs data integration for each modality.

You are free to mix and match input data formats, in the screenshot below, I’ve added multiple datasets, one from ExperimentHub and another from a local H5AD.

image