Skip to content

Commit

Permalink
change from code review
Browse files Browse the repository at this point in the history
  • Loading branch information
wee-snufkin authored Dec 18, 2023
1 parent 7b870fd commit ab79c8a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion topics/single-cell/tutorials/scrna-data-ingest/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ To start with, here are the most common formats and datatypes that you might com
- **Tabular** - simply using TSV, CSV or TXT formats to store expression matrix as well as cell and gene metadata.
- **MTX** - it's just a sparse matrix format with genes on the rows and cells on the columns as output by Cell Ranger.
- **HDF5** - Hierarchical Data Format - can store datasets and groups. A dataset is a a multidimensional array of data elements, together with supporting metadata. A group is a structure for organizing objects in an HDF5 file. This format allows for storing both the count matrices and all metadata in a single file rather than having separate features, barcodes and matrix files.
- **AnnData objects** - [anndata](link https://anndata.readthedocs.io/en/latest/) is a Python package for handling annotated data matrices. In Galaxy, you'll see AnnData objects in **h5ad** format, which is based on the standard HDF5 (h5) format. There are lots of Python tools that work with this format, such as Scanpy, MUON, Cell Oracle, SquidPy, etc.
- **AnnData objects** - [anndata](https://anndata.readthedocs.io/en/latest/) is a Python package for handling annotated data matrices. In Galaxy, you'll see AnnData objects in **h5ad** format, which is based on the standard HDF5 (h5) format. There are lots of Python tools that work with this format, such as Scanpy, MUON, Cell Oracle, SquidPy, etc.
- **Loom** - it is simply an HDF5 file that contains specific groups containing the main matrix as well as row and column attributes and can be read by any language supporting HDF5. [Loompy](https://linnarssonlab.org/loompy/) has been released as a Python API to interact with loom files, and [loomR](https://github.com/mojaveazure/loomR) is its implementation in R.
- **Zarr** - a Python package providing an implementation of compressed, chunked, N-dimensional arrays, designed for use in parallel computing. The Zarr file format offers powerful compression options, supports multiple data store backends, and can read/write your NumPy arrays.
- **Seurat objects** - a representation of single-cell expression data for R, in Galaxy you might see them in **rdata** format.
Expand Down

0 comments on commit ab79c8a

Please sign in to comment.