Skip to content
Nicole Repina edited this page Apr 5, 2023 · 27 revisions

Welcome to the scMultipleX Wiki!

Overview

Asset 11scMultiplex

scMultipleX is a software package for feature extraction of microscopy imaging data. It provides workflows for feature extraction of segmentated objects (e.g. organoids) and single cells, and for linking of objects and cells over multiplexing rounds. It supports 2D and 3D imaging data, and single-round or multiplexed experiments. scMultipleX uses Prefect (v1.4) for parallelized processing, and assumes input data pre-proprecessed with Drogon.

The workflow consists of the following tasks:

  • Task 0 Build Experiment: Initialize output data storage structure with FAIM-HCS (v0.1.1)
  • Task 1 Feature Extraction: Perform 2D object-level and 3D single-cell-level feature extraction and nuclear to membrane linking
  • Task 2 Organoid Multiplex: Link objects across multiplexing rounds
  • Task 3 Nuclear Multiplex: Link nuclei within objects across multiplexing rounds
  • Task 4 Aggregate Features: Output measured features for each round and objects type (e.g. organoids, nuclei, membranes)
  • Task 5 Combine Nuclear and Membrane Features: Output combined nuclear and membrane features based on nuclear to membrane linking
  • Task 6 Aggregate Organoid Multiplex: Output measured object features across multiplexing rounds
  • Task 7 Aggregate Nuclear Multiplex: Output measured nuclear features across multiplexing rounds

Installation

Create new conda environment with python 3.8+:

conda create -n scmpx python=3.8

Activate the conda environment:

conda activate scmpx

Install scMultipleX:

pip install git+https://github.com/fmi-basel/gliberal-scMultipleX.git

Demo Run of scMultipleX

  1. SSH or Remote Desktop to vcl1060.fmi.ch . For ssh you can use

ssh [email protected]

Note to Windows users: use power shell, or Putty. Don't forget to start a tmux / GNU screen session before starting a long analysis.

  1. Create an output directory (ex. username/scMultiplex-demo-test)

mkdir /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR

  1. Copy demo config file (demo.ini) to own user folder

cp -t /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR /tungstenfs/scratch/gliberal/Code/Common/Repositories/gliberal-scMultipleX/resources/scMultipleX_testdata/demo.ini

  1. Check that demo.ini it is copied over:

ls /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR

You should see the demo.ini file listed in directory

  1. Edit this config file:
  • You can edit the file on Tungsten via your local Tungsten mapped drive (W: by default on Windows) or via Remote Desktop (recommended for VPN users) to an FMI workstation
  • Navigate to demo.ini and open it in favorite text editor (e.g. Notepad++)
  • Change base_dir_save to MY_SAVE_DIR path, save.
  1. Back to vcl1060, create a symbolic link for scMultipleX in your home directory bin folder. This link is persistent and needs to be created only the first time running scMultipleX on a given machine:
  • cd $HOME
  • mkdir -p bin
  • ln -s -t bin /tungstenfs/scratch/gliberal/Code/Common/Repositories/gliberal-scMultipleX/run_scmultiplex
  • ls -l bin
  1. Run scMultipleX on test dataset

run_scmultiplex --help

Let's run each task one by one:

run_scmultiplex --cpus 10 --config /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR/demo.ini --tasks 0

run_scmultiplex --cpus 10 --config /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR/demo.ini --tasks 1

etc...

Or we can run multiple tasks at once:

run_scmultiplex --cpus 10 --config /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR/demo.ini --tasks 0 1 2 3 4 5 6 7

  1. Check output folder!

Configuring scMultipleX

[00BuildExperiment]

General parameters for initializing FAIM-HCS experiment structure

well_pattern = Regex pattern for recognizing well ID

raw_ch_pattern = Regex pattern for recognizing channel ID in raw image files

mask_ending = Suffix of organoid segmentation image

base_dir_raw = Path to raw data directory (folder contains rounds)

base_dir_save = Path to save directory

spacing = Z,Y,X pixel spacing of region-extracted data in um/pix, comma-separated

overview_spacing = Y,X pixel spacing of well overview images, comma-separated

round_names = Names of multiplexing rounds, comma-separated

[00BuildExperiment.round_R0]

Round-specific parameters for initializing FAIM-HCS experiment structure. Include this subsection for each round and update name, e.g. round_R1

name = Round name

nuc_ending = Suffix of nuclear segmentation image

mem_ending = Suffix of membrane segmentation image

root_dir = Path to raw data directory for this round

fname_barcode_index = Number of underscores in Yokogawa barcode, integer

organoid_seg_channel = Image channel used for organoid segmentation, e.g. C01

nuclear_seg_channel = Image channel used for nuclear segmentation, e.g. C01

membrane_seg_channel = Image channel used for membrane segmentation, e.g. C04

[01FeatureExtraction]

Parameters used during feature extraction

excluded_plates = Folder name of plate (e.g. day2,day3) to exclude from analysis, comma-separated

excluded_wells = Well ID to exclude from analysis (e.g. A01,C06), comma-separated

ovr_channel = Image channel used for organoid segmentation, e.g. C01

name_ovr = Naming of regionprops file; always keep as regionprops_ovr_

iop_cutoff = Float value 0 to 1 for cutoff threshold for calling a nucleus inside a membrane. Recommended value is 0.6
            iop = number of pixels in intersection of membrane and nuclear label / number of pixels in nuclear label
            Closer to 1 means better match

[02OrganoidLinking]

Parameters used during organoid linking

iou_cutoff = Float value 0 to 1 for cutoff threshold for matching RX to R0 object. Recommended value is 0.2
            iou = number of pixels in intersection of R0 and RX object label / number of pixels in union of R0 and RX object label
            Closer to 1 means better match

Running scMultipleX

On Linux machines

scMultipleX is installed on tungsten at: scratch/gliberal/Code/Common/Repositories/gliberal-scMultipleX

and can be run on any Linux machine with this conda environment. See Demo Run section for more details.

Use run_scmultiplex --help for details on arguments.

To run:

run_scmultiplex --cpus [NUM CORES, INT] --config [PATH TO .INI CONFIG] --tasks [TASKS TO RUN]

Note:

  • --cpus default is number of cores available for the process on the machine
  • --tasks available are integers 0 - 7

On CPU cluster

To run on CPU cluster a submission script needs to be created. A basic example that can be used as a start is located at:

/tungstenfs/scratch/gliberal/Code/Common/Repositories/gliberal-scMultipleX/clusterme.sh

  1. Copy clusterme.sh to your own folder, for example MY_SAVE_DIR
    cp -t /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR /tungstenfs/scratch/gliberal/Code/Common/Repositories/gliberal-scMultipleX/clusterme.sh
  2. Edit clusterme.sh and change the config_path and the --job-name options with your favorite edit
  3. SSH to vcl1043
    ssh [email protected]
  4. navigate to folder containing clusterme.sh using the cd command
    cd /tungstenfs/scratch/gliberal/Users/MY_SAVE_DIR
  5. run sbatch clusterme.sh

Running scMultipleX

Feature extraction output:

  • z_pos_scaled: z-centroid scaled by the z-spacing anisotropy
  • z_pos_img: z-centroid without scaling, matches spacing of raw and label image
  • volume_pix: volume of object, taking into account z-spacing anisotropy
Clone this wiki locally