Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signature Analysis - PMSB5 and generic resistance signatures #58

Open
gwaybio opened this issue Mar 17, 2020 · 5 comments
Open

Signature Analysis - PMSB5 and generic resistance signatures #58

gwaybio opened this issue Mar 17, 2020 · 5 comments
Labels
Discussion and Notes Documenting ideas/discussions Experiments Tracking experimental questions, results, or analysis

Comments

@gwaybio
Copy link
Member

gwaybio commented Mar 17, 2020

In #57 I add a signature analysis. The purpose of this analysis is to identify morphology features that are significantly different between wildtype and resistant clones. The next step is to apply the signatures to other profiles to 1) validate the approach and 2) predict the resistance status of different samples.

This analysis was prompted by #49 .

I will describe the approach, results, and conclusions in this issue.

@gwaybio gwaybio added Discussion and Notes Documenting ideas/discussions Experiments Tracking experimental questions, results, or analysis labels Mar 17, 2020
@gwaybio
Copy link
Member Author

gwaybio commented Mar 17, 2020

Hypothesis

We can identify morphology features that distinguish resistant from wild-type clones

Data

We use two different (aggregated) datasets

  1. Batch 1 (20X) and Batch 2
    • Clone A and E (confirmed PSMB5 mutations)
  2. Batch 5, 6, 7
    • "Four Clone" (we have four resistant and four wild-type clones)
    • DMSO and 0.7 Bortezomib (I think it is lowest dose?)

Approach

  • Only select untreated samples
  • Build a linear model to explore which metadata features contribute to observed variation
  • Metadata: Batch (plate), Cell Line, Clone Type
    • Should we include well?
  • Select morphology features that have a limited contribution to certain variables
  • Remove features that have high batch and inter-individual contributions
  • Isolate the effect of Clone Type
  • Select features that have significant Clone Type differences by Tukey’s HSD
  • Repurpose single-sample gene set enrichment (ssGSEA) methods
    • Gives an enrichment score per profile
    • Expect certain clones to have high scores, others to not

Limitations

  • Small data to generate PSMB5 mutations
    • Currently only using untreated samples
    • Perhaps we can increase the amount of data by including low dose in building signatures
  • clone A and E only represent PSMB5 mutations
    • Core resistance machinery vs. specific to PSMB5?

@gwaybio
Copy link
Member Author

gwaybio commented Mar 17, 2020

Signature Results

Linear Model

Clone A/E

cloneAE_anova_effect_term_distributions_cutoff

cloneAE_tukey_volcano

cloneAE_signature_feature_interpret

Four Clone

fourclone_anova_effect_term_distributions

fourclone_tukey_volcano

fourclone_signature_feature_interpret

@gwaybio
Copy link
Member Author

gwaybio commented Mar 17, 2020

Signature Results

Applying Signatures

Here, we repurpose singscore, which is a rank-based ssGSEA-like method that generates a composite score of high and low features.

Apply PSMB5 Signature to Four Clone

psmb5_signature_apply_fourclone

Apply "Generic" Signature to Clone A/E

generic_resistance_signature_apply_cloneAE

@gwaybio
Copy link
Member Author

gwaybio commented Mar 24, 2020

In general, it appears that the resistance signature decreases with increasing dose. However, this decrease is dependent on clone type (resistant vs. wildtype).

Alternative Plotting Strategy

generic_resistance_signature_apply_cloneAE_xaxis_dosage

@shntnu
Copy link
Collaborator

shntnu commented Mar 24, 2020

Here are some snippets that may be relevant when we ponder the question of of stratifying our pseudo-bulk analysis

From https://osca.bioconductor.org/multi-sample-comparisons.html#differential-expression-between-conditions

  • Collapsing cells into samples reflects the fact that our biological replication occurs at the sample level (Lun and Marioni 2017). Each sample is represented no more than once for each condition, avoiding problems from unmodelled correlations between samples. Supplying the per-cell counts directly to a DE analysis pipeline would imply that each cell is an independent biological replicate, which is not true from an experimental perspective. (A mixed effects model can handle this variance structure but involves extra statistical and computational complexity for little benefit, see (???).)
  • Variance between cells within each sample is masked, provided it does not affect variance across (replicate) samples. This avoids penalizing DEGs that are not uniformly up- or down-regulated for all cells in all samples of one condition. Masking is generally desirable as DEGs - unlike marker genes - do not need to have low within-sample variance to be interesting, e.g., if the treatment effect is consistent across replicate populations but heterogeneous on a per-cell basis. (Of course, high per-cell variability will still result in weaker DE if it affects the variability across populations, while homogeneous per-cell responses will result in stronger DE due to a larger population-level log-fold change. These effects are also largely desirable.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion and Notes Documenting ideas/discussions Experiments Tracking experimental questions, results, or analysis
Projects
None yet
Development

No branches or pull requests

2 participants