Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plot_diagnostics: Too significant -log10(p-values) become inf and cause error in hclust(dist()) #21

Open
bednarsky opened this issue Nov 18, 2024 · 3 comments · May be fixed by #22
Open
Labels
bug Something isn't working

Comments

@bednarsky
Copy link

bednarsky commented Nov 18, 2024

  • Cell type seems to be very much confounding the analysis in my case
> p_values
                        PC1\n(35.9%) PC2\n(19.1%)   PC3\n(4.6%)   PC4\n(3.5%)
age                     2.245190e-03 7.340917e-01  6.734626e-01  9.367418e-01
n_cells_pseudobulked    5.389489e-34 1.300771e-06  3.321595e-17  4.874364e-07
cell_type               0.000000e+00 0.000000e+00 1.571575e-314 1.635357e-321
  • Seems that R cannot handle how much so, and makes the p-values zeros
  • Solution suggestion: Mask all values that are smaller than 1e-100 with that value
# Combine p-values
p_values <- rbind(p_values_numeric, p_values_categorical)
+ p_values[p_values < 1e-100] <- 1e-100

# Adjust p-values for multiple testing
p_values_adjusted <- p.adjust(as.vector(p_values), method = "BH")
p_values_adjusted <- matrix(p_values_adjusted, nrow = nrow(p_values), ncol = ncol(p_values))
rownames(p_values_adjusted) <- rownames(p_values)
colnames(p_values_adjusted) <- colnames(p_values)

# Transform p-values to -log10(p-values)
log_p_values <- -log10(p_values_adjusted)
  • Bonus: This also removes this high values from the color scale, and other, significant confounders don't seem so unimportant
@sreichl
Copy link
Collaborator

sreichl commented Nov 18, 2024

Thanks for the fix suggestions. 1e-100 is arbitrary, can we use something less arbitrary ie either the real min or the min within the data excluding 0 (that's what I do for plotting 0 p-values in volcano plots). definitely not fix 100 in the color scale of heatmap as suggested by @lattaai12 because the scale should stay data driven with middle at 0.

@sreichl sreichl added the bug Something isn't working label Nov 18, 2024
@bednarsky
Copy link
Author

Yes sounds good, will PR after testing

bednarsky added a commit to bednarsky/spilterlize_integrate that referenced this issue Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@sreichl @bednarsky and others