Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with categorical coloring in plot_cells_3d #335

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

t-carroll
Copy link

@t-carroll t-carroll commented May 1, 2020

When passing a categorical column of colData(cds) as the color_cells_by argument to plot_cells_3d, the color scheme defaults to NA, resulting in no colored points being plotted (just a blank trajectory shows up). Warning messages appear as follows:
In colScale(as.character(x)) : Some values were outside the colour scale and will be treated as NA

Setting names appropriately to the color palette in the relevant section of code seems to fix this problem, which I suggest in this pull request. See attached images of the problem before and after this change. Note that for some reason the warning messages persist, but the problem nevertheless seems to be corrected (perhaps this link could shed some more light on the problem? Although I'll probably stop delving into this now that I've found a workable solution for my use case).

Before fix:
Problem1
After fix:
problem_fix

Both generated with the same commands, e.g.
x=plot_cells_3D(cds, color_cells_by="Status",reduction_method="UMAP")
htmlWidgets::saveWidget(x, "test.html")

All the best,
Tom

When passing a categorical column of colData(cds) as the color_cells_by argument to plot_cells_3d, the color scheme defaults to NA, resulting in no colored points being plotted (just a blank trajectory shows up).

Setting names appropriately to the color palette in the relevant section of code seems to fix the problem.
@t-carroll t-carroll changed the title FIx bug with categorical coloring in plot_cells_3d Fix bug with categorical coloring in plot_cells_3d May 1, 2020
@jietian-327
Copy link

I had the same problem with ”Some values were outside the colour scale and will be treated as NA”. I used color_cells_by = "cell.type" ), only return blank trajectory.

@t-carroll
Copy link
Author

I had the same problem with ”Some values were outside the colour scale and will be treated as NA”. I used color_cells_by = "cell.type" ), only return blank trajectory.

@jietian-327 if you need a quick fix to generate the 3D plot, you could reinstall using my monocle3 fork:
devtools::install_github("t-carroll/monocle3")

Just make sure to use a different virtual environment, or reinstall your current monocle3 version after you're done making 3D plots with this fork. My fork is not up-to-date with the latest version of monocle3.

@koldam
Copy link

koldam commented Sep 9, 2021

@t-carroll, I am so glad I found your fix with the recent comments directed to @jietian-327. Using your monocle3 fork, the workaround seems to work like a charm. However, I encountered the problem if I want to show more than 8 categorical groups. I believe this is due to the limit of color palette - if I found correct info, most of palettes consists of about ~12 colors at max in RColorBrewer. Visualization of categorical groups up to 8 works. For groups >8 there is an error:

"n too large, allowed maximum for palette Set2 is 8"

Do you think there is possibility to make workable solution for my case? So that RColorBrewer could automatically use more than one set and/or merge them? If yes, is this possible for you to update your monocle3 fork with such fix?

I will be very grateful if you could provide me some feedback. Thank you in advance.

Changing default behaviour for color_palette argument of plot_cells_3d to work with more than 8 colors (without explicitly specifying palette).
@t-carroll
Copy link
Author

t-carroll commented Sep 9, 2021

Hey there @koldam, glad it helped! I made another quick tweak in another branch so it should work for more than 8 colors now by default. I haven't tested it myself yet, can you try reinstalling from this new branch and see if it works for you? I also now changed it so it's up-to-date with the latest version of monocle3. Here's how:
devtools::install_github("t-carroll/monocle3", ref = "tcarroll-plotdevel")

Also, there is a color_palette argument (in all versions, default NULL) where you can specify your own palette. This might be helpful if you have a lot of categories, as the solution I used (colorRampPalette to expand out the brewer.pal-generated palette) may generate colors that are hard to distinguish from adjacent categories with large palettes. So you could pass your own color palette to this argument, either by concatenating two RColorBrewer palettes like you suggest or just manually providing a list of colors. See here for some more ideas.

If you use the color_palette argument I think you will need to name your custom palette with the same names as your category variable. Quick example with base R rainbow palette, assuming you have 20 groups named A through T:

pal = rainbow(20)
names(pal) = LETTERS[1:20]
plot_cells_3d(..., color_palette = pal)

Hope that helps, and please let me know if that fork works!

@t-carroll
Copy link
Author

t-carroll commented Sep 9, 2021

also @koldam @jietian-327 were the colData columns you passed to color_cells_by character vectors or factors? i.e. if you passed cell.type, does class(cds@colData$cell.type) return character or factor? Just narrowing this down a bit more

updating plot_cells_3d so it handles character vectors and factors similarly (by coercing to factor). This prevents generation of a superfluous NA category for category legend, but allows preservation of user-provided factor levels in the legend (if they have this in their colData variables).
Updated master branch to match monocle3's latest version. Also included further upgrades to plot_cells_3d functionality to allow for more than 8 colors in default behaviour (some versions of RColorBrewer throw an error when more than 8 are given), prevent a superfluous NA from being added to legend if color_cells_by is a character vector, and allows for preservation of factor order if color_cells_by is a factor.
@koldam
Copy link

koldam commented Sep 10, 2021

@t-carroll, thank you for fast response :). Sorry that I left my previous message without any details of type of data or reproducible example. Answering your question, my class(cds@colData$tumor_type) returns factor.

Your trick seems to work for me out-of-the-box, this is super cool! In the meantime, I was playing around using the previous fork and noticed that indeed the color_palette might be useful. It worked with your previous fork and I can reproduce it in the newest one, so in case somebody would experience any error when visualizing large number of groups, you can try this:

Add your palette prior to plot_cells_3d, that is:

cell_type_color <- c("BLCA" = "#F8766D",
                     "BRCA" = "#F27D53",
                     "CESC" = "#EA8331")

Where BLCA, BRCA or CESC represent the names of categorical groups/columns (i.e. what you have in cds@colData)
Of course this is just an example, in my case I had more than 20 groups for which I had to assign specific color. Add as many colors as you have categorical groups in specific column.

Then, if you apply:

plot_cells_3d(
  cds_sub,
  dims = c(1, 2, 3),
  color_cells_by = "tumor_type",
  cell_size = 100,
  color_palette = cell_type_color,
  reduction_method = c("UMAP"))

This should bring up the proper plot. Notice that I was doing 3D plot for only a specific segment of my whole cell data set (cds_sub), but this should work the same for typical cds.

@t-carroll, taking this chance, can I ask one off-topic question? I was searching for possibility to try plot_markers_cluster function of older Monocle releases, that was replaced by plot_genes_by_group (as mentioned in here. IIRC, this was a thing in early Monocle3 i.e. alpha, section "Visualizing marker expression across cell clusters". Do you think it will be possible to merge this function into fork? Personally, I would like to try a heatmap instead of what comes from plot_markers_by_group. I know Cole Trapnell mentioned in the above links that we can try this but I would like to anyway play with the previous function a bit. Also, do you think it will be possible to visualize markers across categorical variables and not clusters? I think this was possible in plot_markers_by_group. Thank you once again Tom.

@t-carroll
Copy link
Author

t-carroll commented Sep 10, 2021

Great, glad that's helped @koldam! have merged those changes into my master branch now, so plain old devtools::install_github("t-carroll/monocle3") should work for anyone else having these issues.

For the gene-cluster heatmap, I unfortunately don't know enough about how that works (just dug into the monocle3 code for the 3D plot because I was having issues on my own dataset). From past experience trying to bring in old functions from monocle2/3a (RIP plot_genes_branched_pseudotime), it's pretty difficult and I usually just end up using the older version or another package. If I wanted that sort of heatmap I'd probably just create a Seurat object with my monocle data and use DoHeatmap, which has a similar goal. Here's how that would work:

library(Seurat)
seu = CreateSeuratObject(counts = assay(cds),
                         meta.data = as.data.frame(colData(cds)))
seu@assays$RNA@data = normalized_counts(cds)
seu = ScaleData(seu)
##make sure you don't have any NAs in your grouping variable, or subset the seurat object you pass to DoHeatmap to remove
DoHeatmap(seu,group.by = "cell.type",size = 4,
          features=markers)+
  viridis::scale_fill_viridis(option="A")

This should preserve all of the counts/normalized counts(data) and metadata from your monocle object. Often scaling is done prior to heatmap visualization, which I do on monocle's normalized counts using Seurat's ScaleData. That may not be technically correct (disclaimer, not associated with either package), but you could also play around with other scaling functions to manually make the scale.data slot. Or if you want to plot monocle's normalized count data directly instead, just add slot=data to the above function. Also I like using viridis color schemes here (try options A-H), but you can get the default purple-yellow color scheme from Seurat by deleting after that plus sign. This code makes this sort of plot (from Monocle vignette data):

image

For other special requests maybe raise an issue over on my branch or shoot me a message just to avoid cluttering this PR! Good luck :)

@t-carroll
Copy link
Author

t-carroll commented Sep 10, 2021

Hi @ctrapnell @brgew @hpliner-
If helpful, I have made some minor tweaks to plot_cells_3d to address an issue a few of us were having with passing factors from colData to color_cells_by, where no cells would show up (see top). In case it's helpful, I'm attaching some minimal code that can reproduce this problem on the Packer & Zhu dataset from the vignette, and can show that this PR fixes it. This also would fix an NA that would show up at the bottom of the legend for character vectors, and expands out the color scheme beyond 8 if needed by default. Any q's please let me know :)

monocle3_PR.R.txt

Adding default color scheme tweak (expanding to more than 8 colors by default) to all plot_cells_3d options
@koldam
Copy link

koldam commented Sep 10, 2021

@t-carroll, my pleasure :). Regarding my off-topic question, thank you for the hint about Seurat - I can confirm this will probably meet my needs, but I need to play around a bit to get desired results/visualization. This code will definitely help, thank you once again. Yes I will now stop cluttering here, in the case of any problems, I will allow myself to contact you via branch or message. Stay safe and healthy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants