-
Notifications
You must be signed in to change notification settings - Fork 79
Updated analysis: Unified color palette for plots #510
Comments
Tagging @sjspielman to weigh in about color palette. |
A strategy like this was suggested to me for color palette. Use an existing smaller color palette that is colorblind friendly and reasonable to distinguish, and then have different gradients. This would work out well if we have some kind of higher level grouping for subtypes? https://stackoverflow.com/questions/50163072/different-colors-with-gradient-for-subgroups-on-a-treemap-ggplot2-r/50164882#50164882 |
Wow, that S.O. post makes me want to consider reworking our treemaps in |
Okay, this is not exactly what we want, but it was something I did in the past (and mentioned before in person) to generate a lot of colors that are pretty distinguishable. This example has 49 colors, which is definitely pushing it. It could be a place to start. colorscheme = hsv(h = 1:49/49 * .85, v = c(.8,1,1), s = c(1,1, .6)) |
Ooh, I just found: http://phrogz.net/css/distinct-colors.html Which allowed me to generate this set: Not perfect, but not bad... Dropping some of the dark colors would help, I expect |
@dvenprasad and I chatted a bit about color palettes. Here are the colorsets I believe we need:
@dvenprasad also found these two tools that we can use to poke around: Next steps:
|
- convert final png figure to pdf - add figure generation script to `run-figures.sh` - use color palette generated in PR AlexsLemonade#510 for figure color scheme - format treemap to have less redundant values (treemap as is in final figure 1 panel does not show redundant values and represents the `short_histology` and `integrated_diagnosis` values which I believe should be fine in this case) - redundant text does show up on the treemap plot in `analyses/sample-distribution-analysis/plots` directory (still looking into this)
With #622 merged, we are ready to update figures to the unified color palette (See the README in |
* Make sample distribution plot publication ready - save treemap plot - rerun module - add `figures/pngs` and `figures/scripts` directories to hold pub ready plots and scripts * Create and incorporate treemap into figure1 * Install `treemapify` package on docker * add series of `dplyr::` as needed * Add figure generation script to .circleCI - convert final png figure to pdf - add figure generation script to `run-figures.sh` - use color palette generated in PR #510 for figure color scheme - format treemap to have less redundant values (treemap as is in final figure 1 panel does not show redundant values and represents the `short_histology` and `integrated_diagnosis` values which I believe should be fine in this case) - redundant text does show up on the treemap plot in `analyses/sample-distribution-analysis/plots` directory (still looking into this) * update branch and rerun plots * Fix represented proportions on treemap figures - attempt 1 to fix overlapping text * Add `dplyr::` to distinct and update comment * Use `broad_histology` instead of `short_histology` in treemap - change around the placement of labels and sizes to try to eliminate text overlapping in plot - rerun plot * remove old figure shell script * add `scale_fill_identity` argument and rerun plot * Decrease text size to try getting rid of overlapping text * add subplot labels per @jashapiro suggestion Co-Authored-By: jashapiro <[email protected]> * rerun plot with subplot label added Co-authored-by: jashapiro <[email protected]>
In #622, colors are defined for I am also unsure of the difference between As a minor clarification, I propose that the What I might like to see is something like the following for
Note: I will be filing a separate data issue about the |
Yes, I wasn't sure how |
Looks like NA is exclusively "non-tumor", which makes sense. But "Other" is a mix of unrelated things, as discussed in #647 |
The oncoprint landscape plots in the This color palette contains hex codes for unique categories of SNVs, CNVs, and fusion data. It is being implemented in the PR getting the oncoprint landscape figure publication ready (WIP PR #666), and as @cansavvy noted in a review comment, it should probably be adjusted (to be uniformed) and incorporated into the color palette strategy. |
We now have unified color palettes and their usage is documented here: https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/figures#color-palette-usage The majority of figures in |
What analysis module should be updated and why?
All modules that have plots with colors.
We should probably prioritize plots that will be in the main document? But we probably want the unified color palette to also extend to non-main figures.
What changes need to be made? Please provide enough detail for another participant to make the update.
We should have a unified color palette. This helps interpretability and aesthetics.
simplecolors
R package has some helpful tools and nice vignette:https://cran.r-project.org/web/packages/simplecolors/vignettes/intro.html
For
ggplot2
plots, colors can be designated using scale_fill_manual and scale_color manual.Which colors do we generally want to default to?
short_histology
, so having the colors for each group in particular would help readers follow along better. Can use an numeric approach to try to get ~36 colors as different as possible. I started implementing thiscolorblindr
's palette selection for guidance on some variable color choices.Here's an example of what I mean, but I haven't yet tested these colors:
colorRamp
for these instances, but we should decide what hex codes/colors should be used (I'm not suggesting necessarily the ones I have below).Modules with plots that will need to be color palette unified:
I've tagged myself on the modules I will be responsible for updating the palette for, others can add themselves for other modules.
breaks_cdf_plot.png
, 3 heatmaps, tumor-type plotscosmic/
andnature/
plots, individual and grouped barplotsall_participants_
png plots)survival_curve_gender.pdf
When do you expect the revised analysis will be completed?
? We should also better refine which plots are the priority before we can make this call.
The text was updated successfully, but these errors were encountered: