From 882f9d7b27f86ca0b74b05152ac5aab56f75a4f1 Mon Sep 17 00:00:00 2001 From: Jo Lynne Rokita Date: Fri, 28 Apr 2023 17:08:01 -0400 Subject: [PATCH] update figure commits to final --- content/03.results.md | 10 +++++----- content/08.supplemental.md | 24 ++++++++++++------------ 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/content/03.results.md b/content/03.results.md index cc493bf4..f2b4a02f 100644 --- a/content/03.results.md +++ b/content/03.results.md @@ -19,7 +19,7 @@ We used the continuous integration (CI) service CircleCI® to run analytical We followed a similar process in our Manubot-powered [@doi:10.1371/journal.pcbi.1007128] repository for proposed manuscript additions (**Figure {@fig:Fig1}C**); peer reviewers ensured clarity and scientific accuracy, and Manubot performed spell-checking. -![**Overview of the OpenPBTA Project.** A, CBTN and PNOC collected tumors from 943 patients. 22 tumor cell lines were created, and over 2000 specimens were sequenced (N = 1035 RNA-Seq, N = 940 WGS, and N = 32 WXS or targeted panel). The Kids First Data Resource Center Data harmonized the data using Amazon S3 through CAVATICA. Panel created with [BioRender.com](biorender.com). B, Number of biospecimens across phases of therapy, with one broad histology per panel. Each bar denotes a cancer group. (Abbreviations: GNG = ganglioglioma, Other LGG = other low-grade glioma, PA = pilocytic astrocytoma, PXA = pleomorphic xanthoastrocytoma, SEGA = subependymal giant cell astrocytoma, DIPG = diffuse intrinsic pontine glioma, DMG = diffuse midline glioma, Other HGG = other high-grade glioma, ATRT = atypical teratoid rhabdoid tumor, MB = medulloblastoma, Other ET = other embryonal tumor, EPN = ependymoma, PNF = plexiform neurofibroma, DNET = dysembryoplastic neuroepithelial tumor, CRANIO = craniopharyngioma, EWS = Ewing sarcoma, CPP = choroid plexus papilloma). C, Overview of the open analysis and manuscript contribution models. Contributors proposed analyses, implemented it in their fork, and filed a pull request (PR) with proposed changes. PRs underwent review for scientific rigor and accuracy. Container and continuous integration technologies ensured that all software dependencies were included and code was not sensitive to underlying data changes. Finally, a contributor filed a PR documenting their methods and results to the Manubot-powered manuscript repository for review. D, A potential path for an analytical PR. Arrows indicate revisions.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure1.png?sanitize=true){#fig:Fig1 width="7in"} +![**Overview of the OpenPBTA Project.** A, CBTN and PNOC collected tumors from 943 patients. 22 tumor cell lines were created, and over 2000 specimens were sequenced (N = 1035 RNA-Seq, N = 940 WGS, and N = 32 WXS or targeted panel). The Kids First Data Resource Center Data harmonized the data using Amazon S3 through CAVATICA. Panel created with [BioRender.com](biorender.com). B, Number of biospecimens across phases of therapy, with one broad histology per panel. Each bar denotes a cancer group. (Abbreviations: GNG = ganglioglioma, Other LGG = other low-grade glioma, PA = pilocytic astrocytoma, PXA = pleomorphic xanthoastrocytoma, SEGA = subependymal giant cell astrocytoma, DIPG = diffuse intrinsic pontine glioma, DMG = diffuse midline glioma, Other HGG = other high-grade glioma, ATRT = atypical teratoid rhabdoid tumor, MB = medulloblastoma, Other ET = other embryonal tumor, EPN = ependymoma, PNF = plexiform neurofibroma, DNET = dysembryoplastic neuroepithelial tumor, CRANIO = craniopharyngioma, EWS = Ewing sarcoma, CPP = choroid plexus papilloma). C, Overview of the open analysis and manuscript contribution models. Contributors proposed analyses, implemented it in their fork, and filed a pull request (PR) with proposed changes. PRs underwent review for scientific rigor and accuracy. Container and continuous integration technologies ensured that all software dependencies were included and code was not sensitive to underlying data changes. Finally, a contributor filed a PR documenting their methods and results to the Manubot-powered manuscript repository for review. D, A potential path for an analytical PR. Arrows indicate revisions.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figure1.png?sanitize=true){#fig:Fig1 width="7in"} ### Molecular Subtyping of OpenPBTA CNS Tumors @@ -145,7 +145,7 @@ We observed that 25% (15/60) of ependymomas were _C11orf95::RELA_ (now, _ZFTA::R We observed somatic mutations or fusions in _NF2_ in 41% (7/17) of meningiomas, 5% (3/60) of ependymomas, and 25% (3/12) of schwannomas, as well as rare fusions in _ERBB4_, _YAP1_, and/or _QKI_ in 10% (6/60) of ependymomas. DNETs harbored alterations in MAPK/PI3K pathway genes, as was previously reported [@doi:10.1093/jnen/nlz101], including _FGFR1_ (21%, 4/19), _PDGFRA_ (10%, 2/19), and _BRAF_ (5%, 1/19). -![**Mutational landscape of PBTA tumors.** Frequencies of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across primary tumors within the OpenPBTA dataset. A, LGGs (N = 226): pilocytic astrocytoma (N = 104), other LGG (N = 68), ganglioglioma (N = 35), pleomorphic xanthoastrocytoma (N = 9), subependymal giant cell astrocytoma (N = 10). B, Embryonal tumors (N = 129): medulloblastoma (N = 95), atypical teratoid rhabdoid tumor (N = 24), other embryonal tumor (N = 10). C, HGGs (N = 63): diffuse midline glioma (N = 36) and other HGG (N = 27). D, Other CNS tumors (N = 153): ependymoma (N = 60), craniopharyngioma (N = 31), meningioma (N = 17), dysembryoplastic neuroepithelial tumor (N = 19), Ewing sarcoma (N = 7), schwannoma (N = 12), and neurofibroma plexiform (N = 7). Rare CNS tumors are displayed in **Figure {@fig:S3}B**. Histology (`Cancer Group`) and sex annotations are displayed under each plot. Only tumors with mutations in the listed genes are shown. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors (one tumor per patient).](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure2.png?sanitize=true){#fig:Fig2 width="9in"} +![**Mutational landscape of PBTA tumors.** Frequencies of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across primary tumors within the OpenPBTA dataset. A, LGGs (N = 226): pilocytic astrocytoma (N = 104), other LGG (N = 68), ganglioglioma (N = 35), pleomorphic xanthoastrocytoma (N = 9), subependymal giant cell astrocytoma (N = 10). B, Embryonal tumors (N = 129): medulloblastoma (N = 95), atypical teratoid rhabdoid tumor (N = 24), other embryonal tumor (N = 10). C, HGGs (N = 63): diffuse midline glioma (N = 36) and other HGG (N = 27). D, Other CNS tumors (N = 153): ependymoma (N = 60), craniopharyngioma (N = 31), meningioma (N = 17), dysembryoplastic neuroepithelial tumor (N = 19), Ewing sarcoma (N = 7), schwannoma (N = 12), and neurofibroma plexiform (N = 7). Rare CNS tumors are displayed in **Figure {@fig:S3}B**. Histology (`Cancer Group`) and sex annotations are displayed under each plot. Only tumors with mutations in the listed genes are shown. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors (one tumor per patient).](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figure2.png?sanitize=true){#fig:Fig2 width="9in"} @@ -175,7 +175,7 @@ Finally, signatures 3, 8, 18, and MMR2 were prevalent in HGGs, including DMGs. -![**Mutational co-occurrence and signatures highlight key oncogenic drivers.** A, Nonsynonymous mutations for 50 most commonly-mutated genes across all histologies. "Other" denotes a histology with <10 tumors. B, Co-occurrence and mutual exclusivity of mutated genes. The co-occurrence score is defined as $I(-\log_{10}(P))$ where $P$ is Fisher's exact test and $I$ is 1 when mutations co-occur more often than expected or -1 when exclusivity is more common. C, Number of SV and CNV breaks are significantly correlated (Adjusted R = 0.443, p = 1.05e-38). D, Chromothripsis frequency across cancer groups with N >= 3 tumors. E, Sina plots of RefSig signature weights for signatures 1, 11, 18, 19, 3, 8, N6, MMR2, and Other across cancer groups. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure3.png?sanitize=true){#fig:Fig3 width="7in"} +![**Mutational co-occurrence and signatures highlight key oncogenic drivers.** A, Nonsynonymous mutations for 50 most commonly-mutated genes across all histologies. "Other" denotes a histology with <10 tumors. B, Co-occurrence and mutual exclusivity of mutated genes. The co-occurrence score is defined as $I(-\log_{10}(P))$ where $P$ is Fisher's exact test and $I$ is 1 when mutations co-occur more often than expected or -1 when exclusivity is more common. C, Number of SV and CNV breaks are significantly correlated (Adjusted R = 0.443, p = 1.05e-38). D, Chromothripsis frequency across cancer groups with N >= 3 tumors. E, Sina plots of RefSig signature weights for signatures 1, 11, 18, 19, 3, 8, N6, MMR2, and Other across cancer groups. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figure3.png?sanitize=true){#fig:Fig3 width="7in"} ### Transcriptomic Landscape of Pediatric Brain Tumors @@ -250,7 +250,7 @@ Higher _TP53_ scores were associated with significant survival risks (**Table S4 Given this result, we next assessed whether different HGG molecular subtypes carry different survival risks if stratified by _TP53_ status. We found that DMG H3 K28 tumors with _TP53_ loss had significantly worse prognosis (HR = 2.8, CI = 1.4-5.6, p = 0.003) than those with wildtype _TP53_ (**Figure {@fig:Fig4}G** and **Figure {@fig:Fig4}H**), recapitulating results from two recent restrospective analyses of DIPG tumors [@doi:10.1158/1078-0432.CCR-22-0803; @doi:10.1007/s11060-021-03890-9]. -![**_TP53_ and telomerase activity** A, Receiver Operating Characteristic for _TP53_ classifier run on stranded FPKM RNA-Seq. B, Violin and strip plots of _TP53_ scores plotted by _TP53_ alteration type (Nactivated = 11, Nlost = 100, Nother = 866). C, Violin and strip plots of _TP53_ RNA expression plotted by _TP53_ activation status (Nactivated = 11, Nlost = 100, Nother = 866). D, Boxplots of _TP53_ and telomerase (EXTEND) scores across cancer groups. TMB status is highlighted in orange (hypermutant) or red (ultra-hypermutant). E, Heatmap of RefSig mutational signatures for patients with at least one hypermutant tumor or cell line. F, Forest plot depicting prognostic effects of _TP53_ and telomerase scores on overall survival (OS), controlling for extent of tumor resection, LGG group, and HGG group. G, Forest plot depicting the effect of molecular subtype on HGG OS. Hazard ratios (HR) with 95% confidence intervals and p-values (multivariate Cox) are given in F and G. Black diamonds denote significant p-values, and gray diamonds denote reference groups. H, Kaplan-Meier curve of HGGs by molecular subtype. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure4.png?sanitize=true){#fig:Fig4 width="7in"} +![**_TP53_ and telomerase activity** A, Receiver Operating Characteristic for _TP53_ classifier run on stranded FPKM RNA-Seq. B, Violin and strip plots of _TP53_ scores plotted by _TP53_ alteration type (Nactivated = 11, Nlost = 100, Nother = 866). C, Violin and strip plots of _TP53_ RNA expression plotted by _TP53_ activation status (Nactivated = 11, Nlost = 100, Nother = 866). D, Boxplots of _TP53_ and telomerase (EXTEND) scores across cancer groups. TMB status is highlighted in orange (hypermutant) or red (ultra-hypermutant). E, Heatmap of RefSig mutational signatures for patients with at least one hypermutant tumor or cell line. F, Forest plot depicting prognostic effects of _TP53_ and telomerase scores on overall survival (OS), controlling for extent of tumor resection, LGG group, and HGG group. G, Forest plot depicting the effect of molecular subtype on HGG OS. Hazard ratios (HR) with 95% confidence intervals and p-values (multivariate Cox) are given in F and G. Black diamonds denote significant p-values, and gray diamonds denote reference groups. H, Kaplan-Meier curve of HGGs by molecular subtype. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figure4.png?sanitize=true){#fig:Fig4 width="7in"} #### Histologic and oncogenic pathway clustering @@ -288,4 +288,4 @@ While adamantinomatous craniopharyngiomas and Group 3 and Group 4 medulloblastom Finally, we explored the potential influence of tumor purity by repeating selected transcriptomic analyses restricted to only samples with high tumor purity (see **STAR Methods**). Results from these analyses were broadly consistent (**Figure {@fig:S7}D-I**) with results derived from all stranded RNA-Seq samples. -![**Transcriptomic and immune landscape of pediatric brain tumors** A, First two dimensions of transcriptome data UMAP, with points colored by broad histology. B, Heatmap of GSVA scores for Hallmark gene sets with tumors ordered by cancer group. C, Boxplots of quanTIseq estimates of immune cell proportions in cancer groups with N > 15 tumors. Note: other HGGs and other LGGs have immune cell proportions similar to DMG and pilocytic astrocytoma, respectively, and are not shown. D, Forest plot depicting additive effects of _CD274_ expression, immune cell proportion, and extent of tumor resection on OS of medulloblastoma patients. HRs with 95% confidence intervals and p-values (multivariate Cox) are listed. Black diamonds denote significant p-values, and gray diamonds denote reference groups. Note: the Macrophage M1 HR was 0 (coefficient = -9.90e+4) with infinite upper and lower CIs, and thus was not included in the figure. E, Boxplot of _CD274_ expression (log2 FPKM) for medulloblastomas grouped by subtype. Bonferroni-corrected p-values from Wilcoxon tests are shown. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles. Only stranded RNA-Seq data is plotted.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure5.png?sanitize=true){#fig:Fig5 width="7in"} +![**Transcriptomic and immune landscape of pediatric brain tumors** A, First two dimensions of transcriptome data UMAP, with points colored by broad histology. B, Heatmap of GSVA scores for Hallmark gene sets with tumors ordered by cancer group. C, Boxplots of quanTIseq estimates of immune cell proportions in cancer groups with N > 15 tumors. Note: other HGGs and other LGGs have immune cell proportions similar to DMG and pilocytic astrocytoma, respectively, and are not shown. D, Forest plot depicting additive effects of _CD274_ expression, immune cell proportion, and extent of tumor resection on OS of medulloblastoma patients. HRs with 95% confidence intervals and p-values (multivariate Cox) are listed. Black diamonds denote significant p-values, and gray diamonds denote reference groups. Note: the Macrophage M1 HR was 0 (coefficient = -9.90e+4) with infinite upper and lower CIs, and thus was not included in the figure. E, Boxplot of _CD274_ expression (log2 FPKM) for medulloblastomas grouped by subtype. Bonferroni-corrected p-values from Wilcoxon tests are shown. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles. Only stranded RNA-Seq data is plotted.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figure5.png?sanitize=true){#fig:Fig5 width="7in"} diff --git a/content/08.supplemental.md b/content/08.supplemental.md index a1a2df32..e78512e8 100644 --- a/content/08.supplemental.md +++ b/content/08.supplemental.md @@ -1,32 +1,32 @@ ## Supplemental Information Titles and Legends -![**OpenPBTA Project Workflow, Related to Figure 1.** Biospecimens and data were collected by CBTN and PNOC. Genomic sequencing and harmonization (orange boxes) were performed by the Kids First Data Resource Center (KFDRC). Analyses in the green boxes were performed by contributors of the OpenPBTA project. Output files are denoted in blue. Figure created with [BioRender.com](biorender.com).](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS1.png?sanitize=true){#fig:S1 tag="S1" width="7in"} +![**OpenPBTA Project Workflow, Related to Figure 1.** Biospecimens and data were collected by CBTN and PNOC. Genomic sequencing and harmonization (orange boxes) were performed by the Kids First Data Resource Center (KFDRC). Analyses in the green boxes were performed by contributors of the OpenPBTA project. Output files are denoted in blue. Figure created with [BioRender.com](biorender.com).](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS1.png?sanitize=true){#fig:S1 tag="S1" width="7in"} -![**Validation of Consensus SNV calls and Tumor Mutation Burden, Related to Figures 2 and 3.** Correlation (A) and violin (B) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, Mutect2, and VarDict) used for PBTA samples. UpSet plot (C) showing overlap of variant calls. Correlation (D) and violin (E) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, and Mutect2) used for TCGA samples. UpSet plot (F) showing overlap of variant calls. Violin plots (G) showing VAFs for Lancet calls performed on WGS and WXS from the same tumor (N = 52 samples from 13 patients). Cumulative distribution TMB plots for PBTA (H) and TCGA (I) tumors using consensus SNV calls.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS2.png?sanitize=true){#fig:S2 tag="S2" width="7in"} +![**Validation of Consensus SNV calls and Tumor Mutation Burden, Related to Figures 2 and 3.** Correlation (A) and violin (B) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, Mutect2, and VarDict) used for PBTA samples. UpSet plot (C) showing overlap of variant calls. Correlation (D) and violin (E) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, and Mutect2) used for TCGA samples. UpSet plot (F) showing overlap of variant calls. Violin plots (G) showing VAFs for Lancet calls performed on WGS and WXS from the same tumor (N = 52 samples from 13 patients). Cumulative distribution TMB plots for PBTA (H) and TCGA (I) tumors using consensus SNV calls.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS2.png?sanitize=true){#fig:S2 tag="S2" width="7in"} -![**Genomic instability of pediatric brain tumors, Related to Figures 2 and 3.** (A) Violin plots of tumor purity by cancer group. Dots represent the group median. (B) Oncoprint of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across rare CNS tumors: desmoplastic infantile astrocytoma and ganglioglioma (N = 2), germinoma (N = 4), glial-neuronal NOS (N = 8), metastatic secondary tumors (N = 2), neurocytoma (N = 2), pineoblastoma (N = 4), Rosai-Dorfman disease (N = 2), and sarcomas (N = 4). Patient sex (`Germline sex estimate`) and tumor histology (`Cancer Group`) are displayed as annotations at the bottom of each plot. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors with one tumor per patient used. (C) Genome-wide plot of CNV alterations by broad histology. Each row represents one sample. Box and whisker plots of number of CNV breaks (D) or SV breaks (E) by number of chromothripsis regions. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS3.png?sanitize=true){#fig:S3 tag="S3" width="7in"} +![**Genomic instability of pediatric brain tumors, Related to Figures 2 and 3.** (A) Violin plots of tumor purity by cancer group. Dots represent the group median. (B) Oncoprint of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across rare CNS tumors: desmoplastic infantile astrocytoma and ganglioglioma (N = 2), germinoma (N = 4), glial-neuronal NOS (N = 8), metastatic secondary tumors (N = 2), neurocytoma (N = 2), pineoblastoma (N = 4), Rosai-Dorfman disease (N = 2), and sarcomas (N = 4). Patient sex (`Germline sex estimate`) and tumor histology (`Cancer Group`) are displayed as annotations at the bottom of each plot. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors with one tumor per patient used. (C) Genome-wide plot of CNV alterations by broad histology. Each row represents one sample. Box and whisker plots of number of CNV breaks (D) or SV breaks (E) by number of chromothripsis regions. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS3.png?sanitize=true){#fig:S3 tag="S3" width="7in"} -![**Mutational signatures in pediatric brain tumors, Related to Figure 3.** (A) Sample-specific RefSig signature weights across cancer groups ordered by decreasing Signature 1 exposure. (B) Proportion of Signature 1 plotted by phase of therapy for each cancer group.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS4.png?sanitize=true){#fig:S4 tag="S4" width="7in"} +![**Mutational signatures in pediatric brain tumors, Related to Figure 3.** (A) Sample-specific RefSig signature weights across cancer groups ordered by decreasing Signature 1 exposure. (B) Proportion of Signature 1 plotted by phase of therapy for each cancer group.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS4.png?sanitize=true){#fig:S4 tag="S4" width="7in"} -![**Quality control metrics for _TP53_ and EXTEND scores, Related to Figure 4**. (A) Receiver Operating Characteristic for _TP53_ classifier run on FPKM of poly-A RNA-Seq samples. Correlation plots for telomerase scores (EXTEND) with RNA expression of _TERT_ (B) and _TERC_ (C). Red dots in B and C denote samples with known _TERT_ promoter (TERTp) mutations.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS5.png?sanitize=true){#fig:S5 tag="S5" width="7in"} +![**Quality control metrics for _TP53_ and EXTEND scores, Related to Figure 4**. (A) Receiver Operating Characteristic for _TP53_ classifier run on FPKM of poly-A RNA-Seq samples. Correlation plots for telomerase scores (EXTEND) with RNA expression of _TERT_ (B) and _TERC_ (C). Red dots in B and C denote samples with known _TERT_ promoter (TERTp) mutations.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS5.png?sanitize=true){#fig:S5 tag="S5" width="7in"} -![**Subtype-specific clustering and immune cell fractions, Related to Figure 5**. First two dimensions from UMAP of sample transcriptome data with points colored by `molecular_subtype` for medulloblastoma (A), ependymoma (B), low-grade glioma (C), and high-grade glioma (D). (E) Box plots of quanTIseq estimates of immune cell fractions in histologies with more than one molecular subtype with N >=3. (F) Box plots of the ratio of immune cell fractions of CD8+ to CD4+ T cells in histologies with more than one molecular subtype with N >=3. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS6.png?sanitize=true){#fig:S6 tag="S6" width="7in"} +![**Subtype-specific clustering and immune cell fractions, Related to Figure 5**. First two dimensions from UMAP of sample transcriptome data with points colored by `molecular_subtype` for medulloblastoma (A), ependymoma (B), low-grade glioma (C), and high-grade glioma (D). (E) Box plots of quanTIseq estimates of immune cell fractions in histologies with more than one molecular subtype with N >=3. (F) Box plots of the ratio of immune cell fractions of CD8+ to CD4+ T cells in histologies with more than one molecular subtype with N >=3. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS6.png?sanitize=true){#fig:S6 tag="S6" width="7in"} -![**RNA batch and tumor purity assessment, Related to Figures 4 and 5**. Bar plot (A) and UMAP (B) of RNA-Seq samples by cancer group and library preparation method. (C) UMAP of RNA-Seq samples by cancer group and sequencing center. For (D-I), RNA-Seq samples were thresholded by median cancer group tumor purity and transcriptomic analyses in **Figure {@fig:Fig4}A-D** (D-G) and **Figure {@fig:Fig5}A,C** (H-I) were repeated.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figureS7.png?sanitize=true){#fig:S7 tag="S7" width="7in"} +![**RNA batch and tumor purity assessment, Related to Figures 4 and 5**. Bar plot (A) and UMAP (B) of RNA-Seq samples by cancer group and library preparation method. (C) UMAP of RNA-Seq samples by cancer group and sequencing center. For (D-I), RNA-Seq samples were thresholded by median cancer group tumor purity and transcriptomic analyses in **Figure {@fig:Fig4}A-D** (D-G) and **Figure {@fig:Fig5}A,C** (H-I) were repeated.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/c8d07b36d0a2b4b36008312eca50604a47903cf9/figures/pngs/figureS7.png?sanitize=true){#fig:S7 tag="S7" width="7in"} -[**Table S1. Related to Figure 1.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/tables/results/TableS1-histologies.xlsx) +[**Table S1. Related to Figure 1.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/c8d07b36d0a2b4b36008312eca50604a47903cf9/tables/results/TableS1-histologies.xlsx) Table of specimens and associated metadata, clinical data, and histological data utilized in the OpenPBTA project. -[**Table S2. Related to Figures 2 and 3.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/tables/results/TableS2-DNA-results-table.xlsx) +[**Table S2. Related to Figures 2 and 3.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/c8d07b36d0a2b4b36008312eca50604a47903cf9/tables/results/TableS2-DNA-results-table.xlsx) Excel file with four sheets, where the first three represent tables of TMB, eight CNS mutational signatures, and chromothripsis events per sample, respectively, and the fourth sheet shows summarized genomic alterations across cancer groups. -[**Table S3. Related to Figures 4 and 5.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/tables/results/TableS3-RNA-results-table.xlsx) +[**Table S3. Related to Figures 4 and 5.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/c8d07b36d0a2b4b36008312eca50604a47903cf9/tables/results/TableS3-RNA-results-table.xlsx) Excel file with three sheets representing tables of _TP53_ scores, telomerase EXTEND scores, and quanTIseq immune scores, respectively. -[**Table S4. Related to Figures 4 and 5.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/tables/results/TableS4-survival-results-table.xlsx) +[**Table S4. Related to Figures 4 and 5.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/c8d07b36d0a2b4b36008312eca50604a47903cf9/tables/results/TableS4-survival-results-table.xlsx) Excel file with six sheets representing the survival analyses performed for this manuscript. See **Star Methods** for details. -[**Table S5. Related to Figure 1.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/tables/results/TableS5-Key-Resources-table.xlsx) +[**Table S5. Related to Figure 1.**](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/c8d07b36d0a2b4b36008312eca50604a47903cf9/tables/results/TableS5-Key-Resources-table.xlsx) Excel file with four sheets representing of all software and their respective versions used for the OpenPBTA project, including the R packages in the OpenPBTA Docker image, Python packages i the OpenPBTA Docker image, other command line tools in the OpenPBTA Docker image, and all software used in the OpenPBTA workflows, respectively. Note that all software in the OpenPBTA Docker image was utilized within the analysis repository, but not all software was used for the final manuscript.