From 2752cd95390e7bbead5b2d29230f464623e0a1c3 Mon Sep 17 00:00:00 2001 From: Stephanie Date: Tue, 4 Apr 2023 16:08:06 -0400 Subject: [PATCH] trimmmmm --- content/03.results.md | 150 ++++++++++++++++++++---------------------- 1 file changed, 72 insertions(+), 78 deletions(-) diff --git a/content/03.results.md b/content/03.results.md index 1b9f5bb8..1a1c7623 100644 --- a/content/03.results.md +++ b/content/03.results.md @@ -3,9 +3,9 @@ ### Crowd-sourced Somatic Analyses to Create an Open Pediatric Brain Tumor Atlas We previously performed whole genome sequencing (WGS), whole exome sequencing (WXS), and RNA sequencing (RNA-Seq) on matched tumor/normal tissues and selected cell lines [@doi:10.1093/neuonc/noz192] from 943 patients from the Pediatric Brain Tumor Atlas (PBTA), consisting of 911 patients from the [CBTN](https://CBTN.org) [@doi:10.1016/j.neo.2022.100846] and 32 patients from [PNOC](https://pnoc.us/) [@doi:10.1002/ijc.32258; @doi:10.1158/1078-0432.CCR-22-0803] (**Figure {@fig:Fig1}A**) across various histologies phrases of therapy (**Figure {@fig:Fig1}B**). -We harnessed, and built upon, the benchmarking efforts of the [Gabriella Miller Kids First Data Resource Center](https://kidsfirstdrc.org/) to develop robust and reproducible data analysis workflows within the [CAVATICA platform](https://www.cavatica.org/) to perform comprehensive somatic analyses (**Figure {@fig:S1}**) and **STAR Methods**) of the PBTA. +We harnessed and extended the benchmarking efforts of the [Gabriella Miller Kids First Data Resource Center](https://kidsfirstdrc.org/) to develop robust and reproducible data analysis workflows within the [CAVATICA platform](https://www.cavatica.org/) for comprehensive somatic analyses (**Figure {@fig:S1}**) and **STAR Methods**) of the PBTA. -A key innovative feature of OpenPBTA is the contribution framework used for analyses (e.g., analytical code) and manuscript writing. +A key innovative feature of OpenPBTA is its open contribution framework used for analytical code and manuscript writing. We created a public Github analysis repository ([https://github.com/AlexsLemonade/OpenPBTA-analysis](https://github.com/AlexsLemonade/OpenPBTA-analysis)) to hold all analysis code downstream of Kids First workflows and a GitHub manuscript repository ([https://github.com/AlexsLemonade/OpenPBTA-manuscript](https://github.com/AlexsLemonade/OpenPBTA-manuscript)) with Manubot [@doi:10.1371/journal.pcbi.1007128] integration to enable real-time manuscript creation. As all analyses and manuscript writing were conducted in public repositories, any researcher in the world could contribute to OpenPBTA following the process outlined in **Figure {@fig:Fig1}C**. First, a potential contributor proposed an analysis by filing an issue in the GitHub analysis repository. @@ -13,9 +13,9 @@ Next, project organizers or other contributors with expertise provided feedback The contributor formally requested to include their analytical code and results ā€“ written in their own copy (fork) of repository ā€“ in the OpenPBTA analysis repository by filing a GitHub pull request (PR). All PRs underwent peer review to ensure scientific accuracy, maintainability, and readability of code and documentation (**Figure {@fig:Fig1}C-D**). -Beyond peer review to ensure reproducibility, we established additional checks to ensure consistent results for all collaborators over time (**Figure {@fig:Fig1}D**). -We leveraged Docker® [@https://dl.acm.org/doi/10.5555/2600239.2600241] and the Rocker project [@https://doi.org/10.48550/arXiv.1710.03675] to maintain a consistent software development environment, creating a monolithic image with all OpenPBTA dependencies. -To ensure that new code executed in the development environment, we used the continuous integration (CI) service CircleCI® to run analytical code in PRs on a test dataset before formal code review, allowing us to detect code bugs or sensitivity to data release changes. +Beyond peer review, we implemented additional checks to ensure consistent results for all collaborators over time (**Figure {@fig:Fig1}D**). +To provide a consistent software development environment, we created a monolithic image with all OpenPBTA dependencies using Docker® [@https://dl.acm.org/doi/10.5555/2600239.2600241] and the Rocker project [@https://doi.org/10.48550/arXiv.1710.03675]. +We used the continuous integration (CI) service CircleCI® to run analytical code in PRs on a test dataset before formal code review, allowing us to detect code bugs or sensitivity to data release changes. We followed a similar process in our Manubot-powered [@doi:10.1371/journal.pcbi.1007128] repository for proposed manuscript additions (**Figure {@fig:Fig1}C**); peer reviewers ensured clarity and scientific accuracy, and Manubot performed spell-checking. @@ -23,16 +23,16 @@ We followed a similar process in our Manubot-powered [@doi:10.1371/journal.pcbi. ### Molecular Subtyping of OpenPBTA CNS Tumors -Over the past two decades, neuro-oncology experts and the WHO have collaborated to iteratively redefine central nervous system (CNS) tumor classifications [@pubmed:11895036; @doi:10.1007/s00401-007-0243-4]. +Since 2000, neuro-oncology experts and the WHO have collaborated to iteratively redefine central nervous system (CNS) tumor classifications [@pubmed:11895036; @doi:10.1007/s00401-007-0243-4]. In 2016 [@doi:10.1007/s00401-016-1545-1], molecular subtypes driven by genetic alterations were integrated into these classifications. -Since CBTN specimen collection began in 2011 before molecular data were integrated into classifications, the majority of tumors lacked molecular subtype information at the time of tissue collection. +Since CBTN specimen collection began in 2011 before molecular data were classified, most tumors lacked molecular subtype information when tissue was collected. Moreover, PBTA does not yet feature methylation arrays which are increasingly used to inform molecular subtyping and cancer diagnosis. Therefore, we created analysis modules to systematically consider key genomic features of tumors described by the WHO in 2016 or Ryall and colleagues [@doi:10.1016/j.ccell.2020.03.011]. -Coupled with clinician and pathologist review, we generated research-grade integrated diagnoses for 60% (644/1074) of tumors with high confidence (**Table S1**) without methylation data, representing a major innovation of this project. -This allowed us to align OpenPBTA specimen diagnoses with WHO classifications (e.g., tumors formerly ascribed primitive neuro-ectodermal tumor [PNET] diagnoses), discover rarer tumor entities (e.g., H3-mutant ependymoma, meningioma with _YAP1::FAM118B_ fusion), as well as identify and correct data entry errors (e.g., an embryonal tumor with multilayer rosettes (ETMR) incorrectly entered as a medulloblastoma) and histologically mis-identified specimens (e.g., Ewing sarcoma sample labeled as a craniopharyngioma). +Coupled with clinician and pathologist review, we generated research-grade high confidence integrated diagnoses for 60% (644/1074) of tumors (**Table S1**) without methylation data, a major innovation of this project. +We could then align OpenPBTA specimen diagnoses with WHO classifications (e.g., tumors formerly ascribed primitive neuro-ectodermal tumor [PNET] diagnoses), discover rarer tumor entities (e.g., H3-mutant ependymoma, meningioma with _YAP1::FAM118B_ fusion), as well as identify and correct data entry errors (e.g., an embryonal tumor with multilayer rosettes (ETMR) incorrectly entered as a medulloblastoma) and histologically mis-identified specimens (e.g., Ewing sarcoma sample labeled as a craniopharyngioma). Uniquely, we used transcriptomic classification to subtype 122 medulloblastomas into SHH, WNT, Group 3, or Group 4 with `MedulloClassifier` [@doi:10.1371/journal.pcbi.1008263] and `MM2S` [@doi:10.1186/s13029-016-0053-y], with 95% (41/43) and 91% (39/43) accuracy, respectively. -**Table {@tbl:Table1}** lists the number of tumors subtyped within OpenPBTA, comprising low-grade gliomas (LGGs) (N = 290), HGGs (N = 141), embryonal tumors (N = 126), ependymomas (N = 33), tumors of sellar region (N = 27), mesenchymal non-meningothelial tumors (N = 11), glialneuronal tumors (N = 10), and chordomas (N = 6), where Ns represent unique tumors. +In total, we subtyped low-grade gliomas (LGGs) (N = 290), HGGs (N = 141), embryonal tumors (N = 126), ependymomas (N = 33), tumors of sellar region (N = 27), mesenchymal non-meningothelial tumors (N = 11), glialneuronal tumors (N = 10), and chordomas (N = 6), where Ns represent unique tumors (**Table {@tbl:Table1}**). For detailed methods, see **STAR Methods** and **Figure {@fig:S1}**. | Broad histology group | OpenPBTA molecular subtype | Patients | Tumors | @@ -109,37 +109,32 @@ Table: **Molecular subtypes generated through the OpenPBTA project.** Listed are ### Somatic Mutational Landscape of Pediatric Brain Tumors -We performed a comprehensive genomic analysis of somatic SNVs, CNVs, SVs, and fusions across 1,074 tumors (N = 1,019 RNA-Seq, N = 918 WGS, N = 32 WXS/Panel) and 22 cell lines (N = 16 RNA-Seq, N = 22 WGS), from 943 patients, 833 with paired normal specimens (N = 801 WGS, N = 32 WXS/Panel). +We performed a comprehensive genomic analysis of somatic SNVs, CNVs, SVs, and fusions across all 1,074 PBTA tumors (N = 1,019 RNA-Seq, N = 918 WGS, N = 32 WXS/Panel) and 22 cell lines (N = 16 RNA-Seq, N = 22 WGS), from 943 patients, 833 with paired normal specimens (N = 801 WGS, N = 32 WXS/Panel). Tumor purity across PBTA samples was high (median 76%), though we observed some cancer groups with lower purity, including SEGA, PXA, and teratoma (**Figure {@fig:S3}A**). Unless otherwise noted, each analysis was performed for diagnostic tumors using one tumor per patient. -SNV consensus calling (**Figure {@fig:S1}** and **Figure {@fig:S2}A-G**) revealed, as expected, lower tumor mutation burden (TMB) (**Figure {@fig:S2}H**) in pediatric tumors compared to adult brain tumors from The Cancer Genome Atlas (TCGA) (**Figure {@fig:S2}I**), with hypermutant (> 10 Mut/Mb) and ultra-hypermutant (> 100 Mut/Mb) tumors [@doi:10.1016/j.cell.2017.09.048] only found within HGGs and one embryonal tumor. -**Figure {@fig:Fig2}** and **Figure {@fig:S3}A** depict oncoprints recapitulating known histology-specific driver genes in primary tumors across OpenPBTA histologies, and **Table S2** summarizes all detected alterations across cancer groups. +SNV consensus calling (**Figure {@fig:S1}** and **Figure {@fig:S2}A-G**) revealed, as expected, lower tumor mutation burden (TMB) (**Figure {@fig:S2}H**) in pediatric tumors compared to adult brain tumors from The Cancer Genome Atlas (TCGA) (**Figure {@fig:S2}I**), with hypermutant (> 10 Mut/Mb) and ultra-hypermutant (> 100 Mut/Mb) tumors [@doi:10.1016/j.cell.2017.09.048] only found within HGGs and embryonal tumors. +**Figure {@fig:Fig2}** and **Figure {@fig:S3}B** depict oncoprints recapitulating known histology-specific driver genes in primary tumors across OpenPBTA histologies, and **Table S2** summarizes all detected alterations across cancer groups. -#### Low-grade gliomas As expected, most (62%, 140/226) LGGs harbored a somatic alteration in _BRAF_, with canonical _BRAF::KIAA1549_ fusions as the major oncogenic driver [@doi:10.1186/s40478-020-00902-z] (**Figure {@fig:Fig2}A**). -We observed additional mutations in _FGFR1_ (2%), _PIK3CA_ (2%), _KRAS_ (2%), _TP53_ (1%), and _ATRX_ (1%) and fusions in _NTRK2_ (2%), _RAF1_ (2%), _MYB_ (1%), _QKI_ (1%), _ROS1_ (1%), and _FGFR2_ (1%), concordant with previous studies reporting near-universal upregulation of the RAS/MAPK pathway in these tumors [@doi:10.1186/s40478-020-00902-z; @doi:10.1016/j.ccell.2020.03.011]. -Indeed, we observed significant upregulation (ANOVA Bonferroni-corrected p < 0.01) of the KRAS signaling pathway in LGGs (**Figure {@fig:Fig5}B**) using gene set variant analysis (GSVA). +We observed additional mutations in _FGFR1_ (2%), _PIK3CA_ (2%), _KRAS_ (2%), _TP53_ (1%), and _ATRX_ (1%) and fusions in _NTRK2_ (2%), _RAF1_ (2%), _MYB_ (1%), _QKI_ (1%), _ROS1_ (1%), and _FGFR2_ (1%), concordant with previous studies reporting near-universal upregulation of the RAS/MAPK pathway in LGGs [@doi:10.1186/s40478-020-00902-z; @doi:10.1016/j.ccell.2020.03.011]. +Indeed, gene set variant analysis (GSVA) revealed significant upregulation (ANOVA Bonferroni-corrected p < 0.01) of the KRAS signaling pathway in LGGs (**Figure {@fig:Fig5}B**). -#### Embryonal tumors Most (N = 95) embryonal tumors were medulloblastomas from four characterized molecular subtypes (WNT, SHH, Group3, and Group 4; see **Molecular Subtyping of CNS Tumors**), as identified by subtype-specific canonical mutations (**Figure {@fig:Fig2}B**). We detected canonical _SMARCB1/SMARCA4_ deletions or inactivating mutations in atypical teratoid rhabdoid tumors (ATRTs; **Table S2**) and C19MC amplification in ETMRs (displayed within "Other embryonal tumors" in **Figure {@fig:Fig2}B**) [@doi:10.1007/s00401-020-02182-2; @doi:10.1093/neuonc/noab178; @doi:10.1186/s40478-020-00984-9; @doi:10.1038/nature22973]. -#### High-grade gliomas Across HGGs, _TP53_ (57%, 36/63) and _H3F3A_ (54%, 34/63) were both most mutated and co-occurring genes (**Figure {@fig:Fig2}A and C**), followed by frequent mutations in _ATRX_ (29%, 18/63) which is commonly mutated in gliomas [@doi:10.1080/14728222.2018.1487953]. We observed recurrent amplifications and fusions in _EGFR_, _MET_, _PDGFRA_, and _KIT_, highlighting that these tumors leverage multiple oncogenic mechanisms to activate tyrosine kinases, as previously reported [@doi:10.1002/ijc.32258; @doi:10.1016/j.ccell.2017.08.017; @doi:10.1186/s40478-020-00905-w]. GSVA showed upregulation (ANOVA Bonferroni-corrected p < 0.01) of DNA repair, G2M checkpoint, and MYC pathways as well as downregulation of the TP53 pathway (**Figure {@fig:Fig5}B**). -The two tumors with ultra-high TMB (> 100 Mutations/Mb) were from patients with known mismatch repair deficiency syndrome [@doi:10.1093/neuonc/noz192]. +The two ultra-hypoermutated tumors (> 100 Mutations/Mb) were from patients with mismatch repair deficiency syndrome [@doi:10.1093/neuonc/noz192]. -#### Other CNS tumors -We observed that 25% (15/60) of ependymomas were _C11orf95::RELA_ (now, _ZFTA::RELA_) fusion-positive ependymomas [@doi:10.1038/nature13109] and that 68% (21/31) of craniopharyngiomas contained mutations in _CTNNB1_ (**Figure {@fig:Fig2}D**). +Considering embryonal tumors, 25% (15/60) of ependymomas were _C11orf95::RELA_ (now, _ZFTA::RELA_) fusion-positive [@doi:10.1038/nature13109], and 68% (21/31) of craniopharyngiomas contained _CTNNB1_ mutations (**Figure {@fig:Fig2}D**). We observed somatic mutations or fusions in _NF2_ in 41% (7/17) of meningiomas, 5% (3/60) of ependymomas, and 25% (3/12) of schwannomas, as well as rare fusions in _ERBB4_, _YAP1_, and/or _QKI_ in 10% (6/60) of ependymomas. DNETs harbored alterations in MAPK/PI3K pathway genes, as was previously reported [@doi:10.1093/jnen/nlz101], including _FGFR1_ (21%, 4/19), _PDGFRA_ (10%, 2/19), and _BRAF_ (5%, 1/19). -**Figure {@fig:S3}A** depicts frequent mutations in additional rare brain tumor histologies. ![**Mutational landscape of PBTA tumors.** Frequencies of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across primary tumors within the OpenPBTA dataset. A, LGGs (N = 226): pilocytic astrocytoma (N = 104), other LGG (N = 68), ganglioglioma (N = 35), pleomorphic xanthoastrocytoma (N = 9), subependymal giant cell astrocytoma (N = 10). B, Embryonal tumors (N = 129): medulloblastoma (N = 95), atypical teratoid rhabdoid tumor (N = 24), other embryonal tumor (N = 10). C, HGGs (N = 63): diffuse midline glioma (N = 36) and other HGG (N = 27). D, Other CNS tumors (N = 153): ependymoma (N = 60), craniopharyngioma (N = 31), meningioma (N = 17), dysembryoplastic neuroepithelial tumor (N = 19), Ewing sarcoma (N = 7), schwannoma (N = 12), and neurofibroma plexiform (N = 7). Rare CNS tumors are displayed in **Figure {@fig:S3}B**. Histology (`Cancer Group`) and sex (`Germline sex estimate`) annotations are displayed under each plot. Only tumors with mutations in the listed genes are shown. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors (one tumor per patient).](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure2.png?sanitize=true){#fig:Fig2 width="9in"} @@ -147,7 +142,7 @@ DNETs harbored alterations in MAPK/PI3K pathway genes, as was previously reporte ### Mutational co-occurrence, CNV, and signatures highlight key oncogenic drivers -We analyzed mutational co-occurrence across the OpenPBTA, using a single tumor from each patient with available WGS (N = 668 patients). +We analyzed mutational co-occurrence across the OpenPBTA, using a single tumor from each patient (N = 688) with WGS. The top 50 mutated genes (see **STAR Methods** for details) in primary tumors are shown in **Figure {@fig:Fig3}** by tumor type (**A**, bar plots), with co-occurrence scores illustrated in the heatmap (**B**). As expected, _TP53_ was the most frequently mutated gene across the OpenPBTA (8.7%, 58/668), significantly co-occurring with _H3F3A_ (OR = 30.05, 95% CI: 14.5 - 62.3, q = 2.34e-16), _ATRX_ (OR = 23.3, 95% CI: 9.6 - 56.3, q = 8.72e-9), _NF1_ (OR = 8.26, 95% CI: 3.5 - 19.4, q = 7.40e-5), and _EGFR_ (OR = 17.5, 95% CI: 4.8 - 63.9, q = 2e-4), with all of these driven by HGGs and consistent with previous reports [@doi:10.1016/j.ccell.2017.08.017; @doi:10.1093/neuonc/noaa251; @doi:10.1038/ng.2938]. @@ -155,19 +150,19 @@ In embryonal tumors, _CTNNB1_ mutations significantly co-occurred with _TP53_ mu _FGFR1_ and _PIK3CA_ mutations significantly co-occurred in LGGs (OR = 77.25, 95% CI: 10.0 - 596.8, q = 3.12e-3), consistent with previous findings [@doi:10.1200/JCO.2010.31.1670; @doi:10.1186/s40478-020-01027-z]. Of HGG tumors with _TP53_ or _PPM1D_ mutations, 53/55 (96.3%) had mutations in only one of these genes (OR = 0.17, 95% CI: 0.04 - 0.89, q = 0.056), recapitulating previous observations that these mutations are usually mutually exclusive in HGGs [@https://doi.org/10.1038/ng.2938]. -From CNV and SV analyses, we observed that HGG, DMG, and medulloblastoma genomes were the most unstable genomes, while craniopharyngiomas and schwannomas generally lacked somatic CNV (**Figure {@fig:S3}C**). -Together, these CNV patterns largely aligned with our TMB estimates (**Figure {@fig:S2}H**). -SV and CNV breakpoint densities were significantly correlated (linear regression p = 1.05e-38; **Figure {@fig:Fig3}C**) and as expected, the number of chromothripsis regions called increased with breakpoint density (**Figure {@fig:S3}D-E**). -We identified chromothripsis events in 31% (N = 12/39) of DMGs and in 44% (N = 21/48) of other HGGs (**Figure {@fig:Fig3}D**). -We also found evidence of chromothripsis in over 15% of sarcomas, PXAs, metastatic secondary tumors, chordomas, glial-neuronal tumors, germinomas, meningiomas, ependymomas, medulloblastomas, ATRTs, and other embryonal tumors, highlighting the genomic instability and complexity of these pediatric brain tumors. +CNV and SV analyses revealed that HGG, DMG, and medulloblastoma tumors has the most unstable genomes, while craniopharyngiomas and schwannomas generally lacked somatic CNV (**Figure {@fig:S3}C**). +These CNV patterns largely aligned with our TMB estimates (**Figure {@fig:S2}H**). +SV and CNV breakpoint densities were significantly correlated (linear regression p = 1.05e-38; **Figure {@fig:Fig3}C**), and as expected, the number of chromothripsis regions called increased with breakpoint density (**Figure {@fig:S3}D-E**). +We identified chromothripsis events in 31% (N = 12/39) of DMGs and in 44% (N = 21/48) of other HGGs (**Figure {@fig:Fig3}D**), and we found evidence of chromothripsis in over 15% of sarcomas, PXAs, metastatic secondary tumors, chordomas, glial-neuronal tumors, germinomas, meningiomas, ependymomas, medulloblastomas, ATRTs, and other embryonal tumors. -We next assessed the contributions of eight previously identified adult CNS-specific mutational signatures from the RefSig database [@doi:10.1038/s43018-020-0027-5] across tumors (**Figure {@fig:Fig3}E** and **Figure {@fig:S4}A**). +We assessed the contributions of eight adult CNS-specific mutational signatures from the RefSig database [@doi:10.1038/s43018-020-0027-5] across tumors (**Figure {@fig:Fig3}E** and **Figure {@fig:S4}A**). Signature 1, which reflects normal spontaneous deamination of 5-methylcytosine, predominated in stage 0 and/or 1 tumors characterized by low TMBs (**Figure {@fig:S2}H**) such as pilocytic astrocytomas, gangliogliomas, other LGGs, and craniopharyngiomas (**Figure {@fig:S4}A**). -Signature N6 is a CNS-specific signature which we observed almost universally across tumors. -Drivers of Signature 18, _TP53_, _APC_, _NOTCH1_ (found at https://signal.mutationalsignatures.com/explore/referenceCancerSignature/31/drivers), are also canonical drivers of medulloblastoma, and indeed, we Signature 18 had the highest signature weight in medulloblastoma tumors. -Signatures 3, 8, 18, and MMR2 were prevalent in HGGs, including DMGs. -Finally, we found that Signature 1 weights were generally higher in tumors sampled at diagnosis (pre-treatment) compared to tumors from later phases of therapy (progression, recurrence, post-mortem, secondary malignancy; **Figure {@fig:S4}B**). -This trend may have resulted from therapy-induced mutations that produced additional signatures (e.g., temozolomide treatment has been suggested to drive Signature 11 [@doi:10.1053/j.gastro.2014.07.052]), subclonal expansion, and/or acquisition of additional driver mutations during tumor progression, leading to higher overall TMBs and additional signatures. +Signature 1 weights were generally higher in tumors sampled at diagnosis (pre-treatment) compared to tumors from later phases of therapy (**Figure {@fig:S4}B**). +This trend may have emerged from therapy-induced mutations that produced additional signatures (e.g., temozolomide treatment has been suggested to drive Signature 11 [@doi:10.1053/j.gastro.2014.07.052]), subclonal expansion, and/or acquisition of additional driver mutations during tumor progression, leading to detection of additional signatures. +We observed the CNS-specific signature N6 in nearly all tumors. +Signature 18 drivers (_TP53_, _APC_, _NOTCH1_; found at https://signal.mutationalsignatures.com/explore/referenceCancerSignature/31/drivers) are also canonical medulloblastoma drivers, and indeed, Signature 18 had the highest signature weight in medulloblastomas. +Finally, signatures 3, 8, 18, and MMR2 were prevalent in HGGs, including DMGs. + @@ -176,37 +171,34 @@ This trend may have resulted from therapy-induced mutations that produced additi ### Transcriptomic Landscape of Pediatric Brain Tumors Most RNA-Seq samples in the PBTA were prepared with ribosomal RNA depletion followed by stranded sequencing (N = 977), while remaining samples were prepared with poly-A selection (N = 58). -Since batch correction was not feasible (see **Limitations of the Study** and **Figure {@fig:S7}A**), the following analyses were performed using stranded samples only. +Since batch correction was not feasible (see **Limitations of the Study** and **Figure {@fig:S7}A**), the following transcriptomic analyses considered only stranded samples. #### Prediction of _TP53_ oncogenicity and telomerase activity -To understand each tumor's _TP53_ phenotype, we applied TCGA-trained classifier [@doi:10.1016/j.celrep.2018.03.076] to calculate a _TP53_ score and infer _TP53_ inactivation status. -We identified "true positive" _TP53_ alterations derived using high-confidence SNVs, CNVs, SVs, and fusions in _TP53_. -We annotated tumors as "activated" if they harbored one of p.R273C or p.R248W gain-of-function mutations [@doi:10.1038/ng0593-42], or "lost" if 1) the given patient had a Li Fraumeni Syndrome (LFS) predisposition diagnosis, 2) the tumor harbored a known hotspot mutation, or 3) the tumor contained two hits (e.g. both SNV and CNV), suggesting both alleles were affected. +We applied TCGA-trained classifier [@doi:10.1016/j.celrep.2018.03.076] to calculate a _TP53_ score, a proxy for _TP53_ gene or pathway dysregulation, and subsequently infer tumor _TP53_ inactivation status. +We identified "true positive" _TP53_ alterations from high-confidence SNVs, CNVs, SVs, and fusions in _TP53_, annotating tumors as "activated" if they harbored one of p.R273C or p.R248W gain-of-function mutations [@doi:10.1038/ng0593-42], or "lost" if 1) the given patient had a Li Fraumeni Syndrome (LFS) predisposition diagnosis, 2) the tumor harbored a known hotspot mutation, or 3) the tumor contained two hits (e.g. both SNV and CNV), suggesting both alleles were affected. If the _TP53_ mutation did not reside within the DNA-binding domain or no alterations in _TP53_ were detected, we annotated the tumor as "other," indicating an unknown _TP53_ alteration status. -The classifier achieved a high accuracy (AUROC = 0.86) for rRNA-depleted, stranded tumors compared to randomly shuffled _TP53_ scores (**Figure {@fig:Fig4}A**). -By contrast, while this classifier has previously shown strong performance on poly-A data from both adult [@doi:10.1016/j.celrep.2018.03.076] tumors and pediatric patient-derived xenografts [@doi:10.1016/j.celrep.2019.09.071], it did not perform as well on the poly-A tumors in this cohort (AUROC = 0.62; **Figure {@fig:S5}A**). +The classifier achieved a high accuracy (AUROC = 0.86) for rRNA-depleted, stranded tumors, but it did not perform as well on the poly-A tumors in this cohort (AUROC = 0.62; **Figure {@fig:S5}A**). -While we expected that "lost" tumors would have higher _TP53_ scores than would "activated" tumors, we observed that these groups had similar _TP53_ scores (**Figure {@fig:Fig4}B**, Wilcoxon p = 0.92). -This result suggests that the classifier actually detects an oncogenic, or altered, _TP53_ phenotype (scores > 0.5) rather than solely _TP53_ inactivation, as interpreted previously [@doi:10.1016/j.celrep.2018.03.076]. +We observed that "activated" and "lost" tumors had similar _TP53_ scores (**Figure {@fig:Fig4}B**, Wilcoxon p = 0.92), contrasting our expectation that "activated" tumors would have higher _TP53_ scores. +This difference suggests that classifier scores > 0.5 may actually represent an oncogenic, or altered, _TP53_ phenotype rather than solely _TP53_ inactivation, as interpreted previously [@doi:10.1016/j.celrep.2018.03.076]. However, "activated" tumors showed higher _TP53_ expression compared to those with _TP53_ "loss" mutations (Wilcoxon p = 0.006, **Figure {@fig:Fig4}C**). -Tumor types with the highest median _TP53_ scores included DMGs, medulloblastomas, HGGs, DNETs, ependymomas, and craniopharyngiomas (**Figure {@fig:Fig4}D**), all known to harbor _TP53_ mutations. +DMGs, medulloblastomas, HGGs, DNETs, ependymomas, and craniopharyngiomas, all known to harbor _TP53_ mutations, had the highest median _TP53_ scores (**Figure {@fig:Fig4}D**). By contrast, gangliogliomas, LGGs, meningiomas, and schwannomas had the lowest median scores. -We hypothesized that tumors from patients with LFS (N = 8) would have higher _TP53_ scores. -Indeed, we observed higher scores in 8/10 tumors from LFS patients (**Table S3**). -Although two tumors from LFS patients had low _TP53_ scores (`BS_DEHJF4C7` at 0.09 and `BS_ZD5HN296` at 0.28), we confirmed from pathology reports that both patients were diagnosed with LFS and had a pathogenic germline variant in _TP53_. -These two LFS tumors also had low tumor purity (16% and 37%, respectively), suggesting the classifier may require a certain level of tumor content for accurate performance, as _TP53_ should be intact in normal cells. -These transcriptomic scores could be utilized to infer _TP53_ function in the absence of a predicted oncogenic _TP53_ alteration or DNA sequencing in general. +We hypothesized that tumors from patients with LFS (N = 8) would have higher _TP53_ scores, which we indeed observed for 8/10 tumors from LFS patients (**Table S3**). +Although two tumors from LFS patients had low _TP53_ scores (`BS_DEHJF4C7` at 0.09 and `BS_ZD5HN296` at 0.28), pathology reports confirmed that both patients were diagnosed with LFS with a _TP53_ pathogenic germline variant. +These two LFS tumors also had low tumor purity (16% and 37%, respectively), suggesting that accurate classification may require a certain level of tumor content. +We suggest that this classifier could be generally applied to infer _TP53_ function in the absence of a predicted oncogenic _TP53_ alteration or DNA sequencing. -We used gene expression data to predict telomerase activity using EXpression-based Telomerase ENzymatic activity Detection (`EXTEND`) [@doi:10.1038/s41467-020-20474-9] as a surrogate measure of malignant potential [@doi:10.1038/s41467-020-20474-9; @doi:10.1093/carcin/bgp268], such that higher `EXTEND` scores indicate higher telomerase activity. +We used gene expression data to predict telomerase activity using EXpression-based Telomerase ENzymatic activity Detection (`EXTEND`) [@doi:10.1038/s41467-020-20474-9] as a surrogate measure of malignant potential [@doi:10.1038/s41467-020-20474-9; @doi:10.1093/carcin/bgp268], where higher `EXTEND` scores indicate higher telomerase activity. +Aggressive tumors such as DMGs, other HGGs, and MB had high `EXTEND` scores (**Figure {@fig:Fig4}D**), and low-grade lesions such as schwannomas, GNGs, DNETs, and other LGGs had among the lowest scores (**Table S3**), supporting previous reports that aggressive tumor phenotypes have higher telomerase activity [@doi:10.1007/s13277-016-5045-7; @doi:10.1038/labinvest.3700710; @doi:10.1007/s12032-016-0736-x; @doi:10.1111/j.1750-3639.2010.00372.x]. While `EXTEND` scores were not significantly higher in tumors with _TERT_ promoter (TERTp) mutations (N = 6; Wilcoxon p-value = 0.1196), scores were significantly correlated with _TERC_ (R = 0.619, p < 0.01) and _TERT_ (R = 0.491, p < 0.01) log2 FPKM expression values (**Figure {@fig:S5}B-C**). -Since catalytically-active telomerase requires a combination of full-length _TERT_, _TERC_, as well as accessory proteins [@url:https://pubmed.ncbi.nlm.nih.gov/9751630], we expect that `EXTEND` scores may not be exclusively correlated with _TERT_ alterations and expression. -While aggressive tumors such as DMGs, other HGGs, and MB had high `EXTEND` scores (**Figure {@fig:Fig4}D**), low-grade lesions such as schwannomas, GNGs, DNETs, and other LGGs had among the lowest scores (**Table S3**), supporting previous reports that more aggressive tumor phenotypes have higher telomerase activity [@doi:10.1007/s13277-016-5045-7; @doi:10.1038/labinvest.3700710; @doi:10.1007/s12032-016-0736-x; @doi:10.1111/j.1750-3639.2010.00372.x]. +Since catalytically-active telomerase requires full-length _TERT_, _TERC_, and certain accessory proteins [@url:https://pubmed.ncbi.nlm.nih.gov/9751630], we expect that `EXTEND` scores may not be exclusively correlated with _TERT_ alterations and expression. #### Hypermutant tumors share mutational signatures and have dysregulated **_TP53_** -We further investigated the mutational signature profiles of hypermutant (TMB > 10 Mut/Mb; N = 3) and ultra-hypermutant (TMB > 100 Mut/Mb; N = 4) tumors and/or derived cell lines from six patients in OpenPBTA (**Figure {@fig:Fig4}E**). +We investigated the mutational signature profiles of hypermutant (TMB > 10 Mut/Mb; N = 3) and ultra-hypermutant (TMB > 100 Mut/Mb; N = 4) tumors and/or derived cell lines from six patients in OpenPBTA (**Figure {@fig:Fig4}E**). Five tumors were HGGs and one was a brain metastasis of a MYCN non-amplified neuroblastoma tumor. Signature 11, which is associated with exposure to temozolomide plus _MGMT_ promoter and/or mismatch repair deficiency [@doi:10.1038/s41588-019-0525-5], was indeed present in tumors with previous exposure to the drug (**Table {@tbl:Table2}**). We detected the MMR2 signature in tumors of four patients (PT_0SPKM4S8, PT_3CHB9PK5, PT_JNEV57VK, and PT_VTM2STE3) diagnosed with either constitutional mismatch repair deficiency (CMMRD) or Lynch syndrome (**Table {@tbl:Table2}**), genetic predisposition syndromes caused by a variant in a mismatch repair gene such as _PMS2_, _MLH1_, _MSH2_, _MSH6_, or others [@doi:10.1136/jmedgenet-2020-107627]. @@ -214,14 +206,15 @@ Three of these patients harbored pathogenic germline variants in one of the afor While we did not detect a _known_ pathogenic variant in the germline of PT_VTM2STE3, this patient's pathology report contained a self-reported _PMS2_ variant, and we indeed found 19 intronic variants of unknown significance (VUS) in their _PMS2_. This is not surprising since an estimated 49% of germline _PMS2_ variants in patients with CMMRD and/or Lynch syndrome are VUS [@doi:10.1136/jmedgenet-2020-107627]. Interestingly, while the cell line derived from patient PT_VTM2STE3's tumor at progression was not hypermutated (TMB = 5.7 Mut/Mb), it only contained the MMR2 signature, suggesting selective pressure to maintain a mismatch repair (MMR) phenotype _in vitro_. -From patient PT_JNEV57VK, only one of the two cell lines derived from the progressive tumor was hypermutated (TMB = 35.9 Mut/Mb). -This hypermutated cell line was strongly weighted towards signature 11, while this patient's non-hypermutated cell line showed several lesser signature weights (1, 11, 18, 19, MMR2; **Table S2**), highlighting the plasticity of mutational processes and the need to carefully genomically characterize and select models for preclinical studies based on research objectives. +Only one of the two cell lines derived from patient PT_JNEV57VK's progressive tumor was hypermutated (TMB = 35.9 Mut/Mb). +Their hypermutated cell line was strongly weighted towards signature 11, while their non-hypermutated cell line showed several lesser signature weights (1, 11, 18, 19, MMR2; **Table S2**). +This mutational process plasticity highlights the importance of careful genomic characterization and model selection for preclinical studies. -Signature 18, which has been associated with high genomic instability and can lead to a hypermutator phenotype [@doi:10.1038/s43018-020-0027-5], was uniformly represented among hypermutant solid tumors. -Additionally, we found that all of the HGG tumors or cell lines had dysfunctional _TP53_ (**Table {@tbl:Table2}**), consistent with a previous report showing _TP53_ dysregulation is a dependency in tumors with high genomic instability [@doi:10.1038/s43018-020-0027-5]. +Signature 18, which has been associated with high genomic instability and can induce a hypermutator phenotype [@doi:10.1038/s43018-020-0027-5], was uniformly represented among hypermutant solid tumors. +Additionally, all hypermutant HGG tumors or cell lines had dysfunctional _TP53_ (**Table {@tbl:Table2}**), consistent with previous findings that tumors with high genomic instability depend on _TP53_ dysregulation [@doi:10.1038/s43018-020-0027-5]. With one exception, hypermutant and ultra-hypermutant tumors had high _TP53_ scores (> 0.5) and telomerase activity. Interestingly, none of the hypermutant tumors showed evidence of signature 3 (present in homologous recombination deficient tumors), signature 8 (arises from double nucleotide substitutions/unknown etiology), or signature N6 (a universal CNS tumor signature). -The mutual exclusivity of signatures 3 and MMR2 corroborates a previous report suggesting tumors do not tend to feature both deficient homologous repair and mismatch repair [@doi:10.1016/j.celrep.2018.03.076]. +The mutual exclusivity of signatures 3 and MMR2 corroborates previous suggestions that tumors do not generally feature both deficient homologous repair and mismatch repair [@doi:10.1016/j.celrep.2018.03.076]. | Kids First Participant ID | Kids First Biospecimen ID | CBTN ID | Phase of therapy | Composition | Therapy post-biopsy | Cancer predisposition | Pathogenic germline variant | TMB | OpenPBTA molecular subtype | @@ -241,13 +234,12 @@ The mutual exclusivity of signatures 3 and MMR2 corroborates a previous report s Table: **Patients with hypermutant tumors.** Listed are patients with at least one hypermutant or ultra-hypermutant tumor or cell line. Pathogenic (P) or likely pathogenic (LP) germline variants, coding region TMB, phase of therapy, therapeutic interventions, cancer predisposition (CMMRD = Constitutional mismatch repair deficiency), and molecular subtypes are included. {#tbl:Table2} -Next, we asked whether transcriptomic classification of _TP53_ dysregulation and/or telomerase activity recapitulate the known prognostic influence of these oncogenic biomarkers. -We identified several expected trends, including a significant overall survival benefit if the tumor had been fully resected (HR = 0.35, 95% CI = 0.2 - 0.62, p < 0.001) or if the tumor belonged to the LGG group (HR = 0.046, 95% CI = 0.0062 - 0.34, p = 0.003) as well as a significant risk if the tumor belonged to the HGG group (HR = 6.2, 95% CI = 4.0 - 9.5, p < 0.001) (**Figure {@fig:Fig4}F**; **STAR Methods**). +Next, we asked whether transcriptomic classification of _TP53_ dysregulation and/or telomerase activity recapitulate these oncogenic biomarkers' known prognostic influence. +We identified several expected trends, including a significant overall survival benefit following full tumor resection (HR = 0.35, 95% CI = 0.2 - 0.62, p < 0.001) or if the tumor was an LGG (HR = 0.046, 95% CI = 0.0062 - 0.34, p = 0.003), and a significant risk if the was an HGG (HR = 6.2, 95% CI = 4.0 - 9.5, p < 0.001) (**Figure {@fig:Fig4}F**; **STAR Methods**). High telomerase scores were associated with poor prognosis across brain tumor histologies (HR = 20, 95% CI = 6.4 - 62, p < 0.001), demonstrating that `EXTEND` scores calculated from RNA-Seq are an effective rapid surrogate measure for telomerase activity. -Although higher _TP53_ scores, which predict _TP53_ gene or pathway dysregulation, were not a significant predictor of risk across the entire OpenPBTA cohort (**Table S4**), we did find a significant survival risk associated with higher _TP53_ scores within DMGs (HR = 6436, 95% CI = 2.67 - 1.55e7, p = 0.03) and ependymomas (HR = 2003, 95% CI = 9.9 - 4.05e5, p = 0.005). -Since we observed the negative prognostic effect of _TP53_ scores for HGGs, we assessed the effect of molecular subtypes within HGGs on survival risk. -We found that DMG H3 K28 tumors with _TP53_ loss had significantly worse prognosis (HR = 2.8, CI = 1.4-5.6, p = 0.003) than did DMG H3 K28 tumors with wildtype _TP53_ (**Figure {@fig:Fig4}G** and **Figure {@fig:Fig4}H**). -This finding was also recently reported in two recent restrospective analyses of DIPG tumors [@doi:10.1158/1078-0432.CCR-22-0803; @doi:10.1007/s11060-021-03890-9]. +Higher _TP53_ scores were associated with significant survival risks (**Table S4**) within DMGs (HR = 6436, 95% CI = 2.67 - 1.55e7, p = 0.03) and ependymomas (HR = 2003, 95% CI = 9.9 - 4.05e5, p = 0.005). +Given this result, we next assessed whether different HGG molecular subtypes carry different survival risks. +We found that DMG H3 K28 tumors with _TP53_ loss had significantly worse prognosis (HR = 2.8, CI = 1.4-5.6, p = 0.003) than did DMG H3 K28 tumors with wildtype _TP53_ (**Figure {@fig:Fig4}G** and **Figure {@fig:Fig4}H**), reflecting results from two recent restrospective analyses of DIPG tumors [@doi:10.1158/1078-0432.CCR-22-0803; @doi:10.1007/s11060-021-03890-9]. ![**_TP53_ and telomerase activity** A, Receiver Operating Characteristic for _TP53_ classifier run on stranded FPKM RNA-Seq. B, Violin and strip plots of _TP53_ scores plotted by _TP53_ alteration type (Nactivated = 11, Nlost = 100, Nother = 866). C, Violin and strip plots of _TP53_ RNA expression plotted by _TP53_ activation status (Nactivated = 11, Nlost = 100, Nother = 866). D, Box plots of _TP53_ and telomerase (EXTEND) scores across cancer groups. TMB status is highlighted in orange (hypermutant) or red (ultra-hypermutant). E, Heatmap of RefSig mutational signatures for patients with at least one hypermutant tumor or cell line. F, Forest plot depicting prognostic effects of _TP53_ and telomerase scores on overall survival (OS), controlling for extent of tumor resection, LGG group, and HGG group. G, Forest plot depicting the effect of molecular subtype on HGG OS. For F and G, hazard ratios (HR) with 95% confidence intervals and p-values (multivariate Cox) are listed. Significant p-values are denoted with black diamonds. Reference groups are denoted by grey diamonds. H, Kaplan-Meier curve of HGGs by molecular subtype. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure4.png?sanitize=true){#fig:Fig4 width="7in"} @@ -255,34 +247,36 @@ This finding was also recently reported in two recent restrospective analyses of UMAP visualization of gene expression variation across brain tumors (**Figure {@fig:Fig5}A**) showed expected histological clustering of brain tumors. We further observed that, except for three outliers, _C11orf95::RELA_ (_ZFTA::RELA_) fusion-positive ependymomas fell within distinct clusters (**Figure {@fig:S6}A**). -Medulloblastoma (MB) tumors cluster by molecular subtype, with WNT and SHH in distinct clusters and Groups 3 and 4 showing some overlap (**Figure {@fig:S6}B**), as expected. -Of note, two MB tumors annotated as the SHH subtype did not cluster with the other MB tumors, and one clustered with Group 3 and 4 tumors, suggesting potential subtype misclassification or different underlying biology of these two tumors. -_BRAF_-driven LGGs (**Figure {@fig:S6}C**) were present in three separate clusters, suggesting that there might be additional shared biology within each cluster. +Medulloblastoma (MB) tumors clustered by molecular subtype, with WNT and SHH in distinct clusters and Groups 3 and 4 showing some expected overlap (**Figure {@fig:S6}B**). +Notably, two MB tumors annotated as SHH did not cluster with the other MB tumors and one clustered with Group 3/4 tumors, suggesting potential subtype misclassification or different underlying biology of these two tumors. +_BRAF_-driven LGGs (**Figure {@fig:S6}C**) fell into three separate clusters, suggesting additional shared biology within each cluster. Histone H3 G35-mutant HGGs generally clustered together and away from K28-mutant tumors (**Figure {@fig:S6}D**). -Interestingly, although H3 K28-mutant tumors have different biological drivers than do H3 wildtype tumors [@doi:10.1126/science.1232245], they did not form distinct clusters. -This pattern suggests these subtypes may be driven by common transcriptional programs, have other much stronger biological drivers than their known distinct epigenetic drivers, or our sample size is too small to detect transcriptional differences. +Interestingly, although H3 K28-mutant and H3 wildtype tumors have different biological drivers [@doi:10.1126/science.1232245], they did not form distinct clusters. +This pattern suggests these subtypes may be driven by common transcriptional programs, have other much stronger biological drivers than their known distinct epigenetic drivers, or we lack power to detect transcriptional differences. We performed GSVA for Hallmark cancer gene sets (**Figure {@fig:Fig5}B**) and quantified immune cell fractions using quanTIseq (**Figure {@fig:Fig5}C** and **Figure {@fig:S6}E**), results from which recapitulated previously-described tumor biology. For example, HGG, DMG, MB, and ATRT tumors are known to upregulate _MYC_ [@doi:10.3390/genes8040107] which in turn activates _E2F_ and S phase [@pubmed:11511364]. Indeed, we detected significant (Bonferroni-corrected p < 0.05) upregulation of _MYC_ and _E2F_ targets, as well as G2M (cell cycle phase following S phase) in MBs, ATRTs, and HGGs compared to several other cancer groups. In contrast, LGGs showed significant downregulation (Bonferroni-corrected p < 0.05, multiple cancer group comparisons) of these pathways. -Schwannomas and neurofibromas, which have a documented inflammatory immune microenvironment of T and B lymphocytes as well as tumor-associated macrophages (TAMs), are driven by upregulation of cytokines such as IFN$\gamma$, IL-1, and IL-6, and TNF$\alpha$ [@doi:10.1093/noajnl/vdaa023]. -Indeed, we observed significant upregulation of these cytokines in GSVA hallmark pathways (Bonferroni-corrected p < 0.05, multiple cancer group comparisons) (**Figure {@fig:Fig5}B**) and found immune cell types dominated by monocytes in these tumors (**Figure {@fig:Fig5}C**). +Schwannomas and neurofibromas, which have an inflammatory immune microenvironment of T and B lymphocytes and tumor-associated macrophages (TAMs), are driven by upregulation of cytokines such as IFN$\gamma$, IL-1, and IL-6, and TNF$\alpha$ [@doi:10.1093/noajnl/vdaa023]. +GSVA releaved significant upregulation of these cytokines in hallmark pathways (Bonferroni-corrected p < 0.05, multiple cancer group comparisons) (**Figure {@fig:Fig5}B**), and monocytes dominated these tumors' immune cell repertoire (**Figure {@fig:Fig5}C**). We also observed significant upregulation of pro-inflammatory cytokines IFN$\alpha$ and IFN$\gamma$ in both LGGs and craniopharyngiomas when compared to either medulloblastoma or ependymomas (Bonferroni-corrected p < 0.05) (**Figure {@fig:Fig5}B**). Together, these results support previous proteogenomic findings that aggressive medulloblastomas and ependymomas have lower immune infiltration compared to _BRAF_-driven LGGs and craniopharyngiomas [@doi:10.1016/j.cell.2020.10.044]. -Although CD8+ T-cell infiltration across all cancer groups was quite low (**Figure {@fig:Fig5}C**), we observed signal in specific cancer molecular subtypes (Groups 3 and 4 medulloblastoma) as well as outlier tumors (BRAF-driven LGG, BRAF-driven and wildtype ganglioglioma, and CNS embryonal NOS; **Figure {@fig:S6}E**) -Surprisingly, the classically immunologically-cold HGGs and DMGs [@doi:10.1186/s40478-018-0553-x; @doi:10.1093/brain/awab155] contained higher overall fractions of immune cells, where monocytes, dendritic cells, and NK cells were the most prevalent (**Figure {@fig:Fig5}C**). -Thus, we suspect that quanTIseq might actually have captured microglia within these immune cell fractions. +Although CD8+ T-cell infiltration across all cancer groups was minimal (**Figure {@fig:Fig5}C**), we observed signal in specific cancer molecular subtypes (Groups 3 and 4 medulloblastoma) as well as outlier tumors (BRAF-driven LGG, BRAF-driven and wildtype ganglioglioma, and CNS embryonal NOS; **Figure {@fig:S6}E**) +Surprisingly, the classically immunologically-cold HGGs and DMGs [@doi:10.1186/s40478-018-0553-x; @doi:10.1093/brain/awab155] contained higher overall fractions of immune cells, primarily monocytes, dendritic cells, and NK cells (**Figure {@fig:Fig5}C**). +Thus, quanTIseq might actually have captured microglia within these immune cell fractions. While we did not detect notable prognostic effects of immune cell infiltration on overall survival in HGGs or DMGs, we found that high levels of macrophage M1 and monocytes were associated with poorer overall survival (monocyte HR = 2.1e18, 95% CI = 3.80e5 - 1.2e31, p = 0.005, multivariate Cox) in medulloblastomas (**Figure {@fig:Fig5}D**). We further reproduced previous findings (**Figure {@fig:Fig5}E**) that medulloblastomas typically have low expression of _CD274_ (PD-L1) [@doi:10.18632/oncotarget.24951]. -However, we also found that higher expression of _CD274_ was significantly associated with improved overall prognosis for medulloblastoma tumors, although with a marginal effect size (HR = 0.0012, 95% CI = 7.5eāˆ’06 - 0.18, p = 0.008, multivariate Cox) (**Figure {@fig:Fig5}D**). +We also found that higher expression of _CD274_ was significantly associated with improved overall prognosis for medulloblastoma tumors, although only marginally (HR = 0.0012, 95% CI = 7.5eāˆ’06 - 0.18, p = 0.008, multivariate Cox) (**Figure {@fig:Fig5}D**). This result may be explained by the higher expression of _CD274_ observed in WNT subtype tumors by us and others [@doi:10.1080/2162402X.2018.1462430], as this diagnosis carries the best prognosis of all medulloblastoma subgroups (**Figure {@fig:Fig5}E**). -Finally, we asked whether any subtypes might have a high ratio CD8+ to CD4+ T cells, a metric which has been associated with better immunotherapy response and prognosis following PD-L1 inhibition in non-small cell lung cancer or adoptive T cell therapy in multiple stage III or IV cancers [@doi:10.1136/jitc-2021-004012; @doi:10.4236/jct.2013.48164]. -While adamantinomatous craniopharyngiomas and Group 3 and Group 4 medulloblastomas had the highest CD8+ to CD4+ T cell ratios (**Figure {@fig:S6}F**), very few tumors had ratios greater than 1, highlighting an urgent need to identify novel therapeutics for pediatric brain tumors with poor prognosis. -To explore the potential influence of tumor purity, selected transcriptomic analyses were repeated using samples with tumor purities at or above the median tumor purity of their cancer group (see **STAR Methods**). -The analyses using all stranded samples were broadly consistent (**Figure {@fig:S7}D-I**) with those using samples with high tumor purity. +We additionally explored the ratio of CD8+ to CD4+ T cells across tumor subtypes. +This ratio has been associated with better immunotherapy response and prognosis following PD-L1 inhibition in non-small cell lung cancer or adoptive T cell therapy in multiple stage III or IV cancers [@doi:10.1136/jitc-2021-004012; @doi:10.4236/jct.2013.48164]. +While adamantinomatous craniopharyngiomas and Group 3 and Group 4 medulloblastomas had the highest ratios (**Figure {@fig:S6}F**), very few tumors had ratios greater than 1, highlighting an urgent need to identify novel therapeutics for pediatric brain tumors with poor prognosis. + +Finally, we explored the potential influence of tumor purity by repeating selected transcriptomic analyses restricted to samples with high tumor purities within their cancer group(see **STAR Methods**). +These new analyses were broadly consistent (**Figure {@fig:S7}D-I**) with results derived from all stranded RNA-Seq samples. ![**Transcriptomic and immune landscape of pediatric brain tumors** A, First two dimensions from UMAP of transcriptome data. Points colored by broad histology. B, Heatmap of with significant GSVA scores for Hallmark gene sets with tumors ordered by cancer group. C, Box plots of quanTIseq estimates of immune cell proportions in select cancer groups with N > 15 tumors. Note: other HGGs and other LGGs have immune cell proportions similar to DMG and pilocytic astrocytoma, respectively, and are not shown. D, Forest plot depicting additive effects of _CD274_ expression, immune cell proportion, and extent of tumor resection on OS of medulloblastoma patients. HRs with 95% confidence intervals and p-values (multivariate Cox) are listed. Significant p-values are denoted with black diamonds. Reference groups are denoted by grey diamonds. Note: the Macrophage M1 HR was 0 (coefficient = -9.90e+4) with infinite upper and lower CIs, and thus was not included in the figure. E, Box plot of _CD274_ expression (log2 FPKM) for medulloblastomas grouped by subtype. Bonferroni-corrected p-values from Wilcoxon tests are shown. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles. Only stranded RNA-Seq data is plotted.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure5.png?sanitize=true){#fig:Fig5 width="7in"}