diff --git a/content/03.results.md b/content/03.results.md index 90d5dffa..a34ad5d4 100644 --- a/content/03.results.md +++ b/content/03.results.md @@ -19,7 +19,8 @@ We used the continuous integration (CI) service CircleCI® to run analytical We followed a similar process in our Manubot-powered [@doi:10.1371/journal.pcbi.1007128] repository for proposed manuscript additions (**Figure {@fig:Fig1}C**); peer reviewers ensured clarity and scientific accuracy, and Manubot performed spell-checking. -![**Overview of the OpenPBTA Project.** A, CBTN and PNOC collected tumors from 943 patients. 22 tumor cell lines were created, and over 2000 specimens were sequenced (N = 1035 RNA-Seq, N = 940 WGS, and N = 32 WXS or targeted panel). Data was harmonized by the Kids First Data Resource Center using Amazon S3 through CAVATICA. B, Stacked bar plot of the number of biospecimens per phase of therapy. Each panel denotes a broad histology and each bar denotes a cancer group. (Abbreviations: GNG = ganglioglioma, Other LGG = other low-grade glioma, PA = pilocytic astrocytoma, PXA = pleomorphic xanthoastrocytoma, SEGA = subependymal giant cell astrocytoma, DIPG = diffuse intrinsic pontine glioma, DMG = diffuse midline glioma, Other HGG = other high-grade glioma, ATRT = atypical teratoid rhabdoid tumor, MB = medulloblastoma, Other ET = other embryonal tumor, EPN = ependymoma, PNF = plexiform neurofibroma, DNET = dysembryoplastic neuroepithelial tumor, CRANIO = craniopharyngioma, EWS = Ewing sarcoma, CPP = choroid plexus papilloma). C, Overview of the open analysis and manuscript contribution model. A contributor proposed an analysis, implemented it in their fork, and filed a pull request (PR) to add changes to the analysis repository. PRs underwent review for scientific rigor and implementation correctness. Using container and continuous integration technologies, PRs were checked to ensure all software dependencies were included and code was not sensitive to underlying data changes. Finally, a contributor filed a PR documenting their methods and results to the Manubot-powered manuscript repository for review. D, A potential path for an analytical PR. Arrows indicate revisions. Panel A created with BioRender.com.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure1.png?sanitize=true){#fig:Fig1 width="7in"} +![**Overview of the OpenPBTA Project.** A, CBTN and PNOC collected tumors from 943 patients. 22 tumor cell lines were created, and over 2000 specimens were sequenced (N = 1035 RNA-Seq, N = 940 WGS, and N = 32 WXS or targeted panel). The Kids First Data Resource Center Data harmonized the data using Amazon S3 through CAVATICA. Panel created with [BioRender.com](biorender.com). B, Number of biospecimens across phases of therapy, with one broad histology per panel. Each bar denotes a cancer group. (Abbreviations: GNG = ganglioglioma, Other LGG = other low-grade glioma, PA = pilocytic astrocytoma, PXA = pleomorphic xanthoastrocytoma, SEGA = subependymal giant cell astrocytoma, DIPG = diffuse intrinsic pontine glioma, DMG = diffuse midline glioma, Other HGG = other high-grade glioma, ATRT = atypical teratoid rhabdoid tumor, MB = medulloblastoma, Other ET = other embryonal tumor, EPN = ependymoma, PNF = plexiform neurofibroma, DNET = dysembryoplastic neuroepithelial tumor, CRANIO = craniopharyngioma, EWS = Ewing sarcoma, CPP = choroid plexus papilloma). C, Overview of the open analysis and manuscript contribution models. Contributors proposed analyses, implemented it in their fork, and filed a pull request (PR) with proposed changes. PRs underwent review for scientific rigor and accuracy. Container and continuous integration technologies ensured that all software dependencies were included and code was not sensitive to underlying data changes. Finally, a contributor filed a PR documenting their methods and results to the Manubot-powered manuscript repository for review. D, A potential path for an analytical PR. Arrows indicate revisions. +](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure1.png?sanitize=true){#fig:Fig1 width="7in"} ### Molecular Subtyping of OpenPBTA CNS Tumors @@ -105,7 +106,7 @@ For detailed methods, see **STAR Methods** and **Figure {@fig:S1}**. | Tumor of sellar region | CRANIO, ADAM | 27 | 27 | | | Total | 577 | 644 | -Table: **Molecular subtypes generated through the OpenPBTA project.** Listed are broad tumor histologies, molecular subtypes generated, and number of patients and tumors subtyped within OpenPBTA. {#tbl:Table1} +Table: **Molecular subtypes generated through the OpenPBTA project.** Broad tumor histologies, molecular subtypes generated, and number of patients and tumors subtyped within OpenPBTA. {#tbl:Table1} ### Somatic Mutational Landscape of Pediatric Brain Tumors @@ -175,7 +176,7 @@ Finally, signatures 3, 8, 18, and MMR2 were prevalent in HGGs, including DMGs. -![**Mutational co-occurrence and signatures highlight key oncogenic drivers.** A, Bar plot of nonsynonymous mutations for 50 most commonly-mutated genes across all histologies. "Other" denotes a histology with <10 tumors. B, Co-occurrence and mutual exclusivity of mutated genes. The co-occurrence score is defined as $I(-\log_{10}(P))$ where $P$ is Fisher's exact test and $I$ is 1 when mutations co-occur more often than expected or -1 when exclusivity is more common. C, Number of SV breaks significantly correlate with CNV breaks (Adjusted R = 0.443, p = 1.05e-38). D, Chromothripsis frequency across cancer groups with N >= 3 tumors. E, Sina plots of RefSig signature weights for signatures 1, 11, 18, 19, 3, 8, N6, MMR2, and Other across cancer groups. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure3.png?sanitize=true){#fig:Fig3 width="7in"} +![**Mutational co-occurrence and signatures highlight key oncogenic drivers.** A, Nonsynonymous mutations for 50 most commonly-mutated genes across all histologies. "Other" denotes a histology with <10 tumors. B, Co-occurrence and mutual exclusivity of mutated genes. The co-occurrence score is defined as $I(-\log_{10}(P))$ where $P$ is Fisher's exact test and $I$ is 1 when mutations co-occur more often than expected or -1 when exclusivity is more common. C, Number of SV and CNV breaks are significantly correlated (Adjusted R = 0.443, p = 1.05e-38). D, Chromothripsis frequency across cancer groups with N >= 3 tumors. E, Sina plots of RefSig signature weights for signatures 1, 11, 18, 19, 3, 8, N6, MMR2, and Other across cancer groups. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure3.png?sanitize=true){#fig:Fig3 width="7in"} ### Transcriptomic Landscape of Pediatric Brain Tumors @@ -241,7 +242,7 @@ The mutual exclusivity of signatures 3 and MMR2 corroborates previous suggestion | PT_VTM2STE3 | BS_02YBZSBY | 7316-2189 | Progressive | Solid Tissue | Unknown | Lynch Syndrome | None detected | 274.5 | HGG, H3 wildtype, TP53 activated | -Table: **Patients with hypermutant tumors.** Listed are patients with at least one hypermutant or ultra-hypermutant tumor or cell line. Pathogenic (P) or likely pathogenic (LP) germline variants, coding region TMB, phase of therapy, therapeutic interventions, cancer predisposition (CMMRD = Constitutional mismatch repair deficiency), and molecular subtypes are included. {#tbl:Table2} +Table: **Patients with hypermutant tumors.** Patients with at least one hypermutant or ultra-hypermutant tumor or cell line. Pathogenic (P) or likely pathogenic (LP) germline variants, coding region TMB, phase of therapy, therapeutic interventions, cancer predisposition (CMMRD = Constitutional mismatch repair deficiency), and molecular subtypes are included. {#tbl:Table2} Next, we asked whether transcriptomic classification of _TP53_ dysregulation and/or telomerase activity recapitulate these oncogenic biomarkers' known prognostic influence. We identified several expected trends, including a significant overall survival benefit following full tumor resection (HR = 0.35, 95% CI = 0.2 - 0.62, p < 0.001) or if the tumor was an LGG (HR = 0.046, 95% CI = 0.0062 - 0.34, p = 0.003), and a significant risk if the tumor was an HGG (HR = 6.2, 95% CI = 4.0 - 9.5, p < 0.001) (**Figure {@fig:Fig4}F**; **STAR Methods**). @@ -250,7 +251,7 @@ Higher _TP53_ scores were associated with significant survival risks (**Table S4 Given this result, we next assessed whether different HGG molecular subtypes carry different survival risks if stratified by _TP53_ status. We found that DMG H3 K28 tumors with _TP53_ loss had significantly worse prognosis (HR = 2.8, CI = 1.4-5.6, p = 0.003) than those with wildtype _TP53_ (**Figure {@fig:Fig4}G** and **Figure {@fig:Fig4}H**), recapitulating results from two recent restrospective analyses of DIPG tumors [@doi:10.1158/1078-0432.CCR-22-0803; @doi:10.1007/s11060-021-03890-9]. -![**_TP53_ and telomerase activity** A, Receiver Operating Characteristic for _TP53_ classifier run on stranded FPKM RNA-Seq. B, Violin and strip plots of _TP53_ scores plotted by _TP53_ alteration type (Nactivated = 11, Nlost = 100, Nother = 866). C, Violin and strip plots of _TP53_ RNA expression plotted by _TP53_ activation status (Nactivated = 11, Nlost = 100, Nother = 866). D, Box plots of _TP53_ and telomerase (EXTEND) scores across cancer groups. TMB status is highlighted in orange (hypermutant) or red (ultra-hypermutant). E, Heatmap of RefSig mutational signatures for patients with at least one hypermutant tumor or cell line. F, Forest plot depicting prognostic effects of _TP53_ and telomerase scores on overall survival (OS), controlling for extent of tumor resection, LGG group, and HGG group. G, Forest plot depicting the effect of molecular subtype on HGG OS. For F and G, hazard ratios (HR) with 95% confidence intervals and p-values (multivariate Cox) are listed. Significant p-values are denoted with black diamonds. Reference groups are denoted by grey diamonds. H, Kaplan-Meier curve of HGGs by molecular subtype. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure4.png?sanitize=true){#fig:Fig4 width="7in"} +![**_TP53_ and telomerase activity** A, Receiver Operating Characteristic for _TP53_ classifier run on stranded FPKM RNA-Seq. B, Violin and strip plots of _TP53_ scores plotted by _TP53_ alteration type (Nactivated = 11, Nlost = 100, Nother = 866). C, Violin and strip plots of _TP53_ RNA expression plotted by _TP53_ activation status (Nactivated = 11, Nlost = 100, Nother = 866). D, Boxplots of _TP53_ and telomerase (EXTEND) scores across cancer groups. TMB status is highlighted in orange (hypermutant) or red (ultra-hypermutant). E, Heatmap of RefSig mutational signatures for patients with at least one hypermutant tumor or cell line. F, Forest plot depicting prognostic effects of _TP53_ and telomerase scores on overall survival (OS), controlling for extent of tumor resection, LGG group, and HGG group. G, Forest plot depicting the effect of molecular subtype on HGG OS. Hazard ratios (HR) with 95% confidence intervals and p-values (multivariate Cox) are given in F and G. Black diamonds denote significant p-values, and gray diamonds denote reference groups. H, Kaplan-Meier curve of HGGs by molecular subtype. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure4.png?sanitize=true){#fig:Fig4 width="7in"} #### Histologic and oncogenic pathway clustering @@ -288,4 +289,4 @@ While adamantinomatous craniopharyngiomas and Group 3 and Group 4 medulloblastom Finally, we explored the potential influence of tumor purity by repeating selected transcriptomic analyses restricted to only samples with high tumor purity (see **STAR Methods**). Results from these analyses were broadly consistent (**Figure {@fig:S7}D-I**) with results derived from all stranded RNA-Seq samples. -![**Transcriptomic and immune landscape of pediatric brain tumors** A, First two dimensions from UMAP of transcriptome data. Points colored by broad histology. B, Heatmap of with significant GSVA scores for Hallmark gene sets with tumors ordered by cancer group. C, Box plots of quanTIseq estimates of immune cell proportions in select cancer groups with N > 15 tumors. Note: other HGGs and other LGGs have immune cell proportions similar to DMG and pilocytic astrocytoma, respectively, and are not shown. D, Forest plot depicting additive effects of _CD274_ expression, immune cell proportion, and extent of tumor resection on OS of medulloblastoma patients. HRs with 95% confidence intervals and p-values (multivariate Cox) are listed. Significant p-values are denoted with black diamonds. Reference groups are denoted by grey diamonds. Note: the Macrophage M1 HR was 0 (coefficient = -9.90e+4) with infinite upper and lower CIs, and thus was not included in the figure. E, Box plot of _CD274_ expression (log2 FPKM) for medulloblastomas grouped by subtype. Bonferroni-corrected p-values from Wilcoxon tests are shown. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles. Only stranded RNA-Seq data is plotted.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure5.png?sanitize=true){#fig:Fig5 width="7in"} +![**Transcriptomic and immune landscape of pediatric brain tumors** A, First two dimensions of transcriptome data UMAP, with points colored by broad histology. B, Heatmap of GSVA scores for Hallmark gene sets with tumors ordered by cancer group. C, Boxplots of quanTIseq estimates of immune cell proportions in cancer groups with N > 15 tumors. Note: other HGGs and other LGGs have immune cell proportions similar to DMG and pilocytic astrocytoma, respectively, and are not shown. D, Forest plot depicting additive effects of _CD274_ expression, immune cell proportion, and extent of tumor resection on OS of medulloblastoma patients. HRs with 95% confidence intervals and p-values (multivariate Cox) are listed. Black diamonds denote significant p-values, and gray diamonds denote reference groups. Note: the Macrophage M1 HR was 0 (coefficient = -9.90e+4) with infinite upper and lower CIs, and thus was not included in the figure. E, Boxplot of _CD274_ expression (log2 FPKM) for medulloblastomas grouped by subtype. Bonferroni-corrected p-values from Wilcoxon tests are shown. Boxplot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles. Only stranded RNA-Seq data is plotted.](https://raw.githubusercontent.com/AlexsLemonade/OpenPBTA-analysis/37ec62fdc2fd9ff157f2f2c10b69e9bb36673363/figures/pngs/figure5.png?sanitize=true){#fig:Fig5 width="7in"}