Landscape and multiomics analysis of 5mC regulators in BLCA
5mC is a dynamic and reversible process mediated by several distinct regulators that plays critical roles in various biological processes in cancers (Fig. S1A). As shown in Fig. S1B, the 21 5mC regulators were significantly correlated with each other. In the TCGA-BLCA cohort, most of the 5mC regulators were significantly highly expressed in bladder cancer tissues compared with normal tissues (Fig. S1C). Furthermore, this imbalance in the expression of 5mC regulators between bladder cancer tissues and normal tissues was also observed in the Xiangya cohort (Fig. S1D). More importantly, we confirmed the cancer-specific overexpression patterns of 5mC regulators from the aspect of the single-cell RNA sequence. In the Xiangya scRNA set, after quality control, 6798 qualified single cells were visualized with 2D tSNE and classified into seven cell types, including cancer cell, T cell, iCAF, mCAF, myloid, B cell, and endothelial (Fig. S2). A majority of 5mC regulators were expressed explicitly on cancer cells, such as DNMT3A, MBD1, MBD3, UNG, NEIL1, ZBTB33, NTHL1, SMUG1, TDG, UHRF1, TET1, TET2, and TET3 (Fig. S3). Six 5mC regulators, including DNMT1, MECP2, MBD2, ZBTB38, MBD4, and UHRF1, were expressed in both cancer and nonmalignant cells. Nonetheless, the proportion of 5mC regulators positively expressed cells was the highest in the type of cancer cells. Similar results were observed in the GSE145137. For instance, several 5mC regulators, including MBD1, MBD2, MDB3, UNG, ZBTB2, UHRF1, UHRF2, and TET3, were only expressed on cancer cells (Fig. S4). For several 5mC regulators which are expressed in both cancer and nonmalignant cells, the proportion of positively expressed cells was the highest in the type of cancer cells. Meanwhile, the majority of 5mC regulators were adverse prognostic factors in the E-MTAB-4321 cohort (Fig. S1E). However, CNV alterations and mutations of 5mC regulators were not frequent in BLCA (Fig. S1F-G). In summary, these results indicated that 5mC regulators may be potential diagnostic and prognostic predictors in BLCA.
Depicting the 5mC clusters
The workflow of developing 5mC clusters and the 5mC score is shown in Fig. S5A. Figure 1A displays the comprehensive landscapes of 5mC regulators with respect to the prognostic value, correlations, and groups in the TCGA-BLCA cohort. As shown in Fig. S5B-I, the TCGA-BLCA cohort was classified into several clusters based on the mRNA expression of 21 5mC regulators. Notably, only when the cohort was divided into two clusters was the clustering algorithm superior, and different clusters had significant prognostic value. Therefore, 135 patients were classified into 5mC cluster 1, and the rest were classified into 5mC cluster 2. Figure 1B shows the correlations between 5mC clusters and clinicopathological characteristics. Obviously, the 5mC regulators were differentially expressed between the two 5mC clusters. Patients in 5mC cluster 2 had a poorer prognosis than patients in 5mC cluster 1 (Fig. 1C). Further multivariable Cox analysis indicated that the 5mC cluster was an independent prognostic factor after adjusting for stage, grade, age, and LVI (Fig. 1D).
Developing the 5mC gene signature, 5mC score, and their functional analyses
Figure S6A displays the 5mC score algorithm. First, we identified 401 DEGs (5mC gene signature) between two 5mC clusters (Fig. S6B, Table S2). Then, we highlighted DEGs with the most significant expression differences (|log2FC| greater than 2.5) in the volcano map (Fig. S6C). Interestingly, a majority of these highlighted DEGs were BLCA molecular subtype-specific markers. KRT6A, KRT6B, KRT6C, KRT5, KRT14, SERPINB3, SERPINB13, SERPINB4, and DSG3 were basal subtype-specific markers. In contrast, UPK1A, UPK2, UPK3A, KRT20, and SNX31 were luminal subtype-specific markers (Fig. S6C) [60]. This indicated that 5mC clusters may be similar to classical BLCA molecular subtypes. The functions of the 5mC gene signature were obviously enriched in immune-related processes. For example, biological processes (BPs) mainly included leukocyte chemotaxis and myeloid leukocyte migration (Fig. S6D, Table S3A). The molecular functions (MFs) mainly included chemokine/cytokine receptor binding and chemokine/cytokine activity (Fig. S6D, Table S3A). The cellular components (CC) mainly included the MHC protein complex and keratin filament (Fig. S6D, Table S3A). Furthermore, the most significant KEGG pathways of the 5mC gene signature were cytokine-cytokine receptor interactions and chemokine signaling pathways (Fig. S6E, Table S3B). These data suggested that 5mC clusters (5mC gene signature) may play critical roles in modulating TME immunity and regulating the anticancer immune response in BLCA.
Within the 5mC gene signature, 88 DEGs had prognostic value (Table S2). Then, we performed PCA based on these prognostic DEGs to calculate the 5mC score. Eventually, the TCGA-BLCA cohort was divided into a high 5mC score group (n = 197) and a low 5mC score group (n = 203) based on an optimal cutoff value of the 5mC score. Patients in the high score group had a better prognosis than patients in the low score group (Fig. S6F). As expected, the 5mC score could effectively quantify the 5mC clusters. 5mC cluster 1 belonged to the high 5mC score group, and the low 5mC score group belonged to 5mC cluster 2 (Fig. S6G-H).
Next, we compared the differences in the mutation profiles, hallmark pathways, oncogenic pathways, and KEGG pathways between the 5mC score groups. A majority of hallmark pathways were differentially enriched between the two 5mC score groups (Fig. S7A). Most of the metabolic hallmark signatures were enriched in the high 5mC score group. In contrast, most proliferation-related hallmark signatures, DNA damage-related pathways, development-related pathways, and cellular component-related pathways were significantly enriched in the low 5mC score group. Meanwhile, all immune-related hallmark pathways, such as interferon-alpha/gamma response, were significantly enriched in the low 5mC score group. The results from the PURE-01 study demonstrated that the activation of the interferon-alpha/gamma response predicted a higher ICB response rate in BLCA [61]. Similarly, most of the KEGG pathways were differentially enriched between the two 5mC score groups (Fig. S8). In particular, the immune-related KEGG pathways were significantly enriched in the low 5mC score group. This indicated that the low 5mC score group (5mC cluster 2) may be more sensitive to ICB. A previous study suggested that TP53 and RB1 mutations induced genomic instability and promoted the pathogenesis of BLCA [62]. In this study, the mutation rates of TP53 (55% vs. 40%) and RB1 (27% vs. 9%) were significantly higher in the low 5mC score group than in the high 5mC score group (Fig. S7B-C, Fig. 2D, E). Meanwhile, a majority of oncogenic pathways were significantly enriched in the low 5mC score group (Fig. S7D). These results may explain why the low 5mC score group (5mC cluster 2) had a poorer prognosis.
Collectively, the 5mC gene signature and 5mC score comprehensively reflect the biological characteristics of BLCA, including TME immunity and prognosis.
5mC clusters are effective alternatives to classical molecular subtypes and accurately predict therapeutic opportunities in BLCA
Figure 2A, B displays the correlations between 5mC clusters, 5mC score groups, and seven classical molecular subtype classifications. The high 5mC score group and 5mC cluster 1 represented the luminal subtype, which was characterized by luminal differentiation, urothelial differentiation, and the Ta pathway. Conversely, the low 5mC score group (5mC cluster 2) indicated the basal subtype, which was characterized by basal differentiation, EMT differentiation, immune differentiation, interferon response, and keratinization.
As shown in Fig. S9A, a majority of BLCA samples belonged to the basal or luminal subtypes regardless of the molecular systems, although some other subgroups (such as the stromal subtype and NE subtype) had a very low proportion. For instance, two molecular systems (Baylor and UNC) only included basal and luminal subtypes. In the TCGA molecular system, only 4% of samples were NE subtype. Similarly, in the consensus molecular system, only 10% of samples were classified into other subtypes (2% NE and 8% stromal subtypes). These results suggested that the basal and luminal subtypes may reflect the molecular characteristics of most BLCA patients.
The ROC curves showed that the 5mC score predicted classical molecular subtypes with high accuracy ranging from 0.91 to 1 (Fig. 2C). These results were successfully validated in several independent external cohorts (Fig. S10A-D). Therefore, we believed that the binary 5mC cluster systems could also reflect the molecular characteristics of most BLCA patients. Certainly, there was an inescapable limitation for binary cluster systems to reflect the molecular characteristics of other infrequent subtypes, such as NE and stromal subtypes. To further explore the role of the 5mC score in quantificationally distinguishing the different rare subtypes, we compared the difference in 5mC score between basal subtype, luminal subtype, and other subtypes. In line with the results from Fig. 2A, B, basal subtypes had the lowest 5mC score, while luminal subtypes had the highest 5mC score. Interestingly, other rare subtypes (stromal and NE subtypes) had an intermediate score (Fig. S9B). This phenomenon was observed in all molecular subtype systems. Therefore, the 5mC score could make up for the shortcomings of the binary 5mC cluster system to quantitatively reflect the biological characteristics of other rare subtypes.
Several studies demonstrated that tumor purity may influence the molecular subtypes [32, 63]. Fortunately, the purity adjusted 5mC clusters were highly consistent with the original 5mC clusters in this study (Fig. S11A). Only one patient was reclassified into another subtype (Fig. S11B, Table S3C). This highlighted the robustness of our 5mC cluster systems, which the tumor purity may not influence. There were two possible explanations for this result. The first one was that 5mC regulators were specifically overexpressed in cancer cells (Figs. S1C-D, S3, S4). The second one was that the overall tumor purity of the TCGA-BLCA samples was satisfied and acceptable when most samples’ tumor purity (CPE) (84.67%) was higher than 60% (Table S1H). However, the 5mC score was positively related to the purity in five algorithms (Fig. S11C). Patients with a higher 5mC score had higher tumor purity, which indicated lower immune and stromal infiltration in the TME. Consistently, samples with higher 5mC scores represented a luminal subtype with lower immune infiltration and stromal signature enrichment scores (Figs. 2A, 3, 4). The closed correlation between the 5mC score and purity may be due to the fact that the 5mC gene signature contained many immune-related genes. Overall, the tumor purity was more inclined to be regarded as a TME internal character, reflecting the stromal and immune-related features.
The overall mutation rate of neoadjuvant chemotherapy-related genes was significantly higher in the low 5mC score group than in the high 5mC group (48.02% vs. 32.49%) (Fig. 2D, E). This indicated that patients in the low 5mC score group may be more sensitive to neoadjuvant chemotherapy. Meanwhile, patients in the low 5mC score group may be more sensitive to EGFR targeted therapy and radiotherapy (Fig. 2F). In contrast, several immunosuppressive oncogenic pathways were significantly enriched in the high 5mC score group, including the WNT-β-catenin network, PPARG network, FGFR3 network, IDH1, KDM6B, and VEGFA. Therefore, targeting these oncogenic pathways may offer promising therapeutic opportunities for BLCA patients in the high 5mC score group. All of these observations were revalidated in several external BLCA cohorts (Fig. S12A-B). Furthermore, we successfully confirmed the above results in the DrugBank database (Fig. 2G). Patients in the low 5mC score group (5mC cluster 2) were more sensitive to chemotherapy drugs (cisplatin, docetaxel, and gemcitabine), ERBB therapy (cetuximab), and immunotherapy (atezolizumab). However, patients in the high 5mC score group (5mC cluster 1) may be more sensitive to antiangiogenic therapy drugs (sorafenib and bevacizumab).
Collectively, 5mC clusters may be economical and simpler alternatives to classical molecular subtypes. Meanwhile, 5mC clusters and 5mC scores could predict the response to several treatments in BLCA.
The distinct methylation patterns between 5mC clusters
We firstly screened 49904 DMPs (adj P < 0.01) between 5mC clusters (Table S3D). Among these, 142 5mC cluster-specific DMPs were defined (Table S3E). Interestingly, almost all of these DMPs (141/142) were 5mC cluster 2 specific. Only one DMP (cg23507945) was 5mC cluster 1 specific. Meanwhile, the 5mC score was negatively related to the methylation levels of these cluster-specific DMPs (Table S3F). Overall, there was a significantly distinct methylation pattern between 5mC clusters (Fig. S13A). In addition, we explored the associations between the 5mC score and the promoter methylation levels of certain critical cancer-associated genes, such as oncogenes and driver genes . Similarly, the 5mC score was negatively related to most promoter methylation levels of those genes (Table S3G-K). For example, among 1268 promoter methylation probes of oncogenes, 667 probes were negatively related to the 5mC score, while only 94 probes were positively related to the 5mC score. These data further confirmed a higher methylation status in 5mC cluster 2 compared to 5mC cluster 1. Based on these 5mC cluster-specific DMPs, we identified 130 5mC cluster-specific DMGs (Table S3E). Results of GO and KEGG analyses based on 130 5mC cluster-specific DMGs were shown in Table S3L. Among these enriched pathways, 11 GO pathways and 1 KEGG pathway were immune-related (Fig. S13B-C), which suggested that 5mC cluster 2 may be an immune infiltrated phenotype.
Robertson et al. identified several methylation subtypes based on the BLCA-specific hypermethylated or hypomethylated probes [8]. Though these methylation subtypes were related to different clinicopathological features, there was no significant difference in prognosis between these subtypes. Here, we identified 592 BLCA-specific hypermethylated probes and 465 BLCA-specific hypomethylated probes using more stringent criteria (see “Methods” part) (Table S3M-N). We then performed unsupervised clustering based on both BLCA-specific hypermethylated and hypomethylated probes. Unfortunately, the BLCA-specific DMPs clusters were not related to prognosis (Fig. S14). Moreover, there was no association between the BLCA-specific DMPs clusters and the 5mC clusters (Fig. S14). Finally, we performed unsupervised clustering based on 5mC cluster-specific DMPs (Fig. S13D-I). Interestingly, the binary 5mC-specific DMPs clusters were related to prognosis (Fig. S13D). Meanwhile, there was a closed matching relation between the 5mC-specific DMPs clusters and the 5mC clusters (Fig. S13E). 5mC-specific DMPs cluster 1 indicated 5mC cluster 2.
5mC clusters (5mC score) predicted immune phenotypes and clinical response of ICB in BLCA
A majority of immunomodulators were downregulated in 5mC cluster 1 (Fig. S15A). Because the activities of cancer immunity cycles are directly determined by the comprehensive performance of immunomodulators, the activities of most cancer immunity cycles were downregulated in 5mC cluster 1, such as the release of cancer cell antigens (Step 1), trafficking of immune cells to tumors (Step 4) (CD8 T cell recruitment, CD4 T cell recruitment, macrophage recruitment, Th1 cell recruitment, NK cell recruitment, DC recruitment), and killing of cancer cells (Step 7) (Figs. 3A, S15C). Consequently, the downregulated activities of these cycles resulted in decreased infiltration levels of corresponding TIICs (including CD8 T cells, CD4 T cells, NK cells, Th1 cells, macrophages, and DCs) in the BLCA TME (Fig. 3B, C, S15D-E). These findings suggested that 5mC cluster 1 may be a noninflamed phenotype. We further analyzed the correlations between 5mC clusters and ICB response predictors. First, most of the immune checkpoints, such as PD-L1, PD-1, and CTLA-4, were downregulated in 5mC cluster 1 (Fig. 4A). Second, the enrichment scores of positive ICB response-related signatures and the TIS were significantly lower in 5mC cluster 1 than in 5mC cluster 2 (Fig. 4B, C). Therefore, 5mC cluster 1 may not be sensitive to ICB.
The 5mC score was negatively related to anticancer immunity in the BLCA TME. Most immunomodulators were downregulated in the high 5mC score group (Fig. S15B). Consistently, the 5mC score negatively correlated with the activities of most cancer immunity cycles (Fig. 3D, Table S4A). As a result, the 5mC score negatively correlated with many anticancer TIICs (including CD8 T cells, CD4 T cells, NK cells, Th1 cells, macrophages, and DCs) and their effector genes, which were cross validated in seven independent algorithms (Fig. 3D, E, Table S4B-C). Furthermore, there were significant adverse correlations between the 5mC score and TIS, enrichment scores of positive ICB response-related signatures, and immune checkpoints (Fig. 4D, E, Table S5).
An inflamed TME was infiltrated by more immune cells and stromal cells. Consistently, the enrichment scores of four stromal signatures, including EMT1, EMT2, EMT3, and F-TBRS, were significantly higher in the 5mC cluster 2 (low 5mC score group) (Fig. S16A-B). In addition, the enrichment score of proliferation was also higher in the low 5mC score group (Fig. S16C).
In summary, 5mC cluster 1 and a high 5mC score predicted a noninflamed phenotype and lower ICB response in BLCA, which was successfully confirmed in several external cohorts (Figs. S17, S18). Moreover, the incidence of ICB-associated hyperprogression may be higher in the high 5mC score group. The mRNA expression and copy number amplification rates of genes positively correlated with ICB-associated hyperprogression, including MDM2, MDM4, DNMT3A, CCND1, FGF3, FGF4, and FGF19, were significantly higher in the high 5mC score group (Fig. 4F, G). In contrast, genes negatively correlated with hyperprogression, such as CDKN2A and CDKN2B, were significantly downregulated in the high 5mC score group.
A distinct gene fusion patterns and regulon expression profiles between 5mC clusters
In TCGA-BLCA cohort, the most common gene fusions included 10 FGFR3-TACC3 fusions, 9 ITGB6-LOC100505984 fusions, 5 AFF1-PTPN13 fusions, 4 PPARG-SYN2 fusions, 4 GPR110-TNFRSF21 fusions, and 4 TSEN2-PPARG fusions (Fig. S16D). Notably, the FGFR3-TACC3 fusions and AFF1-PTPN13 fusions mainly occurred in the high score group, while ITGB6-LOC100505984 fusions mainly occurred in the low score group. In addition, 7 of 8 PPARG associated fusions occurred in the high score group. In addition, we observed a distinct regulon expression pattern across 5mC clusters. As for 11 luminal subtype-specific regulons, such as RARG, FGFR3, and ERBB2, they were highly expressed in the high 5mC score group (Fig. S16E). This result was similar to the hypothesis that GATA3, FOXA1, and PPARG lead to luminal cell biology for BLCA [64]. In contrast, the expression of 12 basal subtype-specific regulons was significantly higher in the low 5mC score group. Collectively, the distinct gene fusion patterns and regulon expression profiles between 5mC clusters may drive the differences in biological phenotypes between 5mC clusters.
Validating the role of the 5mC score in stratifying immune phenotypes and clinical response to ICB in a BLCA immunotherapy cohort (IMvigor210)
In the IMvigor210 cohort, patients with higher 5mC scores had better prognoses (Fig. S19A). Patients were divided into several subgroups based on PD-L1 expression on immune cells (IC0, IC1, and IC2+ subgroups) or tumor cells (TC0, TC1, and TC2+ subgroups) and the infiltration status of CD8 T cells in the TME (deserted, excluded, and inflamed subgroups) [54]. Obviously, the 5mC score was the highest in the IC0 (immune cells with the lowest PD-L1 expression) and TC0 (tumor cells with the lowest PD-L1 expression) subgroups and deserted phenotypes (Fig. S19B-D). Additionally, the 5mC score was negatively related to TIS and most of the immune checkpoints, such as PD-L1, PD-1, CTLA-4, and TIM-3 (Fig. S19E-F). Meanwhile, the effector genes of several anticancer TIICs were significantly downregulated in the high 5mC score group (Fig. S19G). These results confirmed that the high 5mC score group represented a noninflamed phenotype.
Next, we analyzed the correlations between the 5mC score and ICB response in three different immune phenotype subgroups. As expected, in the deserted phenotype subgroup, the ICB response rate in the high 5mC score group was significantly lower than that in the low 5mC score group (Fig. S19H). This result indicated that the high 5mC score group represented a noninflamed phenotype. Naturally, the prognosis of patients in the high 5mC score group was poorer due to a lower ICB response rate (Fig. S19I). Interestingly, we observed opposite results in the excluded and inflamed phenotype subgroups. In these two subgroups, the ICB response rates in the high 5mC score group were higher than those in the low 5mC score group (Fig. S19J, L). Certainly, the prognosis of patients in the high 5mC score group in these subgroups was better due to higher ICB response rates (Fig. S19K, M). Such opposite results could be explained by the comprehensive cross-talk between the 5mC score and other ICB response determinants, such as the panfibroblast TGFβ response signature (F-TBRS). F-TBRS attenuated the clinical response to PD-L1 blockade by contributing to T cell exclusion in BLCA [54]. Previous results from the IMvigor210 cohort indicated that the enrichment score of F-TBRS was the lowest in the deserted phenotype subgroup compared with that in the excluded or inflamed phenotype subgroup. In our study, the 5mC score was the highest in the deserted phenotype subgroup (Fig. S19D). Therefore, the ICB response in the deserted phenotype subgroup may be mainly determined by the 5mC score rather than F-TBRS. Conversely, the 5mC score was obviously lower in the excluded and inflamed phenotype subgroups, but the enrichment score of F-TBRS was significantly higher. Thus, the ICB response in these two subgroups may be determined by other factors, such as F-TBRS, instead of the 5mC score. Of course, further research is needed to demonstrate the importance of interactions between the 5mC score and F-TBRS in determining the clinical response to ICB.
Validating the roles of the 5mC score in the Xiangya cohort
In our own cohort (Xiangya cohort), we found that the 5mC score could accurately predict classical molecular subtypes (Fig. 5A). The AUC ranged from 0.99 to 1, except in the Baylor subtype system (AUC = 0.9) (Fig. 5B). In addition, the 5mC score was negatively correlated with the activities of many anticancer immunity steps (Fig. 5C, Table S6A). Subsequently, the 5mC score was also negatively related to the infiltration levels of CD8 T cells, NK cells, Th1 cells, DCs, and macrophages in seven independent algorithms (Fig. 5F, Table S6B). Meanwhile, there were significantly negative correlations between the 5mC score and immune checkpoints, TIS, and enrichment scores of positive ICB response-related signatures (Fig. 5D, E, G, Table S6C). These data demonstrated that the 5mC score could effectively stratify the immune phenotypes of BLCA. In addition, the 5mC score was able to predict the clinical response to other treatments, including EGFR targeted therapy, radiotherapy, and several therapies targeting immune-inhibited oncogenic pathways (Fig. 5H).
Pancancer analyses of the 5mC score
We further evaluated the role of the 5mC score across cancers. Notably, the 5mC score was related to prognosis in many cancers, such as thymoma, lower-grade glioma, and kidney renal clear cell carcinoma (Fig. S20A, Table S7B). In addition, the 5mC score was negatively correlated with the expression of four critical immune checkpoints, PD-L1, PD-1, CTLA-4, and LAG-3, in most cancers (Fig. S20B-E, Table S7C-F). Aberrant DNA methylation may influence cancer immunogenicity, such as TMB and MSI [65]. Here, we revealed that the 5mC score was related to the TMB and MSI in many cancers (Fig. S20F-G, Table S7G-H). Moreover, the 5mC score was significantly related to the stemness indices of many cancers, such as testicular germ cell tumors and lung squamous cell carcinoma (Fig. S21, Table S7I-N). Therefore, the 5mC score reflected many biological characteristics of the TME, such as anticancer immunity, immunogenicity, and cancer stemness, in pancancer analyses. It may be a generalizable predictor of prognosis and ICB response across cancers.
The 5mC score was a valuable predictor of the response to immunotherapy in multiple immunotherapy cohorts
Here, we explored the role of the 5mC score in predicting the ICB response in other cancers (including melanoma, non-small cell lung cancer, and gastric cancer) from nine immunotherapy-related cohorts (eight ICB cohorts and one adoptive T cell therapy cohort). First, we found that the 5mC score was negatively correlated with most immune checkpoints in eight ICB cohorts (Figs. S22, S23, S24A). In line with this, the ICB response rates were obviously lower in the high 5mC score group than in the low 5mC score group (Figs. 6A–G, S24B). The prognosis of the high 5mC score group was also poorer due to lower ICB response rates (Fig. 6A–G). Similar results were observed in the adoptive T cell therapy cohort (Fig. 6H). This evidence reconfirmed that the 5mC score was a valuable predictor of immunotherapy response across cancers.