Common DNA methylation changes in biliary tract cancers identify subtypes with different immune characteristics and clinical outcomes

Background DNA methylation-associated studies on biliary tract cancer (BTC), including cholangiocarcinoma (CCA) and gallbladder cancer (GBC), may improve the BTC classification scheme. We proposed to identify the shared methylation changes of BTCs and investigate their associations with genomic aberrations, immune characteristics, and survival outcomes. Methods Multi-dimensional data concerning mutation, DNA methylation, immune-related features, and clinical data of 57 CCAs and 48 GBCs from Eastern Hepatobiliary Surgery Hospital (EHSH) and 36 CCAs in the TCGA-CHOL cohort were analyzed. Results In our cohort including 24 intrahepatic CCAs (iCCAs), 20 perihilar CCAs (pCCAs), 13 distal CCAs (dCCAs), and 48 GBCs, 3369 common differentially methylated regions (DMRs) were identified by comparing tumor and non-tumor samples. A lower level of methylation changes of these common DMRs was associated with fewer copy number variations, fewer mutational burden, and remarkably longer overall survival (OS, hazard ratio [HR] = 0.07, 95% confidence interval [CI] 0.01–0.65, P = 0.017). Additionally, a 12-marker model was developed and validated for prognostication after curative surgery (HR = 0.21, 95% CI 0.10–0.43, P < 0.001), which exhibited undifferentiated prognostic effects in subgroups defined by anatomic location (iCCAs, d/pCCAs, GBCs), TNM stage, and tumor purity. Its prognostic utility remained significant in multivariable analysis (HR = 0.26, 95% CI 0.11–0.59, P = 0.001). Moreover, the BTCs with minimal methylation changes exhibited higher immune-related signatures, infiltration of CD8+ lymphocytes, and programmed death-ligand 1 (PD-L1) expression, indicating an inflamed tumor immune microenvironment (TIME) with PD-L1 expression elicited by immune attack, potentially suggesting better immunotherapy efficacy. Conclusions In BTCs, DNA methylation is a powerful tool for molecular classification, serving as a robust indicator of genomic aberrations, survival outcomes, and tumor immune microenvironment. Our integrative analysis provides insights into the prognostication after curative surgery and patient selection for immunotherapy. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-021-02197-w.

Conclusions: In BTCs, DNA methylation is a powerful tool for molecular classification, serving as a robust indicator of genomic aberrations, survival outcomes, and tumor immune microenvironment. Our integrative analysis provides insights into the prognostication after curative surgery and patient selection for immunotherapy.
Keywords: Biliary tract cancer, DNA methylation, Prognostication, Immune characteristic Background Biliary tract cancers (BTCs), including cholangiocarcinoma (CCA) and gallbladder cancer (GBC), are rare but aggressive [1]. Multi-omics analysis may improve the previous classification framework based on anatomic location and pathological features and provide insight into the mechanism of tumorigenesis and potential targets for precision medicine.
BTCs at different anatomic locations may display similar genomic/epigenomic features. Previous comparative exome sequencing studies have found commonalities among BTCs in different locations, such as TP53, KRAS, KMT2C, and SMAD4 mutations, with a second tier of less frequently mutated genes including ARID1A, CDKN2A, and PIK3CA [2][3][4][5][6][7][8][9][10]. Mutational differences between CCAs and GBCs have tended to be in the frequency of mutations in certain genes, rather than different pathways of genes being mutated [2,5]. As for DNA methylation, the DNA methylation pattern of CCA and its association with prognosis have been reported in several studies [8][9][10][11]. As for GBC, the gradual methylation changes in the sequence of gallstone disease, dysplasia, and gallbladder cancer have been described [12], indicating the importance of DNA methylation in GBC carcinogenesis. There are several studies illustrating the associations of survival with the methylation of specific genes in GBCs, e.g., WIF1 and WISPIN [13][14][15], while the prognostic effect of DNA methylation changes has not been not fully addressed. In addition, no comparative study to date has identified the similarities among BTC methylomes and further explores their associations with prognosis and potential benefit from precision medicine.
Here, we identified 3369 common differentially methylated regions (DMRs) between CCAs and GBCs and uncovered their associations with genomic aberration, survival outcome, and immune characteristics. Of note, the BTCs with minimal methylation changes exhibited an inflamed tumor immune microenvironment (TIME) with programmed death-ligand 1 (PD-L1) expression elicited by immune attack, potentially suggesting better immunotherapy efficacy.

Patients
The included samples of the Eastern Hepatobiliary Surgery Hospital (EHSH) cohort consist of three parts: cancerous, adjacent, and precancerous tissues. The 105 tumor samples (24 intrahepatic CCAs [iCCAs], 20 perihilar CCAs [pCCAs], 13 distal CCAs, and 48 GBCs) and 50 adjacent non-tumor samples (gallbladder [n = 28] and bile duct [n = 22]) were obtained immediately following surgery at EHSH from October 2017 to September 2019. The diagnosis and tumor purity were confirmed by two independent pathologists based on the resected samples. Characteristics of the cancerous and adjacent normal tissues are shown in Additional file 1: Table S1-2, respectively. All participants had not received anti-tumor treatment before surgery. Participants with carcinoma or benign disease of other organs were excluded from this study. In addition, eight samples of precancerous disease (e.g., gallbladder polyps) were collected for exploratory analysis (Additional file 1: Table S3).
The Cancer Genome Atlas-cholangiocarcinoma (TCGA-CHOL) cohort of 36 patients with epigenomic, transcriptomic, mutational, immune-related, and survival (OS and progression-free interval [PFI]) data were analyzed in the present study [10,16]. The immune characteristics of this cohort (e.g., signatures and immune cell sorting) were retrieved from the study of Thorsson et al. [16].
Patients or the public were not involved in the design, conduct, reporting, or dissemination plans of our research. All collection and usage of human samples and clinical data were in accordance with the principles of the Declaration of Helsinki and approved by the Ethics Committee of Eastern Hepatobiliary Surgery Hospital (EHBHKY2018-02-014). The written consents were received from all the participated patients. This report follows the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

DNA methylation profiling and identification of differentially methylated region (DMR)
All sequencing experiments were implemented in a College of American Pathologists (CAP)-and Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory (Burning Rock Biotech, Guangzhou, China) before May 2020. The procedure for DNA extraction was as previously described [17]. In brief, DNA was extracted with a QIAamp DNA formalin-fixed paraffinembedded (FFPE) tissue kit according to the manufacturer's instructions. DNA concentration was measured by the Qubit double-stranded DNA assay (Life Technologies, Carlsbad, CA, USA).
As for methylation sequencing, a capture-based method, SeqCap Epi CpGiant Probes (Roche Sequencing Solutions, Madison, WI, USA) was performed to detect > 5.5 million CpG sites (capture size, 80.5 Mb) [18]. We generated a bisulfite sequencing library with the brELSA™ method (Burning Rock Biotech, Guangzhou, China) [19]. The target libraries were quantified by realtime PCR and sequenced on NovaSeq 6000 with 50× target depth on average.
Since differentially methylated regions consisting of multiple CpG sites played more important roles than a single CpG site in cancer detection as reported [20], we defined 319,133 methylation regions of CpG sites with close genomic distance and highly correlation in methylation level [19], which were analyzed in the present study.

NGS testing and analysis of mutation and copy number variation
As for mutation sequencing, a capture-based targeted deep sequencing was performed using a 520-gene panel, spanning 1.64 Mb of the human genome (included genes are shown in Additional file 1: Table S4). Detailed descriptions of sequencing and capturing single nucleotide variant and copy number variation are shown in Additional file 2: Method S1.
Mutated genes included in our analysis were restricted to non-silent mutations consisting of non-sense mutation, missense mutation, frameshift mutation, inframe mutation, splice site mutation, translation start site mutation, and non-stop mutation. Truncating mutations of oncogene were excluded because most of these are passenger mutations with limited cancer-promoting function.
The signaling pathways and their members we analyzed are shown in Additional file 1: Table S5. The definition of pathways drew upon previous genomic studies [3,5].

Assessment of programmed death-ligand 1 (PD-L1) protein expression
For each tumor FFPE block, a 5-μm section was cut and stained with the Dako 22C3 mouse monoclonal antibody with Dako Autostainer Link-48 platform according to the manufacturer's instructions. Cores showing a neoplastic component ≥ 30% were considered as adequate. PD-L1 positivity was determined by any expression in tumor cells or immune cells evaluated independently by two pathologists.

Statistical analysis
To assess the between-group difference, we performed the (i) Fisher exact test, chi-square test, and Cochran-Armitage test for trend for categorical variables; (ii) Mann-Whitney test for continuous variables; and (iii) Kaplan-Meier (KM) method, log-rank method, and Cox regression (hazard ratio [HR] and 95% confidence interval [CI]) for survival variables (OS and PFI). The covariates with P value below 0.05 in the univariable analysis were included in the following multivariable model. The differentiated methylation regions (DMRs) between tumors and adjacent tissues were selected by the Mann-Whitney test for further clustering. Detailed description of single sample gene set enrichment analysis is shown in Additional file 2: Method S2.
The non-supervised clustering of methylation data in both the EHSH and TCGA cohorts was performed by the K-means method (R package, ConsensusCluster-Plus). Gene Ontology (GO) enrichment analysis was performed by R (R package, clusterProfiler) [21][22][23]. The 12-DMR prognostic model was built via least absolute shrinkage and selection operator (LASSO).
All statistical analyses mentioned above were performed using IBM SPSS Statistics 22 and R 3.4.2, and the graphs were drawn by GraphPad Prism 9 and R 3.4.2. The nominal level of significance was set as 5%, and all 95% CIs were 2-sided.

Identification of common DMRs among BTCs
By comparing the methylomes of cancerous and normal adjacent tissues separately in CCAs (57 treatment-naïve CCAs vs. 22 adjacent bile ducts) and GBCs (48 treatment-naïve GBCs vs. 28 adjacent gallbladders), 5279 significant DMRs with β value difference above 0.15 in both CCAs and GBCs were identified (Fig. 1A). Moreover, we assessed the consistency of these 5279 DMRs among iCCAs, pCCAs, and dCCAs, and 3369 DMRs had minimal β value differences of 0.15 in all three subgroups (Fig. 1A). Of these, 835 DMRs showed lower β values in tumor samples compared to adjacent nontumor samples (referred to as hypomethylation DMRs), and the other 2534 DMRs exhibiting higher β values were referred to as hypermethylation DMRs. All of the 3369 DMRs showed great significance by comparing BTCs and adjacent non-tumor samples (maximal P value = 5.66 × 10 −8 , FDR P value< 0.05, Fig. 1B). In addition, we found significant positive correlations of the β value differences of the 835 hypomethylation DMRs among iCCAs, pCCAs, dCCAs, and GBCs (Fig. 1C). Similarly, significant positive correlations were found in the 2534 hypermethylation DMRs (Fig. 1D). These correlations further indicate the consistency of these DMRs among the BTCs at different anatomic locations.
The hypermethylation DMRs were enriched in the regions of CpG islands, promoters, and exons of coding sequence, but not CpG shores, CpG shelves, open sea, introns, and intergenic regions ( Fig. 1E and Additional file 1: Table S6). The top 50 hypermethylation and hypomethylation DMRs are displayed in Additional file 1: Table S7 and S8, respectively. GO analysis revealed that hypomethylation mainly affected the genes related to Rho GTPase binding, transmembrane transporter activity, and lyase activity (Fig. 1F), and hypermethylation targeted DNA binding and transcription (Fig. 1G).

Prognostic correlates of DNA methylation-based clusters
Using the top 1000 most variable DMRs of the 3369 DMRs identified above, 163 tissues (105 treatment-naïve BTC samples, 50 adjacent non-precancerous bile duct or gallbladder tissues, and 8 precancerous lesions) were Fig. 1 Identification of the common DMRs among BTCs in the EHSH cohort. A Diagram of the identification process. B Volcano plot illustrating the P value and β value difference of the 3369 DMRs. C The β value difference of the hypomethylation DMRs and their inner-correlations among iCCAs, pCCAs, dCCAs, and GBCs. D The β value difference of the hypermethylation DMRs and their inner-correlations among iCCAs, pCCAs, dCCAs, and GBCs. E Sankey plot of the hypomethylation and the hypermethylation DMRs. F, G Gene Ontology enrichment analyses of the hypomethylation-(F) and the hypermethylation-associated genes (G). BTC, biliary tract cancer; i/p/dCCA, intrahepatic/perihilar/distal cholangiocarcinoma; DMR, differentially methylated region; GBC, gallbladder cancer; GO, Gene Ontology classified into 6 groups by non-supervised consensus clustering (Fig. S1). Baseline characteristics of the abovementioned samples and the comparison among iCCAs, d/pCCAs, and GBCs are shown in Additional file 1: Table S1-3. As delineated in the heatmap ( Fig. 2A), ranging from the methylation cluster 1 to cluster 6, the degree of global methylation changes gradually decreased, and the proportion of non-precancerous adjacent tissues gradually increased. Despite that the GBCs in our cohort exhibited poorer histological grade and later TNM stage compared to iCCAs and d/pCCAs (Additional file 1: Table S1), similar proportions of GBCs, dCCAs, pCCAs, small-duct iCCAs, and large-duct iCCAs were observed in different methylation clusters (P = 0.72), indicating that the methylation clusters based on the 3369 common DMRs were independent of anatomic sites and potentially reflected a shared characteristic among the cancers along biliary tract.
As non-precancerous adjacent tissues were enriched in the methylation clusters 4-6 with minimal methylation changes ( Fig. 2A), we speculated that the BTCs in clusters 4-6 might exhibit lower invasiveness, and therefore, clusters 4-6 might be associated with longer survival. Among the 80 BTC patients with OS data (characteristics are shown in Additional file 1: Table S1), clusters 4-6 showed better prognosis compared to clusters 1-3 (HR = 0.07, 95% CI 0.01-0.65, P = 0.017, Fig. 2B). Similar trends were observed in the subgroups classified by anatomic location, i.e., iCCAs, d/pCCAs, and GBCs (  2E). We hereby defined the cluster-based risk according to our methylation clusters (high cluster-based risk, clusters 1-3; low cluster-based risk, clusters 4-6). We further use the top 500 and 250 most variable DMRs to perform non-supervised clustering. Compared to the one using top 1000 DMRs as described above, 95.7% of the samples had the same results of the cluster-based risk (Additional file 1: Table S9), indicating the robustness of the classification by common methylation DMRs among BTCs.
Tumor purity (also termed as tumor cellularity) might somewhat influence the sequencing data of methylome and therefore the clustering outcome. Similar to the result of Goeppert et al.'s study in iCCAs, we observed slightly lower tumor cellularity in the samples with lower methylation changes (average, 40.8% in clusters 4-6 compared to 64.1%, 56.5%, and 47.7% in clusters 1, 2, and 3, respectively, Fig. 2F). Lower tumor purity might indicate a higher proportion of stromal area and/or higher density of tumor-infiltrating immune cells, which was partially suggested by the lower CD8A methylation in clusters 4-6 which might reflect higher infiltration of CD8 + T cells. Despite the slight difference of tumor purity in different clusters, tumor purity was not linked with OS (continuous variable: HR = 1.00, 95% CI 0.99-1.02, P = 0.933; categorical variable [≥ 50% vs. < 50%]: HR = 1.00, 95% 0.99-1.02, P = 0.672). The cluster-based risk showed an undifferentiated prognostic effect in the patients with higher tumor purity (≥ 50%, P = 0.077) and lower tumor purity (< 50%, P = 0.022, Fig. 2G). These results rule out the influence of tumor purity on the prognostic effect of the cluster-based risk.
To investigate whether the prognostic effect of the cluster-based risk is independent of other variables, we firstly analyzed the association between clusters and key clinicopathological variables, including sex, age, smoking history, anatomic location, TNM stage, histological grade, resection margin, carcinoembryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), alphafetoprotein (AFP), and multiple immunohistochemical staining markers (e.g., Hep-1, Mucin 1 [MUC-1], P63, and S100). No significant association was revealed except the lower frequency of CEA abnormality in clusters 4-6 (P = 0.007, Additional file 1: Table S10). Moreover, univariable analyses discovered that anatomic location, TNM stage, histological grade, resection margin, and CEA abnormality were significantly associated with OS in addition to the cluster-based risk (Table 1). In the multivariable model, OS was significantly associated with the cluster-based risk (HR = 0.13, 95% CI 0.02-0.96, P = 0.045), rather than anatomic site (Table 1). These results suggest that the methylation-based clusters might be an independent biomarker predicting prognosis in BTCs.

Prognostic correlates of the methylation level of individual DMRs.
To specifically discover the prognosis-related DMRs, 80 BTC patients with OS data were randomly assigned to the training and validation sets with a 2:1 ratio (Fig. 3A), and univariable analysis revealed 54 prognosis-related DMRs in the training set. Based on these 54 DMRs, LASSO regression was performed and constructed a LASSO score based on 12 DMRs (optimal lambda selection and LASSO coefficient profiles are shown in Fig. S2 and Additional file 1: Table S11, respectively). Using this score and the cutoff (median value) derived from the training set, we observed a consistent prognostic effect in both the training set (P < 0.001, Fig. 3B) and the validation set (P = 0.047, Fig. 3C). In the total 80 BTC patients (methylation data are shown in Fig. 3D), a lower LASSO-based risk was associated with better OS (HR = 0.21, 95% CI 0.10-0.43, P < 0.001, Table 1). Combining the LASSO-based risk and the cluster-based risk could differ the BTC patients into subgroups with distinct survivals (P < 0.001, Fig. 3E). The prognostic effect of the LASSO-based risk was undifferentiated in the subgroups classified by anatomic location (GBC: P = 0.009; CCA: P = 0.041, Fig. 3F), TNM stage (stages I-II: P = 0.070;  (Table 1). These results indicate the good robustness of LASSO-based risk that may be effective regardless of other key variables including TNM stage and anatomic location.

Genomic correlates of DNA methylation-based clusters
Of the 105 tumor samples, 99 received targeted deep sequencing of mutational events of the coding sequence and splice sites, large genomic rearrangement, copy number variation (CNV), fusion, and microsatellite (Fig.  4A). Tumor mutational burden (TMB) was slightly higher in cluster 1 (Fig. 4B), and CNV events were enriched in clusters 1-2 (Fig. 4C), indicating the association between methylation changes and chromatin instability.

Immune correlates of DNA methylation-based clusters
Due to the limited publicly available datasets of GBCs with all epigenomic, transcriptomic, and survival data, we sought to firstly investigate the association between methylation and immune characteristics in a cholangiocarcinoma dataset from TCGA (TCGA-CHOL). Thirtysix cancerous tissues and nine matched normal tissues were classified into six groups via non-supervised consensus clustering (Fig. S4). Similar to the result in the EHSH cohort, gradual changes of methylation were observed ranging from cluster 1 to cluster 6 (Fig. 5A). All normal samples were in cluster 6. Clusters 1-2 and clusters 3-5 were defined as the methyl-risk high and the methyl-risk low groups, respectively. Compared to the   Table  S13-14). Fraction altered (percentage of copy number altered chromosome regions out of measured regions) was higher in the methyl-risk high group, indicating chromatin instability (Fig. 5D). Thorsson et al. presented immunogenomics analyses of more than 10,000 tumors in TCGA, identifying 6 immune subtypes (C1: wound healing, C2: IFN-γ dominant, C3: inflammatory, C4: lymphocyte depleted, C5: immunologically quiet, C6: TGF-β dominant) that encompass 33 cancer types based on 6 key signatures [16]. Higher frequencies of C1 and C2 and lower frequencies of C4 and C6 were observed in the methyl-risk low group (P = 0.045, Fig. 5E), and a lower methyl-risk was associated with higher scores of macrophage and lymphocyte signatures (Fig. 5F). As described by Thorsson et al., both C1 and C2 subtypes had low Th1/Th2 ratio, high proliferation rate, and high intratumoral heterogeneity, and C2 had the highest M1/M2 macrophage polarization, a strong CD8 signal, and the greatest TCR diversity. On the contrary, C4 displayed a more prominent macrophage signature with suppressed Th1 and high M2 response, and C6 had the highest TGF-β signature, demonstrating their immunosuppressive tumor microenvironments [16].
We further assess the association between the clusters and immune features. The samples with a lower methylrisk exhibited higher (i) richness and Shannon index of B/T cell receptor (Fig. 5G); (ii) expression of CD274 (PD-L1) and cytotoxic T-lymphocyte antigen 4 (CTLA-4, Fig. 5H); (iii) fraction of infiltrated leukocytes (Fig. 5I), including naïve B cell, plasma cell, monocyte, macrophage, CD8 + T cell, follicular helper T cell, and regulatory T cell (Fig. 5J); and (iv) multiple immune-related and angiogenesis signatures (Fig. 5K, L). Previous studies have demonstrated the pivotal role of vascular endothelial growth factor (VEGF) in interfering (i) the migration of antigen-specific T cells from the vessel into tumor and (ii) the recognition of cancer cells by cytotoxic T cells [26][27][28]. Generally, the angiogenesis score is negatively correlated with immune infiltration; however, we observed an overlap of greater infiltration of CD8 + T cell and higher angiogenesis signature in the methyl-risk low group (Fig. S5). We further calculated the signatures concerning the comparisons between naïve, effector, and exhausted CD8 + T cells (Fig. S6) and observed that the scores of (i) naïve vs. effector (P = 0.089) and (2) naïve vs. exhausted (P = 0.089) trended higher in the methylrisk high subgroup, and the effector vs. exhausted score was similar in the methyl-risk high and the methyl-risk low subgroups (P = 0.58). These results indicate that although the samples with lower methyl-risk had more effector and exhausted CD8 + T cells compared to naïve CD8 + T cells, the quantities of effector and exhausted CD8 + T cells were comparable, reflecting an adaptive resistance to immune attack, which might be partially due to the higher level of angiogenesis and potentially benefit from the regimens inhibiting VEGF and immune checkpoints.
Given the above findings from the TCGA-CHOL cohort, we sought to partially verify them in the EHSH cohort in terms of PD-L1 protein expression and tumorinfiltrating CD8 + T cells. A lower methyl-risk was associated with PD-L1 positivity (P = 0.040, Fig. 6A). The quantity of tumor-infiltrating CD8 + T cells could be reflected by CD8A methylation, as significant correlations were revealed between higher CD8A methylation, lower CD8A mRNA expression, and fewer tumorinfiltrating CD8 + T cells in the TCGA-CHOL cohort (Fig. 6B). In the EHSH cohort, we observed higher methylation levels of the five DMRs related to the CD8A gene in clusters 1/2/3 (P < 0.001, Fig. 6C), indicating poorer infiltration of CD8 + T cells in the clusters with larger methylation changes. Similar trends of the above results were shown in both the CCAs and the GBCs in the EHSH cohort, indicating the associations of methylation changes with PD-L1 expression and tumorinfiltrating CD8 + T cells might be a common feature in all BTCs. Given all the above immune-related results from two independent cohorts, fewer methylation changes might potentially be a novel predictor of better response to immunotherapy.
Distinguished by the density of tumor-infiltrating lymphocytes (TILs), there are two causes of PD-L1 expression: (i) prior attack by immunity (adaptive resistance) and (ii) activation of the oncogenic pathway (intrinsic induction) [29] respectively predicting better and worse benefit from immune checkpoint inhibitors (ICIs) in non-small cell lung cancer [30]. In our cohort, all PD-L1 + samples in clusters 1/2 had a high level of CD8A methylation (above median), and all PD-L1 + samples in clusters 3-6 had a low level of CD8A methylation (below median, P < 0.001, Fig. 6D). In addition, a large difference of OS with borderline significance was uncovered between the PD-L1 + /CD8A methyl-high and the PD-L1 + / Getting back to the prognostic effect of methylation changes, in order to determine whether the effect of methylation on OS is mediated by immune characteristics, we performed multivariable analysis to adjust for CD8A methylation and PD-L1 positivity. The association of the cluster-based risk and the LASSO-based risk with OS remained significant (Fig. 6G), indicating that the prognostic effect of DNA methylation might be partially but not completely on account of the infiltration of CD8 + T cells and the immune escape via PD-L1.

Discussion
The common DNA methylation changes among BTCs may reflect their similar oncogenic mechanisms. Using these common DMRs, BTCs could be stratified into subgroups with distinct genomic aberrations, immune-related features, and survival outcomes (summarized in Fig. 6H).
Previous studies have identified the critical methylation changes during the tumorigenesis and development of CCAs and GBCs, respectively. However, no comparative study to date has identified the shared methylation features among BTCs. We identified 3369 common DMRs, and GO analysis revealed that hypomethylation mainly affected the genes related to Rho GTPase binding, transmembrane transporter activity and lyase activity, and hypermethylation targeted DNA binding and transcription. The mutations of IDH1/2 in our EHSH cohort were rare, and none of the two mutations (IDH1: p. V152L and p. K4N) was at the hotspot (codon 132), which made us impossible to assess the contribution of IDH1/2 to epigenetic regulation. It is presumable that by using these common DMRs in the following analyses, we may arrive at the results generally available for all BTCs, but not limited to a specific BTC subtype at a certain anatomic location.
As for prognostication, a higher level of DNA methylation changes was independently associated with a poorer prognosis in the present study. Two previous studies mainly focusing on iCCAs reached similar results [8,9], and we further expand its application to d/pCCAs and GBCs by using the common 3369 DMRs. Furthermore, we identified 12 significant DMRs associated with OS by LASSO regression, and the LASSO score effectively predicted OS in both training and validation sets and was identified as an independent prognostic risk factor. Compared to other prognostic risk factors, the discrimination potentials of the cluster-based risk and the LASSO-based risk were found to be superior. Of note, abundant studies have pointed out the different prognoses in GBCs, d/pCCAs, and iCCAs, which we also observed in the univariable analysis (P < 0.001). However, the prognostic effect of anatomic location was remarkably decreased after adjusting for TNM stage, resection margin, histological grade, CEA abnormality, and the cluster-based risk or the LASSO-based risk based on methylation changes (P = 0.70 and 0.98, respectively, Table 1). These results indicate that the different survivals in GBCs, d/pCCAs, and iCCAs may be largely on account of other key parameters. Moreover, the significant prognostic effects of the cluster-based risk and the LASSO-based risk in the multivariable models suggest their robustness. Combining methylation biomarkers (See figure on previous page.) Fig. 4 Association between methylation cluster and mutational events in the EHSH cohort. A Clinicopathological and mutational data of the 99 BTC samples. B, C Associations of the methylation-based clusters with TMB (B) and CNV (C). D Association between anatomic location and mutational rate. E Associations of mutation with the methylation-based clusters and overall survival. BTC, biliary tract cancer; CCA, cholangiocarcinoma; CNV, copy number variation; DMR, differentially methylated region; GBC, gallbladder cancer; LASSO, least absolute shrinkage and selection operator; TMB, tumor mutational burden with clinical features may further improve the prognostic estimation, which helps identify patients who would need more aggressive treatment and surveillance. However, our study was limited by sample size (n = 105), and further investigations to adequately assess the reliability of methylation biomarkers in clinical decision-making for BTC patients are still needed.
Of note, tumor cellularity, a confounding feature in DNA methylome studies in general, is detrimental in the study of BTCs which display desmoplasia [31]. We observed lower tumor purity in clusters 4-6 with minimal methylation changes compared to clusters 1-3 with larger methylation changes (37.5% vs. 60.0%, difference = − 22.5%) in our EHSH cohort. However, in the TCGA cohort, we also observed a higher leukocyte fraction in the clusters with minimal methylation changes to a similar extent (31.0% vs. 9.3%, difference = 21.7%). These results indicate that the difference in the tumor cellularity across different clusters may be on account of immune cell infiltration instead of desmoplasia. In addition, it is the cluster-and LASSO-based risks rather than tumor purity that were associated with OS in our analysis, and the subgroup analyses based on tumor purity and the multivariable analyses both support that the prognostic utilities of the cluster-and LASSO-based risks are not influenced by tumor purity.
In terms of precision medicine, the clusters with minimal methylation changes exhibited fewer opportunities for targeted therapy (IDH1/2 mutation 0%, FGFR2 fusion 0%, BRCA1/2 mutation 6%, ERBB2 amplification 0%, BRAF mutation 0%), but a hot and inflamed tumor immune microenvironment (TIME), indicating favorable benefit from immunotherapy [32][33][34][35][36][37][38][39][40][41]. More importantly, the PD-L1 expression in the methyl-risk high BTCs may be mainly induced by intrinsic activation of oncogenic pathways rather than prior immune attack. We further put forward possible candidates for this "oncogenic pathway," including BRAF, NOTCH pathway, Hippo pathway, BAF complex, and two DNA damage repair pathways (MMR and Fanconi). On the contrary, all PD-L1-positive samples with lower methyl-risk exhibited lower CD8A methylation, suggesting immune attackinduced PD-L1 expression. The PD-L1 + /CD8A methyl-low and the PD-L1 + /CD8A methyl-high BTCs had different prognoses and may potentially respond to ICI treatment in different depths owing to the immune-desert properties of the PD-L1 + /CD8A methyl-high BTCs with no priming. In the CD8A methyl-low BTCs, we also observed higher scores of angiogenesis signature probably suggesting stromal interactions in TIME that inhibits further immune cell infiltration, suggesting a possibility that antiangiogenic agents might further enhance the density of TILs and boost immunotherapy efficacy in the BTCs with higher CD8 + TILs.
Recently, Huang et al. have described the larger TIME component of Epstein-Barr virus (EBV)-associated iCCAs (EBVaICC) compared to non-EBVaICC [42]. Although the DNA methylome was not assessed in their study, plenty of evidence has pointed out the associations of EBV infection with the abnormal DNA methylation changes in solid tumors such as gastric and nasopharyngeal carcinomas [43]. Despite the rarity of EBV infection in BTCs [42], it is interesting and valuable to investigate whether the EBV-associated BTCs exhibit distinct genetic and epigenetic alterations and benefit from ICI treatment.
As for the methyl-risk high patients, immunotherapy efficacy might be improved by combining demethylation agents [44], which can induce T cell attraction and reactivation by synergistically upregulating tumor antigen presentation [45][46][47] and downregulating immune suppressive signals in solid tumors [48][49][50]. At present, all the clinical trials are in phase I/II, assessing the tolerance and efficacy of ICI plus demethylation agent [44]. More clinical and basic research probing into whether larger methylation changes can predict favorable benefits from this combination therapy is warranted.
As for limitation, first, the TNM stage was different among iCCAs, d/pCCAs, and GBCs in the present study, which may limit the exploration of the associations of anatomic location with methylation subtypes and prognosis. Given this, we performed subgroup analysis and multivariable analysis to rule out the impact from covariates including TNM stage and anatomic location on the prognostic effect of DNA methylation. Second, the sample size of the present study is not large and most samples were not obtained from advanced BTC patients, making it difficult to provide reliable explanations for (See figure on previous page.) Fig. 6 Association between global methylation changes and immune characteristics in the EHSH cohort. A Association between the methylationbased clusters and PD-L1 positivity (CPS score). B Correlations between CD8A methylation, CD8A mRNA, and the infiltration of CD8 + T cells in the TCGA-CHOL cohort. C Heatmap and table illustrating the methylation data and features of the five CD8A DMRs. D Associations of the methyl-risk with PD-L1 positivity and CD8A methylation. E Kaplan-Meier curves illustrating the OS of the subgroups classified by PD-L1 positivity and CD8A methylation. F Prognostic effect of the cluster-based and the LASSO-based risks after adjusting for CD8A methylation and/or PD-L1 positivity. G Representative features of the methylation-based clusters in terms of methylation changes, mutational events, immune-related characteristics, prognosis, and potential usefulness for predicting the benefits from targeted therapy and immunotherapy. CCA, cholangiocarcinoma; CPS, combined positive score; GBC, gallbladder cancer; LASSO, least absolute shrinkage and selection operator; PD-L1, programmed death-ligand 1; TMB, tumor mutational burden the different responses to ICIs of the tumors in different anatomic locations. Third, whole-genome sequencing and gene expression data that could contribute to comprehensive molecular subtyping are missing, and this study also lacks the functional validation of specific genes in cell lines to further clarify the relation between DNA methylation and mutational events. ARID2 mutations were enriched in clusters 1-2 with greater methylation changes, and a previous study in hepatocellular carcinoma indicates that ARID2 could recruit DNMT1 to the promoter, which increased promoter methylation [51]. Future basic studies may focus on the effect of ARID2 on the methylation in BTCs. Despite these, only based on the DNA methylation data, we successfully stratified BTC patients into subgroups with distinct prognosis and immune-related features in two independent cohorts. Importantly, consistent stratification utilities were observed in iCCAs, d/pCCAs, and GBCs, demonstrating the robustness of our findings.

Conclusions
To our knowledge, this is the first study that identifies common DNA methylation changes of CCAs and GBCs. By leveraging the common 3369 DMRs, we subtyped the BTC patients into subgroups with distinct genomic aberrations, immune characteristics, and survival outcomes. Additionally, the 12-marker prognostic model may be used for estimating survival outcomes. Our integrative analysis based on the common DMRs provides insights into BTC pathogenesis, prognostication after curative surgery, and patient selection for immunotherapy. Conceivably, by stratification of BTCs according to molecular profiling, subtype-specific treatment modalities may be achieved in the future, which in the long term might improve the survival of BTC patients.
Additional file 1: Table S1. Baseline characteristics of included 105 BTC patients. Table S2. Characteristics of patients with adjacent tissue. Table  S3. Characteristics of patients with precancerous tissue. Table S4. List of the genes in the 520-gene panel. Table S5. Members of the analyzed signaling pathways. Table S6. Characteristics of the common differential methylated regions in BTCs. Table S7. Top hypermethylation regions between BTCs and adjacent/benign samples. Table S8. Top hypomethylation regions between BTCs and adjacent/benign samples. Table S9. Comparison of the clustering results using top 1,000, 500, and 250 most variable methylation DMRs. Table S10. Clinical correlates of methylationbased clustering in patients with survival data. Table S11. Detailed information of the included methylation sites in the LASSO model. Table  S12. Association between mutational event and OS in the EHSH cohort. Table S13. Univariable and multivariable analyses of OS in the TCGA cohort. Table S14. Univariable and multivariable analyses of PFI in the TCGA cohort.
Additional file 3: Fig. S1. Consensus clustering of the EHSH cohort. Consensus matrix of non-supervised clustering of methylation signatures by K-means method (K=2-8) and delta plot assessing change in consensus cumulative distribution function area. Fig. S2. Feature selection using the LASSO algorithm for a prognostic model. A. The optimal tuning parameter (Lambda) in the LASSO model was selected using 3-fold crossvalidation. B. LASSO coefficient profiles of the 12 features. Fig. S3. Association between mutational rate and TNM stage. Fig. S4. Consensus clustering of the TCGA-CHOL cohort. Consensus matrix of non-supervised clustering of methylation signatures by K-means method (K=2-8) and delta plot assessing change in consensus cumulative distribution function area. Fig. S5. Overlap of greater infiltration of CD8+ T cell and higher angiogenesis signature clustering in the TCGA-CHOL cohort. Scatter plot illustrating the infiltration of CD8 T cell and the score of angiogenesis signature in the methyl-risk high and the methyl-risk low groups. Fig. S6. Associations of the methyl-risk with the signatures of naïve, effector, and exhausted CD8 T cells in the TCGA-CHOL cohort. Comparisons of the signatures concerning naïve vs. effector, naïve vs. exhausted, and effector vs. exhausted between the two subgroups defined by the methyl-risk in the TCGA-CHOL cohort.