Risk factors and risk prediction models for colorectal cancer metastasis and recurrence: an umbrella review of systematic reviews and meta-analyses of observational studies

Background There is a clear need for systematic appraisal of models/factors predicting colorectal cancer (CRC) metastasis and recurrence because clinical decisions about adjuvant treatment are taken on the basis of such variables. Methods We conducted an umbrella review of all systematic reviews of observational studies (with/without meta-analysis) that evaluated risk factors of CRC metastasis and recurrence. We also generated an updated synthesis of risk prediction models for CRC metastasis and recurrence. We cross-assessed individual risk factors and risk prediction models. Results Thirty-four risk factors for CRC metastasis and 17 for recurrence were investigated. Twelve of 34 and 4/17 risk factors with p < 0.05 were estimated to change the odds of the outcome at least 3-fold. Only one risk factor (vascular invasion for lymph node metastasis [LNM] in pT1 CRC) presented convincing evidence. We identified 24 CRC risk prediction models. Across 12 metastasis models, six out of 27 unique predictors were assessed in the umbrella review and four of them changed the odds of the outcome at least 3-fold. Across 12 recurrence models, five out of 25 unique predictors were assessed in the umbrella review and only one changed the odds of the outcome at least 3-fold. Conclusions This study provides an in-depth evaluation and cross-assessment of 51 risk factors and 24 prediction models. Our findings suggest that a minority of influential risk factors are employed in prediction models, which indicates the need for a more rigorous and systematic model construction process following evidence-based methods.


Background
Around 20-25% of patients with colorectal cancer (CRC) present with metastasis at initial diagnosis, while patients who are apparently cancer-free on investigation at diagnosis subsequently develop locoregional recurrence (18%), distant (78%) recurrence, or both (4%) [1]. Metastasis occurs when cancer cells from the original tumor are able to proliferate in local, regional, or distant tissues; lymph nodes; or organs via lymphatic, blood, or even transcoelomic spread [2]. CRC recurrence is defined as local, regional, and distant metastatic recurrence after a diseasefree period [3]. Local recurrence refers to CRC relapse that occurs at the site of original surgical resection [4], while regional recurrence occurs at draining lymph nodes and/or lateral pelvic lymph nodes [3]. Distant metastatic recurrence involves the liver (accounts for 40-50% of metastases), the lung (accounts for 10-20% of metastases), the peritoneum, the ovaries, the adrenal glands, the bone, and the brain [1,5]. It is estimated that 5-year survival rates are around 90%, 70%, and 10% for CRC localized, regional, and distant metastatic stages [6].
Validating individual risk factors and even more so multivariable prediction models of multiple risk factors for local, regional, or distant metastasis and recurrence is crucially important as these could guide management of the primary tumor and provide prognostic information for patients and their cancer clinicians. Prediction models may be more successful if they consider the most informative factors. This knowledge may eventually prove useful in managing CRC treatment with betterinformed patient choices. Understanding the underlying validity and predictive performance of risk factors for locoregional recurrence is particularly relevant, given progressive moves towards organ-preserving approaches such as endoscopic resection (EMR), trans-anal microscopic surgery (TEMS), and neo-adjuvant chemoradiotherapy for rectal cancer [1], since organ preservation may be at the expense of elevated recurrence rates. The corollary also applies since the risk-benefit ratio of extensive locoregional surgery and/or radiotherapy may be detrimentally impacted by future distant metastases.
A number of systematic reviews (with/without metaanalyses) have investigated existing risk factors for CRC metastasis and recurrence [7][8][9][10]. However, there is a need for a comprehensive evaluation of the available epidemiological evidence. Here, we conducted an umbrella review to identify and evaluate associations between risk factors and risk of CRC metastasis and recurrence. We also systematically collected and evaluated predictive models on CRC prognostic outcomes. We then conducted a comparative cross-assessment between the identified risk factors and the predictors employed in risk prediction models to examine to what extent predictive models include the most influential factors.

Protocol
The study protocol was developed in accordance with the reporting guidance in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) statement [11].
Umbrella review of systematic reviews (with/without meta-analyses) of risk factors Literature search and eligibility criteria A systematic search was performed in PubMed, Cochrane Library (Wiley), Web of Science (Thomson Reuters), and EMBASE (Ovid) from inception to 7 October 2019, to identify systematic reviews of observational studies with or without a meta-analysis that evaluated the associations between risk factors and risk of metastasis and recurrence in CRC (Additional file 1: Table S1). We further hand-searched reference lists of the retrieved eligible publications to identify additional relevant studies. All identified publications went through a parallel review of the title, abstract, and full text (performed by WX and YM independently) based on pre-defined inclusion and exclusion criteria following "PICOS." In particular, we included human participants from observational studies with no restriction to settings. Conversely, animal, in vitro, and in vivo experiments were excluded. For study outcomes, we included CRC metastasis (local, regional, or distant metastasis in tissues, lymph nodes, or organs at diagnosis) and CRC recurrence (local, regional, or distant metastatic recurrence in tissues, lymph nodes, or organs after a disease-free period). For study design, we included systematic reviews of observational studies with or without meta-analysis. Conversely, literature reviews, individual observational studies, systematic reviews, and metaanalyses that investigated the evidence on the efficacy of pharmaceutical drugs and therapeutic procedures were excluded. We included publications in peer-reviewed journals, and therefore, gray literature, comments, conference abstracts, and interviews were excluded.

Data extraction
Data were extracted by one investigator (WX) and checked by a second investigator (YH). For each included meta-analysis, the following items were extracted: study citation details, number of studies included, study design, study population, number of events and size of total population, risk factors, outcomes examined, reported summary meta-analytic estimates (e.g., risk ratio [RR], odds ratio [OR], hazard ratio [HR], the corresponding 95% confidence interval [95% CI], p value, and heterogeneity measures), instrument applied for quality and risk of bias assessment of component studies, and quality assessment result. The following items were further extracted from the individual component studies: study citation details, study design, study population, risk factors, outcomes examined, number of events and size of total population in exposed and unexposed groups, effect size, and 95% CI.

Evidence synthesis and evaluation
First, when two or more meta-analyses examining associations between the same risk factor and the same outcome were identified, the most recent meta-analysis of prospective cohort studies with the largest event number was prioritized and retained for further analysis. We also compared whether the results reported in overlapping meta-analyses were concordant in terms of direction, statistical significance, and association magnitude.
Second, we estimated the following metrics for each unique meta-analysis: (1) The summary effect size along with 95% CI was estimated based on random-effects models (DerSimonian and Laird (DL)) when the number of component studies was five or more and the Hartung-Knapp-Sidik-Jonkman (HKSJ) when the number of component studies was less than five [12,13]. (2) Heterogeneity was assessed by the I 2 statistic [14]. (3) The 95% prediction interval was estimated. (4) The small study effect was estimated by Egger's regression asymmetry test [15]. (5) Excess significance was assessed by a chi-square test [16]. Based on these metrics and by applying a set of pre-defined criteria (Additional file 1: Table S8), we evaluated the credibility of the evidence for each risk factor and categorized the evidence as convincing, highly suggestive, suggestive, or weak [17,18].
Lastly, for all meta-analyses that statistically represented at least 3-fold changes in the odds of the outcome, we evaluated the methodological quality and risk of bias based on the Assessment of Multiple Systematic Reviews 2.0 (AMSTAR 2.0) checklist [19]. We used an odds ratio of 3.0 as a threshold for what is a substantially large effect. There is no consensus on what an optimal threshold might be, but values between 2 and 5 are proposed typically [20].

Sensitivity analysis
We re-ran all meta-analyses by evaluating the outcome definitions of each individual component study reclassifying the outcomes to (i) CRC metastasis at presentation, (ii) CRC local recurrence after a disease-free period, and (iii) CRC distant recurrence after a disease-free period.
Comparative cross-assessment of risk factors and risk prediction models We performed a comparative cross-assessment between risk factors evaluated in the umbrella review and risk predictors included in existing prediction models. A recently published systematic review conducted by our team [21] investigated a total number of 15 prediction models for prediction of metastasis and recurrence in CRC patients with surgical resection (metastasis: N = 6; recurrence: N = 9). We updated the original search to identify studies developing and/or validating risk prediction models to predict metastasis and recurrence in all CRCs, with no restriction on whether the tumor was resected. We performed a systematic search in PubMed from inception to 7 October 2019 to identify eligible studies. We extracted data relevant to study design, study population, prediction outcome, prediction time horizon, predictors, model performance, and model presentation from each included study. We created a catalog of all variables that had been included across CRC metastasis prognostic models and separately across CRC recurrence prognostic models (presented in the same order as in the respective tables). We then assessed whether the included risk predictors were evaluated or not in the umbrella review described above. If yes, we also recorded the magnitude of the summary relative risk (typically odds ratio) and noted how many of those represented at least 3-fold changes in the odds of the outcome and how many had convincing or highly suggestive evidence in our assessment.
All statistical analyses were conducted in Stata, version 14.0 (StataCorp), and R, version 3.3.0 (R Foundation for Statistical Computing).

Literature review
A total of 2033 publications were retrieved from the systematic search in four databases. Eventually, 43 publications met all inclusion criteria ( Fig. 1, Additional file 1: Table S2) and that included 9 systematic reviews (metastasis: N = 7; recurrence: N = 2) and 81 meta-analyses (metastasis: N = 61; recurrence: N = 20; Additional file 1: Table S3 and Table S4) of observational studies. A total of 18 overlapping meta-analyses that examined associations between the same risk factor and the same outcome were identified (Additional file 1: Table S5). The most recent meta-analysis with the largest event number was prioritized. Within the remaining 63 unique metaanalyses, 12 meta-analyses from four publications did not report detailed OR, RR, or HR in forest plots. Finally, 51 unique meta-analyses were retained for analysis, which reported 34 unique risk factors for CRC metastasis and 17 risk factors for recurrence (Additional file 1: Table S6 and Table S7).

Meta-analyses of risk factors for CRC metastasis
Overall, 61 eligible meta-analyses of observational studies investigating risk factors for CRC metastasis were identified (Additional file 1: Table S3). More than one meta-analysis was conducted for seven risk factors (Additional file 1: Table S5). The direction of the summary effect size and the presence of nominal statistical significance (p < 0.05) of the reported associations in overlapping meta-analyses were concordant for six (86%) risk factors (Additional file 1: Table S5).
A total of 34 unique meta-analyses with available data were retained for further analysis (Additional file 1: Table S6). The median number of included component studies was five (range 2-41), the median number of the total population was 983 (range 76-10,128), and the median number of events was 138 (range 16-1808). The meta-analyses reported a wide range of risk factors (Additional file 1: Table S6): 17 histopathological risk factors (50%), 13 biomarkers (38%), three genetic risk factors (9%), and one demographic risk factor (3%). Overall, 21 (62%) of 34 unique meta-analyses reported effect sizes at p < 0.05 (Table 1). Based on the predefined credibility criteria, only one (3%) histopathological risk factor (vascular invasion for LNM in pT1 CRC) presented convincing evidence (see Additional file 1: Table S9 for the credibility assessment of all identified risk factors). Furthermore, 12 of 21 probed risk factors with p < 0.05 had an effect size suggesting ≥ 3-fold change in the odds of the outcome, while this was also seen for the point estimates in four of 13 probed risk factors where the meta-analysis had p ≥ 0.05 (Table 1).

Meta-analyses of risk factors for CRC recurrence
Overall, 20 eligible meta-analyses of observational studies investigating risk factors for CRC recurrence were identified (Additional file 1: Table S4). More than one meta-analysis was conducted for three risk factors (Additional file 1: Table S5). The direction of the summary effect size and the presence of nominal statistical significance (p < 0.05) of the reported associations between the same risk factor and the same outcome in overlapping meta-analyses were concordant for two (67%) risk factors (Additional file 1: Table S5).
A total of 17 unique meta-analyses with available data were retained for further analysis (Additional file 1: Table S7). The median number of included component studies was six (range 2-26), the median number of the    Table S7): five histopathological risk factors (29%), two biomarkers (12%), one genetic risk factor (6%), five clinical risk factors (29%), one comorbidity (6%), and three anthropometric indices (18%). Overall, 11 (65%) of 17 unique meta-analyses reported effect sizes at p < 0.05 (Table 2). No risk factor presented convincing evidence (Additional file 1: Table S10). In addition, four of 11 probed risk factors with p < 0.05 had an effect size suggesting ≥ 3-fold change in the odds of the outcome (Table 2).

Sensitivity analysis of redefying the disease outcome groups
We performed a sensitivity analysis to include individual component studies investigating risk factors for metastasis at presentation and re-evaluated the credibility of the evidence (Additional file 1: Table S11). A total of 16 unique meta-analyses including 67 (27%) component studies were retained and investigated. The remaining 185 (73%) studies did not illustrate when metastasis was present (i.e., at diagnosis or after a disease-free period) and therefore could not be included in this sensitivity analysis. Based on the pre-defined criteria, no risk factor presented convincing evidence.
Similarly, a sensitivity analysis was performed to include individual component studies investigating risk factors for local or distant recurrence (Additional file 1: Table S12). A total of 13 unique meta-analyses composed of 81 (58%) component studies (including five meta-analyses investigating distant metastasis after a period of being disease-free) were retained and investigated. The remaining 59 (42%) studies did not separate local or distant recurrence and therefore could not be included in our sensitivity analysis. Furthermore, no risk factor presented convincing evidence (Additional file 1: Table S12).
Comparative cross-assessment between risk factors evaluated in the umbrella review and risk predictors applied in existing prediction models Prediction models for CRC metastasis Twelve prognostic models have been developed for prediction of CRC metastasis [22][23][24][25][26][27][28][29][30][31][32][33] (Table 3). The median number of included predictors was four (range 3-9), and 27 unique predictors were included in at least one model. Cancer stage (N = 9, 75%) was the most commonly used predictor variable in the 12 prognostic models. Other common predictors included histopathological risk factors such as positive lymph nodes (N = 3, 25%), tumor grade or differentiation (N = 2, 17%), and tumor histological type (N = 3, 25%); biomarkercarcinoembryonic antigen (CEA) (N = 3, 25%); age (N = 3, 25%); gender (N = 2, 17%); and clinical treatment such as surgery, chemotherapy, and radiotherapy (N = 3, 25%). Five models (42%) performed internal validation, and four models (33%) were validated in external datasets. We conducted a cross-assessment between these predictors and 34 risk factors that were evaluated in our umbrella review. Six of 27 unique predictors (tumor budding, tumor differentiation, tumor size, vascular invasion, submucosal invasion, and sex) were evaluated in the umbrella review ( Table 5). The associated ORs for these six risk factors varied from 2.23 to 6.76, and four of them (67%) corresponded to ≥ 3fold change in the odds of the outcome. Of the remaining 28 risk factors that were not employed in prediction models, ORs varied from 0.45 to 6.78, and 13 (46%) represented ≥ 3-fold change in the odds of the outcome.
In addition, we compared the overlapping outcomes to investigate whether prediction models had included influential risk factors (those presented convincing evidence or with 3-fold change in the odds of the outcome) when they predicted the same outcomes as those evaluated in the umbrella review (Table 6). In total, four overlapping outcomes were found in this cross-assessment (LNM in pT1 CRC, LNM in CRC, hepatic metastasis in CRC, and distant metastasis in CRC). For only one outcome (LNM in pT1 CRC), two prognostic models [22,28] included four risk predictors that were also evaluated in the umbrella review, two of which corresponded to ≥ 3-fold change in the odds of the outcome (tumor budding, tumor differentiation).
In relation to overlapping outcomes, only one outcome (overall recurrence in CRC) was identified (Table 6). However, the prognostic model [36] included risk predictors that were not evaluated in the umbrella review

Discussion
We initially synthesized and evaluated the evidence of risk factors for CRC metastasis and recurrence. Our study comprised 51 unique meta-analyses of observational studies investigating 34 risk factors for CRC metastasis and 17 risk factors for recurrence. We also conducted a sensitivity analysis of 29 unique meta-   Effect size (95% CI), effect size from the risk prediction models analyses of risk factors for CRC metastasis at presentation (n = 16), CRC local recurrence (n = 5), and CRC distant recurrence (n = 8) using a standardized categorization of the component studies. Furthermore, we updated synthesis of risk prediction models for CRC metastasis (n = 12) and recurrence (n = 12) and then conducted a cross-assessment of individual risk factors evaluated in the umbrella review and risk predictors included in existing prediction models, which allowed us to examine to what extent predictive models include the most influential factors.

Meta-analyses for CRC metastasis
According to our pre-defined criteria for assessing the credibility of the evidence, only one risk factor was classified as convincing (vascular invasion for LNM in pT1 CRC), reflecting strong statistical significance and no hints of bias. Many studies have demonstrated that the invasion of blood vessels leading to tumor cell dissemination and metastasis is a strong risk factor for disease prognosis, which is in line with our umbrella review [44,45]. Based on our findings, a large proportion of studies (17/25, 68%) investigated lymphatic and vascular invasion as separate risk factors, while 32% of studies categorized them jointly as lymphovascular invasion. It has been shown though that the predictive ability of lymphovascular invasion is lower than that of vascular invasion [46]. Twelve (35%) of 34 probed risk factors for metastasis had an effect size suggesting ≥ 3-fold change in the odds of the outcome with p < 0.05. Four of these risk factors (lymphatic invasion for LNM in pT1 CRC; tumor budding for LNM in pT1 CRC; tumor budding for LNM in all stage CRC; tumor size > 1 cm for LNM in rectal cancer) were classified as highly suggestive. As discussed above, lymphatic invasion could be an indicator of tumor cells metastasizing to lymph nodes. This finding agrees with three recently published studies manifesting that lymphatic invasion is causally associated with the risk of LNM in CRC [47][48][49]. Tumor budding is recognized as a negative prognostic risk factor for LNM in CRC, and our findings are concordant with previous studies [50][51][52]. Individual component studies vary in their definitions of tumor budding (e.g., how many cancer cells comprise a tumor bud, and how many buds signify tumor budding) and vary in the pathologic staining methods to detect tumor budding (e.g., hematoxylin and eosin [H&E], immunohistochemistry [IHC]). Furthermore, a systematic review summarized pathologic methods to detect tumor budding and revealed that all studies even when utilizing different methods showed that tumor budding increases the risk of CRC metastasis [53]. Notably, substantial between-study heterogeneity (I 2 > 50%) was found in the meta-analysis investigating tumor budding for LNM in all CRC stages, indicating that this association needs to be interpreted with caution. The observed heterogeneity may be influenced by the inclusion of different tumor stages. Finally, tumor size > 1 cm is associated with an increased risk of LNM in rectal cancer. This largely agrees with the European Society for Medical Oncology (ESMO) clinical practice guideline manifesting that a rectal lesion less than 1 cm has a lower risk of metastasis, and therefore, local excision (TEM) is suggested [54].

Meta-analyses for CRC recurrence
In regard to 17 probed risk factors for CRC recurrence, four (24%) had an effect size suggesting ≥ 3-fold change in the odds of the outcome with p < 0.05. None of them presented convincing evidence. Three (tumor budding for overall recurrence in CRC; perineural invasion [PNI] for local recurrence in rectal cancer; MRI-detected extramural vascular invasion [mrEMVI] for distant metastatic recurrence in rectal cancer) were classified as highly suggestive. Our findings suggest that tumor budding is a common highly suggestive risk factor for both CRC LNM and overall recurrence. However, there is a need for standardization of the histopathological definition of tumor budding [46]. Another histopathological risk factor, PNI, which is a common pathological feature in rectal cancer, strongly signifies local recurrence. Compared to colon cancer, PNI occurs more frequently in rectal cancer, since there is a cluster of intensive neural plexuses surrounding the pelvis in the rectum [55]. The National Comprehensive Cancer Network (NCCN) guidelines also suggest that patients with PNI positive are at higher risk of local recurrence [56]. However, there is no consensus in the definition of PNI positive, with two of the most frequently used definitions being SS-PNI (when tumor cells surround at least 33% of the nerve) and TS-PNI (when tumor cells surround any of the three layers of the nerve) [57][58][59][60]. Finally, we found that mrEMVI increases the risk of distant metastatic recurrence. EMVI is the venous invasion beyond the muscularis propria, which has long been recognized as a risk factor for distant recurrence [61][62][63]. The 5-point MRIdetected EMVI scoring system is precise for detecting this invasion, and it is recommended as a post-operation follow-up strategy in clinical settings [64]. In addition, a recently published meta-analysis is also in line with our findings, reporting that around 90% of patients with liver metastases are mrEMVI positive [65].

Sensitivity analysis
In our effort for a consistent definition of metastasis and recurrence, we re-categorized all the component studies to three distinct disease outcomes: metastasis at presentation, local recurrence, and distant recurrence. This could generate insight into metastasis and recurrence patterns and provide investigators and clinicians with a more comprehensive summary of risk factors for these CRC prognostic outcomes with clinical significance [66]. Our sensitivity analyses reported a dearth of convincing evidence. However, a total of 244 (62%) individual component studies were excluded from our sensitivity analyses due to missing information in relation to outcome definition.
Cross-assessment between risk factors evaluated in the umbrella review and risk predictors applied in existing prediction models We identified 24 CRC prognostic models for metastasis (n = 12) and recurrence (n = 12). The majority of risk prediction models applied an average of four to five predictor variables. The most commonly used predictors were clinic-histopathological (cancer stage, lymph node status) and demographic (gender, age) parameters. Seven models were validated internally and eight in external datasets, but none of the identified models conducted any impact studies. As for model presentation, the majority of models were nomograms (graphical prediction models), and the remaining models were presented as formulae, risk scores, and calculators.
In our cross-assessment, we investigated whether the identified prediction models had employed influential risk factors (those presented convincing evidence or with 3-fold change in the odds of the outcome) when they predicted the same outcomes as those that were evaluated in the umbrella review. Across 12 CRC metastasis risk prediction models, five models [22,23,25,28,29] were on the same outcomes (LNM in pT1 CRC, LNM in CRC, hepatic metastasis in CRC, and distant metastasis in CRC), with only two [22,28] of these models (on LNM in pT1 CRC) including predictors also evaluated in the umbrella review. However, the models' calibration was poorly reported, which made it difficult to assess the models' predictive accuracy. Furthermore, one model [28] was externally validated to ensure the model's applicability and generalizability, while the remaining one [22] did not undergo adequate validation to address its potential overfitting. In addition, the remaining three models [23,25,29] predicting LNM and DM in CRC applied other risk predictors such as cancer stage, CEA, and alkaline phosphatase (ALP) that were not evaluated in the umbrella review. We suggest that risk factors with strong associations with CRC prognosis, such as circulating tumor cells and microsatellite instability, should be employed following evidence-based methods.
Across the 12 CRC recurrence risk prediction models, only one model [36] was on an outcome that was also evaluated in the umbrella review (overall recurrence in CRC). Unfortunately, we did not find overlapping risk factors/ predictors. We recommend tumor budding and absence of peritoneal free tumor cells in post-resection (≥ 3-fold change in the odds of the outcome) to be considered as predictors.

Clinical implications and future research
Identifying and evaluating risk factors with substantial predictive value is of great clinical importance. Major clinical decisions are made taking into account expectations and formal or informal predictions about major outcomes. Accurate and valid risk prediction could assist with clinical decision-making in relation to the extent and mode of surgery and therapy. Ideally, adjuvant treatment would be targeted with precision to those most likely to benefit; those most at risk of CRC metastasis/ recurrence may also have a higher absolute probability of benefit. The majority of patients do not benefit from additional therapy aimed at preventing locoregional or distant relapse before or after surgical resection, and yet they may be exposed to the attendant morbidity, cost, and false expectation of such therapy. Therefore, accurate and valid risk prediction which could impact clinical decision-making is crucial. In summary, this umbrella review provides an evidence classification that could help clinicians to judge the relative priority of risk factors/predictors' impact on CRC prognosis and make clinical decisions based on more accurate and valid risk prediction.
Our findings suggest that efforts to address the limitations of the available evidence could be beneficial. Largescale prospective studies are needed to generate evidence less prone to bias and allowing better predictive model building and validation. Standardizing the outcome definitions of CRC metastasis and recurrence could improve reporting of outcomes that have direct clinical relevance. Future risk prediction model research is encouraged to apply rigorous model construction processes and to integrate the most influential risk factors based on evidencebased methods.

Strengths and limitations
The main strength of this study is that it provides a rigorous critical assessment of the published epidemiological evidence on risk factors of CRC metastasis and recurrence, based on pre-defined criteria in a transparent and systematic way [17,18]. In addition, we updated the synthesis of CRC prognostic prediction models, and to our best knowledge, this is the first cross-assessment between individual risk factors and risk predictors applied in existing prediction models, to investigate whether influential risk factors are employed as predictors. Our findings provide a comprehensive evaluation of available evidence that can inform future research on risk factors for CRC prognostic outcomes and risk prediction models.
However, the following potential limitations should be considered. First, umbrella review comprises a synthesis of evidence from existing systematic reviews and metaanalyses [67]. Therefore, risk factors and risk predictors that were not systematically reviewed in the pre-existing literature are not included in this umbrella review. These may include some factors that are commonly used in predictive models, and it highlights the need to perform systematic reviews of the evidence for factors that might be routinely or frequently measured. Second, metaanalyses have common defects such as limited coverage of the literature search and low quality of the included studies [68,69]. Third, this study only collected and evaluated evidence from systematic reviews and metaanalyses of observational studies published in peerreviewed journals. This could limit the breadth of our results if research in gray literature, conference abstracts, and comments investigated risk factors that were not included in this umbrella review. Furthermore, 77% of meta-analyses included only retrospective studies.
Moreover, this study did not evaluate the quality of all individual component studies included in each metaanalysis because it is beyond the scope of an umbrella review. Instead, we performed a credibility evaluation and risk of bias assessment for meta-analyses that represented at least 3-fold changes in the odds of the outcome. Criteria for assessing the evidence from metaanalyses of observational studies applied in our umbrella review were based on pre-defined metrics whose limitations have been summarized [70][71][72]. For the outcomes that we studied, one is probably interested usually on whether the considered risk factors confer substantial predictive value, rather than whether they are causally related to the outcomes. We pre-specified a threshold for the magnitude of what might be a relatively large effect size (3-fold change in odds), but this is not absolute. The predictive value may depend also on how frequently a given factor is in the evaluated population. However, with one exception, all the factors evaluated concurrently in both risk factor meta-analyses and in predictive models were pretty common, with prevalence ranging from 16 to 82%.
We should also acknowledge that although we performed a sensitivity analysis to classify CRC metastasis at presentation, local or distant recurrence, a large proportion (62%) of individual component studies did not present enough information, such as the timing of metastasis in relation to initial diagnosis (i.e., synchronous or metachronous) and local or distant recurrence separately from overall recurrence. Finally, we did not evaluate risk factors relevant to clinical interventions such as surgery type, chemotherapy, radiotherapy, and transfusion. We also could not perform a complete comparison between risk factors evaluated in the umbrella review and risk predictors applied in existing prediction models because only 11 overlapping risk factors/predictors were identified.

Conclusions
In this umbrella review, we synthesized and evaluated risk factors and risk prediction models of CRC metastasis and recurrence. A total of 51 unique risk factors were investigated, convincing evidence exists only for the association between vascular invasion and LNM, and even that is restricted to pT1 tumors. Furthermore, we also conducted a cross-assessment to evaluate individual risk factors and risk prediction models. Our findings emphasize the need for a more rigorous and systematic model construction process to integrate influential risk factors following evidence-based methods.
Additional file 1: Table S1. Search strategy. Table S2. A list of publications included in the umbrella review. Table S3. Quantitative synthesis of all 61 eligible meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer metastasis. Table S4. Quantitative synthesis of all 20 eligible meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer recurrence. Table S5. Overlapping meta-analyses of observational studies investigating the associations between the same risk factor and the same outcome. Table S6. Quantitative synthesis of 34 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer metastasis. Table S7. Quantitative synthesis of 17 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer recurrence. Table S8. Criteria for assessing the credibility of the evidence from meta-analyses of observational studies. Table S9. Summary of evidence credibility assessment of 34 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer metastasis. Table S10. Summary of evidence credibility assessment of 17 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer recurrence. Table S11. Sensitivity analysis of 16 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer metastasis (at presentation) and evidence credibility assessment. Table S12. Sensitivity analysis of 13 unique meta-analyses of observational studies investigating the associations between risk factors and colorectal cancer recurrence (local/ distant) and evidence credibility assessment. Table S13. Quality and risk of bias assessment (AMSTAR 2.0) for the evidence represented at least 3-fold changes in the odds of the outcome.