The prognostic value of lymph node yield in the earliest stage of colorectal cancer: a multicenter cohort study

Background In patients with stage II colorectal cancer (CRC) the number of surgically retrieved lymph nodes (LNs) is associated with prognosis, resulting in a minimum of 10–12 retrieved LNs being recommended for this stage. Current guidelines do not provide a recommendation regarding LN yield in T1 CRC. Studies evaluating LN yield in T1 CRC suggest that such high LN yields are not feasible in this early stage, and a lower LN yield might be appropriate. We aimed to validate the cut-off of 10 retrieved LNs on risk for recurrent cancer and detection of LN metastasis (LNM) in T1 CRC, and explored whether this number is feasible in clinical practice. Methods Patients diagnosed with T1 CRC and treated with surgical resection between 2000 and 2014 in thirteen participating hospitals were selected from the Netherlands Cancer Registry. Medical records were reviewed to collect additional information. The association between LN yield and recurrence and LNM respectively were analyzed using 10 LNs as cut-off. Propensity score analysis using inverse probability weighting (IPW) was performed to adjust for clinical and histological confounding factors (i.e., age, sex, tumor location, size and morphology, presence of LNM, lymphovascular invasion, depth of submucosal invasion, and grade of differentiation). Results In total, 1017 patients with a median follow-up time of 49.0 months (IQR 19.6–81.5) were included. Four-hundred five patients (39.8%) had a LN yield ≥ 10. Forty-one patients (4.0%) developed recurrence. LN yield ≥ 10 was independently associated with a decreased risk for recurrence (IPW-adjusted HR 0.20; 95% CI 0.06–0.67; P = 0.009). LNM were detected in 84 patients (8.3%). LN yield ≥ 10 was independently associated with increased detection of LNM (IPW-adjusted OR 2.27; 95% CI 1.39–3.69; P = 0.001). Conclusions In this retrospective observational study, retrieving < 10 LNs was associated with an increased risk of CRC recurrence, advocating the importance to perform an appropriate oncologic resection of the draining LNs and diligent LN search when patients with T1 CRC at high-risk for LNM are referred for surgical resection. Given that both gastroenterologists, surgeons and pathologists will encounter T1 CRCs with increasing frequency due to the introduction of national screening programs, awareness on the consequences of an inadequate LN retrieval is of utmost importance. Electronic supplementary material The online version of this article (doi:10.1186/s12916-017-0892-7) contains supplementary material, which is available to authorized users.


Background
With the introduction of colorectal cancer (CRC) screening programs, there has been a shift towards an increased detection of T1 CRC as compared to more advanced cancers [1,2]. Endoscopic resection is considered curative for low-risk T1 CRC. However, surgical resection is recommended for patients with T1 CRC with a considerable risk of lymph node metastasis (LNM), as determined by histological high-risk factors [3,4]. The prognosis of these patients depends to a large extent on the lymph node (LN) status, with a CRC-related 5-year survival of ≥ 95% in the absence of LNM (American Joint Committee on Cancer (AJCC) stage I T1 CRC), decreasing to 68-90% in the presence of LNM (AJCC stage III T1 CRC) [5,6].
In contrast to AJCC stage II CRC (T3-4 CRC), in which a higher LN yield has been associated with improved survival, little is known on the association between LN yield and recurrence in T1 CRC [7,8]. Therefore, international guidelines restrict their recommendation for a minimum yield of 10-12 LNs to AJCC stage II CRC [9][10][11][12]. It is questionable whether the recommended cut-off in advanced CRC can be extrapolated to T1 CRC, given that more advanced tumor depth has been associated with an increased LN yield [13][14][15]. Accordingly, reported LN yields in T1 CRC are much lower, with mean and median LN yields between 4 and 7 in studies performed between 1988 and 2003 [13,[16][17][18]. One might hypothesize that the additional value of retrieving more LNs might be less relevant for T1 CRCs as LNM are reported in only 8-12% of patients and recurrence after surgical resection is reported in only 2-5% of patients [3,4,19]. Two small single-center studies argued that a minimum of 4 and 8 retrieved LNs, respectively, should be appropriate for staging T1 CRC [16,20], whereas La Torre et al. [21] concluded that a limited resection does not affect oncological outcome in this early stage. However, these studies were either small or not informed about long-term recurrence rates.
In this longitudinal multicenter retrospective cohort study consisting of patients who underwent surgical resection of T1 CRC, we aimed to explore the association between LN yield and the risk for recurrence, and to assess the association between LN yield and the detection of LNM. Moreover, we aimed to explore whether a minimum of 10 LNs is feasible in a routine clinical setting.

Patients and study design
This is a multicenter retrospective cohort study. Patients diagnosed with T1 CRC in 13 participating hospitals (1 academic and 12 non-academic hospitals) between 1 January 2000 and 31 December 2014 were selected from the Netherlands Cancer Registry. The electronic medical records of all patients were reviewed. Only cases in which the local pathologist clearly confirmed the diagnosis T1 CRC in the pathology report were selected, which was defined as tumors with invasion through the muscularis mucosa and into, but not beyond, the submucosa [22]. Patients were included if they were treated with surgical resection of T1 CRC. Transanal endoscopic microsurgery was considered an endoscopic treatment, as no lymphadenectomy is performed. Exclusion criteria were hereditary predisposition for CRC, inflammatory bowel disease, synchronous CRC (defined as CRC in the previous 5 years before detection of T1 CRC, or elsewhere in the colorectum at the time of detection of T1 CRC), non-CRC-related death within 1 year, nonadenocarcinoma, neo-adjuvant radiotherapy, unknown number of retrieved LNs, and AJCC stage IV T1 CRC at diagnosis. This study was approved by the Medical Ethics Review Committee of the University Medical Center Utrecht (reference number 15-487/C) and was carried out in accordance with the Helsinki Declaration. The study conforms to the STROBE guideline for cohort studies [23].

Endpoints
In each participating center, study variables were collected from the electronic medical records, and the corresponding endoscopy, surgery, pathology, and radiology reports. Primary endpoint was incidence of recurrent cancer, either local or distant. Local recurrence was defined as malignant tissue at the site of the anastomosis. Distant recurrence was defined as metastasis to extra-colonic organs, bone or peritoneum confirmed with imaging or histology. A new primary CRC elsewhere in the colon or rectum was defined as a metachronous lesion, not as recurrence. Secondary endpoint was prevalence of LNM at time of surgery, defined as positive LNs in the resection specimen as reported in the pathology report.

Determinant and confounding factors
The determinant of interest was number of retrieved LNs, as reported in the pathology reports. We dichotomized LN yield using 10 retrieved LNs as cut-off, as this is the lowest recommended minimum for AJCC stage II CRC in current (inter)national guidelines [9,11,12].
Established risk factors (based on previous literature) for recurrent cancer and LNM were considered potential confounding factors [3,4,7,24]. Data on potential clinical confounders were collected from the medical records and endoscopy reports, and included age, sex, tumor location (right colon vs. left colon vs. rectum), tumor size and tumor morphology (pedunculated vs. non-pedunculated) [3,7,24]. Right colon was defined as caecum, ascending and transverse colon including the splenic curve. Left colon was defined as the descending and sigmoid colon. Tumor morphology was defined as pedunculated if the presence of a stalk or Paris 0-Ip was reported in the endoscopy report [25]. Flat and sessile tumors were defined as non-pedunculated T1 CRCs [26]. Potential histological confounders were collected from the pathology reports, and included lymphovascular invasion (absent vs. present), depth of submucosal invasion (superficial vs. deep invasion), and grade of differentiation (good vs. moderate vs. poor) [4]. Deep submucosal invasion was defined as invasion depth ≥ 1 mm or sm2/3 for nonpedunculated T1 CRC, and Haggitt 4 for pedunculated T1 CRC [4]. For the endpoint recurrent cancer, the list of potential confounders was extended with LNM as a confounding factor [7].
Additional patient characteristics (not causally related to recurrent cancer or LNM and therefore not considered potential confounders) and follow-up characteristics were collected from medical records. Patient characteristics were body mass index (calculated as weight in kilograms divided by height in meters squared) and comorbidity according to the American Society of Anesthesiologists Physical Status classification [27]. Follow-up was performed according to routine clinical care. Follow-up started at the date of diagnosis and ended at the date of detection of recurrence, death, or last follow-up.

Statistical analysis
Categorical data were expressed as frequencies and percentages; continuous variables as means with standard deviation (SD) or medians with interquartile range (IQR).
Primary aim of this study was to evaluate the association between number of retrieved LNs and recurrent cancer. Despite our large multicenter cohort of patients spanning an inclusion period of many years, the absolute number of recurrences was expected to be small, since recurrent cancer after surgical resection of T1 CRC is a rare event. We therefore used propensity scores allowing for the adjustment for more confounders than would have been feasible using standard multivariable adjustment approaches [28][29][30]. Inverse probability weighting (IPW) was used to account for baseline differences in predictive characteristics between patients with LN yield < 10 vs. ≥ 10. A propensity score was derived by fitting a logistic regression model with the dichotomized number of retrieved LNs as the dependent variable and potential confounders as predictors. We adjusted for confounders in a two-step approach. For the primary analysis, we adjusted for clinical confounding factors (i.e., age, sex, tumor location, tumor size, tumor morphology, and presence of LNM). Secondary supporting analyses were performed to additionally correct for histological confounding factors (i.e., lymphovascular invasion, depth of submucosal invasion, grade of differentiation), which were not included in the primary analysis as they were missing in a considerable number of cases (Table 1). Continuous variables (i.e., age and tumor size) were fitted in the propensity models by restricted cubic spline functions, dummies were used for categorical variables (no interaction terms). Then, the inverse of the (propensity score-derived) predicted probability for actual LN yield was used to weigh patients in a Cox regression model relating the dichotomized LN yield as the sole determinant of recurrent cancer, yielding hazard ratios (HR) for a high (≥10) vs. low (< 10) retrieval of LNs. Patients who did not develop recurrent cancer were censored at the last follow-up moment or date of death. No violation of the proportionality of the hazard assumption was observed following inspection of the scaled Schoenfeld residuals. Bootstrapping was performed to obtain two-sided P values and 95% confidence intervals (CI) for IPW-derived estimates.
Assuming that a low LN yield risks understaging and these understaged patients would fail to receive chemotherapy that could prevent recurrence [31], we additionally repeated our main analysis restricted to patients in whom no LNM were observed [32]. Analyses were performed in the same manner with IPW analysis as described above, adjusting for the same potential confounders.
Furthermore, to assess the sensitivity of our results for the way we dichotomized LN yield in our analyses, the number of retrieved LNs was also analyzed continuously in univariable regression analyses. Moreover, sensitivity analyses were performed using 12 retrieved LNs as cutoff (using similar analysis as used for 10 retrieved LNs as cut-off ), as this cut-off is often used as a quality measure for adequate staging in AJCC stage II CRC [9].
Several checks were performed to evaluate the appropriateness of the IPW analysis, namely (1) baseline characteristics before and after IPW adjustment for confounders were compared; (2) the c-index of the propensity models was assessed using the propensity model's own IPWs (a c-index of 0.5 indicates a successfully obtained balance of confounders after adjusting; a c-index of 1.0 indicates extreme remaining imbalance); and (3) the maximum weight of a single patient used in the IPW-adjusted analysis to obtain balance in potential confounders was assessed as a quality instrument to assess whether a single or a few cases influence the risk estimate excessively [33,34].
The secondary aim was to explore the association between LN yield and detection of LNM. Like recurrence, a low number of LNM were observed. Therefore, analyses were performed in the same manner with IPW analysis as described above with the presence of LNM as the endpoint, correcting for the same confounders except for the presence of LNM itself. Inverse probability weighted logistic regression analysis was used yielding odds ratios (OR) for a high (≥ 10) vs. low (< 10) retrieval of LNs on the outcome LNM. Information on LN yield and the presence of LNM was available for all patients; however, several clinicopathological confounding variables had missing data ( Table 1). As simply excluding patients with missing data is inefficient and increases the risk of selection bias, we used multiple imputation before data analysis [35]. Missing data was assumed to be missing at random. Multivariate imputation by chained equations (10 imputation datasets, 25 iterations, healthy convergence) was performed [36]. Rubin's rules were used to pool results across imputation datasets [37]. Percentage of complete cases (i.e., percentage of patients with no imputed values) were reported for each analysis, together with the percentage of observed data points. For the primary outcome, a sensitivity analysis excluding patients with missing values was performed (complete-case analysis) to determine whether this agreed with the imputed results (Additional file 1: Table S1).
GraphPad Prism version 6.02 (GraphPad software Inc., San Diego, CA, USA) was used to draw figures. IBM SPSS Statistics version 21 (SPSS Inc., Chicago, IL, USA) and R version 3.2.2 (RStudio Inc., Boston, MA, USA) were used for statistical analysis. A two-sided P value < 0.05 was considered significant.

Patient characteristics
A total of 2253 patients with T1 CRC diagnosed between 2000 and 2014 were identified in the participating hospitals. A total of 1017 patients treated for pT1 CRC with surgical resection remained eligible for analysis (Fig. 1). Mean age of the cohort was 69.1 years (SD 9.6) and 54.6% of patients were male. A total of 405 patients (39.8%) had a LN yield ≥ 10. An overview of the baseline characteristics of patients with a LN yield < 10 vs. ≥ 10 before and after IPW adjustment for the confounders considered for the primary analysis (adjustment for clinical factors) and the secondary supporting analysis (adjustment for clinical and histological factors) is provided in Table 1. Before adjustment by IPW, patients with a LN yield ≥ 10 had T1 CRCs that were more often located in the right colon (37.5% vs. 15.2%, P < 0.001), had more often a non-pedunculated morphology (75.9% vs. 60.7%, P < 0.001), had a larger tumor size (2.9 vs. 2.6 cm, P = 0.003), more often had LNM (11.6% vs. 6.0%, P = 0.002), and less often showed lymphovascular invasion (20.2% vs. 29.9%, P = 0.04). Following adjustment, baseline characteristics between groups were comparable.
To assess the sensitivity of our results for the way we dichotomized LN yield, sensitivity analyses were performed with the number of retrieved LNs without thresholds and using 12 LNs as cut-off. Without thresholds, LN yield remained associated with a decreased risk for recurrence (per retrieved LN: HR 0.90; 95% CI 0.84-0.97; P = 0.007). Of the 749 patients in whom < 12 LNs were retrieved, 37 patients (4.9%) developed recurrences. Of the 268 patients in whom ≥ 12 LNs were retrieved, 4 patients (1.5%) developed recurrences. Similar to a LN yield ≥ 10, the HR of a LN yield ≥ 12 pointed towards a decreased risk for recurrence, albeit non-significant (adjusted HR 0.41; 95% CI 0.12-1.34; P = 0.14) (Additional file 1: Table S2).

LN yield and LNM
LNM were detected in 84 patients, corresponding to a prevalence of 8.3% (84/1017) ( Table 2); 1 positive LN was detected in 54 patients, 2 in 18 patients, 3 in 4 patients, and ≥ 4 in 8 patients, with a maximum of 8 positive LNs per patient. In 6 patients with LNM, recurrence occurred during follow-up. In univariable analysis, the number of retrieved LNs (without thresholds) was associated with increased detection of LNM (per retrieved LN: OR 1.04; 95% CI 1.01-1.07; P = 0.006).

Discussion
To the best of our knowledge, this is the first large-scale study evaluating the association between LN yield and long-term recurrence rates in patients with T1 CRC when adjusting for multiple confounding factors. A LN yield < 10 was associated with an increased risk for recurrence after surgical resection of T1 CRC, even when adjusting for clinical and histological characteristics. Furthermore, LN yield ≥ 10 was independently associated with an increased detection of LNM. These findings underline the importance of performing an appropriate oncologic resection of the draining LNs and diligent LN search by the pathologist when patients with T1 CRC at high-risk for LNM are referred for surgical resection and   The maximum weight of a single patient used in the IPW adjusted analysis to obtain balance in potential confounders. This is a quality instrument to assess whether a single or a few cases influence the risk estimate excessively. As a rule of thumb this should be lower than 10% of the analyzed dataset (i.e., smaller than 100 and 90 for the total cohort and LN-negative patients, respectively) d This is an estimate of the balance of confounders after adjusting by inverse probability weighting (0.50 complete balance, 1.00 complete unbalance) e Percentage of complete cases (i.e., cases with no imputed values for any of the evaluated variables of that analysis). Note: analysis was performed on the imputed dataset concerning all cases f Percentage of available data points before imputation. Note: analysis was performed on the imputed dataset concerning all cases In contrast to earlier smaller studies, we validated a cut-off (≥ 10) that was chosen based on current recommendations for AJCC stage II CRC, instead of evaluating multiple study-specific cut-off points and selecting the most suitable one [16,20]. This approach is less prone to the introduction of overestimation of the difference in outcomes between groups, and enhances the generalizability and external validity of the results [38]. To assess the sensitivity of our results for the manner in which the LN yield was dichotomized, we performed sensitivity analyses with the number of retrieved LNs analyzed continuously and additionally used 12 retrieved LNs as the cut-off. In all analyses, the HR indicated a decreased risk for recurrence.
Our findings build on a study conducted with data from the Surveillance, Epidemiology, and End Results cancer registry in patients with AJCC stage I CRC, comprising both T1N0 and T2N0 CRC [31]. An increased LN yield was associated with improved overall survival. However, this study used a population-based registry, and could therefore only evaluate the association between LN yield and mortality due to all causes. As the vast majority (≥ 95%) of patients with T1 CRC without LNM do not die as a result of CRC, recurrence rates rather than survival rates are informative when evaluating factors associated with prognosis in this early stage [5,6].
Several explanations can be hypothesized for the observed association with recurrence. The first is understaging, with positive nodes missed when an insufficient number of LNs is retrieved. Consequently, chemotherapy will not be considered in these patients and the missed residual cancer cells may metastasize to distant organs. Evidence on the benefit of chemotherapy in patients with T1 CRC with LNM is limited. However, it is currently advised in these patients [39]. Our finding that ≥ 10 retrieved LNs resulted in a significantly higher percentage of nodal positive patients supports this explanation, as was also demonstrated in previous work for advanced CRC stages [31,40]. However, a population-based retrospective observational study questioned this mechanism as the main cause of the observed association, as patients with higher LN yields were only slightly more likely to have LNM, suggesting that some other unmeasured factors resulted in better patient outcomes [8]. It has been hypothesized that the number of retrieved LNs might reflect tumor biology [41]. The tumor microenvironment and the host's immune response have been shown to be of major importance in tumor progression [42]. Thus, a higher LN yield may reflect a stronger immune response, reducing the risk for recurrence. Some limitations should be mentioned. Although the present cohort is one of the largest on T1 CRCs to date with long-term follow-up enabling the evaluation of the association between LN yield and recurrence when adjusting for multiple confounding factors, it is an observational study on a retrospective cohort. Inherent to the design, we had to deal with missing data. Information on histological potential confounders was missing in a considerable number of patients, resulting in one or more variables having to be imputed in more than 82% of patients in the secondary supporting analysis. Nevertheless, the estimates resulting from the secondary analyses with and without imputation were similar, suggesting that selection bias due to missingnessif presentwas rather limited (results without imputation can be found in Additional file 1: Table S1). Furthermore, due to the retrospective design, no data was available on the effort of the surgeon for adequate lymphadenectomy, such as the extent of the mesentery excised, and no standardized pathologic evaluation of the resection specimen was performed. This could have shed some light on the role of the surgeon and pathologist The maximum weight of a single patient used in the IPW adjusted analysis to obtain balance in potential confounders. This is a quality instrument to assess whether a single or a few cases influence the risk estimate excessively. As a rule of thumb this should be lower than 10% of the analyzed dataset (i.e., smaller than 100) d This is an estimate of the balance of confounders after adjusting by inverse probability weighting (0.50 complete balance, 1.00 complete unbalance) e Percentage of complete cases (i.e., cases with no imputed values for any of the evaluated variables of that analysis). Note: analysis was performed on the imputed dataset concerning all cases f Percentage of available data points before imputation. Note: analysis was performed on the imputed dataset concerning all cases on the LN yield. However, the primary aim of this study was to explore the association between LN yield and recurrence in T1 CRC, not to explore the influence of the pathologist and surgeon, as has been done in previous studies [17,[43][44][45][46]. Taking the results of these studies into account, we believe LN yield is a shared responsibility of both surgeon and pathologist. Secondly, although we corrected as efficiently as possible for the measured confounders, it remains unclear how well the unmeasured confounders have been adjusted for. New predictive markers in CRC have been identified in the past two decades, including tumor budding as prognostic marker for LNM and macroscopic pathological assessment of the quality of the circumferential resection margin (CRM) as a prognostic marker for recurrence in rectal cancer [4,47,48]. Standardized pathologic reporting of the quality of the CRM was only introduced in Dutch guidelines in recent years, and consensus on the standardized assessment of tumor budding has only recently been achieved (International Tumor Budding Consensus Conference 2016) [49]. Therefore, this hampered the evaluation of these factors in our analysis. To explore the magnitude of potential bias introduced by lack of information on the quality of the CRM, we repeated the main analysis when excluding patients with rectal T1 cancer, yielding similar results, suggesting that the introduced potential bias is limited (Additional file 1: Table S3).
Thirdly, we were not informed on tumor biology such as genetic mutations, microsatellite instable status, or consensus molecular subtype classification [50,51]. However, similar to data from the Surveillance, Epidemiology, and End Results cancer registry, we observed an increased LN yield over the years, suggesting that the observed survival benefit cannot be completely attributed to tumor biology [8,9]. It is not unlikely that the Dutch Surgical Colorectal Audit, introduced in 2008, and the implementation of LN yield as a quality indicator in 2009 have contributed to the observed increment in LN yield from 2010 onwards. This finding shows that retrieving higher number of LNs is achievable in daily clinical practice, and should be aimed for. Moreover, a previous study observed that LN retrieval differed between different types of hospitals, suggesting that other quality issues may influence LN retrieval [52].
Finally, it should be emphasized that, despite our multicenter cohort spanning an inclusion period of many years, the absolute number of patients with recurrent cancer or LNM was still low. Although we resorted to statistical analysis techniques suited for such data and performed several checks to evaluate the appropriateness of our analysis, the low number of events unavoidably resulted in relatively broad CIs.

Conclusion
In conclusion, in this observation cohort study, the retrieval of less than 10 LNs was associated with an increased risk for recurrence and decreased detection of LNM in T1 CRC. Given that gastroenterologists, surgeons and pathologists will all encounter T1 CRCs with increasing frequency due to the introduction of CRC screening programs, awareness on the consequences of an inadequate LN retrieval is of utmost importance.

Additional file
Additional file 1: Table S1. Sensitivity analysis comparing the complete case hazard ratio and the multiple imputation hazard ratio for the association between lymph node (LN) yield ≥ 10 versus < 10 and the primary outcome recurrent cancer. Table S2. Unadjusted and adjusted association between LN yield ≥ 12 (N = 268 patients, N = 4 recurrences) versus LN yield < 12 (N = 749 patients, N = 37 recurrences) and recurrent cancer after surgical resection of T1 colorectal cancer. Table S3. Main analysis restricted to patients with T1 colon cancer (843 T1 colon cancers with 29 recurrences) in order to explore the magnitude of potential bias introduced by lack of information on the quality of the circumferential resection margin in rectal T1 cancer. (DOCX 18 kb) Abbreviations AJCC: American Joint Committee on Cancer; CI: confidence interval; CRC: colorectal cancer; CRM: circumferential resection margin; HR: hazard ratio; IPW: inverse probability weighting; IQR: interquartile range; LN: lymph node; LNM: lymph node metastasis; OR: odds ratio; SD: standard deviation