Skip to main content

Table 4 Model selection and discrimination in the derivation and validation cohorts

From: Development and validation of a lifestyle-based model for colorectal cancer risk prediction: the LiFeCRC score

  Colorectal cancer Colon cancer Rectal cancer
Selected predictors Both sexes Men Women Both sexes Men Women Both sexes Men Women
Age at recruitment, per 10 years
Waist circumference, per 10 cm   
Height, per 10 cm       
Daily alcohol consumption, high       
Ever smoker, yes  
Physically active, yes         
Vegetables, per 100 g/day  
Fruits, per 100 g/day          
Dark bread, per 50 g/day         
Dairy products, per 100 g/day       
Red meat, per 50 g/day          
Poultry, per 50 g/day          
Processed meat, per 50 g/day        
Fish, per 50 g/day          
Sugar and confectionary, per 50 g/day         
Soft drinks, per 100 g/day         
Harrell’s C-index
Full model          
 Derivation cohort 0.710 0.700 0.702 0.718 0.708 0.718 0.705 0.705 0.677
 Optimism corrected * 0.708 0.697 0.700 0.716 0.707 0.715 0.704 0.703 0.668
 Validation cohort 0.715 0.707 0.700 0.708 0.727 0.700 0.730 0.689 0.693
Reduced model          
 Derivation cohort 0.710 0.699 0.700 0.717 0.705 0.717 0.703 0.700 0.668
 Optimism corrected* 0.709 0.698 0.699 0.716 0.704 0.715 0.701 0.698 0.667
 Validation cohort 0.714 0.708 0.699 0.708 0.727 0.698 0.728 0.687 0.696
  1. *Harrell's C-index for the derivation cohort corrected for optimism by bootstrapping with 1000 replications. For each bootstrap sample a new model is fitted and the C-index calculated for the bootstrap sample and the original derivation cohort. The difference between these two C-indices is then averaged over all bootstrap replications and then subtracted from the original C-index