Skip to main content

Table 4 Model selection and discrimination in the derivation and validation cohorts

From: Development and validation of a lifestyle-based model for colorectal cancer risk prediction: the LiFeCRC score

 

Colorectal cancer

Colon cancer

Rectal cancer

Selected predictors

Both sexes

Men

Women

Both sexes

Men

Women

Both sexes

Men

Women

Age at recruitment, per 10 years

Waist circumference, per 10 cm

  

Height, per 10 cm

 

 

  

Daily alcohol consumption, high

 

  

 

Ever smoker, yes

 

Physically active, yes

  

 

   

Vegetables, per 100 g/day

 

Fruits, per 100 g/day

         

Dark bread, per 50 g/day

 

 

    

Dairy products, per 100 g/day

 

    

Red meat, per 50 g/day

 

       

Poultry, per 50 g/day

         

Processed meat, per 50 g/day

 

   

 

Fish, per 50 g/day

         

Sugar and confectionary, per 50 g/day

        

Soft drinks, per 100 g/day

      

 

Harrell’s C-index

Full model

         

 Derivation cohort

0.710

0.700

0.702

0.718

0.708

0.718

0.705

0.705

0.677

 Optimism corrected *

0.708

0.697

0.700

0.716

0.707

0.715

0.704

0.703

0.668

 Validation cohort

0.715

0.707

0.700

0.708

0.727

0.700

0.730

0.689

0.693

Reduced model

         

 Derivation cohort

0.710

0.699

0.700

0.717

0.705

0.717

0.703

0.700

0.668

 Optimism corrected*

0.709

0.698

0.699

0.716

0.704

0.715

0.701

0.698

0.667

 Validation cohort

0.714

0.708

0.699

0.708

0.727

0.698

0.728

0.687

0.696

  1. *Harrell's C-index for the derivation cohort corrected for optimism by bootstrapping with 1000 replications. For each bootstrap sample a new model is fitted and the C-index calculated for the bootstrap sample and the original derivation cohort. The difference between these two C-indices is then averaged over all bootstrap replications and then subtracted from the original C-index