Skip to main content

Table 2 Area under the receiver operating characteristic curve prediction results predictors at varied stages of pregnancy

From: Development and validation of prediction models for gestational diabetes treatment modality using supervised machine learning: a population-based cohort study

Predictor levelsa Dataset AUC (95% CI)
CART LASSO regression Simple super learnerb Complex super learnerc
1 Discovery set 0.613 (0.603–0.622) 0.670 (0.663–0.676) 0.673 (0.667–0.679) 0.683 (0.676–0.689)
Validation set 0.592 (0.567–0.616) 0.634 (0.615–0.653) 0.635 (0.615–0.654) 0.634 (0.615–0.653)
1, 2 Discovery set 0.618 (0.609–0.628) 0.685 (0.678–0.691) 0.688 (0.682–0.695) 0.761 (0.756–0.767)
Validation set 0.588 (0.563–0.613) 0.647 (0.628–0.666) 0.645 (0.626–0.664) 0.648 (0.630–0.667)
1, 2, 3 Discovery set 0.740 (0.732–0.748) 0.785 (0.780–0.791) 0.790 (0.785–0.796) 0.869 (0.865–0.873)
Validation set 0.703 (0.682–0.724) 0.750 (0.733–0.767) 0.749 (0.733–0.766) 0.754 (0.739–0.772)
1, 2, 3, 4 Discovery set 0.785 (0.777–0.792) 0.849 (0.845–0.854) 0.852 (0.848–0.857) 0.934 (0.931–0.936)
Validation set 0.745 (0.722–0.767) 0.809 (0.794–0.823) 0.808 (0.794–0.823) 0.815 (0.800–0.829)
  1. AUC, area under the receiver operating characteristic curve; CART, classification and regression tree; LASSO, least absolute shrinkage and selection operator
  2. aLevel 1: 1-year preconception to last menstrual period; level 2: last menstrual period to before diagnosis of gestational diabetes; level 3: at the time of diagnosis of gestational diabetes; level 4: 1 week after diagnosis of gestational diabetes
  3. bCandidate algorithms in simple super learner included response-mean, LASSO regression, and CART
  4. cCandidate algorithms in complex super learner included response-mean, LASSO regression, CART, random forest, and extreme gradient boosting