Skip to main content

Advertisement

Table 3 Random Forest Classifiers for patient clusters and clinical subpopulations.

From: Collectives of diagnostic biomarkers identify high-risk subpopulations of hematuria patients: exploiting heterogeneity in large-scale biomarker data

Variable description Sub populations Biomarkers Classification error (SD) AUROC (SD)
All 157 hematuria patients controls n = 77 UC n = 80 CRP, EGF, IL-6, IL-1α, MMP9NGAL, osmolarity, CEA 0.203 (0.017) 0.766 (0.152)
Patient clustersa blue n = 57 (28) TNFα, EGF, NSE, NGAL, MMP9NGAL, TM, FAS 0.155 (0.029) 0.800 (0.258)
  green n = 49 (18) TNFα, EGF, IL-6, IL-1α, MMP9NGAL, TM, CEA 0.204 (0.037) 0.825 (0.264)
  gold n = 23 (15) CRP, sTNFR1, vWF, IL-1α, MMP9NGAL, creatinine, BTA 0.245 (0.049) 0.700 (0.349)
Clinical subpopulations
Smoking smokers n = 101 (60) CRP, EGF, MMP9, IL-1α, IL-4, TM, IL-2 0.276 (0.027) 0.770 (0.117)
  non- smokers n = 56 (20) TNFα, sTNFR1, IL-6, IL-1α, MMP9NGAL, creatinine, CEA 0.156 (0.027) 0.783 (0.159)
Gender males n = 120 (65) CRP, EGF, CK18, IL-1β, IL-8, creatinine, IL-2 0.272 (0.030) 0.753 (0.117)
  females n = 37 (15) CRP, EGF, IL-6, dDimer, MMP9NGAL, osmolarity, CEA 0.181 (0.054) 0.830 (0.146)
Hx stone disease yes n = 30 (14) CRP, sTNFR1, CK18, IL-1α, IL-8, creatinine, VEGF 0.322 (0.062) 0.738 (0.194)
  no n = 127 (66) CRP, EGF, IL-6, IL-1α, MMP9NGAL, creatinine, CEA 0.186 (0.015) 0.817 (0.117)
Hx BPE yes n = 30 (14) CRP, EGF, IL-6, IL-1α, MMP9NGAL, TM, CEA 0.192 (0.018) 0.826 (0.148)
  no n = 127 (66) CRP, EGF, CK18, NGAL, MMP9NGAL, creatinine, BTA 0.266 (0.061) 0.788 (0.169)
Anti-hypertensive medication on medication n = 73 (51) TNFα, EGF, IL-6, protein, MMP9NGAL, creatinine, CEA 0.211 (0.025) 0.731 (0.161)
  no medication n = 83 (28) TNFα, sTNFR1, IL-6, NGAL, IL-8, TM, CEA 0.145 (0.028) 0.810 (0.132)
Anti-platelet medication on medication n = 37 (25) TNFα, EGF, IL-6, protein, IL-8, osmolarity, CEA 0.215 (0.019) 0.780 (0.141)
  no medication n = 118 (53) CRP, EGF, MCP-1, protein, MMP9NGAL, TM, FPSA 0.160 (0.046) 0.843 (0.153)
Anti-ulcer medication on medication n = 33 (17) CRP, EGF, IL-6, IL-1α, IL-8, TM, CEA 0.220 (0.018) 0.827 (0.118)
  no medication n = 123 (62) CRP, EGF, vWF, IL-1β, MMP9NGAL, TM, HA 0.259 (0.072) 0.812 (0.168)
  1. Using the clusters of biomarkers as a feature set, we determined the classification error and the area under the receiver operating characteristic curve (AUROC) of urothelial cancer (UC) diagnostic classifiers for all possible biomarker combinations for all 157 hematuria patients; for 3/5 of the patient clusters; and for 14 subpopulations split on the basis of smoking, gender, history of stone disease, history of benign prostate enlargement (BPE), or anti-hypertensive, anti-platelet or anti-ulcer medications. Therefore, one biomarker from each of the seven clusters illustrated in the biomarker dendrogram (Figure 3), was represented in each classifier. The classification errors in the clinically split populations were very similar to those obtained for the patient clusters. aOnly two of the natural patient subpopulations, those shown in blue and green in Figure 1, contained sufficient numbers to train a Random Forest Classifier (RFC). For reasons of comparison, we also trained a RFC for the gold cluster. Four of the seven biomarkers were the same in the diagnostic classifiers for the blue and green patient clusters suggesting biological similarities. The numbers in brackets in the second column indicate the number of patients with UC. BTA, bladder tumor antigen; CEA, carcino-embryonic antigen; CK18, cytokeratin 18; CRP, C-reactive protein; EGF, epidermal growth factor; FPSA, free prostate specific antigen; IL, interleukin; HA, hyaluronidase; MMP-9, matrix metalloproteinase 9; NGAL, neutrophil-associated gelatinase lipocalin; NSE, neuron specific enolase; SD, standard deviation; sTNFR1, soluble tumor necrosis factor receptor 1; TM, thrombomodulin; TNFα, tumor necrosis factor α; VEGF, vascular endothelial growth factor; vWF, Von Willeband factor.