Skip to main content

Table 3 Diagnostic performance of DLM-2

From: Deep learning models of ultrasonography significantly improved the differential diagnosis performance for superficial soft-tissue masses: a retrospective multicenter study

 

AUC

ACC (%)

Sensitivity (%)

Specificity (%)

PPV (%)

NPV (%)

F1-score

Lipomyoma

 Training cohort

0.98 [0.954, 1.0]

94.3 [92.6, 95.9]

90.8 [86.1, 95.1]

95.3 [93.4, 96.8]

83.2 [77.3, 88.5]

97.6 [96.3, 98.7]

0.869 [0.825, 0.904]

 Validation cohort

0.986 [0.973, 0.997]

95.3 [92.6, 98.0]

83.9 [71.9, 93.8]

98.3 [96.4, 100.0]

92.9 [84.2, 100.0]

95.8 [92.7, 98.4]

0.881 [0.8, 0.947]

 Test cohort A

0.98 [0.954, 1.0]

84.1 [78.8, 88.7]

95.0 [88.9, 100.0]

80.2 [73.7, 86.0]

63.3 [53.1, 73.8]

97.8 [95.2, 100.0]

0.76 [0.674, 0.832]

 Test cohort B

0.905 [0.847, 0.956]

78.6 [72.3, 84.8]

96.4 [89.5, 100.0]

72.6 [64.9, 80.2]

54.0 [42.6, 65.9]

98.4 [95.2, 100.0]

0.692 [0.59, 0.786]

Hemangioma

 Training cohort

0.976 [0.966, 0.985]

90.6 [88.5, 92.6]

92.5 [88.2, 96.3]

90.1 [87.8, 92.2]

70.7 [64.5, 76.7]

97.9 [96.7, 99.0]

0.801 [0.756, 0.844]

 Validation cohort

0.993 [0.986, 0.999]

93.9 [90.5, 96.6]

70.0 [55.9, 82.9]

100.0 [100.0, 100.0]

100.0 [100.0, 100.0]

92.9 [89.1, 96.2]

0.824 [0.717, 0.906]

 Test cohort A

0.909 [0.851, 0.957]

79.5 [74.2, 84.8]

87.5 [71.4, 100.0]

78.5 [72.7, 84.6]

32.6 [21.2, 45.2]

98.1 [95.6, 100.0]

0.475 [0.333, 0.606]

 Test cohort B

0.827 [0.735, 0.909]

72.3 [65.2, 80.4]

73.7 [57.1, 90.0]

72.0 [64.3, 80.4]

35.0 [22.9, 48.4]

93.1 [87.7, 97.5]

0.475 [0.333, 0.604]

Neurinoma

 Training cohort

0.94 [0.92, 0.96]

85.1 [82.7, 87.7]

92.5 [88.3, 96.3]

83.2 [80.3, 86.2]

58.7 [53.1, 64.9]

97.7 [96.4, 99.0]

0.718 [0.669, 0.766]

 Validation cohort

0.944 [0.902, 0.978]

89.1 [85.0, 93.2]

86.7 [76.0, 96.4]

89.7 [85.0, 94.0]

68.4 [55.2, 80.0]

96.3 [93.4, 99.1]

0.765 [0.656, 0.847]

 Test cohort A

0.885 [0.816, 0.94]

82.8 [77.5, 87.4]

84.0 [70.8, 95.8]

82.5 [76.4, 88.2]

48.8 [35.7, 61.0]

96.3 [93.1, 99.1]

0.618 [0.491, 0.716]

 Test cohort B

0.811 [0.716, 0.895]

65.2 [58.0, 72.3]

77.8 [60.0, 94.1]

62.8 [54.2, 70.3]

28.6 [18.0, 38.5]

93.7 [88.1, 98.4]

0.418 [0.286, 0.527]

Epidermal cyst

 Training cohort

0.997 [0.993, 1.0]

98.5 [97.6, 99.1]

97.5 [95.2, 99.4]

98.8 [97.9, 99.5]

96.9 [94.4, 98.8]

99.1 [98.3, 99.8]

0.972 [0.956, 0.986]

 Validation cohort

0.973 [0.928, 0.998]

95.2 [92.5, 98.0]

87.5 [78.0, 95.1]

98.1 [95.8, 100.0]

94.6 [88.1, 100.0]

95.5 [91.7, 98.2]

0.909 [0.842, 0.958]

 Test cohort A

0.942 [0.911, 0.969]

84.8 [79.5, 89.4]

51.1 [38.3, 63.8]

99.1 [97.2, 100.0]

95.8 [88.5, 100.0]

82.7 [76.8, 87.7]

0.667 [0.54, 0.767]

 Test cohort B

0.898 [0.839, 0.942]

82.1 [75.9, 87.5]

33.3 [17.9, 48.1]

97.6 [94.7, 100.0]

81.8 [58.3, 100.0]

82.2 [75.8, 88.1]

0.474 [0.286, 0.622]

Calcifying epithelioma

 Training cohort

0.974 [0.958, 0.986]

91.3 [89.2, 93.2]

90.6 [84.2, 96.0]

91.3 [89.2, 93.5]

56.3 [48.3, 64.2]

98.8 [97.9, 99.6]

0.695 [0.625, 0.759]

 Validation cohort

0.903 [0.816, 0.967]

85.8 [81.8, 90.5]

82.4 [66.7, 95.7]

86.3 [81.5, 91.0]

43.8 [29.3, 57.7]

97.4 [94.8, 99.2]

0.571 [0.417, 0.692]

 Test cohort A

0.943 [0.909, 0.973]

90.1 [86.1, 94.0]

60.0 [42.9, 76.9]

96.0 [93.1, 98.4]

75.0 [58.8, 89.5]

92.4 [88.7, 96.1]

0.667 [0.514, 0.787]

 Test cohort B

0.898 [0.848, 0.942]

87.5 [82.1, 92.0]

50.0 [32.0, 69.6]

95.7 [91.7, 98.9]

71.4 [50.0, 91.7]

89.8 [84.5, 94.2]

0.588 [0.4, 0.744]

  1. Data in brackets are the 95% confidence interval
  2. Abbreviations: AUC area under the receiver operating characteristic curve, ACC accuracy, PPV positive predict value, NPV negative predict value, DLM deep learning model, training cohort (n = 584 individuals), validation cohort (n = 148 individuals), test cohort A (n = 151 individuals), test cohort B (n = 112 individuals)