Deep learning models of ultrasonography significantly improved the differential diagnosis performance for superficial soft-tissue masses: a retrospective multicenter study

Table 2 Diagnostic performance of DLM-1

	Training cohort	Validation cohort	Test cohort A	Test cohort B
AUC	0.915 [0.871, 0.95]	0.992 [0.98, 1.0]	0.979 [0.952, 1.0]	0.898 [0.827, 0.959]
ACC (%)	87.5 [84.7, 90.0]	98.7 [95.4, 99.8]	97.4 [93.6, 99.3]	91.0 [84.4, 95.4]
Sensitivity (%)	84.4 [67.2, 94.7]	100.0 [63.1, 100.0]	40.0 [5.3, 85.3]	20.0 [2.5, 55.6]
Specificity (%)	87.7 [84.8, 90.2]	98.6 [95.2, 99.8]	99.3 [96.4, 100.0]	97.3 [92.4, 99.4]
PPV (%)	27.3 [22.4, 32.8]	80.0 [50.2, 94.1]	66.7 [17.7, 94.9]	40.0 [11.2, 78.0]
NPV (%)	99.0 [97.9, 99.6]	100.0 [100.0, 100.0]	98.0 [96.1, 99.0]	93.2 [91.0, 94.9]
F1-score	0.412 [0.318, 0.5]	0.889 [0.706, 1.0]	0.5 [0, 0.857]	0.267 [0, 0.5]

Data in brackets are the 95% confidence interval
Abbreviations: AUC area under the receiver operating characteristic curve, ACC accuracy, PPV positive predict value, NPV negative predict value, DLM deep learning model, training cohort (n = 617 individuals), validation cohort (n = 155 individuals), test A cohort (n = 156 individuals), test B cohort (n = 122 individuals)

ISSN: 1741-7015