Skip to main content

Table 3 Comparison of diagnostic performance between the groups of radiologists at different levels

From: Deep learning radiomics of dual-modality ultrasound images for hierarchical diagnosis of unexplained cervical lymphadenopathy

Different levels of radiologist group Internal testing cohort (n = 171) External testing cohort 1 (n = 105) External testing cohort 2 (n = 92)
Without → with AI (%) P1 P2 Without → with AI (%) P1 P2 Without → with AI (%) P1 P2
Senior Accuracy 83.2 → 86.3 ↑ 0.033 / 81.7 → 84.1 ↑ 0.132 / 81.5 → 82.6 ↑ 0.291 /
Sensitivity 66.4 → 72.5 ↑ 0.032 / 63.3 → 68.1↑ 0.132 / 63.0 → 65.2 ↑ 0.314 /
Specificity 88.8 → 90.8 ↑ 0.040 / 87.8 → 89.4 ↑ 0.116 / 87.7 → 88.4 ↑ 0.292 /
Middle Accuracy 81.0 → 85.5 ↑ 0.007 0.534 77.9 → 80.7 ↑ 0.088 0.616 78.5 → 81.5 ↑ 0.096 0.628
Sensitivity 62.0 → 71.1 ↑ 0.005 0.551 55.7 → 61.4 ↑ 0.107 0.614 57.1 → 63.1 ↑ 0.087 0.647
Specificity 87.3 → 90.4 ↑ 0.004 0.537 85.2 → 87.1 ↑ 0.093 0.589 85.7 → 87.7 ↑ 0.102 0.644
Junior Accuracy 77.8 → 81.0 ↑ 0.033 0.420 77.1 → 82.1 ↑ 0.013 0.670 75.6 → 78.8 ↑ 0.075 0.783
Sensitivity 58.8 → 62.0 ↑ 0.037 0.440 54.3 → 64.3↑ 0.009 0.702 51.1 → 57.6 ↑ 0.091 0.790
Specificity 85.2 → 87.3 ↑ 0.034 0.430 84.8 → 88.1 ↑ 0.016 0.671 83.7 → 85.9 ↑ 0.080 0.759
  1. P1 values indicate a comparison between the AI model and the different levels of radiologist groups without AI assistance. P2 values indicate a comparison between junior and middle experienced radiologist group with AI assistance and senior experienced radiologist group without AI assistance. The upward arrow (↑) represents indicators that improved owing to AI assistance