Skip to main content
Fig. 3 | BMC Medicine

Fig. 3

From: Sampling inequalities affect generalization of neuroimaging-based diagnostic classifiers in psychiatry

Fig. 3

Sampling bias and sampling inequalities in these trained ML models. A provides a scatter plot for the association between GDP and sample size for 32 counties/regions in the globe. B offers a scatter plot showing the association between GDP and sample size for 20 provinces within China. (C) shows the association between GDP and sample size for 25 states within the USA. (D) plots Gini sampling coefficients for the top 10% countries with large sample sizes to train ML models in existing studies, with high Gini value for high sampling inequality. LEDC and MEDC were categorized by World Bank (WB) and International Monetary Fund (IMF) classification. E illustrates the sampling bias and Gini coefficients for each continent. The left panel shows the proportion of the total sample size for training ML models in existing studies on the total sample population for each continent. The right panel shows the Gini coefficient for each continent

Back to article page