Skip to main content
  • Research article
  • Open access
  • Published:

Mapping age- and sex-specific HIV prevalence in adults in sub-Saharan Africa, 2000–2018



Human immunodeficiency virus and acquired immune deficiency syndrome (HIV/AIDS) is still among the leading causes of disease burden and mortality in sub-Saharan Africa (SSA), and the world is not on track to meet targets set for ending the epidemic by the Joint United Nations Programme on HIV/AIDS (UNAIDS) and the United Nations Sustainable Development Goals (SDGs). Precise HIV burden information is critical for effective geographic and epidemiological targeting of prevention and treatment interventions. Age- and sex-specific HIV prevalence estimates are widely available at the national level, and region-wide local estimates were recently published for adults overall. We add further dimensionality to previous analyses by estimating HIV prevalence at local scales, stratified into sex-specific 5-year age groups for adults ages 15–59 years across SSA.


We analyzed data from 91 seroprevalence surveys and sentinel surveillance among antenatal care clinic (ANC) attendees using model-based geostatistical methods to produce estimates of HIV prevalence across 43 countries in SSA, from years 2000 to 2018, at a 5 × 5-km resolution and presented among second administrative level (typically districts or counties) units.


We found substantial variation in HIV prevalence across localities, ages, and sexes that have been masked in earlier analyses. Within-country variation in prevalence in 2018 was a median 3.5 times greater across ages and sexes, compared to for all adults combined. We note large within-district prevalence differences between age groups: for men, 50% of districts displayed at least a 14-fold difference between age groups with the highest and lowest prevalence, and at least a 9-fold difference for women. Prevalence trends also varied over time; between 2000 and 2018, 70% of all districts saw a reduction in prevalence greater than five percentage points in at least one sex and age group. Meanwhile, over 30% of all districts saw at least a five percentage point prevalence increase in one or more sex and age group.


As the HIV epidemic persists and evolves in SSA, geographic and demographic shifts in prevention and treatment efforts are necessary. These estimates offer epidemiologically informative detail to better guide more targeted interventions, vital for combating HIV in SSA.

Peer Review reports


Four decades after its discovery, human immunodeficiency virus (HIV) continues to impact millions of people worldwide, remains one of the leading causes of morbidity and mortality globally [1, 2] and incurs billions of dollars annually in direct health care costs and indirect socioeconomic costs [3]. In sub-Saharan Africa (SSA) in 2019, an estimated 26 million people were living with HIV [2]. In recent years, international bodies have set goals to end the HIV epidemic: in 2014, the Joint United Nations Programme on HIV/AIDS (UNAIDS) introduced the “95-95-95” targets—that by 2030, 95% of people living with HIV globally would know their status, 95% of all people with diagnosed HIV infection would receive sustained antiretroviral therapy, and 95% of people living with HIV receiving antiretroviral therapy (ART) would be virally suppressed [4, 5]. The United Nations Sustainable Development Goals also call for an end to the AIDS epidemic by 2030 [6]. Unfortunately, despite a significant increase in ART coverage over the last 20 years and major progress in terms of reductions in HIV incidence and mortality [1], the latest estimates and projections indicate that the world is not on track to meet these goals [2, 7, 8], and progress may stall further as a consequence of the COVID-19 pandemic [9].

Differences in HIV prevalence both within and between nations in SSA have been well-documented [10,11,12,13,14], as have differences between sexes [2, 12,13,14] and age groups [2]. These differences have also changed over time [1, 10], impacted in part by the onset, duration, location, and demographic targeting of different prevention and treatment interventions [15,16,17]. Epidemiologically targeted interventions are understood to be more effective compared to homogeneous interventions [18] and are increasingly important at a time when the future of funding for HIV prevention and treatment is both uncertain and highly variable [19, 20], particularly in the wake of disruptions related to the COVID-19 pandemic [21]. Evidence suggests that interventions are most effective when tailored to account for differences in the intensity of the epidemic by geographic location [14, 22], sex [23], and age [24]. Locally and demographically precise HIV prevalence information, however, is necessary in order to maximize the benefit of such methods; at present, such information in SSA is lacking.

HIV prevalence estimates stratified by age and sex are available at the national level through the Global Burden of Disease (GBD) [2] and from UNAIDS [25]. Both sources also provide subnational estimates at the first administrative level (e.g., province, state) in select countries. Recently, Dwyer-Lindgren et al. [10] presented aggregated adult HIV prevalence estimates for the years 2000–2017 at local scales in SSA, generalizing estimates for males and females combined, and across ages 15–49 years. Some studies have gone further to present subnational prevalence estimates separated by sex [26,27,28,29] or age [30]; however, these studies focused on single countries, and/or presented estimates for only one point in time, without describing any temporal trajectories in prevalence. To our knowledge, no previous studies have presented age- and sex-specific HIV prevalence estimates across SSA at local scales over time.

We built upon the HIV prevalence model from Dwyer-Lindgren et al. [10] to produce HIV prevalence estimates for 43 countries in SSA for males and females ages 15–59 years, stratified into nine 5-year age groups, for the years spanning 2000 to 2018. Countries, age groups, and time period were selected according to data availability. We expanded upon existing Bayesian spatiotemporal methods to model these estimates at a 5 × 5-km resolution and present them here aggregated to the second administrative level (which varies by country but is typically equivalent to e.g., districts, municipalities), which is the level typically considered most relevant to policymakers and stakeholders. Prevalence estimates for all demographic groups at all levels of geographic aggregation, as well as number of people living with HIV (count estimates), are publicly available from the Global Health Data Exchange ( and through a user-friendly data visualization tool (



This ecological study follows the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) [31] (Additional file 1: Section 1). This analysis relies secondary data sources to provide estimates of HIV prevalence on a 5 × 5-km grid in 43 countries in SSA for males and females ages 15–59 years residing at each location, stratified into five-year age bins (i.e., ages 15–19, 20–24, 25–29, 30–34, 35–39, 40–44, 45–49, 50–54, 55–59), with annual resolution from year 2000 to 2018 inclusive, calibrated to national estimates from the GBD [2]. The period of 2000–2018 and the age range of 15–59 years were selected to optimize the contemporaneousness of the estimates and to account for data availability—there were relatively few large-scale seroprevalence surveys conducted before 2000, and most seroprevalence surveys focus on adults, with little reporting outside the 15–59 years age range. We produced estimates for sex rather than gender binaries because sex is more predominantly reported in the available data sources. Due to data availability limitations we were unable to produce prevalence estimates for sex minority individuals outside the male/female binary. The 43 countries analyzed were also selected according to data availability—Mauritania was excluded as there were no HIV prevalence data available. We included six countries—Djibouti, Guinea-Bissau, Madagascar, Somalia, South Sudan and Sudan—where no seroprevalence survey data were available, but where sentinel surveillance data collected from antenatal care clinic (ANC) attendees (described below) were available. The implications of these and other limitations are expanded upon in the “Methodological advantages and limitations” section in the “Discussion” section.

The methodology used here largely parallels that previously used to map adult HIV prevalence in SSA [10], with the incorporation of modifications necessary to model by age and sex, and improvements related to the inclusion of spatially aggregated data and ANC data (Fig. 1). We used a 5 × 5-km grid for consistency with this previous analysis; to align with the resolution available for pre-existing covariates incorporated in this analysis; and for flexibility in aggregating these estimates to other levels of interest (e.g., first- and second-level administrative subdivisions, such as states or districts, respectively, or more aggregated age groups such as reproductive ages [commonly 15-49]) using grid-cell-level estimates of age- and sex-specific population from Worldpop [32]. These population estimates were also used to estimate the number of people living with HIV in each demographic group. All analyses were conducted in R version 3.6.1 [33]. Figure 2 provides an overview of the analytic process, described in more depth below. Additional details are available in Additional file 1.

Fig. 1
figure 1

HIV prevalence data by region and country. a HIV seroprevalence survey data and b ANC sentinel surveillance data used in this analysis, by region and country. Color indicates the data source. AIS, AIDS Indicator Survey; DHS, Demographic and Health Survey; MICS, Multiple Indicator Cluster Survey; PHIA, Population-based HIV Impact Assessment Survey. Shape type indicates whether a data source is age-specific and has point (GPS) or polygon location information. Size indicates the relative effective sample size for each source. A full list of data sources with additional details about data type (such as survey microdata and survey reports) and geographical details are provided in Additional file 2: Tables S1-S5

Fig. 2
figure 2

Analytical process overview. The process used to produce age- and sex-specific HIV prevalence estimates in sub-Saharan Africa involved three main parts. In the data-processing steps (green), data were identified, extracted, and prepared for use in the HIV prevalence model and in covariate models. In the modeling phase (orange), we used these data and covariates in a stacked generalization ensemble model and spatiotemporal Gaussian process model. In the post-processing phase (blue), we calibrated the prevalence estimation to match GBD 2019 estimates at the national level, aggregated prevalence estimates to the first- and second-level administrative subdivisions in each country, and calculated the number of people living with HIV (PLHIV)


HIV data

We compiled a geolocated dataset of 304,672 observations from 91 seroprevalence surveys from 37 countries and 10,351 observations from sentinel surveillance among antenatal care clinic attendees (ANC data) in 43 countries (Additional file 2: Tables S1-S2; Fig. 1). Data from seroprevalence surveys were originally in the form of survey microdata (that is, individual-level survey responses) or survey reports (Additional file 2: Table S1). For surveys with available microdata, we extracted variables related to age, sex, HIV blood test result, location, and year, as well as survey weights, where available. We excluded rows with missing information on any of these variables, and subset the data to ages 15–59 years. For data coded by gender rather than sex, we treated these data as if they were sex-specific rather than gender-specific. We recognize that sex and gender are not interchangeable: sex is a biological variable, while gender is a fluid social construct. In the absence of quality data, however, we could not disaggregate estimates by gender at this time. After subsetting by age, we collapsed the age-specific data into 5-year age bins (hereafter referred to as “ages”) by sex. We did this by calculating the weighted age- and sex-specific HIV prevalence at the finest spatial resolution available. Ideally, this was at the level of global positioning system (GPS) coordinates that represent the location of a survey cluster. In most surveys, GPS coordinates are randomly displaced (typically by 2–5 km depending on the setting and the survey series [34]) in order to protect respondent’s confidentiality. In instances where GPS coordinates were not available, the smallest areal unit (termed a “polygon”) possible was used instead. These typically represented an administrative subdivision. For surveys without microdata but for which estimates with some subnational resolution were provided in a report, we extracted these estimates with information about the sample size and location. GPS coordinates were not available for these reports, so these data were exclusively matched to polygons. In most reports, age ranges larger than 5 years were reported. Among these, we retained data reported for age ranges that corresponded exactly to one or more of the 5-year age bins used in this model; for example, we included surveys covering age ranges 15–49 years, or 15–24 years, but excluded those covering age ranges such as 18–24 years. For age-aggregated data, we retained information regarding the age range covered, to be used in our modeling process as described below. We also only included sex-specific data. For more information on excluded surveys see Additional file 2: Table S3.

Data that were spatially aggregated (i.e., polygon data) and/or age-aggregated required additional processing. Although we ultimately modeled HIV prevalence at the level of the observation, be it point or polygon, age-specific or age-aggregated, our modeling process initially specified HIV prevalence at the point-, time-, age-, and sex-specific level. Because of this, it was necessary that we disaggregate the age-aggregated and polygon survey data to be location- and age-specific. We did this by distributing polygon data to pixels proportional to population. Specifically, for each polygon, we generated points at the centroid of each 5 × 5-km pixel falling within that polygon and replicated that observation’s HIV prevalence and sample size at the location of each of those centroids. Age-aggregated point data were similarly disaggregated by replicating the HIV prevalence and sample size once for each year-age group covered in the overall age range. In the cases of age-aggregated polygon data, these two processes were combined. Next, each of the disaggregated, location- and age-specific rows of data associated with a given aggregated observation were assigned weights proportional to the age- and sex-specific population residing at that location for the given year, derived from WorldPop [32]. Weights per observation all summed to one. This process substantially increased the size of the dataset. To reduce the associated computational burden when fitting the model, in cases where at least one row within an observation was given a weight of less than half of one divided by the number of locations and/or ages in that observation, we successively dropped the lowest-weighted locations and/or ages until reaching a maximum of 1% of the observation’s weight dropped. Remaining locations and/or ages within that observation were then reweighted to maintain a total weight of one. Data that were not aggregated (i.e., age-specific point observations) were each assigned a weight of one.

ANC data were primarily derived from national HIV estimate files developed by national teams and compiled and shared via UNAIDS [35] and supplemented with data derived from sentinel surveillance country reports (Additional file 2: Table S2). We extracted information from these sources on HIV prevalence and sample size by site and year. Sites were geolocated to specific GPS coordinates where possible and otherwise to a polygon that represents an administrative subdivision. The ANC data available for this analysis were not age-specific. Because ANC data included only pregnant females, we assumed the age range of these data to be that of females with non-zero fertility rates in SSA according to GBD 2019 [36], that is, females ages 15–54 years. We disaggregated ANC data to the age and location level as we did for age-aggregated or polygon survey data. However, specific locations and ages were weighted by number of births rather than population size. The number of births for a given age and location was estimated as the product of the location-, age-, and sex-specific population, again derived from WorldPop [32], and the national fertility rate, derived from GBD 2019 estimates [36].


This analysis included the same covariates as the previous analysis [10]. This included five pre-existing covariates: (1) travel time to the nearest settlement of more than 50,000 inhabitants; (2) total population; (3) night-time lights; (4) urbanicity; and (5) malaria incidence (Additional file 2: Table S4). In addition, eight covariates were constructed explicitly for this analysis owing to their known association with HIV prevalence and data availability: (1) prevalence of male circumcision (all forms); (2) prevalence of self-reported sexually transmitted infection (STI) symptoms; (3) prevalence of marriage or living with a partner as married; (4) prevalence of one’s current partner living elsewhere among females; (5) prevalence of condom use at last sexual encounter; (6) prevalence of reporting ever having had intercourse among young females; and (7) and (8) prevalence of multiple partners in the past year for males and for females, respectively. We updated the covariates constructed for this analysis to incorporate newly available data but utilized the original statistical methods (Additional file 1: Section 3.2; Additional file 2: Table S5; Additional file 3: Figs. S1-S8).

Model and estimation

Covariate stacking

An ensemble covariate modeling approach (“stacking”) was implemented to capture possible nonlinear interactions among the covariates across space and time [37]. In this approach, three sub-models were fitted to the HIV survey data with the covariates as explanatory predictors: generalized additive models [38], boosted regression trees [39], and lasso regression [40]. Each sub-model was fitted using fivefold cross-validation to avoid overfitting, and the out-of-sample predictions from across the five folds were compiled into a single set of predictions that were used to fit the geostatistical model described below. In addition, each sub-model was also fitted to the full dataset to generate a complete set of in-sample predictions that were subsequently used when generating predictions from the geostatistical model (Additional file 3: Figs. S9-S11). Because the covariates used here were neither age-specific nor (for most) sex-specific, we fit these sub-models at that same age- and sex-aggregated level as the HIV-specific covariates, modeling HIV prevalence data aggregated across ages 15–49 and males and females. The age range 15–49 years was used in this case because of its more common usage in seroprevalence surveys compared to the 15–59 years range, allowing us to retain more data for the stacking model. Polygon data were excluded from stacking models due to their incongruity with the configurations needed for the different sub-models. The ANC data were also excluded due to known sampling biases, which are described in the Additional file 1: Section 4.2.

Geostatistical model

This model was fit in Template Model Builder (TMB) [41]. Owing to computational constraints, and to allow for regional differences in the relationships between covariates and HIV prevalence, as well as differences in the temporal, spatial, and demographic autocorrelation in HIV prevalence, separate models were fitted for four regions (Additional file 3: Fig. S12). We modeled HIV prevalence stratified by space, time, age, and sex using a generalized linear mixed-effects model. To simultaneously model point- and polygon-level observations, as well as both age-specific and age-aggregated observations, we specified the data likelihood at the observation level (i), which accommodated all of these. We modeled the number of HIV-positive individuals (Yi) among a sample (Ni) for a given observation as a binomial variable:

$${Y}_i\sim \textrm{Binomial}\left({N}_i,{p}_i\right)$$

Logit-transformed prevalence was however first specified at the space, time, age, and sex-disaggregated level (j):

$${\displaystyle \begin{array}{c}\textrm{logit}\left({p}_j\right)={\beta}_0+{\boldsymbol{\beta}}_1{\boldsymbol{X}}_j+{Z}_{1,j}+{Z}_{2,j}+{Z}_{3,c\left[j\right]}\\ {}{Z}_{1,j}\sim \textrm{GP}\left(0,{\varSigma}_{1, space}\otimes {\varSigma}_{1, time}\right)\\ {}\begin{array}{c}{Z}_{2,j}\sim \textrm{GMRF}\left(0,{\varSigma}_{2, time}\otimes {\varSigma}_{2, age}\otimes {\varSigma}_{2, sex}\right)\\ {}{Z}_{3,c\left[j\right]}\sim \textrm{GMRF}\left(0,{\varSigma}_{3,c}\right)\end{array}\end{array}}$$

We specified logit-transformed prevalence at the disaggregated level (pj) as a linear combination of:

  • A regional intercept (β0);

  • Covariates and associated regression parameters (β1Xj);

  • Random effects correlated across space and time, (Z1, j);

  • Random effects correlated across time, age, and sex, (Z2, j);

  • Country-specific (c) random effects correlated across age, (Z3, c[j]).

The random effects capturing correlations between space, time, age, and sex included:

  • Z1, j: a Gaussian process with mean 0 and a covariance matrix given by the Kronecker product of a spatial Matérn covariance function [42] (Σ1, space) and a temporal first-order autoregressive covariance function (Σ1, time);

  • Z2, j: a Gaussian Markov Random Field with mean 0 and a covariance matrix given by the Kronecker product of first-order autoregressive covariance functions for time (Σ2, time), age (Σ2, age), and sex (Σ2, sex);

  • Z3, c[j]: a Gaussian Markov Random Field with mean 0 and a covariance matrix given by country-specific first-order autoregressive covariance functions for age (Σ3, c).

We used the stochastic partial differential equation [43] approach to approximate the continuous spatiotemporal Gaussian random field (Z1, j). Sensitivity analyses were carried out to compare this model configuration to others with differing pj specification configurations, as well as to several other model and data specifications, and are described in detail in the Additional file 1: Section 4.3, Additional file 3: Figs. S13-S15, and the “Discussion” section. We then specified observation-level (i) prevalence:

$${p}_i={\textrm{logit}}^{-1}\left(\textrm{logit}\left(\sum \left({p}_{transformed,j}\cdot {w}_j\right)\right)+\left({\beta}_2+{U}_{s\left[i\right]}\right)\cdot {I}_{ANC}+{\epsilon}_i\right)$$

pi was calculated as the sum of disaggregated prevalence (ptransformed, j) estimates multiplied by their respective population (or in the case of ANC data, birth) weights (wj), plus the incorporation of additional ANC-related transformations and bias corrections (β2, Us[i], and IANC described below), and an observation-level uncorrelated error term (ϵi):

$${\upepsilon}_i\sim \textrm{Normal}\left(0,{\sigma}_i^2\right)$$

In cases where data were already disaggregated spatially and by age, wj = 1.

HIV prevalence as measured by sentinel surveillance of ANC clinic attendees is known to be biased as a measure of HIV prevalence in the general adult female population [44], because it only covers pregnant females who attend ANC, compared to all adult females [45, 46]. Additionally, fertility rates differ between HIV+ and HIV- females, with the exact relationship varying by age [47], thereby impacting age-specific ANC clinic visitation rates. To address this, for ANC data we transformed prevalence among pregnant females based on the underlying prevalence among all females and the age-specific fertility-rate ratio (HIV+ fertility/HIV- fertility). For ANC data,

$${p}_{transformed,j}=\frac{\left({p}_j\cdot {FRR}_j\right)}{\left({p}_j\cdot {FRR}_j\right)+1-{p}_j}$$

Fertility rate ratios (FRRj) were derived from GBD 2019 fertility estimates [36], taken at the national level except in cases where subnational estimates were available (in Ethiopia, Nigeria, and South Africa). For survey data,


To allow for additional ANC-related bias at the observation level (i), in instances where data in our model were derived from ANC sentinel surveillance (where IANC = 1 for ANC data, and IANC = 0 for all other data) our model incorporated a fixed term (β2) that captured overall mean bias in the ANC data, and a random effect (Us[i]) for a given ANC site s that captured spatial differences in the extent of this bias:

$${U}_{s\left[i\right]}\sim \textrm{Normal}\left(0,{\sigma}_{site\left[i\right]}^2\right)$$

Fitted model parameters are detailed in Additional file 2: Table S6. From each fitted model, we generated 1000 draws from the approximated joint posterior distribution of all model parameters and used these to construct 1000 draws of pj, setting IANC to 0. Fivefold cross-validation was used to assess model performance and to compare a number of alternative models (Additional file 3: Figs. S13-S15). We also compared the re-aggregated adult-level estimates from our final model to those from the results of an age- and sex-aggregated counterpart (Additional file 3: Fig. S16).


To take advantage of the more structured modeling approach and additional national-level data used by GBD 2019 [2], we performed post hoc calibration of our estimates to the corresponding national-level GBD estimates. For each country, year, age bin, and sex in our analysis, we defined a “raking factor” equal to the ratio of the GBD estimate for this country-year-age-sex to the population-weighted posterior mean HIV prevalence in all corresponding grid cells (Additional file 3: Figs. S17-S18). These raking factors were then used to scale each draw of HIV prevalence for each grid cell within that GBD geography, year, age, and sex. Point estimates for each grid cell were calculated as the mean of the scaled draws, and 95% uncertainty intervals were calculated as the 2.5th and 97.5th percentiles of the scaled draws. Grid cells that crossed international borders within modeling regions were fractionally allocated to multiple countries in proportion to the covered area during this process. In cases where subnational (i.e., first administrative level) estimates were available from the GBD, that is, for Ethiopia, Nigeria and South Africa, we calibrated to those estimates rather than those at the national level. Uncertainty in GBD estimates was not accounted for in this calibration.

In addition to estimates of HIV prevalence on a 5 × 5-km grid, we constructed estimates of HIV prevalence for first- and second-level administrative subdivisions. We did this by calculating age- and sex-specific population-weighted averages of prevalence for all grid cells within a given area. This process was carried out for each of the 1000 posterior draws (after calibration to GBD), with final point estimates derived from the mean of these draws and uncertainty intervals from the 2.5th and 97.5th percentiles. Additionally, estimates of the number of people living with HIV for a given age and sex in each grid cell were derived by multiplying estimated prevalence in each grid cell by the corresponding population estimate from WorldPop [32], which was also calibrated to match GBD 2019 [36] (Additional file 1: Section 4.4; complete estimates of people living with HIV are available along with all prevalence estimates at (

Although the model makes predictions for all locations covered by available covariates, all final model outputs for which land cover was classified as barren or sparsely vegetated according to European Space Agency Climate Change Initiative satellite data [48] and for which total population density was less than 10 individuals per 1 × 1-km in 2015 were masked for improved clarity when communicating with data specialists and policymakers. Maps were generated in R using the ggplot2 [49] package version 3.3.0.


Geographic variation

We found large differences in the spatial and demographic distribution of estimated HIV prevalence in SSA that were masked in demographically aggregated estimates (Figs. 3 and 4; Additional file 3: Figs. S19-S34). This was particularly striking among middle and older age groups. For example, in the year 2018, the maximum estimated HIV prevalence in any second-level administrative unit for adults ages 15–59 years was 35.4% in Umgungundlovu in the Kwazulu Natal province, South Africa (95% uncertainty interval (UI), 22.3–46.3%). However, estimated prevalence reached up to 59.4% [46.5–71.2%], almost 1.7 times higher, for females ages 35–39 years within that same location. Across all second-level administrative units, age groups, and sexes, females ages 35–39 in Nkilongo in Lubombo, Eswatini, had the highest estimated HIV prevalence in the year 2018, at 62.5% [50.1–74.5%].

Fig. 3
figure 3

HIV prevalence in sub-Saharan Africa in 2018 at the second administrative level for a subset of modeled demographic groups from the lower, middle, and upper age ranges: a all adults, ages 15–59 years; b males and c females ages 15–19 years; d males and e females ages 35–39 years; and f males and g females ages 55–59 years. Maps reflect national boundaries, land cover, lakes, and population; areas with fewer than ten people per 1 × 1 km, and classified as barren or sparsely vegetated, are colored light gray. Countries colored in dark gray were not included in the analysis

Fig. 4
figure 4

Relative uncertainty in HIV prevalence, 2018. Overlapping population-weighted quartiles of HIV prevalence (constructed separately for each demographic group) and relative 95% uncertainty in 2018 at the 5 × 5-km grid cell level for select demographic groups: a all adults, ages 15–59 years; b males and c females ages 15–19 years; d males and e females ages 35–39 years; and f males and g females ages 55–59 years. Relative uncertainty is defined as the ratio of the width of the 95% uncertainty interval to the mean estimate. Maps reflect national boundaries, land cover, lakes, and population; areas with fewer than ten people per 1 × 1 km, and classified as barren or sparsely vegetated, are colored light gray. Countries colored in dark gray were not included in the analysis

Geographic variation within countries was also more dramatic in our demographically disaggregated results. Across SSA countries, the median absolute difference between second-level administrative units with the lowest and highest estimated prevalence within a given country in 2018 was 3.5 times greater when considered across ages and sexes, than when estimated for all adults combined (11.2 percentage points versus 3.2 percentage points). This difference in within-country prevalence range between demographically aggregated versus disaggregated estimates varied greatly between countries. For example, in Mozambique, this range across second-level administrative units was 30.1 percentage points [16.7–46.3] for combined adults and 56.9 percentage points [37.4–78.2] (or 1.9 times larger) for estimates across ages and sexes. In Lesotho, on the other hand, this range was 8.2 times larger for estimates across ages and sexes compared to adults combined (51.6 percentage points [40.1–63.5] versus 6.3 percentage points [1.4–11.5]). Overall, countries in Eastern SSA tended to see greater such discrepancies compared to other regions; here, the median absolute difference between second-level administrative units was 4.4 times greater when considered across ages and sexes than for all adults combined (14.0 versus 3.2 percentage points). For complete geographic variation comparisons within each country, including uncertainty estimates, see Additional file 4.

Variation between males and females

Across SSA and across the years 2000–2018, estimated HIV prevalence was generally higher among females than males (Fig. 5). In 2018, for prevalence aggregated across ages 15–59 years, in no second-level administrative units was estimated prevalence higher among males compared to females. The absolute difference in estimated prevalence in 2018 between females and males reached a maximum of 15.0 percentage points (in Umkhanyakude, in KwaZulu-Natal, South Africa, with 36.3% [24.7–46.8%] estimated prevalence in females compared to 21.3% [13.1–28.7%] estimated prevalence in males), for a female to male prevalence ratio of 1.7 [1.5–1.9]. Countries in Central SSA, where overall prevalence was lower than in other SSA regions, tended to see the largest disparity between females and males in terms of relative differences. Estimated prevalence among females in Central SSA ranged up to a maximum of 2.7 [1.84–4.2] times greater than estimated prevalence in males in 2018 (in San Antonio de Palé, in Annobón, Equatorial Guinea, with 8.3% [2.1–21.4%] prevalence in females compared to 3.1% [0.8–8.1%] prevalence in males). Across Central SSA second-level administrative units, the median ratio between female and male estimated prevalence was 2.2, compared to the all-SSA median ratio of 1.6. The greatest absolute differences were seen in Eastern SSA, where the median absolute difference between female and male estimated prevalence was 1.9 percentage points in 2018, compared to the all-SSA median absolute difference of 0.9 percentage points. These differences between female and male prevalence in 2018 were less than those observed in the year 2000, when the median ratio between female and male estimated prevalence was 1.5, and the median absolute difference was 1.5 percentage points. We did not note substantial differences in within-country variations in prevalence between females and males in either 2000 or 2018 in any region. For complete comparisons between sexes by second-level administrative unit, including uncertainty estimates, see Additional file 4.

Fig. 5
figure 5

Differences in estimated prevalence between males and females ages 15–59 years at the second administrative level in 2018, calculated as a the ratio of estimated prevalence among females to prevalence among males and b the absolute difference in estimated prevalence between females and males. Maps reflect national boundaries, land cover, lakes, and population; areas with fewer than ten people per 1 × 1 km, and classified as barren or sparsely vegetated, are colored light gray. Countries colored in dark gray were not included in the analysis

Variation between age groups

Prevalence within second-level administrative units was also highly variable across age groups (Fig. 6), and relative variation in prevalence between age groups in 2018 tended to be higher in males. Comparing estimated prevalence across age groups within a given second-level administrative unit in 2018, the ratio between highest and lowest prevalence among age groups tended to be larger among males compared to females (median ratio across all SSA second-level administrative units of 14.4 for males, and 9.3 for females). For males, this ratio between highest and lowest estimated prevalence among age groups was smaller in Central SSA compared to other regions (median ratio of 8.3) and was largest in Western SSA (median ratio of 21.7). There was little regional difference for females. The sexes also differed in changes in this ratio between years, where it decreased over time for males (with a median ratio in 2000 of 52.7) but increased over time for females (median ratio in 2000 of 5.6). For complete age variation comparisons by second-level administrative unit, including uncertainty estimates, see Additional file 4.

Fig. 6
figure 6

Differences in prevalence between age groups in the year 2018 at the second administrative level, calculated as the ratio of estimated prevalence between the age groups with highest and lowest prevalence, for a males b and females; and the age groups with highest prevalence for c males d and females in 2018. Maps reflect national boundaries, land cover, lakes, and population; areas with fewer than ten people per 1 × 1 km, and classified as barren or sparsely vegetated, are colored light gray. Countries colored in dark gray were not included in the analysis

Across SSA, the age group with the highest estimated prevalence in any given second-level administrative unit in 2018 was always between ages 35 and 54 years for males and between 30 and 49 years for females (Fig. 6). In 2018, males ages 45–49 years most commonly had the highest estimated prevalence across all age groups in a given second-level administrative unit, at 46.8% of second-level administrative units (1894 of 4043) from within 23 of 43 countries. Females ages 40–44 years had the highest estimated prevalence across age groups in 63.8% of second-level administrative units (2581 of 4043) in 31 of 43 countries. For both males and females, the age group with the highest estimated prevalence tended to vary more across Eastern SSA compared to other regions.

Within-country variation between second-level administrative units was relatively consistent across age groups. The ratio of maximum to minimum estimated prevalence among districts within each country was lowest for ages 35–39 years (median ratio of 4.3 across countries) and highest for ages 15–19 years (median ratio of 4.8 across countries) in 2018. Slightly larger differences were seen between age groups in Eastern and Southern SSA, with lower variation in middle-age groups and greater within-country variation in younger age groups. The maximum-to-minimum within-country prevalence ratio in Eastern SSA was lowest for adults ages 40–44 years (median ratio of 5.4 across Eastern SSA countries) and highest for adults ages 15–19 years (median ratio of 6.7 across Eastern SSA countries). These same age groups also represented the highest and lowest ratios in Southern SSA countries, with median values of 2.0 in adults ages 40–44 years and 2.8 in adults ages 15–19 years.

Variation over time

Estimated change in prevalence over time among all adults masked broad differences between specific age and sex groups (Fig. 7; Additional file 3: Figs. S35-S40). Large temporal changes were much more common when considering sexes and age groups, compared to all adults combined. Between the years 2000 and 2018, among all adults ages 15–59 years, estimated HIV prevalence increased by more than 5.0 percentage points in only 3.7% (151 out of 4043) of second-level administrative units across SSA and decreased by more than 5.0 percentage points in 7.9% (321 of 4043) of second-level administrative units. On the other hand, 37.7% (1523 of 4043) of second-level administrative units experienced an increase in estimated HIV prevalence greater than 5.0 percentage points in that timeframe in at least one sex and age group, and 70.9% (2867 of 4043) of second-level administrative units saw a decrease greater than 5.0 percentage points in at least one sex and age group.

Fig. 7
figure 7

Change in HIV prevalence at the second administrative level between 2000 and 2018 for a subset of modeled demographic groups from the lower, middle, and upper age ranges: a all adults, ages 15–59 years; b males and c females ages 15–19 years; d males and e females ages 35–39 years; and f males and g females ages 55–59 years. Maps reflect national boundaries, land cover, lakes, and population; areas with fewer than ten people per 1 × 1 km, and classified as barren or sparsely vegetated, are colored light gray. Countries colored in dark gray were not included in the analysis

The distribution of districts with large increases or decreases in prevalence over time also varied greatly by region. All regions saw a decrease of greater than 5.0 percentage points in estimated prevalence for at least one sex and age group in a majority of second-level administrative units between 2000 and 2018: 61.2% for Central SSA, (393 out of 642), 70.9% (1160 out if 1635) for Western SSA, 71.0% (1032 out of 1452) for Eastern SSA, and 90.1% (283 out of 314) for Southern SSA. However, Southern SSA also had a very high proportion of second-level administrative units seeing an increase of greater than 5.0 percentage points in that same time frame, at 92.0% (289 of 314), while only a minority of second-level administrative units saw similar increases in the other regions.

We found diverging overall trends between age groups over time, with greater decreases over time among younger age groups, and greater increases among older age groups. For example, for females ages 25–29 years, we found that estimated prevalence decreased by at least 1.0 percentage point in the year 2018 compared to 2000 in more than 73.3% of second-level administrative units in SSA (2965 of 4043) and increased by at least 1.0 percentage point in only 2.4% (99 of 4043) of all second-level administrative units. Conversely, among females ages 50–54 years, estimated prevalence decreased between 2000 and 2018 by at least 1.0 percentage point in just 11.8% (477 of 4043) of second-level administrative units but increased by at least 1.0 percentage point in 40.1% (1622 of 4043) of second-level administrative units. We found this trend to be similar across regions. For complete comparisons of prevalence over time for each second-level administrative unit, age, and sex, including uncertainty estimates, see Additional file 4.


The results of this study, the first to present age- and sex-specific HIV prevalence estimates across sub-Saharan Africa at local scales, emphasize the interactions of geographic and demographic differences in HIV prevalence, going beyond previous research focused on either aspect individually. Just as previous work demonstrated how much geographic variability is masked in national prevalence estimates [10], we show here that demographically aggregated estimates mask important variation in the age and sex distributions of HIV prevalence at a local level, which in turn provide much clearer insights into the evolution of the HIV epidemic in SSA.

Many intervention methods are commonly used in the fight against the HIV epidemic, and variation in their efficacy and implementation has likely contributed to the prevalence trends presented here. Cost-efficiency is a consistent priority and is generally maximized by using targeted, integrated interventions [50]. For example, HIV prevention via behavioral and biomedical interventions based on local prevalence rates, HIV testing, and treatment initiation may be priorities for some age groups [51], while long-term ART retention and comorbidity care may require more emphasis for others [52]. Barriers to access to care often differ between geographic and demographic groups, where in some cases barriers may be logistical (e.g., geographic isolation and programmatic fragmentation [53]) or social (e.g., lack of information, stigmatization, homophobia [54]), and require different intervention methods. Males and females are also often targeted using different points of contact. For example, HIV testing has been recommended for all females attending antenatal care clinics [55], whereas for males the provision of self-, home-based, and mobile testing compared to facility-based testing may be more useful for testing and subsequent uptake of care [56,57,58]. Effective targeting of these interventions requires local, demographically specific HIV burden information, such as provided in the estimates presented here. Countries may similarly use this burden information to prioritize subnational and demographically specific treatment needs. This resource may also be useful in program evaluation efforts and thus aid the development of more successfully tailored interventions.

Variation in the social determinants driving HIV incidence and mortality, and thus HIV prevalence, are also an important consideration when assessing inequalities in HIV prevalence between locations and demographic groups. While prevalence among females is consistently higher than prevalence among males, for example, these differences can be attributed to different exposure to risk factors (such as age at first sex between males and females, marital status) in different countries [59]. In addition to understanding local patterns in HIV prevalence, effective interventions also need to consider, if not focus directly on, locally important risk factors and determinants of HIV infection and mortality [60, 61].

Our estimates point to many local shifts in HIV prevalence over time. A multitude of factors can affect HIV prevalence trends at the local level over time, from local changes in prevention interventions to shifts in the overall demographics of an area, but one particularly important factor is local scale-up of ART [62, 63]. Increases in ART coverage and reduced treatment costs have repeatedly been associated with large demographic shifts among people living with HIV [64] due to its success in reducing HIV mortality, leading to greatly increasing numbers of people living with HIV over the age of 50 years; our results reflect this trend. Given evidence pointing to differences between younger and older ART patients in rates of CD4 cell count decline [65], immune reconstitution rates [66], and risk of associated non-communicable diseases [67, 68], among other health metrics [69], it is necessary that treatment plans for older patients be specifically tailored for their age group. Our results highlight those locations with large existing populations of people living with HIV for ages 50–59 years, and those seeing rapid growth of HIV prevalence in that demographic group. At the same time, the minimal change in estimated prevalence over time among the youngest age groups suggests that continued and even expanded efforts in HIV prevention for adolescents and young adults still need to be maintained as a priority across the continent.

Despite the significant progress made through this analysis in describing HIV burden in SSA, prevalence estimates mask complex and varied relationships between HIV incidence and mortality, as well as migration and seasonal mobility. It is difficult to determine, for example, if a dramatic decrease in HIV prevalence in an area is due to reduced incidence, increased mortality, or differences in the immigration and emigration rates of HIV+ and HIV- individuals. Primary data for all three of these metrics are not widely available for SSA, adding additional complexity to the interpretation of our estimates. Importantly, no estimates of these indicators are consistently available at local scales for specific demographic groups. Furthermore, local data related to diagnosis, treatment, and viral suppression rates are also limited, despite these metrics lying at the heart of the UNAIDS 95-95-95 goals [4]. While very informative, difficulties can still arise in intervention decision-making built around HIV prevalence estimates alone, without understanding their underlying drivers. Improved surveillance of HIV prevalence, incidence, and mortality, combined with reliable population and migration estimates and information on local programs, are necessary to fully understand the complexities of the region’s HIV epidemic. Clearly, even with the development of more comprehensive burden information, any modeled estimates should only be used for intervention purposes in conjunction with local program knowledge.

Methodological advantages and limitations

The methods used in this analysis build upon those previously used by Dwyer-Lindgren et al. to model adult HIV prevalence [10]. While this analysis does improve upon and have advantages over the previous methods in some ways, it faces some of the same, as well as some new limitations. As with the previous study, and as with all modeling studies, the quality of our estimates is highly dependent on the quality and coverage of our input data. Despite constructing a large database of HIV prevalence data, coverage gaps and small sample sizes in some locations can be associated with imprecision and/or large uncertainty intervals in some of our prevalence estimates (Additional file 3: Figs. S27-S34). Additionally, the location information associated with the data compiled for this analysis is subject to some error. In order to protect respondent confidentiality, most surveys that collect GPS coordinates perform some type of random displacement on those coordinates prior to releasing data for secondary analysis: for example, GPS coordinates for Demographic and Health Surveys (DHS) are displaced by up to 2 km for urban clusters, up to 5 km for most rural clusters, and up to 10 km in a random 1% of rural clusters [34]. Past research has found that displacement can degrade the predictive power of a geostatistical model, however this effect was found to be modest, and researchers concluded that relatively accurate mapping can be undertaken at a 5 × 5-km resolution even with GPS displacement [70].

The approximate integration method we use in this analysis better handles uncertainty estimation and easily accommodates not only polygon data but age-aggregated data as well, compared to the polygon resampling method that has been used elsewhere [10, 71, 72]. At the same time, given the large number of dimensions being modeled, as well as the high data input count produced by our data disaggregation technique, we found that current matrix packages, as well as our computational facilities, could not accommodate a Gaussian process that accounted for the covariance of a complete space-time-age-sex Kronecker product. We therefore focused on the interactions between space, time, age, and sex that we believed would be most relevant in terms of capturing important variability in these dimensions, within our computational abilities. Our modeling strategy also assumed no difference in the probability that an HIV+ versus an HIV- pregnant woman would access antenatal care and therefore be included in ANC surveillance.

Due to limited data availability, we delineated estimates in this analysis using a male/female binary. We recognize that this approach does not allow for investigation of HIV prevalence among gender and sex diverse people, despite the disproportionate burden of HIV commonly seen among these populations [73]. Further, we recognize that many data sources do not provide the option to select a sex other than “male” or “female,” gender options beyond “man” or “woman,” and often conflate gender with sex. In the future, we hope that high-quality data on HIV prevalence for gender and sexual diverse people will be more widely available, so we can produce estimates beyond females and males.

We note that our results include unprecedentedly high prevalence estimates for certain population subsets. In most cases, we do not believe these estimates are implausible. For example, we estimated prevalence among middle- and older-aged females to be up to 59.2% [45.9–73.0%] in Umgungundlovu in KwaZulu-Natal, South Africa in 2018. Previous research has estimated prevalence for females adults of all ages combined in Umgungundlovu in 2017 to be 46.6% [43.8–49.5%] [74]. As we have shown that prevalence in middle- and older-aged females tended to be higher than all-ages prevalence, we believe our estimates for middle- and older-aged females during this time period in this location to be reasonable, especially with uncertainty intervals taken into consideration. In rare cases, however, our methods yielded estimates which we were unable to support through the literature. For example, for males ages 35–39 and 40–44 years in Nyatike in Migori, Kenya, we estimated prevalence in the year 2000 to be 77.8% [50.2–100.0%] and 78.7% [50.0–100.0%], respectively. It is unlikely true prevalence in that area and year was this high (though given the large uncertainty intervals associated with these values, it is probable that true prevalence does fall within those ranges). We note, however, that the high estimates in this area and surrounding second-level administrative units were predominantly associated with the earlier years in our time series—we believe the more recent estimates in Nyatike to be more realistic [75]. In these locations, decreases in prevalence over time may therefore also be overestimated. These instances were rare.

A combination of data limitations and model complexity ultimately led to large uncertainty intervals around our estimates. Given that our 95% coverage estimates in model validation were consistently higher than expected (Additional file 3: Figs. S14-S16), this indicates that these uncertainty intervals may be larger than appropriate. Wide uncertainty can limit the utility of our estimates in terms of informing HIV policies, and reducing this uncertainty through improved data coverage will be an important consideration in future iterations of this model. We were also unable to account for all sources of uncertainty such as uncertainty in the WorldPop estimates used in many stages of our modeling and estimation processes and uncertainty in covariates.


HIV continues to impose enormous human and financial costs [3] on SSA, decades since its emergence. Financial and logistical disruptions and discontinuities due to the impacts of COVID-19, as well as changes in ART adherence, are likely to present new barriers [21, 76] to the UNAIDS 95-95-95 goals [4]. This analysis provides important insight into the nuances of HIV burden in SSA, offering information that is critical to the development of targeted interventions.

Availability of data and materials

The findings of this study are supported by data available in public online repositories and data publicly available upon request of the data provider. Details regarding the data sources used and their availability can be found in Additional file 2: Supplemental Tables 1-5 and online via the Global Health Data Exchange ( Estimates can also be further explored through the Global Health Data Exchange, as well as via our online visualization tool ( Administrative boundaries were modified from the Database for Global Administrative Areas (GADM) dataset [77]. Populations were retrieved from WorldPop [32]. This study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) recommendations [31]. All maps and figures presented in this study are generated by the authors; no permissions are required for publication. All computer code is available online and can be found at (



Acquired immune deficiency syndrome


Antenatal care


Antiretroviral therapy


Guidelines for Accurate and Transparent Health Estimates Reporting


Global Burden of Disease


Global positioning system


Human immunodeficiency virus


Sustainable Development Goals


Sub-Saharan Africa


Template Model Builder


Joint United Nations Programme on HIV/AIDS


  1. Frank TD, Carter A, Jahagirdar D, Biehl MH, Douwes-Schultz D, Larson SL, et al. Global, regional, and national incidence, prevalence, and mortality of HIV, 1980–2017, and forecasts to 2030, for 195 countries and territories: a systematic analysis for the Global Burden of Diseases, Injuries, and Risk Factors Study 2017. Lancet HIV. 2019;6:e831–59.

    Article  Google Scholar 

  2. GBD 2019 Diseases and Injuries Collaborators. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:1204–22.

    Article  Google Scholar 

  3. Micah AE, Su Y, Bachmeier SD, Chapin A, Cogswell IE, Crosby SW, et al. Health sector spending and spending on HIV/AIDS, tuberculosis, and malaria, and development assistance for health: progress towards Sustainable Development Goal 3. Lancet. 2020;396:693–724.

    Article  Google Scholar 

  4. UNAIDS. Joint United Nations Programme on HIV/AIDS. 2020.

    Google Scholar 

  5. UNAIDS. Understanding Fast Track: accelerating action to end the AIDS epidemic by 2030. 2015.

    Google Scholar 

  6. United Nations. Transforming our world: the 2030 agenda for sustainable development. New York: United Nations; 2015.

    Google Scholar 

  7. Bekker L-G, Alleyne G, Baral S, Cepeda J, Daskalakis D, Dowdy D, et al. Advancing global health and strengthening the HIV response in the era of the Sustainable Development Goals: the International AIDS Society— Lancet Commission. Lancet. 2018;392:312–58.

    Article  Google Scholar 

  8. Jones J, Sullivan PS, Curran JW. Progress in the HIV epidemic: identifying goals and measuring success. PLoS Med. 2019;16:e1002729.

    Article  Google Scholar 

  9. Hogan AB, Jewell BL, Sherrard-Smith E, Vesga JF, Watson OJ, Whittaker C, et al. Potential impact of the COVID-19 pandemic on HIV, tuberculosis, and malaria in low-income and middle-income countries: a modelling study. Lancet Glob Health. 2020;8:e1132–41.

    Article  Google Scholar 

  10. Dwyer-Lindgren L, Cork MA, Sligar A, Steuben KM, Wilson KF, Provost NR, et al. Mapping HIV prevalence in sub-Saharan Africa between 2000 and 2017. Nature. 2019;570:189–93.

    Article  CAS  Google Scholar 

  11. Larmarange J, Bendaud V. HIV estimates at second subnational level from national population-based surveys. AIDS. 2014;28(Suppl 4):S469–76.

    Article  Google Scholar 

  12. Meyer-Rath G, McGillen JB, Cuadros DF, Hallett TB, Bhatt S, Wabiri N, et al. Targeting the right interventions to the right people and places: the role of geospatial analysis in HIV program planning. AIDS. 2018;32:957–63.

    Article  Google Scholar 

  13. Cuadros DF, Li J, Branscum AJ, Akullian A, Jia P, Mziray EN, et al. Mapping the spatial variability of HIV infection in sub-Saharan Africa: effective information for localized HIV prevention and control. Sci Rep. 2017;7:9093.

    Article  Google Scholar 

  14. Coburn BJ, Okano JT, Blower S. Using geospatial mapping to design HIV elimination strategies for sub-Saharan Africa. Sci Transl Med. 2017;9:eaag0019.

    Article  Google Scholar 

  15. Akullian A, Vandormael A, Miller JC, Bershteyn A, Wenger E, Cuadros D, et al. Large age shifts in HIV-1 incidence patterns in KwaZulu-Natal, South Africa. Proc Natl Acad Sci U S A. 2021;118:e2013164118.

    Article  CAS  Google Scholar 

  16. Khalifa A, Stover J, Mahy M, Idele P, Porth T, Lwamba C. Demographic change and HIV epidemic projections to 2050 for adolescents and young people aged 15-24. Null. 2019;12:1662685.

    Google Scholar 

  17. Faust L, Yaya S. The effect of HIV educational interventions on HIV-related knowledge, condom use, and HIV incidence in sub-Saharan Africa: a systematic review and meta-analysis. BMC Public Health. 2018;18:1254.

    Article  Google Scholar 

  18. Anderson S-J, Cherutich P, Kilonzo N, Cremin I, Fecht D, Kimanga D, et al. Maximising the effect of combination HIV prevention through prioritisation of the people and places in greatest need: a modelling study. Lancet. 2014;384:249–56.

    Article  Google Scholar 

  19. Schneider MT, Birger M, Haakenstad A, Singh L, Hamavid H, Chapin A, et al. Tracking development assistance for HIV/AIDS: the international response to a global epidemic. AIDS. 2016;30:1475–9.

    Article  Google Scholar 

  20. Olakunde BO, Adeyinka DA, Ozigbu CE, Ogundipe T, Menson WNA, Olawepo JO, et al. Revisiting aid dependency for HIV programs in sub-Saharan Africa. Public Health. 2019;170:57–60.

    Article  CAS  Google Scholar 

  21. Jewell BL, Mudimu E, Stover J, ten Brink D, Phillips AN, Smith JA, et al. Potential effects of disruption to HIV programmes in sub-Saharan Africa caused by COVID-19: results from multiple mathematical models. Lancet HIV. 2020;7:e629–40.

    Article  Google Scholar 

  22. Nagelkerke NJD, Jha P, de Vlas SJ, Korenromp EL, Moses S, Blanchard JF, et al. Modelling HIV/AIDS epidemics in Botswana and India: impact of interventions to prevent transmission. Bull World Health Organ. 2002;80:89–96.

    Google Scholar 

  23. Long EF, Stavert RR. Portfolios of biomedical HIV interventions in South Africa: a cost-effectiveness analysis. J Gen Intern Med. 2013;28:1294–301.

    Article  Google Scholar 

  24. Bershteyn A, Klein DJ, Eckhoff PA. Age-targeted HIV treatment and primary prevention as a “ring fence” to efficiently interrupt the age patterns of transmission in generalized epidemic settings in South Africa. Int Health. 2016;8:277–85.

    Article  Google Scholar 

  25. Joint United Nations Programme on HIV/AIDS. AIDSinfo. UNAIDS; 2018. Accessed 18 June 2020.

  26. Okano JT, Blower S. Sex-specific maps of HIV epidemics in sub-Saharan Africa. Lancet Infect Dis. 2016;16:1320–2.

    Article  Google Scholar 

  27. Messina JP, Emch M, Muwonga J, Mwandagalirwa K, Edidi SB, Mama N, et al. Spatial and socio-behavioral patterns of HIV prevalence in the Democratic Republic of Congo. Soc Sci Med. 2010;71:1428–35.

    Article  Google Scholar 

  28. Palk L, Blower S. Geographic variation in sexual behavior can explain geospatial heterogeneity in the severity of the HIV epidemic in Malawi. BMC Med. 2018;16:22.

    Article  Google Scholar 

  29. Tanser F, Bärnighausen T, Cooke GS, Newell M-L. Localized spatial clustering of HIV infections in a widely disseminated rural South African epidemic. Int J Epidemiol. 2009;38:1008–16.

    Article  Google Scholar 

  30. Bulstra CA, Hontelez JAC, Giardina F, Steen R, Nagelkerke NJD, Bärnighausen T, et al. Mapping and characterising areas with high levels of HIV transmission in sub-Saharan Africa: a geospatial analysis of national survey data. PLoS Med. 2020;17:e1003042.

    Article  Google Scholar 

  31. Stevens GA, Alkema L, Black RE, Boerma JT, Collins GS, Ezzati M, et al. Guidelines for accurate and transparent health estimates reporting: the GATHER statement. Lancet. 2016;388:e19–23.

    Article  Google Scholar 

  32. Tatem AJ. WorldPop, open data for spatial demography. Sci Data. 2017;4:170004.

    Article  Google Scholar 

  33. R Core Team. R: the R project for statistical computing. 2019. Accessed 8 July 2020.

    Google Scholar 

  34. Burgert C, Colston J, Roy T, Zachary B. Geographic displacement procedure and georeferenced data release policy for the Demographic and Health Surveys; 2013.

    Google Scholar 

  35. UNAIDS. National HIV estimates file. 2019.

    Google Scholar 

  36. GBD 2019 Demographics Collaborators. Global age-sex-specific fertility, mortality, healthy life expectancy (HALE), and population estimates in 204 countries and territories, 1950–2019: a comprehensive demographic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:1160–203.

    Article  Google Scholar 

  37. Bhatt S, Cameron E, Flaxman SR, Weiss DJ, Smith DL, Gething PW. Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization. J Royal Soc Interface. 2017;14:20170520.

    Article  Google Scholar 

  38. Hastie T, Tibshirani RJ. Generalized additive models. London: Chapman & Hall; 1990.

    Google Scholar 

  39. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29:1189–232.

    Article  Google Scholar 

  40. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc B Methodol. 1996;58:267–88.

    Google Scholar 

  41. Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM. TMB: automatic differentiation and Laplace approximation. J Stat Softw. 2016;70:1–21.

    Article  Google Scholar 

  42. Stein ML. Interpolation of spatial data: some theory for Kriging. New York: Springer-Verlag; 1999.

    Book  Google Scholar 

  43. Lindgren F, Rue H, Lindström J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. R Stat Soc. 2011;73:423–98.

    Article  Google Scholar 

  44. Zaba BW, Carpenter LM, Boerma JT, Gregson S, Nakiyingi J, Urassa M. Adjusting ante-natal clinic data for improved estimates of HIV prevalence among women in sub-Saharan Africa. AIDS. 2000;14:2741–50.

    Article  CAS  Google Scholar 

  45. Gouws E, Mishra V, Fowler TB. Comparison of adult HIV prevalence from national population-based surveys and antenatal clinic surveillance in countries with generalised epidemics: implications for calibrating surveillance data. Sex Transm Infect. 2008;84(Suppl 1):i17–23.

    Article  Google Scholar 

  46. Marsh K, Mahy M, Salomon JA, Hogan DR. Assessing and adjusting for differences between HIV prevalence estimates derived from national population-based surveys and antenatal care surveillance, with applications for Spectrum 2013. AIDS. 2014;28:S497–505.

    Article  Google Scholar 

  47. Marston M, Zaba B, Eaton JW. The relationship between HIV and fertility in the era of antiretroviral therapy in sub-Saharan Africa: evidence from 49 Demographic and Health Surveys. Trop Med Int Health. 2017;22:1542–50.

    Article  CAS  Google Scholar 

  48. ESA-CCI Project. Land cover classification gridded maps from 1992 to present derived from satellite observations. 2020.!/dataset/satellite-land-cover?tab=overview. Accessed 30 Apr 2020.

    Google Scholar 

  49. Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2016.

    Book  Google Scholar 

  50. McGillen JB, Anderson S-J, Dybul MR, Hallett TB. Optimum resource allocation to reduce HIV incidence across sub-Saharan Africa: a mathematical modelling study. Lancet HIV. 2016;3:e441–8.

    Article  Google Scholar 

  51. Hosek S, Pettifor A. HIV prevention interventions for adolescents. Curr HIV/AIDS Rep. 2019;16:120–8.

    Article  Google Scholar 

  52. Schatz E, Seeley J, Negin J, Weiss HA, Tumwekwase G, Kabunga E, et al. “For us here, we remind ourselves”: strategies and barriers to ART access and adherence among older Ugandans. BMC Public Health. 2019;19:131.

    Article  Google Scholar 

  53. De Neve J-W, Garrison-Desany H, Andrews KG, Sharara N, Boudreaux C, Gill R, et al. Harmonization of community health worker programs for HIV: A four-country qualitative study in Southern Africa. PLoS Med. 2017;14:e1002374.

    Article  Google Scholar 

  54. Mbonu NC, van den Borne B, De Vries NK. Stigma of people with HIV/AIDS in Sub-Saharan Africa: a literature review. J Trop Med. 2009;2009:145891.

    Article  Google Scholar 

  55. World Health Organization. Consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection: recommendations for a public health approach. 2013.;jsessionid=FA819AE8F065C685D2C4CC768FEE7304?sequence=1. Accessed 20 Mar 2021.

    Google Scholar 

  56. Sabapathy K, den Bergh RV, Fidler S, Hayes R, Ford N. Uptake of home-based voluntary HIV testing in sub-Saharan Africa: a systematic review and meta-analysis. PLoS Med. 2012;9:e1001351.

    Article  Google Scholar 

  57. Sharma M, Ying R, Tarr G, Barnabas R. Systematic review and meta-analysis of community and facility-based HIV testing to address linkage to care gaps in sub-Saharan Africa. Nature. 2015;528:S77–85.

    Article  Google Scholar 

  58. Hatzold K, Gudukeya S, Mutseta MN, Chilongosi R, Nalubamba M, Nkhoma C, et al. HIV self-testing: breaking the barriers to uptake of testing among men and adolescents in sub-Saharan Africa, experiences from STAR demonstration projects in Malawi, Zambia and Zimbabwe. J Int AIDS Soc. 2019;22:e25244.

    Article  Google Scholar 

  59. Sia D, Onadja Y, Hajizadeh M, Heymann SJ, Brewer TF, Nandi A. What explains gender inequalities in HIV/AIDS prevalence in sub-Saharan Africa? Evidence from the demographic and health surveys. BMC Public Health. 2016;16:1136.

    Article  Google Scholar 

  60. Dean HD, Fenton KA. Addressing social determinants of health in the prevention and control of HIV/AIDS, viral hepatitis, sexually transmitted infections, and tuberculosis. Public Health Rep. 2010;125(4_suppl):1–5.

    Article  Google Scholar 

  61. Heestermans T, Browne JL, Aitken SC, Vervoort SC, Klipstein-Grobusch K. Determinants of adherence to antiretroviral therapy among HIV-positive adults in sub-Saharan Africa: a systematic review. BMJ Glob Health. 2016;1:e000125.

    Article  Google Scholar 

  62. Tanser F, Bärnighausen T, Grapsa E, Zaidi J, Newell M-L. High coverage of ART associated with decline in risk of HIV acquisition in rural KwaZulu-Natal, South Africa. Science. 2013;339:966–71.

    Article  CAS  Google Scholar 

  63. Vandormael A, Akullian A, Siedner M, de Oliveira T, Bärnighausen T, Tanser F. Declines in HIV incidence among men and women in a South African population-based cohort. Nat Commun. 2019;10:5482.

    Article  CAS  Google Scholar 

  64. Hontelez JAC, de Vlas SJ, Baltussen R, Newell M-L, Bakker R, Tanser F, et al. The impact of antiretroviral treatment on the age composition of the HIV epidemic in sub-Saharan Africa. AIDS. 2012;26(Suppl 1 0 1):S19–30.

    Article  CAS  Google Scholar 

  65. CASCADE Collaboration. Differences in CD4 cell counts at seroconversion and decline among 5739 HIV-1-infected individuals with well-estimated dates of seroconversion. J Acquir Immune Defic Syndr. 2003;34:76–83.

    Article  Google Scholar 

  66. Goetz MB, Boscardin WJ, Wiley D, Alkasspooles S. Decreased recovery of CD4 lymphocytes in older HIV-infected patients beginning highly active antiretroviral therapy. AIDS. 2001;15:1576–9.

    Article  CAS  Google Scholar 

  67. Mills EJ, Bärnighausen T, Negin J. HIV and aging--preparing for the challenges ahead. N Engl J Med. 2012;366:1270–3.

    Article  CAS  Google Scholar 

  68. Coetzee L, Bogler L, De Neve J-W, Bärnighausen T, Geldsetzer P, Vollmer S. HIV, antiretroviral therapy and non-communicable diseases in sub-Saharan Africa: empirical evidence from 44 countries over the period 2000 to 2016. J Int AIDS Soc. 2019;22:e25364.

    Article  Google Scholar 

  69. Parikh SM, Obuku EA, Walker SA, Semeere AS, Auerbach BJ, Hakim JG, et al. Clinical differences between younger and older adults with HIV/AIDS starting antiretroviral therapy in Uganda and Zimbabwe: a secondary analysis of the DART trial. PLoS One. 2013;8:e76158.

    Article  CAS  Google Scholar 

  70. Gething P, Tatem A, Bird T, Burgert-Brucker CR. Creating spatial interpolation surfaces with DHS data. Rockville: ICF International; 2015.

    Google Scholar 

  71. Golding N, Burstein R, Longbottom J, Browne AJ, Fullman N, Osgood-Zimmerman A, et al. Mapping under-5 and neonatal mortality in Africa, 2000–15: a baseline analysis for the Sustainable Development Goals. Lancet. 2017;390:2171–82.

    Article  Google Scholar 

  72. Graetz N, Friedman J, Osgood-Zimmerman A, Burstein R, Biehl MH, Shields C, et al. Mapping local variation in educational attainment across Africa. Nature. 2018;555:48–53.

    Article  CAS  Google Scholar 

  73. Blondeel K, Say L, Chou D, Toskin I, Khosla R, Scolaro E, et al. Evidence and knowledge gaps on the disease burden in sexual and gender minorities: a review of systematic reviews. Int J Equity Health. 2016;15:16.

    Article  Google Scholar 

  74. Woldesenbet S, Kufa T, Lombard C, Manda S, Ayalew kassahun, Cheyip M, et al. The 2017 national antenatal sentinel HIV survey key findings, South Africa. 2019.

    Google Scholar 

  75. Ministry of Health, National AIDS Control Council. Kenya AIDS response progress report 2016. 2016.

    Google Scholar 

  76. Jiang H, Zhou Y, Tang W. Maintaining HIV care during the COVID-19 pandemic. Lancet HIV. 2020;7:e308–9.

    Article  Google Scholar 

  77. Global Administrative Areas. GADM maps and data. v.3.6. 2019.

    Google Scholar 

Download references


LBD sub-Saharan Africa HIV Prevalence Collaborators

S Afzal acknowledges support of the Pakistan Society of Medical Infectious Diseases and King Edward Medical University to access the relevant data of HIV from various sources. T W Bärnighausen was supported by the Alexander von Humboldt Foundation through the Alexander von Humboldt Professor award, funded by the German Federal Ministry of Education and Research. F Carvalho and E Fernandes acknowledge support from Fundação para a Ciência e a Tecnologia (FCT), I.P., in the scope of the project UIDP/04378/2020 and UIDB/04378/2020 of the Research Unit on Applied Molecular Biosciences - UCIBIO and the project LA/P/0140/2020 of the Associate Laboratory Institute for Health and Bioeconomy - i4HB; FCT/MCTES (Ministério da Ciência, Tecnologia e Ensino Superior) through the project UIDB/50006/2020. K Deribe acknowledges support by the Wellcome Trust [grant number 201900/Z/16/Z] as part of his International Intermediate Fellowship. C Herteliu and A Pana are partially supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNDS-UEFISCDI, project number PN-III-P4-ID-PCCF-2016-0084. Claudiu Herteliu is partially supported by a grant of the Romanian Ministry of Research Innovation and Digitalization, MCID, project number ID-585-CTR-42-PFE-2021. Y J Kim acknowledges support by the Research Management Centre, Xiamen University Malaysia [No. XMUMRF/2020-C6/ITCM/0004]. S L Koulmane Laxminarayana acknowledges institutional support by the Manipal Academy of Higher Education. K Krishan acknowledges non-financial support from UGC Centre of Advanced Study, CAS II, Department of Anthropology, Panjab University, Chandigarh, India. M Kumar would like to acknowledge NIH/FIC K43 TW010716-04. I Landires is a member of the Sistema Nacional de Investigación (SNI), supported by the Secretaría Nacional de Ciencia, Tecnología e Innovación (SENACYT), Panama. V Nuñez-Samudio is a member of the Sistema Nacional de Investigación (SNI), which is supported by Panama’s Secretaría Nacional de Ciencia, Tecnología e Innovación (SENACYT). O O Odukoya was supported by the Fogarty International Center of the National Institutes of Health under the Award Number K43TW010704. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Z Quazi Syed acknowledges support from JNMC, Datta Meghe Institute of Medical Sciences. A I Ribeiro was supported by National Funds through FCT, under the ‘Stimulus of Scientific Employment – Individual Support’ program within the contract CEECIND/02386/2018. A M Samy acknowledges the support from a fellowship of the Egyptian Fulbright Mission program and Ain Shams University. R Shrestha acknowledges support from NIDA K01 Award: K01DA051346. N Taveira acknowledges support from FCT and Aga Khan Development Network (AKDN) - Portugal Collaborative Research Network in Portuguese speaking countries in Africa (project reference: 332821690), and by the European & Developing Countries Clinical Trials Partnership (EDCTP), UE (project reference: RIA2016MC-1615). B Unnikrishnan acknowledges support from Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal.


This work was primarily supported by grant OPP1132415 from the Bill & Melinda Gates Foundation. The funder of the study had no role in study design, data collection, data analysis, data interpretation, writing of the report, or decision to publish. The corresponding authors had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Author information

Authors and Affiliations