The impact of HIV infection on tuberculosis transmission in a country with low tuberculosis incidence: a national retrospective study using molecular epidemiology

Background HIV is known to increase the likelihood of reactivation of latent tuberculosis to active TB disease; however, its impact on tuberculosis infectiousness and consequent transmission is unclear, particularly in low-incidence settings. Methods National surveillance data from England, Wales and Northern Ireland on tuberculosis cases in adults from 2010 to 2014, strain typed using 24-locus mycobacterial-interspersed-repetitive-units–variable-number-tandem-repeats was used retrospectively to identify clusters of tuberculosis cases, subdivided into ‘first’ and ‘subsequent’ cases. Firstly, we used zero-inflated Poisson regression models to examine the association between HIV status and the number of subsequent clustered cases (a surrogate for tuberculosis infectiousness) in a strain type cluster. Secondly, we used logistic regression to examine the association between HIV status and the likelihood of being a subsequent case in a cluster (a surrogate for recent acquisition of tuberculosis infection) compared to the first case or a non-clustered case (a surrogate for reactivation of latent infection). Results We included 18,864 strain-typed cases, 2238 were the first cases of clusters and 8471 were subsequent cases. Seven hundred and fifty-nine (4%) were HIV-positive. Outcome 1: HIV-positive pulmonary tuberculosis cases who were the first in a cluster had fewer subsequent cases associated with them (mean 0.6, multivariable incidence rate ratio [IRR] 0.75 [0.65–0.86]) than those HIV-negative (mean 1.1). Extra-pulmonary tuberculosis (EPTB) cases with HIV were less likely to be the first case in a cluster compared to HIV-negative EPTB cases. EPTB cases who were the first case had a higher mean number of subsequent cases (mean 2.5, IRR (3.62 [3.12–4.19]) than those HIV-negative (mean 0.6). Outcome 2: tuberculosis cases with HIV co-infection were less likely to be a subsequent case in a cluster (odds ratio 0.82 [0.69–0.98]), compared to being the first or a non-clustered case. Conclusions Outcome 1: pulmonary tuberculosis-HIV patients were less infectious than those without HIV. EPTB patients with HIV who were the first case in a cluster had a higher number of subsequent cases and thus may be markers of other undetected cases, discoverable by contact investigations. Outcome 2: tuberculosis in HIV-positive individuals was more likely due to reactivation than recent infection, compared to those who were HIV-negative. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-020-01849-7.


Background
HIV infection increases susceptibility to tuberculosis (TB) disease by increasing the rate of progression from latent TB infection (LTBI) to active disease [1,2]. However, there is also evidence that overall, TB may be less infectious in patients who also have HIV; contact studies have shown lower prevalence of tuberculin skin test (TST) positivity and lower TST conversion rates among contacts of HIV-positive index patients than HIVnegative index patients [3][4][5], particularly when index patients with HIV were immunocompromised [6]. This may be mediated through a shorter duration of infectiousness due to accelerated TB disease progression resulting in earlier diagnosis [2,7], earlier TB treatment [6], lower rates of cavitary [4,6] or sputum smearpositive [4,5] TB, or a shorter duration of cough [4] among HIV-positive index patients.
Molecular strain typing data can help identify cases which may be part of the same chain of transmission [8].
Since 2010, all culture-positive Mycobacterium tuberculosis complex (MTBC) isolates in England, Wales and Northern Ireland have been prospectively strain typed using 24-locus mycobacterial interspersed repetitive units-variable number tandem repeats (MIRU-VNTR) typing. 58.4% of TB cases in England were part of a strain type cluster with at least one other case between 2010 and 2015 [9,10].
Several studies in low-incidence settings which examined whether HIV was a risk factor for being part of a strain type cluster found no association [11][12][13], including one meta-analysis [14], but other more recent studies have reported both positive [15] and negative [16,17] associations. Weak evidence from studies in low-burden settings (with few HIV-positive TB cases) suggests that HIV positivity among the first cases of a cluster may be associated with increased numbers of secondary cases in clusters (possibly because contacts of HIV-infected TB patients may be more likely to have HIV themselves, and therefore may be more susceptible to TB infection) and that patients with TB arising from recent infection are more likely to be HIV-positive than patients whose TB derives from reactivation of LTBI [18][19][20]. Larger cluster sizes in these studies were also associated with social risk factors such as illicit/intravenous drug use and homelessness, both of which are commonly associated with HIV co-infection.
Most risk factors for TB transmission have the same direction of effect on both susceptibility to infection and likelihood of onward transmission. In contrast, HIV may increase susceptibility to infection and is known to increase progression to active TB disease, but may lower infectiousness of TB. The overall impact of HIV on onward transmission of TB is therefore unclear, particularly in low-incidence settings. We utilised a comprehensive national dataset of TB notifications over 5 years, combined with molecular strain typing data and linked to national HIV surveillance data, to examine two outcomes. Firstly, we examined whether the HIV status of a TB case determined the number of subsequent clustered cases. Secondly, we assessed whether TB is more often due to reactivation of LTBI or recent infection in patients with and without HIV.

Study population
This was a retrospective study of culture-confirmed patients with MTBC disease in adults (aged ≥ 15 years) in England, Wales and Northern Ireland, notified to Public Health England (PHE)'s Enhanced TB Surveillance System (ETS) between 2010 and 2014. We included all notified TB patients whose MTBC isolates were strain typed at ≥ 23 loci, using 24-loci MIRU-VNTR genotyping [8]. Recurrent TB cases were identified by record linkage and excluded if the strain type of recurrent notifications was indistinguishable from that of the first (i.e. plausible instances of relapse of active TB disease).
Defining strain type clusters PHE defines a strain type cluster as two or more persons with TB caused by indistinguishable MIRU-VNTR strain types [8,21]. TB cases with unique strain types were considered 'not clustered'.
The earliest date of evidence of TB disease for each patient (including symptom onset date, date of presentation to healthcare, earliest specimen date, diagnosis date, treatment start date and case notification date) was used to define the order of cases within clusters. We defined the earliest patient in each cluster as the first case and all later cases as subsequent cases.
Cases of TB in children (aged < 15 years) were included in the dataset when determining the order of TB cases within a cluster. However, as HIV status could only be determined for adults, we excluded children from our subsequent analyses. As TB is rare in the UK, clusters were not limited by geographical area within England, Wales and Northern Ireland.

Statistical analysis
Data were analysed in Stata version 13.1. Descriptive analyses of the cohort were undertaken, examining the proportion of cases belonging to a strain type cluster and how many of whom were first cases compared to subsequent cases, stratified by HIV status. We also examined the number of subsequent cases following the first case of pulmonary TB in a cluster, stratified by HIV status of the first case in the cluster.
To investigate whether HIV was a risk factor for potential transmission of TB, we conducted two analyses, described in detail below.
Outcome 1: Likelihood of transmitting TB, and the number of subsequent TB cases This analysis aimed to assess whether the HIV status of a TB case affected transmission, determined by the number of subsequent clustered cases. We compared the likelihood of transmission from TB cases with unique strain types versus those who were the first case in a cluster. The number of subsequent cases for the first case of a cluster was calculated as the number of patients in the cluster, minus one. TB cases with unique strain types were classed as having zero subsequent cases.
To investigate the impact of HIV on the onward transmission of TB, multivariable zero-inflated Poisson regression [22] was used to examine whether the HIV status of the first case of a cluster determined the number of subsequent clustered cases.
Zero-inflated Poisson regression is useful for modelling count data with an excess of zeroes, when the underlying theory suggests that the excess zeroes occur due to a separate process, and can therefore be modelled separately. In this study, we suggest that TB patients fall into two groups; those who are not infectious (and therefore cannot transmit TB to anyone else), modelled by a logistic model, and those who are infectious (and may therefore transmit TB to none, one, or more people), modelled by a Poisson model. Zero-inflated Poisson regression models undertake both of these processes and therefore give an output in two parts: an odds ratio (for the odds of transmitting infection to any subsequent patients) and a rate ratio (for the number of subsequent clustered cases, given that there has been transmission of infection). The model was offset by the time since the earliest date of evidence of TB to the end of the study period (31 December 2014). This analysis was subdivided by the site of TB disease of the first case in the cluster (pulmonary disease with or without extrapulmonary disease, compared to extra-pulmonary disease only), as it is generally accepted that patients with only extra-pulmonary TB (EPTB) are not infectious, and adjusted for other confounding variables [23].
As the first identified case of the cluster may not be responsible for transmission within the cluster, we conducted a sensitivity analysis in which we examined the number of subsequent cases for the first pulmonary case in each cluster, regardless of whether the first pulmonary case was the first case in the cluster.
Outcome 2: Likelihood of being a subsequent case in a cluster (a surrogate for recent TB infection) This analysis investigated whether HIV status influenced whether a patient's TB was more likely to be the result of recent infection or reactivation of LTBI. We used multivariable logistic regression to assess the odds ratio for being a subsequent case in a cluster (a proxy for recent acquisition of TB infection), compared to being the first case or a non-clustered case (representing reactivation cases) in HIV-positive and negative individuals. All TB cases with strain typing data were included in this analysis.
As per outcome 1, we also conducted a sensitivity analysis in which we assumed that transmission originated from the first pulmonary case in the cluster, rather than the first case temporally irrespective of the site of disease.

Exposure variables
Our primary exposure variable was HIV status, which was determined through linkage [24,25] of ETS to the national HIV and AIDS Reporting System [26,27]. Potential confounders for the relationship between HIV status and the outcomes were identified prospectively [23,28] and are shown in Table 1. All potential confounders were included in the multivariable models.

Descriptive analysis
A flow chart of the cases included is shown in Fig. 1 Seven hundred and fifty-nine TB cases were coinfected with HIV (4.0%); 410/759 (54.0%) were clustered and 99/410 (24.2%) were the first case in a cluster.
Of the 8471 subsequent cases in clusters, 3.7% were HIV-positive. 572/8471 (6.8%) of subsequent cases had an HIV-positive first case, 7775 (91.8%) had an HIVnegative first case, and the HIV status of the first case was unknown for 124 (1.5%) patients from clusters in which the first case was a child. Other demographic, socioeconomic and clinical factors are shown in Table 1. The HIV status of the first case of a cluster was positively associated with the HIV status of subsequent cases (χ 2 test P < 0.001). The prevalence of HIV among subsequent cases was higher in clusters with an HIV-positive first case (10.7%) than in clusters with an HIV-negative first case (3.2%). 6.4% of HIV-negative subsequent cases had an HIV-positive first case, compared to 19.9% of HIV-positive subsequent cases. 1998/2284 (87.5%) of clusters consisted of only HIV-negative TB patients, 11 clusters (0.5%) consisted of only HIV-positive TB patients and 275 (12.0%) clusters were mixed.
The mean cluster size in the cohort was 5 (median 3, inter-quartile range 2-4, range 2-198), 5 for clusters where the first patient was HIV-negative and 7 for clusters with an HIV-positive first case.

Outcome 1: The impact of HIV on the likelihood of transmitting TB, and the number of subsequent TB cases
The number of subsequent cases following the first TB case in a cluster differed substantially by HIV status, site of disease and smear status ( Table 2).
The zero-inflated Poisson model showed that among pulmonary TB cases (with or without extra-pulmonary disease), there was no evidence for an association between HIV co-infection and being the first case of a strain type cluster (compared to not being part of a strain type cluster) in the logistic part of the model (multivariable odds ratio [OR] 1.10 [0.79-1.53], Table 3). However, HIV co-infection was associated with a decreased number of subsequent clustered cases in the Poisson part of the models (multivariable incidence rate ratio [IRR] 0.75 [0.65-0.86], Table 3). This shows where TB cases with HIV were the first case of a cluster, the overall cluster size was smaller.
Extra-pulmonary (with no pulmonary disease) TB cases with HIV co-infection were less likely to be the first case of a cluster than those without HIV (multivariable OR for having a unique strain type 1.93 [1.12-3.33], Table 4). However, where an EPTB case was the first case in a cluster, HIV co-infection was associated with an increased number of subsequent cases (multivariable IRR 3.62 [3.12-4.19]).
In a sensitivity analysis, we examined the number of subsequent cases following the first pulmonary case in each cluster, rather than stratifying the analysis by the site of TB disease of the first patient in the cluster. This    Table 5), indicating that reactivation of LTBI was more likely to have been the source of disease for these individuals. A sensitivity analysis in which we assumed non-clustered cases and the first pulmonary case of each cluster (rather than the first case of the cluster irrespective of disease site) were the result of reactivation of LTBI and that all other clustered cases were the result of recent transmission showed consistent results (Additional file 1: Table S2).

Discussion
In this retrospective cohort study undertaken in England, Wales and Northern Ireland, we found that pulmonary TB patients with HIV seemed to transmit disease less than individuals without this co-infection, i.e. they had fewer subsequent clustered cases than those without HIV. This is consistent with the results of contact studies across high-and low-burden settings, which have found lower risks of LTBI and TB disease among the contacts of HIV-positive patients than HIV-negative TB patients [3][4][5][6]. This adds weight to the suggestion that patients with pulmonary TB and HIV may be less infectious than individuals without HIV co-infection. Among EPTB cases, we found a strong association between HIV co-infection and not being the first case of a cluster, again suggesting that patients with HIV are substantially less infectious. However, where HIV-positive EPTB patients were the first case of a cluster, they had substantially more subsequent clustered cases than HIVnegative EPTB patients. As it is generally accepted that patients with only EPTB disease are not infectious, it is unlikely these patients are driving transmission within these larger clusters. Transmission may have occurred from undiagnosed patients or patients without a known strain type, with the HIV-positive EPTB case appearing to be the first case due to more rapid disease progression or earlier presentation to clinical services. Increased cluster size may also be the result of transmission chains within clusters. HIV prevalence was higher among subsequent cases in clusters with an HIV-positive first case than clusters with HIV-negative first cases; it is therefore likely that the increased cluster size is because HIV infection is concentrated within some communities, and so the contacts of the HIV-positive infectious case are more likely to be susceptible to infection and progression to active disease. There may also be other social factors influencing transmission which differ between clusters with respect to HIV status, for example, living conditions, social mixing patterns and health-seeking behaviours, which we were not able to account for in this study.
Regardless of whether these HIV-positive cases are the 'true' first case in a cluster or merely the first case in a cluster to develop symptoms or present to care, the first observable patient is still a point at which interventions to diagnose patients earlier or investigate clusters can be targeted. National Institute for Health and Care Excellence guidelines currently suggest contact tracing is unnecessary for EPTB cases, and this is supported by a recent cost-effectiveness study [31]. However, our findings demonstrate that whilst EPTB cases may not drive transmission, EPTB cases with HIV can be the first observable case of a substantially larger cluster, which is important for directing cluster investigations. Furthermore, as around 50% of co-infected patients are only diagnosed with HIV at the time of their TB diagnosis [32], targeting HIV screening and LTBI treatment to the contacts of TB patients with HIV could result in earlier diagnosis of HIV infections, providing the opportunity to initiate anti-retroviral therapy and prevent TB disease from occurring [33].
We found a negative association between HIV coinfection and being a subsequent case in a cluster, compared to being the first case or a non-clustered case. This suggests that TB in patients with HIV is more often the result of reactivation of remotely-acquired LTBI than recent infection. These TB cases may be preventable if PLHIV, particularly those born abroad, could be tested and treated for LTBI. This finding contrasts with that of a meta-analysis of the association between HIV and clustering of TB cases in HIV-endemic populations [34], and more recent studies using WGS [35,36], which concluded that HIV-associated TB was more often the result of recent infection than reactivation of LTBI. This difference is likely the result of the different settings; the higher incidence of TB in the general population in countries where HIV is endemic will lead to a greater force of infection which may differentially affect immunocompromised PLHIV. In contrast, in the UK (and other low-burden settings), the majority of TB cases are in foreign-born patients and transmission is generally considered to be low [9]. As there is generally less exposure to TB, HIV contributes more to reactivation of LTBI than to new TB infections.
Our study benefits from a large sample of all culturepositive TB cases strain typed at ≥ 23 loci in England, Wales and Northern Ireland over a 5-year period and represents over 80% of culture-confirmed TB cases and over 50% of all TB cases in the country during this time. This coverage was comparable to national studies of a similar size in the Netherlands [18,37] and considerably higher than the 31% coverage in a previous study in England which did not include data on HIV co-infection [10,38]. Studies in Norway and Denmark have achieved higher rates of coverage nationally (67-69% of all TB cases); however, these studies had limited or no information on HIV status and much smaller overall sample sizes [39,40]. The cases included in the analysis did not substantially differ in terms of age, sex, ethnicity, place of birth (UK or abroad), year of TB diagnosis or presence of social risk factors from those not included (data not shown).
24-loci MIRU-VNTR is a highly discriminative, highthroughput method of genotyping MTBC [41,42] and has been widely used in TB cluster investigations. However, analyses using whole-genome sequencing (WGS) have demonstrated that indistinguishable 24-loci MIRU-VNTR profiles do not always have sufficiently high resolution to distinguish between closely related, but distinct, lineages [17,43].
As of 2014, over 95% of adults (18-64 years) diagnosed with TB, who previously did not know their HIV status, were tested for HIV [44]. It is possible that a small number of individuals with undiagnosed HIV were mistakenly classified as HIV-negative. We would expect any such misclassification to either be non-differential or for IRR: incidence rate ratio (Poisson part) for an increased number of subsequent clustered cases. OR: odds ratio (zero-inflated part) for the odds of being a nonclustered case, compared to being the first extra-pulmonary case of a cluster. IMD: index of multiple deprivation score. IMD score deciles represent relative levels of deprivation of income, employment, health, education, housing and services, crime and living environment for small areas in England and Wales, where 1 = most deprived and 10 = least deprived [29,30] ≠ Adjusted for all variables shown in the table. The multivariable model included 3576 extra-pulmonary TB cases after 633 were excluded due to missing data on one or more of sex (n = 3), ethnicity (n = 106), time since entry to the UK (n = 505), IMD score (n = 99) or TB lineage (n = 1) HIV-positive people to be more likely to be tested. Any misclassification would therefore have biased our results towards the null, making the true effect of HIV infection greater than stated, and so we do not consider this a major limitation of our study. We classed clustered TB cases as being the first case or a subsequent case in clusters according to their earliest date of evidence of TB. Consequently, we may have misclassified the order of patients within clusters, as patients may not develop symptoms or present to care in the order in which they were infected. In particular, TB patients diagnosed with HIV may be diagnosed sooner. If this is the case, we would expect differential misclassification of TB patients with HIV as the first case in a cluster, when in fact they may just be the first patient in that cluster who developed symptoms or presented to care. However, we found that HIV-positive cases typically had fewer subsequent cases and were less likely to be subsequent cases in clusters, and so any misclassification to this effect would have biased our results towards the null and caused underestimation of the impact of HIV. Furthermore, under 50% of TB patients are aware of their HIV infection when diagnosed with TB [32]; therefore, this would not have influenced the time it took them to present to care, although their disease may have progressed more quickly. We also, where possible (Additional file 1: Table S3), used symptom onset date to determine the order of patients in clusters, as much onward transmission will occur before a TB patient is diagnosed.
Shared strain types may not represent recent transmission, particularly in patients born abroad who may have been infected with common endemic strain types before entering the UK [9]. This could have caused us to overestimate the proportion of TB attributable to recent transmission. Conversely, cases which appeared to have a unique strain type could be the result of recent infection acquired outside of England, Wales and Northern Ireland. Whilst our sample size was large, we were only able to include approximately 50% of TB cases nationally in our analysis as strain typing relies on culture of mycobacterial samples. Low sampling fractions result in underestimation of the extent of clustering [45,46], as cases can be misclassified as not-clustered if the case they cluster with has not been strain typed. However, it has been shown that a low sampling fraction does not bias estimations of risk factors associated with clustering [45,46].
We chose not to include data on the CD4 count of HIV-positive individuals. Due to the retrospective nature of our study, which used routinely collected data, it was not possible to determine when TB transmission occurred. We therefore were unable to determine the CD4 count of HIV-positive individuals at the time of transmission and so were unable to explore any possible association between CD4 count and propensity to transmit TB. We were also unable to include data on other factors that may have been relevant, such as socioeconomic status and diabetes, as these data were not routinely recorded.
Data on HIV status was not available for children, and therefore children could not be included in this analysis. Children are also less likely to have sputum samples taken and therefore less likely to be strain-typed. To limit bias, we included children when determining whether TB cases were clustered and whether a case was the first or a subsequent case in a cluster and then excluded patients aged < 15 years from the risk factor analysis. TB in children living with HIV is relatively rare in the UK [47], and children with TB are considered Cases missing data were considered not to have these social risk factors unlikely to transmit TB; therefore, the impact of HIV on TB transmission from children is likely to be minimal.

Conclusions
In conclusion, we report that pulmonary TB patients with HIV had fewer subsequent clustered cases than patients without HIV. However, when patients with HIV and EPTB were the first case of a cluster, they had a higher number of subsequent cases. HIV prevalence was higher among the subsequent cases of HIV-positive first cases than the subsequent cases of HIV-negative first cases, suggesting that the higher number of subsequent cases for EPTB patients with HIV could be because their contacts are more susceptible to infection and progression of disease. Similarly, EPTB patients with HIV may be a sentinel marker for other factors driving recent transmission, and contact tracing should not be discounted for these cases. Our findings suggest that screening the contacts of TB patients with HIV for both HIV and LTBI could be considered. Furthermore, TB cases with HIV were less likely to be a subsequent case within a cluster, which suggests that HIV-associated TB is more often due to reactivation of LTBI rather than recent infection. More widespread testing for LTBI and preventive therapy among people living with HIV could decrease the incidence of HIV-associated TB.
Additional file 1: Table S1: Sensitivity analysis for a multivariable zeroinflated Poisson regression of factors associated with the number of subsequent clustered cases for the first pulmonary TB case in each cluster in England, Wales and Northern Ireland, 2010-2014. Table S2: Sensitivity analysis for a multivariable logistic regression of factors associated with being a subsequent TB case in a cluster (a surrogate for recent infection) compared to being the first pulmonary case or a non-clustered case, in England, Wales and Northern Ireland from 2010 to 2014.