In this retrospective cohort study undertaken in England, Wales and Northern Ireland, we found that pulmonary TB patients with HIV seemed to transmit disease less than individuals without this co-infection, i.e. they had fewer subsequent clustered cases than those without HIV. This is consistent with the results of contact studies across high- and low-burden settings, which have found lower risks of LTBI and TB disease among the contacts of HIV-positive patients than HIV-negative TB patients [3,4,5,6]. This adds weight to the suggestion that patients with pulmonary TB and HIV may be less infectious than individuals without HIV co-infection. Among EPTB cases, we found a strong association between HIV co-infection and not being the first case of a cluster, again suggesting that patients with HIV are substantially less infectious. However, where HIV-positive EPTB patients were the first case of a cluster, they had substantially more subsequent clustered cases than HIV-negative EPTB patients. As it is generally accepted that patients with only EPTB disease are not infectious, it is unlikely these patients are driving transmission within these larger clusters. Transmission may have occurred from undiagnosed patients or patients without a known strain type, with the HIV-positive EPTB case appearing to be the first case due to more rapid disease progression or earlier presentation to clinical services. Increased cluster size may also be the result of transmission chains within clusters. HIV prevalence was higher among subsequent cases in clusters with an HIV-positive first case than clusters with HIV-negative first cases; it is therefore likely that the increased cluster size is because HIV infection is concentrated within some communities, and so the contacts of the HIV-positive infectious case are more likely to be susceptible to infection and progression to active disease. There may also be other social factors influencing transmission which differ between clusters with respect to HIV status, for example, living conditions, social mixing patterns and health-seeking behaviours, which we were not able to account for in this study.
Regardless of whether these HIV-positive cases are the ‘true’ first case in a cluster or merely the first case in a cluster to develop symptoms or present to care, the first observable patient is still a point at which interventions to diagnose patients earlier or investigate clusters can be targeted. National Institute for Health and Care Excellence guidelines currently suggest contact tracing is unnecessary for EPTB cases, and this is supported by a recent cost-effectiveness study [31]. However, our findings demonstrate that whilst EPTB cases may not drive transmission, EPTB cases with HIV can be the first observable case of a substantially larger cluster, which is important for directing cluster investigations. Furthermore, as around 50% of co-infected patients are only diagnosed with HIV at the time of their TB diagnosis [32], targeting HIV screening and LTBI treatment to the contacts of TB patients with HIV could result in earlier diagnosis of HIV infections, providing the opportunity to initiate anti-retroviral therapy and prevent TB disease from occurring [33].
We found a negative association between HIV co-infection and being a subsequent case in a cluster, compared to being the first case or a non-clustered case. This suggests that TB in patients with HIV is more often the result of reactivation of remotely-acquired LTBI than recent infection. These TB cases may be preventable if PLHIV, particularly those born abroad, could be tested and treated for LTBI. This finding contrasts with that of a meta-analysis of the association between HIV and clustering of TB cases in HIV-endemic populations [34], and more recent studies using WGS [35, 36], which concluded that HIV-associated TB was more often the result of recent infection than reactivation of LTBI. This difference is likely the result of the different settings; the higher incidence of TB in the general population in countries where HIV is endemic will lead to a greater force of infection which may differentially affect immunocompromised PLHIV. In contrast, in the UK (and other low-burden settings), the majority of TB cases are in foreign-born patients and transmission is generally considered to be low [9]. As there is generally less exposure to TB, HIV contributes more to reactivation of LTBI than to new TB infections.
Our study benefits from a large sample of all culture-positive TB cases strain typed at ≥ 23 loci in England, Wales and Northern Ireland over a 5-year period and represents over 80% of culture-confirmed TB cases and over 50% of all TB cases in the country during this time. This coverage was comparable to national studies of a similar size in the Netherlands [18, 37] and considerably higher than the 31% coverage in a previous study in England which did not include data on HIV co-infection [10, 38]. Studies in Norway and Denmark have achieved higher rates of coverage nationally (67–69% of all TB cases); however, these studies had limited or no information on HIV status and much smaller overall sample sizes [39, 40]. The cases included in the analysis did not substantially differ in terms of age, sex, ethnicity, place of birth (UK or abroad), year of TB diagnosis or presence of social risk factors from those not included (data not shown).
24-loci MIRU-VNTR is a highly discriminative, high-throughput method of genotyping MTBC [41, 42] and has been widely used in TB cluster investigations. However, analyses using whole-genome sequencing (WGS) have demonstrated that indistinguishable 24-loci MIRU-VNTR profiles do not always have sufficiently high resolution to distinguish between closely related, but distinct, lineages [17, 43].
As of 2014, over 95% of adults (18–64 years) diagnosed with TB, who previously did not know their HIV status, were tested for HIV [44]. It is possible that a small number of individuals with undiagnosed HIV were mistakenly classified as HIV-negative. We would expect any such misclassification to either be non-differential or for HIV-positive people to be more likely to be tested. Any misclassification would therefore have biased our results towards the null, making the true effect of HIV infection greater than stated, and so we do not consider this a major limitation of our study.
We classed clustered TB cases as being the first case or a subsequent case in clusters according to their earliest date of evidence of TB. Consequently, we may have misclassified the order of patients within clusters, as patients may not develop symptoms or present to care in the order in which they were infected. In particular, TB patients diagnosed with HIV may be diagnosed sooner. If this is the case, we would expect differential misclassification of TB patients with HIV as the first case in a cluster, when in fact they may just be the first patient in that cluster who developed symptoms or presented to care. However, we found that HIV-positive cases typically had fewer subsequent cases and were less likely to be subsequent cases in clusters, and so any misclassification to this effect would have biased our results towards the null and caused underestimation of the impact of HIV. Furthermore, under 50% of TB patients are aware of their HIV infection when diagnosed with TB [32]; therefore, this would not have influenced the time it took them to present to care, although their disease may have progressed more quickly. We also, where possible (Additional file 1: Table S3), used symptom onset date to determine the order of patients in clusters, as much onward transmission will occur before a TB patient is diagnosed.
Shared strain types may not represent recent transmission, particularly in patients born abroad who may have been infected with common endemic strain types before entering the UK [9]. This could have caused us to overestimate the proportion of TB attributable to recent transmission. Conversely, cases which appeared to have a unique strain type could be the result of recent infection acquired outside of England, Wales and Northern Ireland. Whilst our sample size was large, we were only able to include approximately 50% of TB cases nationally in our analysis as strain typing relies on culture of mycobacterial samples. Low sampling fractions result in underestimation of the extent of clustering [45, 46], as cases can be misclassified as not-clustered if the case they cluster with has not been strain typed. However, it has been shown that a low sampling fraction does not bias estimations of risk factors associated with clustering [45, 46].
We chose not to include data on the CD4 count of HIV-positive individuals. Due to the retrospective nature of our study, which used routinely collected data, it was not possible to determine when TB transmission occurred. We therefore were unable to determine the CD4 count of HIV-positive individuals at the time of transmission and so were unable to explore any possible association between CD4 count and propensity to transmit TB. We were also unable to include data on other factors that may have been relevant, such as socioeconomic status and diabetes, as these data were not routinely recorded.
Data on HIV status was not available for children, and therefore children could not be included in this analysis. Children are also less likely to have sputum samples taken and therefore less likely to be strain-typed. To limit bias, we included children when determining whether TB cases were clustered and whether a case was the first or a subsequent case in a cluster and then excluded patients aged < 15 years from the risk factor analysis. TB in children living with HIV is relatively rare in the UK [47], and children with TB are considered unlikely to transmit TB; therefore, the impact of HIV on TB transmission from children is likely to be minimal.