Risk of tuberculosis in patients with diabetes: population based cohort study using the UK Clinical Practice Research Datalink

Background Previous cohort studies demonstrate diabetes as a risk factor for tuberculosis (TB) disease. Public Health England has identified improved TB control as a priority area and has proposed a primary care-based screening program for latent TB. We investigated the association between diabetes and risk of tuberculosis in a UK General Practice cohort in order to identify potential high-risk groups appropriate for latent TB screening. Methods Using data from the UK Clinical Practice Research Datalink we constructed a cohort of patients with incident diabetes. We included 222,731 patients with diabetes diagnosed from 1990–2013 and 1,218,616 controls without diabetes at index date who were matched for age, sex and general practice. The effect of diabetes was explored using a Poisson analysis adjusted for age, ethnicity, body mass index, socioeconomic status, alcohol intake and smoking. We explored the effects of age, diabetes duration and severity. The effects of diabetes on risk of incident TB were explored across strata of chronic disease care defined by cholesterol and blood pressure measurement and influenza vaccination rates. Results During just under 7 million person-years of follow-up, 969 cases of TB were identified. The incidence of TB was higher amongst patients with diabetes compared with the unexposed group: 16.2 and 13.5 cases per 100,000 person-years, respectively. After adjustment for potential confounders the association between diabetes and TB remained (adjusted RR 1.30, 95 % CI 1.01 to 1.67, P = 0.04). There was no evidence that age, time since diagnosis and severity of diabetes affected the association between diabetes and TB. Diabetes patients with the lowest and highest rates of chronic disease management had a higher risk of TB (P <0.001 for all comparisons). Conclusions Diabetes as an independent risk factor is associated with only a modest overall increased risk of TB in our UK General Practice cohort and is unlikely to be sufficient cause to screen for latent TB. Across different consulting patterns, diabetes patients accessing the least amount of chronic disease care are at highest risk for TB. Electronic supplementary material The online version of this article (doi:10.1186/s12916-015-0381-9) contains supplementary material, which is available to authorized users.


Background
In the UK, rising rates of tuberculosis (TB), particularly in socially disadvantaged groups including migrants, are of great public health concern. London has the highest TB incidence of any western European capital [1]. There is evidence that reactivation of latent disease accounts for a large number of new TB diagnoses in the UK providing an argument for screening and treating latent disease in high-risk groups to help reverse the UK's increasing incidence of TB [1,2]. An effective TB control program that targets latent disease within a UK population will need to define high-risk groups and strategies for accessing and screening them. It could be that the role of diabetes as an independent risk factor for TB disease, or identification of subgroups of diabetic patients at particularly high risk, might inform any future UK screening and treatment policy.
Previous cohort studies have demonstrated an increased risk for TB disease in people with diabetes. The most recent meta-analysis that included three cohort studies showed a rate ratio of 3.11 (95 % confidence interval (CI) 2.27 to 4.26) for pulmonary TB associated with diabetes [3]. None of these studies were based in a UK population and two of the studies were in patients with renal failure, which is an independent risk factor. There has been one previous cohort study in a UK population [4] using two linked hospital based datasets from Oxford which found rate ratios of 1.83 (CI 1.26 to 2.60) and 3.11 (CI 1.17 to 7.03) for active TB disease comparing patients with diabetes with a reference hospital cohort. There has been no previous large cohort study in the general UK population exploring the association between diabetes and TB risk, which has also been able to include the effects of potentially important confounders.
We used a large UK representative general practice database to assess the overall risk of TB comparing people with and without diabetes and to explore whether factors relating to the patient, diabetes condition or accessing primary health care further defined a high risk group for TB, whilst adjusting for important confounding risk factors.

Methods
We undertook a matched cohort study to investigate the incidence of TB in patients with and without diabetes.

Clinical Practice Research Datalink
The UK Clinical Practice Research Datalink (CPRD) contains anonymised data from clinical general practice records of more than 15 million patients. Data on demographic, diagnostic, prescription, referral, clinical test results and pertinent lifestyle measures are included in the records. Data collection started in 1987 and currently the database contains information from the medical records of 680 UK participating general practices where there are 7 million patients currently registered. The information contained on the database undergoes regular rigorous quality checks.

Hospital episode statistics
Hospital Episode Statistics (HES) data include information on English NHS Trust hospital admissions from 1989 and outpatient clinic attendances from 2003. Key diagnostic (using international classification of diseases codes, ICD-10) and demographic data are contained in this database and hospital data for patients residing within the English region of the CPRD are eligible to be linked with their CPRD general practice record. We used HES ethnicity data.

Study participants
We identified a cohort of patients within the CPRD with incident diabetes (types 1 and 2 included), ≥5-years old, who had their first recorded diagnosis for diabetes in the study period 1 January 1990 to 31 December 2012. Patients were identified using National Health Service (NHS) Read codes, which are a coded thesaurus of clinical terms and diagnoses [5]. A diagnosis of diabetes was considered incident if the Read code was first recorded on a date at least 12 months after the patient's current practice registration date, to avoid including prevalent cases recorded when patients register [6]. The incident date was taken as the date of the first diabetes diagnostic Read code. Patients were excluded if they had a secondary, gestational or genetic cause of diabetes recorded or had codes indicating the earlier diabetes Read code was incorrect (example "ceased" or "not" codes). Once this initial cohort of patients with incident diabetes was created, further data quality checks were made excluding those who did not have a feasible temporal sequence for dates of birth, registration and death (Fig. 1).
The cohort of patients identified with incident diabetes was initially divided into type 1 and 2 diabetes containing definite, probable and possible categories based on their diagnostic Read code and number of contradictory codes according to methods previously published by De Lusignan et al. [7] The cohorts were then further refined using algorithms including patient age at diagnosis, prescriptions for insulin and other anti-diabetic drugs (OAD), body mass index (BMI) and ethnicity data, where this was available, based on methods first described in the Royal College of General Practitioners' report of coding and classifying diabetes [8] (Additional files 1 and 2).
An unexposed matched cohort was created, composed of those who did not have a prevalent diagnosis of diabetes on the matched index date, which also had to fall 12 months on or after their current registration date. Up to six unexposed patients were randomly selected and matched for age +/− 5 years, gender and General Practice with every exposed patient. Patients in the unexposed group could become diabetic during the study period and would then join the exposed cohort for the remainder of their follow-up. This was done so that our exposed cohort was not compared against an unusually healthy unexposed group who were never at risk of diabetes. This design helps control for unmeasured confounding lifestyle factors and allows for time dependent analyses. Patients included in the unexposed cohort had to have at least one face-to-face consultation or acute prescription recorded within the 12 months before or after their matched index date to ensure they were active in their follow-up. The unexposed cohort was subject to the same data quality checks and exclusions as the exposed cohort (Fig. 2).

Tuberculosis outcome
A list of Read codes for all forms of tuberculosis was developed (available on request). Prescriptions for antituberculosis drugs was not used in developing or later validating cases of TB identified by diagnostic Read codes as these medications singly prescribed are not specific to active TB disease which is treated by secondary care specialists in the UK. Both exposed and unexposed patients were allowed to have had TB recorded before their index or matched index date but there was a lag period of 2 years after the last coding of TB before a subsequent coding of TB would be considered as a new diagnosis. Any part of this lag period which occurred after the (matched) index date was not included as follow-up time in order to avoid time-related bias [9] (see Fig. 3).

Confounders and effect modifiers
We considered potential confounders including age, gender, BMI, alcohol intake, smoking status, socioeconomic status and ethnicity. Chronic renal failure was not considered as a confounder as it is on the causal pathway. Exposure to steroid medication was also not included as secondary diabetes Read codes were excluded. As age and gender were used in the matching process, we only included age in our final models to adjust for any residual confounding by this factor. BMI, alcohol intake and smoking status were taken from data recorded closest to index or matched index date. Ethnicity data was generated using both CPRD and HES data for English patients based on algorithms produced by one of the authors (RM) described previously [10]. See Additional file 3 for further methodology details on variable construction.
To explore any possible ascertainment bias from likely increased consultation rates in patients with diabetes, we  studied the effect of diabetes within strata of yearly consultation rate and total consultation number. As consultation rates capture the severity of disease we did not adjust for this, to avoid adjusting for diabetes exposure. Yearly consultation rate was generated using the same face to face consultation codes used to define active patients and the denominator of follow-up time adjusted for any possible time-related bias from previous TB diagnoses. Similarly, each patient's total consultation number was constructed for the same period.
We explored the effect of health care utilisation to investigate effects of access to care to reduce ascertainment bias for diabetes patients more likely to have regular primary care follow-up. Health care utilisation was captured by studying rates of blood pressure and cholesterol measurement and influenza vaccination. These three measures are incentivised in UK General Practice and patients with diabetes should receive these interventions at least once per year. These specific factors were chosen to capture health care utilisation without the same inherent limitation of consultation rate, which is dependent on diabetes severity.

Statistical analysis
Follow-up time started from the index or matched-index or at the end of any potential immortal time period (Fig. 3). Follow-up ended at the earliest occurrence of first TB diagnosis post index date, transfer out of the practice, last practice data collection, death from any cause or the end of the study, 31 December 2012.
Initial univariate analysis of diabetes exposure and the outcome of TB were explored by dividing the exposed cohort into type 1 and type 2 diabetes categories to ascertain  We used a Poisson regression model for the adjusted rate ratio of first diagnosis of TB comparing patients with diabetes with their matched unexposed cohort. We assumed that UK incidence rates of TB do not vary rapidly over time. An unmatched analysis was used as this does not introduce bias when analysing matched cohort studies [11]. The analysis model was developed guided by the postulated causal relationships between variables (Fig. 4). Variables were kept in the model if they changed the point estimate for the rate ratio of TB for our exposed cohort   and with the addition of each variable we assessed for possible collinearity by studying changes in standard errors. A priori effect modification of the relationship between diabetes and TB by age, ethnicity, duration of diabetes and within strata of consultation pattern was explored.
To explore the effect of diabetes exposure on absolute tuberculosis incidence we used the Poisson model developed previously to predict event rates amongst patients at most and least risk of TB according to our baseline bivariate analyses.
A pre-specified exploration of the effect of severity of diabetes was undertaken by comparing the adjusted rate ratios of TB incidence for patients with type 2 diabetes in different insulin treatment categories, comparing those requiring insulin and those without insulin (treated with other antidiabetic drugs or diet alone) with a matched non-exposed cohort. A time-updated analysis was used (see Additional file 3 for a full description of the methodology).
The effect of glycaemic control was studied using a post-index mean value for HbA1c, calculated for all patients with type 2 diabetes and TB rates were calculated within strata of glycaemic control. Algorithms were developed to identify and convert HbA1c units from DCCT (Diabetes Control and Complications Trial) percent units to the more recent IFCC (International Federation of Clinical Chemistry) mmols/mol units.
All primary analyses were performed on a completecase basis. Further post hoc and sensitivity analyses are described in Additional file 3. All data analyses were carried out using STATA 13 MP.

Protocol and ethical approval
The study received ethical approval under protocol number 13_014 by the Independent Scientific Advisory Committee for the Medicines and Healthcare products Regulatory Agency for patient-level NHS database research and the LSHTM Research Ethics Committee (reference 6352).

Study population and baseline characteristics
From 530,994 patients identified with a diabetes Read code recorded within the study period, 222,731 patients with incident diabetes were included in this study, comprising 6,186 patients with type 1 and 216,545 patients with type 2 diabetes (see Additional files 1 and 2). These were matched to a total of 1,218,616 patients who were unexposed to diabetes at their matched index date, 82,208 (6.74 %) of whom went on to have a diabetes Read code and hence joined the exposed cohort for the remainder of their follow-up. On initial matching 98.59 % of the exposed cohort had 6 possible matched controls with only 76 patients having no matched controls. After excluding these and controls that failed quality checks and were not active in their follow-up, 92.77 % of the exposed cohort had ≥5 matched controls.
The median follow-up time was 4.4 years (interquartile range 1.9 to 7.8 years) for patients with incident diabetes and 3.8 years (interquartile range 1.6 to 7.0 years) for matched control patients. In total, the diabetes cases and their matched controls contributed a little less than seven million person-years of follow-up. Amongst the cohort with diabetes, 66,005 (29.6 %) had no anti-diabetic medication recorded in their therapy files. In those patients with anti-diabetic prescriptions, 133,503 (85.2 %) had prescriptions for metformin singly or combined and 35,203 (22.4 %) had insulin prescriptions.
Baseline characteristics for patients with incident diabetes and the matched unexposed cohort are shown in Table 1. There were considerable missing data for ethnicity (45.3 %) and index of multiple deprivation (IMD) (39.5 %) for the total cohort, although there were fewer missing data for patients with diabetes. In total, 97,861 (43.9 %) patients with incident diabetes and 461,792 (37.9 %) of the matched unexposed cohort had complete data on all variables and were used in the complete case analysis. Patients with diabetes consulted more frequently than their matched controls (P <0.001).

Effect of diabetes on tuberculosis incidence
A total of 969 TB outcomes were recorded during followup and 2,376 (1.07 %) of the exposed and 13,867 (1.14 %) of the unexposed had previous TB recorded. The incidence of TB was higher amongst patients with diabetes compared with a matched unexposed group, 16.2 and 13.5 cases per 100,000 person-years, respectively (Table 2). After adjustment for potential confounders the association between diabetes and TB incidence remained (adjusted rate ratio 1.30, 95 % CI 1.01 to 1.67, P = 0.04).
The associations between diabetes and the potential confounders with TB that were included in the adjusted model are shown in Table 3.
There was only one TB outcome event in the follow-up period of patients classified with type 1 diabetes; thus, in all further analyses we used a cohort with both types of diabetes combined or where stated only those with type 2 diabetes and their matched unexposed cohort.
The estimated absolute increase in TB incidence comparing patients with and without diabetes in those with the highest baseline risk of TB (age >70 years, South Asian ethnicity, non-drinkers, current-smokers, BMI <20 and in the 5 th IMD quintile) was an extra 206 cases per 100,000 person-years ( Table 4).

Effect of age, duration of diabetes diagnosis and consultation patterns
The median age for TB diagnosis was nearly five years younger for patients with diabetes (67.5 years, IQR 58.5- Table 1 Characteristics of patients with incident diabetes and a matched a non-exposed cohort  (Table 5). There was no evidence for effect of duration of diabetes diagnosis on risk of tuberculosis (P = 0.375). To study possible interaction with ethnicity the categories of black and mixed/other were combined to provide sufficient TB events in each strata. There was no evidence for a role of ethnicity in modifying the effects of diabetes for tuberculosis risk (P = 0.894). For consulting patterns, there was a suggestion that diabetes had a greater effect on risk of TB for those who consulted at the lowest rate and the effect of diabetes diminished with increasing consultation rates, but this pattern was not statistically significant. Our results were not changed by excluding those with follow-up shorter than six months who might have unstable consultation rates. A similar but statistically significant pattern was seen when looking within strata of total consultation number with the effect of diabetes conferring a greater risk for TB in the group with the least number of consultations in their follow-up (P = 0.003).
Across health care utilisation patterns there was strong evidence (P <0.001) of differences in TB risk amongst diabetes patients, with patients accessing the lowest and highest amounts of chronic disease health care being at higher risk of TB, particularly in those with lower influenza vaccination rates (Table 6).

Comparing insulin and non-insulin users within type 2 diabetes
There was an association between increasing severity of type 2 diabetes and increasing TB risk as indicated by time-dependent analyses comparing insulin users versus non-users in crude analyses with age adjustment (P = 0.020), but this relationship did not remain after full adjustment ( Table 7).

Effect of glycaemic control in type 2 diabetes
In the subset of type 2 diabetes patients with Hba1c measurements (198,227) compared with their matched unexposed controls, there was no evidence for increasing TB risk with worsening glycaemic control (Table 8).
Post hoc and sensitivity analyses are described in Additional file 4.

Key findings
In this large UK population based cohort of patients with incident diabetes we found only an overall modest 1.3 fold increased risk of tuberculosis. We found no evidence for higher relative increases in TB rates amongst diabetes patients of different age groups or ethnicities, longer duration of disease, those using insulin or with worse glycaemic control. There was strong evidence for differences amongst diabetes patients with different health care utilisation patterns. The highest risk of TB disease was amongst the group least accessing chronic disease health care.

Comparison with other studies and explanation of findings
Our study is the largest cohort study to date exploring the association between diabetes and TB with individual-level adjustment for important demographic and lifestyle factors. The finding of an overall increased risk of TB in those with diabetes is in agreement with previous published studies and reviews [12-29, 4, 30, 3, 31]. However, in contrast we find only an overall small relative effect from diabetes in our UK population. The most recent systematic review and meta-analysis [3] included eight casecontrol studies with odds ratios ranging from 1.16 to 7.83 and a random effects analysis of the three included cohort studies showed a three-fold increased risk of TB with diabetes (relative risk 3.11, 95 % CI 2.27 to 4.26). All three cohort studies included in the meta-analysis were conducted in high TB incidence countries and two used cohorts of renal transplant patients.
Since the most recent systematic review, we are aware of seven more published analytical studies in humans looking at the question of TB risk associated with diabetes, including two case-control [26,27] and five cohort studies [25,28,29,4,32] summarised in Additional file 5. Our finding of only a modest increase in risk of TB with diabetes is in agreement with these more recent studies. As study size increases there is a decrease in the estimate for the association between diabetes and TB seen for both cohort and case-control designs, even in higher incidence countries. These differences could be due to publication bias in earlier studies and/or the adjustment for more confounding in later studies. Leung et al. [25] used a cohort from a Hong Kong community based health program for ≥65 year olds and were able to adjust for demographic and lifestyle factors giving an overall adjusted hazard ratio of 1.77 (1.41 to 2.24). Dobler et al. [29] used a whole population cohort for Australian citizens and adjusted for demographic and indigenous status and TB incidence in country of birth using census aggregate data for the unexposed general population cohort. They found a 1.4 fold increased risk of TB in those with diabetes (RR 1.48, 1.04 to 2.10). The largest matched case-control study by Leegaard et al. [27] in Denmark found no overall association between diabetes and TB after adjusting for a range of chronic disease and demographic indicators (OR 1.18, 0.96 to 1.45).
We report a median length of follow-up of 4.4 years which is similar to that in previous studies and on the whole, most of the patients included had reasonably well-controlled diabetes. The largest cohort studies previously reported, Kim et al. [12] and Dobler et al. [29], which were unable to adjust for individual confounding, had study periods of two and six years, respectively. It is possible that the risk of TB associated with wellcontrolled diabetes only becomes manifest over much The relatively small effect estimate for TB risk from diabetes found in our study could be due to the ability to adequately control for important individual level confounding from lifestyle and demographic risk factors using the CPRD. Our findings might also be a reflection of a successful primary care service with good chronic disease management and, therefore, reduction in attendant infection complications from diabetes. In support of this hypothesis, we see increased risks for TB disease from diabetes in those with the highest and lowest chronic disease management. Diabetes patients most frequently accessing chronic disease care could represent a group with more uncontrolled disease where general practice teams are seeking to improve diabetes management. The group with the least health care utilisation from that incentivised in UK General Practice for chronic disease management, may have more uncontrolled diabetes and have limited access to primary health care. The latter group might form part of a wider hard to reach group who are at increased risk of TB not only from diabetes but from multiple social risk factors [33]. Of note, our findings show diabetes patients receiving standard rates of chronic disease care, which in part will reflect good diabetes control, are at no increased risk of TB compared against patients without diabetes.
We did not find evidence for an effect of age, duration of diabetes or ethnicity for TB risk in diabetes patients although this could be due to type 2 error as we only had very small numbers of TB cases in some diabetes subgroups and had missing data on important confounders. Although some studies have found evidence for increasing TB risk in younger diabetes patients [12,15], other authors have not [27,29]. No previous studies have explored the effect of duration of diabetes diagnosis on risk of TB disease but if diabetes acts to increase the risk of TB infection or increases reactivation of latent disease, it would seem probable that cumulative exposure to diabetes would potentiate these risks.
The previous literature, using a variety of different markers, show mixed results for the effect of diabetes severity on the risk for TB. Leung et al. [25] stratified by glycaemic status and found those with Hba1c <7 % had no increased risk of TB compared with those without diabetes, in contrast to the subjects with Hba1c ≥7 % who were at 2.5 fold risk (HR 2.56, 1.95 to 3.35). Baker et al. [28] used the number of diabetes complications to explore the effect of diabetes severity and found that those with treated diabetes and ≥2 complications compared against a group without diabetes had a greater risk of TB (RR 3.45, 1.59 to 7.90). Dobler et al. [29] explored the effects of insulin use as a marker of severity and found that those using insulin had 2.3-fold risk compared against a general population cohort (RR 2.27, 1.41 to 3.66). In contrast, and coherent with our own data, Leegaard et al. [27] found no evidence for an association between TB risk and dysglycaemia. Again, this might reflect that diabetes patients managed in UK Primary Care have very well controlled disease, not completely captured by mean Hba1c measurements and, therefore, reduced attendant risks from infection. We aimed to compare the different types of diabetes for risk of TB. The underlying hypothesis being that type 1 diabetes represents a more severe form of diabetes and thus we might expect that it poses a greater risk for TB infection if the relationship between diabetes and TB risk is causal. Only the previous study by Leegaard et al [27] defined and explored the risk of TB for a group with type 1 diabetes. They classified patients <30-years old using insulin monotherapy and never using oral antidiabetes medications as having type 1 diabetes. Our classification differed in that we defined our type 1 cohort using incentivised diagnostic codes and additional demographic factors to age and insulin prescriptions. Leegaard et al had very small numbers of patients classified with type 1 diabetes, only three amongst their TB cases and the adjusted TB risk estimate reflected the imprecision (OR 2.59, 0. 44-15.29). Similarly, we found only one case of TB amongst our group of patients classified with type 1 diabetes and, thus, we were unable to explore the effects of type 1 diabetes further. Under-ascertainment of TB in type 1 diabetes patients within CPRD is a possible cause of our finding only one case of TB in this group. This might be due to those with type 1 diabetes mainly receiving their care in hospital out-patients clinics and notification of TB diagnoses not being returned to general practice. If a large number of cases of TB in the UK are due to reactivation of latent disease from those born in high TB burden countries [2,1], it might be that  Consultation number defined as total number of face to face consultations during patient follow-up; tertiles in order of increasing number of consultations BMI body mass index, CI confidence interval, IMD index of multiple deprivation, TB tuberculosis incidence of type 1 diabetes in these populations is low, as supported by global incidence studies [34] or that these patients suffer competing risks before possible reactivation of TB infection. Current UK guidelines advise considering treatment for latent TB infection in certain groups of adults where active disease has been ruled out but they show signs of TB infection with Mantoux positivity (≥6 mm) and without prior Bacillus Calmette-Guérin (BCG) vaccination, or strong Mantoux positivity (≥15 mm) or interferongamma release assay (IGRA) positive and with prior BCG vaccination [35]. There is no specific guidance for patients with diabetes at present.

Strengths and limitations
To our knowledge, this is the largest cohort study to date exploring the association between diabetes and TB in a general population which was able to adjust for important individual level confounding demographic, socioeconomic and lifestyle factors. By using time-updated exposure status, where previous unexposed patients could later develop incident diabetes and join the exposed cohort, we could study time-related phenomena but our design also allowed comparison between more similar groups ensuring we could explore the role of diabetes with reduced confounding from unmeasured social and healthrelated risk factors. Previous literature describe excluding  people with prior diagnoses of tuberculosis but we included this group with a suitable time-lapse as they will be amongst the highest risk groups for TB disease in the UK so our findings are more applicable to the population of interest.
Our study explores effects of important patient characteristics such as age and ethnicity and aspects of the risk factor of diabetes such as duration and severity, which have not been previously explored within one cohort. As far as we are aware this is the only study to look at how TB risk for diabetes patients varies with consultation patterns including receipt of chronic disease health care and shows how UK General Practice systems are able to identify different risk groups. The study is based within the General Practice population using routinely collected clinical data so reducing the likelihood of significant selection bias. In the UK the majority of diabetes care occurs in the primary care setting making our cohort of diabetes patients very inclusive. Where patients receive specialist diabetes care in the face of more challenging disease control, their primary care record is equally important and contemporaneous as primary care co-ordinates ongoing management, such as retinal screening and immunisation. Our study is pertinent to any UK policy seeking to identify high-risk groups in primary care suitable for latent TB infection screening using such routine electronic health data.
There is the potential for misclassification of diabetes and TB within our study as we used routinely collected clinical data without validation from consultation free text or hospital correspondence. We expect misclassification to be minimal as diabetes is an indicator condition within UK Primary Care; thus, practices are incentivised to maintain accurate diabetes patient registers. Seventy percent of our cohort with diabetes had confirmatory prescription data for anti-diabetic medications and the 30 % having no specific therapy but other recorded indicators is in agreement with previous studies [36]. We generated a specific TB Read code list to try to avoid the inclusion of non-mycobacterial disease. Although TB can present nonspecifically initially we would expect patients to seek medical attention in a setting with free access to health care. Diagnoses of TB made in secondary care are highly likely to be communicated to the General Practitioner due to the public health risk of this communicable disease, the risk of serious side-effects from antituberculous therapy and possible treatment interactions with medications prescribed in primary care. We found an incidence of TB in our unexposed cohort that was equal to that reported as the UK TB incidence for 2012 (13.9 cases per 100,000 population) [2], therefore supporting that we reliably identified cases of TB. Any misclassification of our outcome is likely to be non-differential as unexposed patients could join the exposed group during follow-up and "inactive" controls were excluded who are more likely to be misclassified as unexposed. Furthermore, there is no current UK guideline advising screening of patients with diabetes for TB and broader awareness for the association between the two conditions was only recently raised [37], thus any bias will tend towards underestimating associations.
There is a risk of reverse causality with diagnoses made close together and with a condition such as TB which may initially present with non-specific symptoms and can itself cause a reactive hyperglycaemia associated with acute infection [38]. TB diagnosis and treatment delays can be as long as three months in high-income settings [39,40] but potential delays in diagnosis of diabetes far exceed this with estimates between four to seven years [41]. Thus, it is far more likely in the situation where a diagnosis of TB closely follows a diagnosis of diabetes that the condition of diabetes or chronic hyperglycaemia has been ongoing for some time previous. Using similar considerations, we did not introduce a minimum follow-up time after diabetes diagnosis. There is no current evidence-base to guide us in considering a mechanistically plausible minimum length of follow-up within which TB risk increases and introducing an arbitrary follow-up time will only produce survivor bias. Information on country of origin is not routinely recorded within the primary care clinical record and may confound the association between diabetes and TB. Lifestyle factors predisposing a person to diabetes associated with the country of origin are likely superseded by those associated with residence in the UK and remaining lifestyle factors or genetic predisposition could be crudely captured by ethnicity descriptors. However, there is likely residual confounding especially as ethnicity was only broadly described within our study. We did not identify and adjust for HIV status within our study as this is recognised to be under-reported in General Practice records [42]. HIV and its treatment have been associated with metabolic derangement including diabetes [43] and HIV is a strong risk factor for TB disease [44,45]. We did not identify and exclude patients with recorded HIV as this may have differentially affected TB patients with diabetes as we expected these patients might be more likely to share a diagnosis of HIV with their General Practitioner due to the risk of multiple drug interactions between their HIV and diabetes medications. We did not adjust for BCG vaccination status. The UK immunisation program recommends childhood BCG immunisation for those at higher risk of TB [46]. The protection offered by BCG immunisation wanes with time [47], and is likely to offer little protection to the age groups predominantly at risk of TB associated with diabetes.
We performed multiple imputation for missing data on important covariates but there is likely residual confounding from non-differential misclassification of these data. Our main findings were robust to the inclusion of imputed missing data. There were fewer missing data for patients with diabetes, which is to be expected as these represent a group that are better characterised within routine primary care data. Missing data produce loss of study power, despite multiple imputation methods, which calculate standard errors reflecting the increased uncertainty. The study also had insufficient power, risking type 2 error, for the investigation of some interactions as we had small numbers of outcome data in some of our subgroup analyses. Thus, larger well-characterised cohorts in higher TB incidence settings would be needed to study these possible interactions.

Conclusions and public health implications
We find evidence for an increased risk of TB amongst those with diabetes but in contrast with many other studies mostly from high TB burden countries, the risk is only modest in our UK General Practice population. Our findings may reflect good management of diabetes in UK primary care and, therefore, reduced attendant infection risks from this chronic condition. Diabetes alone is not a sufficient indication for screening for latent TB infection in the UK. Importantly we also identified a group of patients with diabetes and reduced primary health care utilisation who are at increased risk of TB. It appears these might form a higher risk hard to reach group in terms of accessing them through primary care. Our findings can inform the development of such a primary care latent TB infection screening program hoping to reach high-risk groups and how effective such an approach might be for improving UK TB control.