Skip to main content

Predicting COPD 1-year mortality using prognostic predictors routinely measured in primary care



Chronic obstructive pulmonary disease (COPD) is a major cause of mortality. Patients with advanced disease often have a poor quality of life, such that guidelines recommend providing palliative care in their last year of life. Uptake and use of palliative care in advanced COPD is low; difficulty in predicting 1-year mortality is thought to be a major contributing factor.


We identified two primary care COPD cohorts using UK electronic healthcare records (Clinical Practice Research Datalink). The first cohort was randomised equally into training and test sets. An external dataset was drawn from a second cohort. A risk model to predict mortality within 12 months was derived from the training set using backwards elimination Cox regression. The model was given the acronym BARC based on putative prognostic factors including body mass index and blood results (B), age (A), respiratory variables (airflow obstruction, exacerbations, smoking) (R) and comorbidities (C). The BARC index predictive performance was validated in the test set and external dataset by assessing calibration and discrimination. The observed and expected probabilities of death were assessed for increasing quartiles of mortality risk (very low risk, low risk, moderate risk, high risk). The BARC index was compared to the established index scores body mass index, obstructive, dyspnoea and exacerbations (BODEx), dyspnoea, obstruction, smoking and exacerbations (DOSE) and age, dyspnoea and obstruction (ADO).


Fifty-four thousand nine hundred ninety patients were eligible from the first cohort and 4931 from the second cohort. Eighteen variables were included in the BARC, including age, airflow obstruction, body mass index, smoking, exacerbations and comorbidities. The risk model had acceptable predictive performance (test set: C-index = 0.79, 95% CI 0.78–0.81, D-statistic = 1.87, 95% CI 1.77–1.96, calibration slope = 0.95, 95% CI 0.9–0.99; external dataset: C-index = 0.67, 95% CI 0.65–0.7, D-statistic = 0.98, 95% CI 0.8–1.2, calibration slope = 0.54, 95% CI 0.45–0.64) and acceptable accuracy predicting the probability of death (probability of death in 1 year, n high-risk group, test set: expected = 0.31, observed = 0.30; external dataset: expected = 0.22, observed = 0.27). The BARC compared favourably to existing index scores that can also be applied without specialist respiratory variables (area under the curve: BARC = 0.78, 95% CI 0.76–0.79; BODEx = 0.48, 95% CI 0.45–0.51; DOSE = 0.60, 95% CI 0.57–0.61; ADO = 0.68, 95% CI 0.66–0.69, external dataset: BARC = 0.70, 95% CI 0.67–0.72; BODEx = 0.41, 95% CI 0.38–0.45; DOSE = 0.52, 95% CI 0.49–0.55; ADO = 0.57, 95% CI 0.54–0.60).


The BARC index performed better than existing tools in predicting 1-year mortality. Critically, the risk score only requires routinely collected non-specialist information which, therefore, could help identify patients seen in primary care that may benefit from palliative care.

Peer Review reports


Chronic obstructive pulmonary disease (COPD) is associated with significant mortality and morbidity and is one of the most prevalent chronic diseases globally; in the UK, it is the fifth highest cause of death [1, 2]. As COPD progresses, patients experience significant decreases in functional capacity, quality of life, social ability and psychological well-being, impairments that are analogous to those from lung cancer. There is growing evidence and increasing expert opinion that palliative care should have a prominent role in patients with end-stage COPD [3, 4]. UK clinical guidelines (National Health Service, National Institute for Health and Care Excellence, National Council for Palliative Care) all recommend starting palliative care in the year before people die, with the goal of both improving their quality of life and addressing end-of-life planning [3, 5]. The healthcare workers best placed to enable this are often those in primary care. However, we have previously shown in the UK that only 1 in 5 COPD patients within the last year of life are provided palliative care, and a recent Canadian study of COPD patients with advanced disease found a similarly low proportion [6, 7]. One major barrier to provision is the challenge of predicting patient survival, due to the irregular disease trajectory of COPD, which is usually one of slow decline, punctuated by sudden unpredictable exacerbations that often end in death [4, 8,9,10]. This is in contrast to lung cancer, where there is often a reasonable level of physical function until a short period of relatively predictable decline. This may partly explain why COPD patients are much less likely to receive palliative care than patients with lung cancer [7].

Many derived prognostic indices can help long-term mortality prediction in COPD, but the ability to predict death at 12 months is currently limited, thought in part to be because the original derivation of some of the scores was to predict mortality over several years, as well as the lack of inclusion of important prognostic factors, such as comorbidities [4, 8, 11]. Furthermore, these risk scores have been derived using subgroups of patients, in particular patients from secondary care, where more specialised test results are available. Hence, these indices often cannot be applied to the general COPD population, for example, the BODE index is the most commonly used yet requires knowledge of a patient’s exercise capacity, measured by their 6-min walk test, which is not routinely carried out in a primary care setting. This limitation prevents those that most commonly attend to COPD patients, healthcare workers within primary care, from identifying COPD patients that would benefit from palliative care. Lastly, the simplicity of the most commonly used predictive indexes may impede their predictive ability, such that addition of clinical variables increased their performance [11]. This seems especially relevant when adding comorbidities as putative prognostic predictors; comorbidities such as cardiovascular disease, cerebrovascular disease and lung cancer are both associated with an increased mortality and are highly prevalent in COPD patients. Moreover, there is evidence to suggest COPD patients are more likely to die from their comorbidities than the disease itself [12].

The aim of this study was to devise a prognostic tool, based on routinely collected variables within primary care, which could provide a 12-month mortality prognosis for general COPD patients. To carry this out, we used the UK’s largest longitudinal database of electronic healthcare records and incorporated in our analysis all recorded putative predictive risk factors; these risk factors were based on previous published indices and risk scores.


Data sources

Data from the Clinical Practice Research Datalink (CPRD) was used to derive the prognostic risk model. CPRD currently covers more than 11 million patients, who represent the population, including with respect to gender and age, containing primary care clinical, prescription and test data [13]. To obtain data on exacerbations, socioeconomic status and mortality, linkage respectively to Hospital Episode Statistics (HES), Index of Multiple Deprivation (IMD) and Office of National Statistics (ONS) data was obtained; just over 60% of CPRD practices have patient-level linkage to HES-IMD-ONS.

Study populations

All patients had a COPD diagnosis as determined using a previously validated algorithm [14]. Patients’ data were eligible for inclusion after the latest of their COPD diagnosis date, the date the GP practice began recording research quality data, their continuous CPRD registration date, or cohort start date. Patients’ data were censored at the earliest of their date of death, end of study (26 June 2015), the GP practice last collection date or the date of transfer out of a CPRD-linked practice. Two study populations were drawn. The first had a cohort start date of 1 January 2010, and an arbitrary index date (time from which the 1-year mortality prognosis model could be applied) set as the first annual COPD review that occurred 12 months after eligibility. This cohort was used to derive the model and internally validate the model.

A second population was drawn that did not have a recorded annual review date and had data drawn from an earlier time period. The second cohort start date was 1 January 2004, and index date was arbitrarily set as the first day after 12 months of eligible data had occurred. Patients were excluded if they had a recorded annual review date between 1 January 2004 and 26 June 2015, and if they had missing values required for the model.

Outcome and prognostic predictors

Death was defined as mortality from any cause. The following prognostic predictors were chosen, based on published indices and risk scores, using appropriate Read codes (codes are available upon request): history of smoking (current or ex-smoker), MRC dyspnoea score, bereavement, myocardial infarction, asthma, osteoporosis, diabetes, hypertension, dementia, lung cancer, heart failure, stroke, anxiety, depression, atrial fibrillation, pulmonary embolism, coronary artery disease, gastric/duodenal ulcer disease, breast cancer, pancreatic cancer, pulmonary fibrosis, stroke, long-term oxygen therapy, influenza and pneumococcal vaccinations (this can be given every 5 years; if records did not extend beyond 5 years and did not show vaccination, this was recorded as missing) [8]. The COTE score (based on the presence of multiple comorbidities, including lung fibrosis, pancreatic cancer and diabetes with neuropathy) was also calculated [15]. Lung fibrosis was defined as any interstitial lung disease (ILD), e.g. sarcoidosis, idiopathic pulmonary fibrosis, rheumatoid arthritis-associated ILD. Prescription data was used to identify patients that had ever used an inhaled corticosteroid (ICS), long-acting beta agonist (LABA), or long-acting muscarinic antagonist (LAMA). Test results were used to identify the following variables, FEV1, GOLD staging (FEV1 and FVC), C-reactive protein (CRP), albumin (low = < 35 g/L), haemoglobin, fibrinogen, platelets (low = < 150 × 109/L, high = > 400 × 109/L) and creatinine; creatinine above 120 μmol/L for males, or 110 μmol/L for females, was used to define chronic kidney disease (CKD). BMI was measured as kg/m2 (underweight < 19, normal = 19–25, overweight = 25–30, obese ≥ 30). Exacerbations, treated within primary (labelled as moderate) or secondary care (labelled as severe), were identified using a validated algorithm [16, 17]. Severe exacerbations were categorised as none, 1–2 hospitalisations annually and ≥ 3 hospitalisations annually. The rules for variable inclusion are defined in Additional file 1: Table S1.

Multivariable prognostic scores

Only three of the nine multivariable scores, which have previously been used to address mortality in unselected COPD patients at 1 year, were able to be derived from routinely collected primary care data. These were ADO (age, dyspnoea and airflow obstruction), BODEx (BMI, airflow obstruction, dyspnoea and exacerbations) and DOSE (dyspnoea, airflow obstruction, smoking status and exacerbations). The scores were derived as per original publication, using variables as defined above (MRC dyspnoea score, FEV1, smoking status, exacerbations, BMI) [18,19,20].

Modelling the putative prognostic predictors

The dataset was randomly divided equally into two datasets: a training set, used to derive the model, and a test set, used to internally validate the risk model.

Variables exceeding 50% missing were excluded from the model. An imputation model was defined for each variable with ≤ 50% missing data. Data were assumed to be missing at random, and values for the missing predictors were imputed using multiple imputation techniques based on chained equations [21]. A total of 10 imputed datasets were generated.

To derive the risk model, Cox regression models were fitted using the data from the training set with all predictors (with the exclusion of the COTE scores). Backwards elimination with a stack approach [21] was used, using a 5% significance level for variable selection and weights equal to 1/10 for each one of the imputed training datasets. The coefficient estimates for the final model were combined from the imputed datasets using Rubin’s rules [22]. Proportional hazard assumptions were tested for the final model.

The probability of mortality at 1 year for a patient can be calculated using the following equation, derived from the Cox proportional hazards model:

$$ P\left(\mathrm{death}\ \mathrm{at}\ 1\ \mathrm{year}\right)=1-{\left({S}_0(t)\right)}^{\exp \left(\mathrm{prognostic}\ \mathrm{index}\right)}, $$

where S0(t) is the baseline survival probability at time t (i.e. at 1 year in this study). The prognostic index, i.e. the linear predictor of the Cox model, is the quantity we used as our proposed index. The index was given the acronym BARC based on putative prognostic factors including body mass index and blood results (B), age (A), respiratory variables (airflow obstruction, exacerbations, smoking) (R) and comorbidities (C).

Validation of the risk model

To validate the predictive ability of the risk model at 12 months, we relied on the calculation of the BARC index in the test set using the coefficients obtained in the development phase. The model was validated internally in the test dataset and in the external dataset (drawn from the second COPD cohort). Measures assessing calibration (calibration slope) and discrimination (Harrel’s C-index and D-statistic) were calculated [23,24,25]. Calibration slope assesses the agreement between predicted and observed risks. A calibration slope of 1 suggests perfect calibration, while a value diverging from 1 is indicative of poorer agreement. A value of 0.5 for C-index indicates no discrimination, and 1 indicates perfect discrimination. A model with no discriminatory ability will produce D value equal to 0, and better separation is achieved with higher values. The performance measures were estimated in each imputed validation test dataset, overall measures were calculated by combining the estimates using Rubin’s rules, and in the external dataset.

Graphical illustration of calibration is given by comparing observed (Kaplan–Meier) and predicted survival probabilities in several prognostic groups. Groups were derived by placing cut points on the BARC based on meaningful quantiles [26, 27]. We categorised BARC index’s at the 1st quartile, median and 3rd quartile of the time of death, i.e. not counting censored observations, to create four risk groups.

Comparing observed and predicted mortality probability

The observed mortality probability was calculated by the proportion of deceased patients in the sample within a year. The same four groups used to graphically calibrate the model were used to classify subjects in very low, low, moderate and high risk [28]. Mortality could then be compared for patients in each risk group between that observed and the predicted mortality using the BARC index.

Comparing the risk model with established multivariable prognostic scores

To compare the predictive capability of the BARC index with that of ADO, BODEx and DOSE scores, we plotted the receiver operating characteristic (ROC) curves and calculated their associated area under the curves (AUC) for the survival threshold of interest, i.e. 1 year. As a sensitivity analysis, the scores were compared on the first cohort (training and test set) without lung cancer.

All statistical analyses were carried out using STATA (version 15) and R (version 3.5.0).


Characteristics of the COPD populations

There were 54,990 eligible COPD patients in the first cohort, from which the training and test datasets were drawn, of whom 21% died during study follow-up; median follow-up was 2.7 years (Additional file 1). The cohort had a median age of 70 years, around half were male, median BMI corresponding to overweight and a median FEV1 of 1.48 L (Table 1). All of the cohort had a history of at least one documented comorbidity. Only 1.2% of the cohort had a high COTE index. As might be expected, the cohort that died were slightly older, had a lower FEV1, had experienced more moderate and severe exacerbations, were on more inhaled medication and had in general more comorbidities. There were 4931 eligible COPD patients in the external validation dataset (Additional file 2: Figure S1), drawn from the second COPD cohort of whom 29% died during study follow-up; median follow-up was 2.1 years. The dataset had a median age of 71 years, 55% were males, and a median FEV1 was 1.52 L (Table 2). The patients that died were older, had a lower FEV1 and had more exacerbations and comorbidities.

Table 1 Demographic and clinical characteristics of the first COPD cohort (training and test datasets)
Table 2 Demographic and clinical characteristics of the external dataset

Prevalence of prognostic predictors

In the first primary care COPD cohort, there was < 5% missing data for the most commonly applied prognostic predictors, MRC dyspnoea score, BMI, smoking status, exacerbation history and age, except for FEV1 where there was 20% missing. Other predictors that had missing values were blood tests; CRP had 79% missing, albumin, haemoglobin, and platelets had around 30% missing and creatinine only had 23% missing. Only 16% of patients did not have a blood test within 12 months of their annual review, and only 3% were taken within 7 days either side of an exacerbation. There was 70% of patients with missing data for the pneumococcal vaccine. All other variables, unless derived from the abovementioned variables, were < 5% missing.

Identification of the risk model

After imputation for the missing values and stepwise elimination, 18 different variables remained in the model, including age, BMI, FEV1, severe exacerbations, smoking status, multiple comorbidities, haemoglobin and platelets (Table 3).

Table 3 Estimated beta coefficients and their standard errors (SE) for the final Cox proportional hazards model

The marginal predictions for the risk of death at 1 year were obtained by the following equation

$$ P\left(\mathrm{death}\ \mathrm{at}\ 1\ \mathrm{year}\right)=1-{(0.9837)}^{\exp \left(\mathrm{prognostic}\ \mathrm{index}\right)}, $$

where the baseline survival is estimated by means of a fractional polynomial and the prognostic index is the linear combination of the coefficients given in Table 3 with the values of the corresponding variables.

Validation of the risk model

The predictive performance and calibration of the BARC index was high in the test dataset and satisfactory in the external dataset (test set: C-index 0.79, 95% CI 0.78–0.81; D-statistic 1.9, 95% CI 1.8–2.0; calibration slope 0.95, 95% CI 0.90–0.99, and external dataset: C-index 0.67, 95% CI 0.65–0.70; D-statistic 0.98, 95% CI 0.83–1.14; calibration slope 0.54, 95% CI 0.45–0.64) (Table 4). We depict the observed and fitted survival probabilities, with pointwise 95% confidence intervals for the latter, at 3, 6 and 9 months, other than at 1 year, to give a visual trend of the survival probabilities. The graphical analysis confirms the satisfactory calibration of the BARC index (Additional file 3: Figure S2), even if the predictions in some of the groups were slightly higher than the observed.

Table 4 Validation at 12 months using the test and external validation datasets

Comparing mortality between that observed and predicted

There was an increasing probability of death with each increasing risk group (Fig. 1). The BARC index estimated the probability of dying to within 1% of the observed probability in the high-risk group, in the training and test sets, and within 5% in the external dataset.

Fig. 1
figure 1

Mortality probability by PI group in training, test and external validation datasets

Comparing the BARC to ADO, BODEx and DOSE

The ROC curve of the BARC index was consistently above any of the curves associated with both ADO, BODEx and DOSE scores, showing that our model performed better in the test dataset than the three scores (Fig. 2 and Additional file 4: Figure S3). This result is confirmed by the associated AUCs and their corresponding 95% confidence intervals (Table 5). BARC index still performed better in the sensitivity analysis, removing lung cancer patients (Additional file 1: Tables S2-S4).

Fig. 2
figure 2

Receiver operating curves comparing the BARC index with ADO, BODEx and DOSE indexes

Table 5 AUCs for the BARC, ADO, BODEx and DOSE indexes


From a large cohort of primary care COPD patients, we have derived a 12-month mortality predictive model, the BARC index, with acceptable discrimination and calibration when externally validated. The predictive performance of the model also compared favourably to the commonly used ADO, DOSE and BODEx indexes. The BARC index is comprised of variables commonly included in established predictive indexes, such as airway obstruction, age, smoking status and dyspnoea assessment, as well as several comorbidities and blood biomarkers linked to general health (including serum albumin and haemoglobin).

A significant difference between our more favourable model and established scores is the addition of comorbidities. The presence of comorbid disease is common, with at least 80% of COPD patients estimated to have one or more additional chronic disorders; indeed, those within 1 year of death have an even larger proportion with comorbid disease [7, 29]. It is also associated with significantly increased mortality; up to two thirds of deaths are thought to be from comorbid disease not COPD [12, 15, 30]. Perhaps unexpectedly, most cardiovascular comorbidities were not included in the model at the 5% significance level; however, this may be because this model addressed shorter-term 12-month mortality whereas cardiovascular disease has relatively longer-term effects, than some other comorbidities, such as cirrhosis, lung cancer and cerebrovascular disease that were included. Furthermore, cardiovascular mortality continues to decrease [31]. The specific comorbidities index (COTE) uses 12 comorbidities, but no respiratory parameters, and provides a good 5-year mortality prediction [15]. However, COTE has not been assessed for predicting mortality at 1 year, and as it was derived in secondary care, it requires specialised knowledge on disease status that is not always available. In this respect, the CODEX index (based on the Charlson index and BODEx), derived from a selective cohort of hospitalised COPD patients, also requires in-depth knowledge on comorbidities [18]. In comparison, many variables that are associated with COPD severity, including medication use, moderate exacerbations and GOLD staging, were not included in the model at the 5% significance level. This information in itself points to the complexity of understanding COPD mortality and highlights again the influence of comorbid conditions on mortality.

One advantage of the BARC index is that it is practical, and user-friendly, as it incorporates routinely collected data easily available within primary care, which could also allow the risk score to be embedded in the electronic healthcare records system. In addition, because it was derived and validated in two large nationally representative COPD populations, and nearly 90% of UK population is registered in primary care, this aids the generalisability of the risk score to all COPD populations. The cohorts used had similar mortality rates to other COPD cohorts (data not shown) [11, 32]. However, the generalisability could have been reduced as we used an annual review as the arbitrary time point from which to start the study; 20% of the cohort did not have one during their study period, and this was largely due to their short length of research quality data available (i.e. only had just over 1 year of CPRD data therefore not long enough to have 1 year of data and an annual review) rather than lack of attendance to their annual review. This generalisability issue was overcome as the external dataset contained patients without an annual review during that time period. Another possible limitation of the derivation of the risk score is that five variables (FEV1, albumin, haemoglobin, platelets and creatinine) had to be imputed due to missing data, which potentially could have led to misclassification, though the percentage missing was only around 10 to 30%. The low percentage of missing data in the first cohort was likely due to some selection bias as these patients all had an annual review; there was higher percentage missing in the second cohort, with 15% missing FEV1 and 50% missing MRC dyspnoea score. In the first cohort, many of the missing variables appeared to be missing due to a relatively short follow-up period before death (in the UK FEV1 is routinely measured every 18 months); nevertheless, FEV1 can readily be measured by spirometry if required for the index. Although blood tests were missing from some patients, these provided significant predictive value to the model and were mostly performed less than a year before the annual review date. Moreover, we feel it is likely in patients where a GP is considering this index, they will have had a blood test in the recent past; if not, this information can easily be obtained from a simple single blood test. A strength of this study is the use of such a large cohort of patients to derive the model from; this also provided the power to assess less-common comorbidities (including cirrhosis and dementia) as statistically significant prognostic markers that may not have been found in a smaller sample size.

Information on the end of life has been identified as of intrinsic interest to patients, carers and healthcare professionals, but the lack of the ability to approximately predict mortality is thought to be one of the key barriers to providing this information. Therefore, the identification of this accurate, user-friendly, predictive model that is applicable in primary care, could aid communication, shared decision-making and ultimately a palliative care approach directed from primary care. Our findings suggest the currently used predictive scores may be too simple and that incorporating more clinical variables, in particular comorbidities, significantly improves predictive performance. Of course, a risk score only aids decision-making, and physicians should use their clinical acumen and discuss with patients and their families to decide when palliative care is appropriate (it may be appropriate long before the last year of life); a risk score should not be used in isolation as a screening tool for palliative care [28].


This is the first published prognostic tool designed to predict all-cause mortality in patients with COPD within 12 months of death. In addition, its applicability in primary care, and validation in a large general COPD cohort, gives the BARC index significant clinical and practical advantages over previously identified risk indices.



Age, dyspnoea and obstruction


Body mass index, blood test, age, respiratory variable and comorbidities


Body mass index, obstructive, dyspnoea and exacerbations


Chronic obstructive pulmonary disease


Clinical Practice Research Datalink


C-reactive protein


Dyspnoea, obstruction, smoking and exacerbations


General practitioner




Hospital Episode Statistics


Inhaled corticosteroid


Index of Multiple Deprivation


Long-acting beta agonist


Long-acting muscarinic antagonist


National Health Service


Office of National Statistics


Standard deviation


United Kingdom


  1. Snell N, et al. S32 epidemiology of chronic obstructive pulmonary disease (COPD) in the UK: findings from the British lung foundation’s ‘respiratory health of the nation’ project. Thorax. 2016;71:A20.1–A20.

    Article  Google Scholar 

  2. GBD 2015 Chronic Respiratory Disease Collaborators, J. B, et al. Global, regional, and national deaths, prevalence, disability-adjusted life years, and years lived with disability for chronic obstructive pulmonary disease and asthma, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet Respir Med. 2017;5:691–706.

    Article  Google Scholar 

  3. National Institute for Health and Care Excellence (NICE). Chronic Obstructive Pulmonary Disease in over 16s: diagnosis and managment (NICE Guideline). 2018.

  4. Maddocks M, Lovell N, Booth S, Man WD-C, Higginson IJ. Palliative care and management of troublesome symptoms for people with chronic obstructive pulmonary disease. Lancet. 2017;390:988–1002.

    Article  Google Scholar 

  5. National Council for Palliative Care. Commissioning End of Life Care. (2011).

    Google Scholar 

  6. Gershon AS, et al. End of life strategies among patients with advanced chronic obstructive pulmonary disease (COPD). AJRCCM Artic Press. 2018:03–592.

  7. Bloom CI, et al. Low uptake of palliative care for COPD patients within primary care in the UK. Eur Respir J. 2018;51:1701879.

    Article  Google Scholar 

  8. Smith L-JE, et al. Prognostic variables and scores identifying the end of life in COPD: a systematic review. Int J Chron Obstruct Pulmon Dis. 2017;12:2239–56.

    Article  Google Scholar 

  9. Spathis A, Booth S. End of life care in chronic obstructive pulmonary disease: in search of a good death. Int J Chron Obstruct Pulmon Dis. 2008;3:11–29.

    Article  Google Scholar 

  10. Halpin DM. Palliative care for COPD: signs of progress, but still a long way to go. AJRCCM Artic Press. 2018:05–955.

  11. Morales DR, et al. External validation of ADO, DOSE, COTE and CODEX at predicting death in primary care patients with COPD using standard and machine learning approaches. Respir Med. 2018;138:150–5.

    Article  Google Scholar 

  12. McGarvey LP, et al. Ascertainment of cause-specific mortality in COPD: operations of the TORCH Clinical Endpoint Committee. Thorax. 2007;62:411–5.

    Article  Google Scholar 

  13. Herrett E, et al. Data resource profile: clinical practice research datalink (CPRD). Int J Epidemiol. 2015;44:827–36.

    Article  Google Scholar 

  14. Quint JK, et al. Validation of chronic obstructive pulmonary disease recording in the clinical practice research datalink (CPRD-GOLD). BMJ Open. 2014;4:–e005540.

  15. Divo M, et al. Comorbidities and risk of mortality in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2012;186:155–61.

    Article  Google Scholar 

  16. Rothnie KJ, et al. Recording of hospitalizations for acute exacerbations of COPD in UK electronic health care records. Clin Epidemiol. 2016;8:771–82.

    Article  Google Scholar 

  17. Rothnie KJ, et al. Validation of the recording of acute exacerbations of COPD in UK primary care electronic healthcare records. PLoS One. 2016;11:e0151357.

    Article  Google Scholar 

  18. Jones RC, et al. Derivation and validation of a composite index of severity in chronic obstructive pulmonary disease: the DOSE index. Am J Respir Crit Care Med. 2009;180:1189–95.

    Article  Google Scholar 

  19. Soler-Cataluña JJ, Martínez-García MA, Sánchez LS, Tordera MP, Sánchez PR. Severe exacerbations and BODE index: two independent risk factors for death in male COPD patients. Respir Med. 2009;103:692–9.

    Article  Google Scholar 

  20. Puhan MA, et al. Expansion of the prognostic assessment of patients with chronic obstructive pulmonary disease: the updated BODE index and the ADO index. Lancet. 2009;374:704–11.

    Article  Google Scholar 

  21. van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med. 1999;18:681–94.

    Article  Google Scholar 

  22. Rubin D. Multiple Imputation for Nonresponse in Surveys. Wiley; 1987.

  23. Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004;23:723–48.

    Article  Google Scholar 

  24. Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247:2543–6.

    Article  Google Scholar 

  25. van Houwelingen HC. Validation, calibration, revision and combination of prognostic survival models. Stat Med. 2000;19:3401–15.

    Article  Google Scholar 

  26. Royston P. Tools for checking calibration of a Cox model in external validation: prediction of population-averaged survival curves based on risk groups. Stata J. 2015;15:275–91.

    Article  Google Scholar 

  27. Royston P, Altman DG. External validation of a Cox prognostic model: principles and methods. BMC Med Res Methodol. 2013;13:33.

    Article  Google Scholar 

  28. Small N, et al. Using a prediction of death in the next 12 months as a prompt for referral to palliative care acts to the detriment of patients with heart failure and chronic obstructive pulmonary disease. Palliat Med. 2010;24:740–1.

    Article  CAS  Google Scholar 

  29. Putcha N, Drummond MB, Wise RA, Hansel NN. Comorbidities and chronic obstructive pulmonary disease: prevalence, influence on outcomes, and management. Semin Respir Crit Care Med. 2015;36:575–91.

    Article  Google Scholar 

  30. Berry CE, Wise RA. Mortality in COPD: causes, risk factors, and prevention. COPD J Chronic Obstr Pulm Dis. 2010;7:375–82.

    Article  Google Scholar 

  31. Bhatnagar P, Wickramasinghe K, Wilkins E, Townsend N. Trends in the epidemiology of cardiovascular disease in the UK. Heart. 2016;102:1945–52.

    Article  Google Scholar 

  32. Gayle A, Axson E, Bloom C, Navaratnam V, Quint J. Changing causes of death for patients with chronic respiratory disease in England, 2005-2015. Thorax. 2019.

Download references


Not applicable.


The study was funded by Wellcome. PS and FR are supported by Marie Curie I-CAN-CARE Program grant (MCCC-FPO-16-U), Marie Curie core funding (CORE MCCC-FCO-16-U) and the UCLH NIHR Biomedical Research Centre. PS is supported by the Marie Curie Chair’s grant (MCCC-509537).

Availability of data and materials

The data that support the findings of this study are available from the UK CPRD, but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. The data are, however, available from the authors upon reasonable request and with permission of the UK CPRD.

Author information

Authors and Affiliations



CIB, FR, LS, PS and JKQ were the sole contributors and authors of this study. PS, LS and JKQ contributed in developing the research question, writing the protocol and obtaining the data. CIB and FR carried out the analysis. All authors contributed to preparing the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to C. I. Bloom.

Ethics declarations

Ethics approval and consent to participate

The protocol for this research was approved by the Independent Scientific Advisory Committee (ISAC) for MHRA Database Research (protocol number 17_083). Generic ethical approval for observational research using CPRD with approval from ISAC has been granted by a Health Research Authority (HRA) Research Ethics Committee (East Midlands – Derby, REC reference number 05/MRE04/87). Linked pseudonymised data was provided for this study by CPRD. Data is linked by NHS Digital, the statutory trusted third party for linking data, using identifiable data held only by NHS Digital. Select practices consent to this process at a practice level with individual patients having the right to opt-out.

Consent for publication

Not applicable.

Competing interests

All authors have completed the ICMJE uniform disclosure form at and declare the following. LS reports grants from Wellcome Trust during the conduct of the study; outside the submitted work, LS reports grants from Wellcome, MRC, NIHR, BHF and Diabetes UK and grants and personal fees from GlaxoSmithKline. JKQ, outside the submitted work, reports grants from The Health Foundation, MRC, and British Lung Foundation; grants and personal fees from GlaxoSmithKline; grants and personal fees from Boehringer Ingelheim; grants and personal fees from AstraZeneca; grants and personal fees form Chiesi; personal fees from Teva; grants and personal fees from Insmed; grants and personal fees from Bayer and grants from IQVIA.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional information

Bloom CI and Ricciardi F are joint first authors.

Additional files

Additional file 1:

Tables S1-S4. Table S1. Time scale of when variables data collected according to index date (annual review for training and test dataset and 12 months after eligibility date for external validation dataset). Table S2. AUCs for the BARC, ADO, BODEx and DOSE indexes in the sensitivity analysis, removing patients with lung cancer from the test dataset. Table S3. Model performance in the sensitivity analysis, removing patients with lung cancer from the external dataset. Table S4. AUCs for the BARC, ADO, BODEx and DOSE indexes in the sensitivity analysis, removing patients with lung cancer from the external dataset. (DOCX 19 kb)

Additional file 2:

Figure S1. Flow diagram of inclusion criteria and patient numbers. (PNG 38 kb)

Additional file 3:

Figure S2. Calibration of a Cox model in the test datasets. Smooth dashed lines represent predicted survival probabilities, and vertical capped lines denote Kaplan–Meier estimates with 95% confidence intervals. Four prognosis groups are plotted (from darkest to palest): the “very low” risk group, the “low” risk group, the “moderate” risk group and the “high” risk group. (PNG 239 kb)

Additional file 4:

Figure S3. Receiver operating curves comparing the BARC index with ADO, BODEx and DOSE indexes in the external dataset. (PNG 56 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bloom, C.I., Ricciardi, F., Smeeth, L. et al. Predicting COPD 1-year mortality using prognostic predictors routinely measured in primary care. BMC Med 17, 73 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • COPD
  • Prediction
  • Risk score
  • Mortality
  • Palliative care