Predicting dementia risk in primary care: development and validation of the Dementia Risk Score using routinely collected data

Background Existing dementia risk scores require collection of additional data from patients, limiting their use in practice. Routinely collected healthcare data have the potential to assess dementia risk without the need to collect further information. Our objective was to develop and validate a 5-year dementia risk score derived from primary healthcare data. Methods We used data from general practices in The Health Improvement Network (THIN) database from across the UK, randomly selecting 377 practices for a development cohort and identifying 930,395 patients aged 60–95 years without a recording of dementia, cognitive impairment or memory symptoms at baseline. We developed risk algorithm models for two age groups (60–79 and 80–95 years). An external validation was conducted by validating the model on a separate cohort of 264,224 patients from 95 randomly chosen THIN practices that did not contribute to the development cohort. Our main outcome was 5-year risk of first recorded dementia diagnosis. Potential predictors included sociodemographic, cardiovascular, lifestyle and mental health variables. Results Dementia incidence was 1.88 (95 % CI, 1.83–1.93) and 16.53 (95 % CI, 16.15–16.92) per 1000 PYAR for those aged 60–79 (n = 6017) and 80–95 years (n = 7104), respectively. Predictors for those aged 60–79 included age, sex, social deprivation, smoking, BMI, heavy alcohol use, anti-hypertensive drugs, diabetes, stroke/TIA, atrial fibrillation, aspirin, depression. The discrimination and calibration of the risk algorithm were good for the 60–79 years model; D statistic 2.03 (95 % CI, 1.95–2.11), C index 0.84 (95 % CI, 0.81–0.87), and calibration slope 0.98 (95 % CI, 0.93–1.02). The algorithm had a high negative predictive value, but lower positive predictive value at most risk thresholds. Discrimination and calibration were poor for the 80–95 years model. Conclusions Routinely collected data predicts 5-year risk of recorded diagnosis of dementia for those aged 60–79, but not those aged 80+. This algorithm can identify higher risk populations for dementia in primary care. The risk score has a high negative predictive value and may be most helpful in ‘ruling out’ those at very low risk from further testing or intensive preventative activities. Electronic supplementary material The online version of this article (doi:10.1186/s12916-016-0549-y) contains supplementary material, which is available to authorized users.


Background
More than 115 million people are predicted to have dementia by 2050 [1], with huge associated health and social care costs [2]. There is both epidemiological [3,4] and policy [5] support for the identification and management of modifiable risk factors for dementia to delay dementia onset. Around a third of Alzheimer's disease cases might be attributable to potentially modifiable risk factors (diabetes, mid-life hypertension, mid-life obesity, depression, physical inactivity, smoking, low education) [3]. It has been estimated that a reduction in the seven main modifiable risk factors by 10-25 % would prevent an estimated 1-3 million dementia cases worldwide [4]. There is a strong drive internationally for clinicians to be more pro-active in dementia diagnosis [6,7]. There is, however, a limited evidence base for current approaches to dementia screening and casefinding [8,9] and further work needs to be completed to validate new methods across different settings, including primary care [9].
Many multi-factorial prognostic dementia risk models have been developed based on neuropsychological testing and sociodemographic, health, lifestyle, and environmental variables from a range of cohort studies, e.g. [10][11][12][13][14][15][16][17][18][19][20]. These have had variable discriminating power [10,11], there is no one model that is recommended for population based settings [11], and none are widely used in practice. These risk scores entail collecting extra information from patients that would not form part of routine clinical care for the general population, for example, on fish oil intake [20], pesticide exposure [20], needing assistance with money or medication [19], years of education [12,19,20], depression symptom score [19,20], genotype [12][13][14], or neuropsychological testing [13,15,17,18], making these scores potentially more difficult and costly to implement to large populations in non-specialized clinical settings. One tool has recently been developed as a brief screening indicator to identify a high risk population for cognitive screening in primary care, using data from four cohort studies [19]. However, three of the seven factors in this tool are not routinely recorded in General Practitioner (GP) records in the United Kingdom (UK), and would have to be collected from patients individually. Validated risk scores developed using routinely collected primary care data have been used in practice for other disease areas, such as cardiovascular disease prediction, where they performed better than standard algorithms (e.g. Framingham) originally derived from cohort studies [21]. These scores can be easy to implement and calculated without collecting extra new information from the patient. They can be used to risk stratify an eligible practice population, as the process is automated and uses data already in medical records. No dementia risk model has yet been developed and validated using routinely collected primary care data in the general population. Our study objectives were to develop and validate a 5-year dementia risk score utilizing routinely collected data from a large nationally representative primary care database in the UK.

Study design
Cohort studies using routinely collected data; development and validation of a 5-year risk score for predicting newly recorded dementia diagnoses.

Setting and data source
We used The Health Improvement Network (THIN) primary care database, which derives data from routine clinical practice in the UK [22]. Around 6 % of General Practices in the UK contribute data to the THIN database, which contains nearly 12 million patients and is broadly representative of the UK population [22,23]. Data is collected longitudinally during routine care and includes consultations, symptoms, diagnoses, investigations, health measurements, prescriptions, surgical procedures, and referrals. Diagnoses from secondary care and other health information received by the practice are coded and entered using Read codes, a hierarchical coding system which maps onto ICD-10 codes, but which also includes symptom descriptions. THIN data is collected and anonymized centrally and linked by postal (zip) code for 150 households to population census data, including neighbourhood deprivation (quintiles of Townsend deprivation index) [24]. Diagnostic and prescribing information are generally well recorded and accurate [25,26] and have been successfully used in numerous studies [22], including dementia [27][28][29]. Further, THIN data are subject to a range of quality assurance procedures [30,31]. A validation study of dementia recording suggested a specificity of a GP recorded dementia diagnosis of 83 % and no false negatives in a small sample without recorded dementia [27].
We randomly selected 377 practices from 472 eligible practices providing acceptable quality data to THIN during our study period for a development cohort. The remaining 95 randomly selected eligible practices formed a completely separate validation cohort.

Participants
In both development and validation cohort studies we included individuals aged between 60 and 95 years contributing to the THIN database between January 1, 2000, and December 31, 2011. We excluded individuals with recorded dementia, cognitive impairment, memory symptoms and confusion prior to study entry, those with an exclusion diagnosis indicating specific sub-types of dementia syndrome (Parkinson's disease, Huntingdon's disease, Pick's disease, alcohol-induced dementia, dementia in other conditions, Human Immunodeficiency Virus (HIV), Lewy body disease, Cruetzfeldt-Jacob Disease), and those with less than a year's follow-up data, to allow time for patient history and risk factor information to be recorded ( Fig. 1 and Additional file 1: Figure A1).

Follow-up period
Follow-up time was restricted to a maximum of 5 years in both cohort studies. The start of follow-up was the latest of: 1) January 1, 2000; 2) when the individual turned 60 years; 3) one year following new registration with a THIN practice; 4) one year after the practice met standard criteria for accurate recording of deaths, consultation, health measurements, and prescribing [30,31]. The end date was the earliest of dementia incident date, 5 years follow-up, patient died, patient developed an exclusion diagnosis (as listed above), patient left practice, practice left THIN database, or December 31, 2011.

Main outcome
Newly recorded dementia diagnoses, including Alzheimer's disease, vascular dementia, and unspecified or mixed dementia, but excluding dementia diagnoses associated with Parkinson's disease, Lewy body dementia, Huntingdon, Picks, HIV, and drug-induced and alcohol-related dementia (Read code lists available from the authors) were the primary outcome.

Risk factor measurements
Based on potential risk factors for dementia [3,4,32] available in THIN, we examined the following as predictor variables in the risk model: (1) Sociodemographic measures: age (years), sex, social deprivation (quintiles of Townsend Index), calendar year at baseline (to account for temporal trends). (2) Health status/lifestyle measurements: smoking status up to 5 years prior to baseline (current, non-smoker or ex-smoker), body mass index (BMI), lipids (total cholesterol/ high density lipoprotein (HDL) cholesterol ratio), systolic blood pressure (SBP), history of heavy alcohol use (more than 56 units per week for men/49 units per week for women), or a Read-code entry in their medical records indicating an alcohol problem. anti-hypertensive drugs, hypnotic medication, statins, aspirin and other non-steroidal antiinflammatory drugs (NSAIDs). Patients were identified as exposed to medications if they had received at least two consecutive prescriptions in the 12 months before baseline.

Analysis
For both the development and validation cohort studies the study population was divided into two groups: those aged 60-79 years and aged 80-95 at baseline.  80 years, a sharp increased risk of dementia has previously been found [19], and in our population there were differences in the distribution of risk factors and their associations with dementia in those aged 60-79 years and older individuals. We considered additional stratification by sex but age-adjusted risk factor associations with dementia in men and women were similar, justifying combining both sexes in a single model. Separate model development was carried out for the two age groups in the development cohort and separate validation and calibration was performed for each age group in the validation cohort. Analyses were performed using Stata version 12.1.

Sample size calculation
We conservatively estimated that 20 events were required per coefficient to fit a risk model based on studies evaluating the relationship between the number of events and the performance of a risk prediction model, which have shown that 15 events at least may be required to achieve a satisfactory level of model calibration [33]. There were a total of 25 coefficients for all the predictors initially considered, corresponding to 500 dementia events needed. Applying an inflation factor to adjust for clustering within practices of 10.741 for the 60-79 years age model (based on intra-class correlation coefficient of 0.00117, estimated from the data, and a mean cluster size of 2,122 people aged 60-79 years per practice), corresponded to a total of 500 × 10.741 = 5,371 dementia events. For the 80-95 years model, the inflation factor was 10.915 (based on intra-class correlation coefficient of 0.00863 and a mean cluster size of 346 people aged 80-95 years per practice), which corresponded to a total of 500 × 10.915 = 5,458 dementia events.

Missing data imputation
We used the two-fold Fully Conditional Specification algorithm method for multiple imputation of longitudinal clinical datasets to impute missing data for both fixed (smoking and height) and time-varying variables (total cholesterol and HDL cholesterol, SBP and weight) in both the development and validation cohorts [34]. This algorithm is an efficient way to use the full longitudinal patient record rather than just the baseline measurements to inform the imputation. Missing data in the validation cohort was imputed separately from that in the development cohort. The remaining variables were complete. The imputation model included all variables in the analysis model, plus the outcome and cumulative hazard function. In the backwards elimination process, the variables were included in the final model if retained in 7 out of 10 imputed datasets to avoid over-selection of the variables [35].

Development cohort: model development
For each age group (60-79 years and 80-95 years), we derived the dementia risk score using a Cox proportional hazards regression model, with robust standard errors to account for clustering of individuals within general practices. The assumption of proportional hazards was checked using plots of the log cumulative hazard function and Schoenfeld residuals. Continuous variables were centred and the assumption of a linear relationship was assessed using fractional polynomials, visual checks by plotting graphs of the log hazard ratio by increasing category of the continuous variable, and by inclusion of squared and cubic terms in the Cox models; transformations were made when linear relationships were not confirmed.
All variables were included in the full model prior to backwards elimination. We used backwards elimination to determine which variables should be retained, using the Akaike Information Criteria. After the elimination process we considered the interaction terms systolic blood pressure*anti-hypertensive medication and lipid ratio*statin prescriptions. Interactions were retained if significant and clinically meaningful.

Validation cohort: validation and calibration
For each age group, the model developed using the development cohort was applied to the validation cohort, to assess performance. We assessed the discriminative performance of the dementia risk models by computing the Uno's C [36] and Royston's D [37] statistics for the validation cohort. Uno's C and Royston's D statistics were chosen as they have been shown to be less biased in the presence of censored data than other discriminative statistics [36,37]. Each validation statistic was estimated separately for each imputed validation dataset, and then combined using Rubin's rules to obtain an overall validation statistic. For Uno's C statistic we calculated confidence intervals from bootstrapping. A random sub-sample of 15 % of the validation cohort was used as the vast size of the dataset made computation of bootstrap confidence intervals for the full sample unfeasible. We assessed calibration by comparing the observed and predicted dementia risk in the validation cohort per decile of predicted risk, and computing the calibration slope. We calculated the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) using a range of potential risk thresholds, to explore the clinical utility of the risk algorithms.

Development cohort study
We identified 930,395 eligible patients aged 60-95 years in 377 practices in the development cohort study, of which 800,013 were aged 60-79 years and 130,382 aged 80-95 years at baseline (Fig. 1).

Development cohort aged 60-79 years
Baseline characteristics There were 413,974 (52 %) women in the 60-79 years development cohort, the mean age at baseline was 65.6 years (SD 6.1 years; Table 1).
Missing data on health measurements are detailed in Additional file 1: harmful alcohol drinking, current depression, current aspirin use, and history of diabetes, stroke, TIA and atrial fibrillation were all retained in the model (Table 2). Because statin use, lipid ratio, and SBP were all eliminated in the backwards elimination, interaction terms for statin use*lipid ratio and anti-hypertensive use*SBP were not considered.

Development cohort aged 80-95 years
Baseline characteristics There were 86,096 (66 %) women in the 80-95 years development cohort, with a mean age at baseline of 85 years (SD 3.9 years; Table 3).
Missing data on health measurements are reported in Additional file 1:  There were no significant associations with living in a deprived area, CHD, and total cholesterol/HDL ratio.
There was a small negative association with current smoking, BMI, systolic blood pressure, anti-hypertensive drugs, and NSAIDs (excluding aspirin).  (Table 4). As statin use was excluded, the interaction term statin use*lipid ratio was not considered. An interaction term for SBP*anti-hypertensive use was considered, but was not statistically significant (P = 0.6) and therefore was not included.

Validation cohort study
We identified 264,224 eligible patients aged 60-95 years in 95 practices for the validation cohort, of which 226,140 were aged 60-79 years and 38,084 were aged 80-95 years at baseline (Additional file 1: Figure A.1).

Validation cohort aged 60-79 years
Baseline characteristics/incidence of dementia The characteristics of the validation cohort were similar to the development cohort (Table 1). Missing data on health measurements are reported in Additional file 1: Risk classification Utilizing a range of possible cut-offs to indicate 'high risk' for dementia, the specificity of the risk algorithm was high but with lower sensitivity, and there was a high NPV, but a low PPV (

Discussion
This study developed risk algorithms for predicting a new recorded dementia diagnosis in two age groups in primary care. In our validation study, the dementia risk algorithm developed for the 60-79 year old population performed well, but the algorithm for the older 80-95 years population did not. Our model is the first to be derived entirely from routinely collected health data, which can be calculated without collecting additional information from the patient. In people aged between 60-79 years, the dementia risk score included records of depression, stroke, high alcohol consumption, diabetes, atrial fibrillation, aspirin use, smoking, decreasing weight, and untreated blood pressure. Aspirin use may be a marker for underlying vascular risk. The directions of associations of some factors, such as weight and cholesterol, have been shown to change in later life with the onset of disability, frailty and cognitive decline and potential pre-clinical dementia [38,39]. In our study, the 'high risk' population may include those with pre-clinical or undetected/recorded dementia, which may explain some of the associations observed with individual factors. Our algorithm uses routinely collected healthcare data to predict the risk of a GP recorded diagnosis within 5 years, and the profile of risk factors within the score is different to those aimed at identifying future risk, for example mid-life risk scores for dementia [40]. At a low threshold of 1 %, our risk algorithm had a sensitivity of 78 % and specificity of 73 %. With thresholds of 2 % or above, our risk algorithm had higher specificity (85 %) but a correspondingly lower sensitivity (58 %). In previous prediction models derived from cohort studies, models have generally had either high specificity with low sensitivity or vice versa [10,11], and the choice of threshold will depend on the intended use.

Strengths and limitations
Our development cohort study included more than 900,000 older people from across the UK registered with THIN General Practices, with more than 13,000 new dementia events recorded. The findings are likely to be generalizable to the UK population, but may not be generalizable to other different healthcare settings. The data source includes longitudinal data on a wide range of potential risk factors, including demographic factors, lifestyle, heath status measurements, medical history/diagnoses, and drugs. We had power to consider a wide range of potentially important risk factors, in comparison to cohort studies with smaller samples [10][11][12][13][14][15][16][17][18][19][20]. In those aged 60-79 years, we had good recording of data for most factors, and for missing data at baseline we used robust multiple imputation techniques utilizing the entire patient record, taking into account the longitudinal records rather than relying solely on baseline parameters.
Using routinely collected data to develop the risk algorithm has some inherent limitations. It may be less complete in terms of potential predictor variables than cohorts designed for research. The older cohort (80-95 years) had fewer routine measurements of health status such as BMI and lipid profile. The current validation applies to use of the risk score in the case where the GP has complete information on the factors in the model. There were low levels of missing data in some individuals on smoking status and BMI for those 60-79 years, which we imputed for our analysis. For all other factors in the final model, if missing, the factor was presumed to be absent. Some potential risk factors, such as family history of dementia, physical activity or educational status, are poorly recorded in routine UK primary care and could not be included. Studies suggest that chronic and significant medical diagnoses entered in electronic records are likely to be accurate [25]. Other evidence suggests dementia is under-recorded in primary care [41]. Our incidence rates for dementia were lower than rates reported in studies using screening, particularly for those over 80 years [42]; however, there is some evidence that dementia prevalence is stabilizing more recently, despite population ageing [43], and our study is based on more contemporary data. This potential under-recording of dementia diagnoses in GP records may lead to an underestimation of the true predictive power of the risk score. In common with most risk models, we only accounted for baseline variables and for time-varying factors, exposure status may change during the follow-up period. Routinely collected data has the advantage of reflecting the data normally available to a clinician in practice.

Implications
We used routinely collected primary care data to derive a relatively simple new risk algorithm, predicting a new GP recorded dementia diagnosis within 5 years, which worked well in those aged 60-79 years, but not in older age groups. This supports the previous suggestion that given the steep rise in risk of dementia at 80 years, it would be reasonable to test for dementia beyond this point on the basis of age alone [19]. It is likely that risk scores using traditional risk factors will not perform well in this population, and a different approach might be needed to identify a higher risk group aged 80 or above using, for example, measures of frailty.
Our new dementia risk algorithm for 60-79 year olds can be added to clinical software systems and a practice could, for example, run this risk model on all eligible people and offer those at risk more detailed testing or specific preventive management. Using a range of thresholds, there was good specificity but lower sensitivity, and a very high NPV but a low PPV. This risk algorithm may be most helpful to 'rule out' those at low risk from dementia case finding programs. This might avoid unnecessary investigations and anxiety for those at very low risk and make these programs more cost-effective. The risk algorithm may enable the identification of 'at risk' groups to approach for future research studies. We report a range of thresholds to allow clinicians or researchers to select the threshold that gives the optimum balance of sensitivity and specificity for dementia risk, depending on the intended use.
Further research should be undertaken to explore the performance of the Dementia Risk Score in different settings and populations, including variations in performance in areas where the prevalence, detection, and recording of dementia by GPs is very low or very high. We also need to further understand how the tool might be used in practice, the ethical implications, and what the impact of this might be for older people, clinicians, and the potential costs for health services.

Conclusion
Routinely collected health data can predict five year risk of recorded diagnosis of dementia in primary care for individuals aged 60-79 years, but not for those aged 80 years or more. This risk score can be used to identify higher risk populations for dementia in primary care. The risk score has a high negative predictive value and may be most helpful in 'ruling out' those at very low risk from further testing.