A systematic assessment of the association between frequently prescribed medicines and the risk of common cancers: a series of nested case-control studies

Background Studies systematically screening medications have successfully identified prescription medicines associated with cancer risk. However, adjustment for confounding factors in these studies has been limited. We therefore investigated the association between frequently prescribed medicines and the risk of common cancers adjusting for a range of confounders. Methods A series of nested case-control studies were undertaken using the Primary Care Clinical Informatics Unit Research (PCCIUR) database containing general practice (GP) records from Scotland. Cancer cases at 22 cancer sites, diagnosed between 1999 and 2011, were identified from GP records and matched with up to five controls (based on age, gender, GP practice and date of registration). Odds ratios (OR) and 95% confidence intervals (CI) comparing any versus no prescriptions for each of the most commonly prescribed medicines, identified from prescription records, were calculated using conditional logistic regression, adjusting for comorbidities. Additional analyses adjusted for smoking use. An association was considered a signal based upon the magnitude of its adjusted OR, p-value and evidence of an exposure-response relationship. Supplementary analyses were undertaken comparing 6 or more prescriptions versus less than 6 for each medicine. Results Overall, 62,109 cases and 276,580 controls were included in the analyses and a total of 5622 medication-cancer associations were studied across the 22 cancer sites. After adjusting for comorbidities 2060 medicine-cancer associations for any prescription had adjusted ORs greater than 1.25 (or less than 0.8), 214 had a corresponding p-value less than or equal to 0.01 and 118 had evidence of an exposure-dose relationship hence meeting the criteria for a signal. Seventy-seven signals were identified after additionally adjusting for smoking. Based upon an exposure of 6 or more prescriptions, there were 118 signals after adjusting for comorbidities and 82 after additionally adjusting for smoking. Conclusions In this study a number of novel associations between medicine and cancer were identified which require further clinical and epidemiological investigation. The majority of medicines were not associated with an altered cancer risk and many identified signals reflected known associations between medicine and cancer. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-020-01891-5.


Background
Cancer remains a leading cause of disease burden and death [1,2]. In 2018, there were an estimated 18.1 million new cases of cancer worldwide and 9.5 million deaths, with the yearly incidence estimated to increase to 29.5 million by 2040 [3]. Despite continuing advances in medical research, survival remains low for a range of cancers [4], highlighting the need to avoid precipitating factors.
Certain medicines may possess unintended carcinogenic or chemoprotective properties. For instance, hormone replacement therapy (HRT) has been shown to increase breast cancer risk [5], and aspirin has traditionally been considered to reduce colorectal cancer risk [6] (although recent studies have questioned the perceived protective effect associated with aspirin on cancer risk [7,8]). Clinical trials conducted during the development of new medications are unlikely to identify medications that alter cancer risk due to the relatively small number of patients exposed to the medication and the generally short duration of follow-up [9]. Spontaneous reporting systems (such as the United Kingdom (UK) Yellow Card Scheme) may also identify medications which cause cancer, but their ability is limited by the long induction time of many cancers and by design they will not identify medicines with a potential chemoprotective effect. Consequently, pharmacoepidemiology, the study of the use and effects of drugs in large numbers of people [10], has proved a valuable tool in the identification of medications which can cause or reduce the risk of cancer.
In traditional pharmacoepidemiology studies, investigators have identified a clinical mechanism whereby a medication may increase cancer risk a priori and then investigated that specific association. However, it is difficult to predict possible carcinogenic mechanisms in advance, particularly as the number of licensed medicines is growing; between 2013 and 2018, the Food and Drug Association licensed, on average, between 40 and 50 new drugs per year [11]. Consequently, studies have been conducted systematically investigating, or screening, large numbers of medicines in relation to cancer risk as a means of complimenting traditional pharmacoepidemiologic practice. Screening studies have the potential to identify medicine-cancer associations which require more detailed study, as well as highlighting associations which are not widely recognised.
To date, studies systematically screening medicines in relation to cancer risk have been conducted among subscribers to Kaiser Permanente healthcare plans in the USA [12][13][14][15][16] as well as more recent population-based studies in Denmark [17], Sweden [18] and Norway [19]. However, these screening studies have had limitations, such as a relatively short follow-up period or a limited number of cancer sites studied [18], and none have controlled for site-specific risk factors or smoking (an accepted risk factor for many cancers [20]). It has been suggested that confounding by smoking is a possible explanation for the positive association observed in some screening studies between certain medicines and cancer, such as respiratory/allergy medicines with lung cancer [16,21] .Therefore, we systemically assessed the associations between commonly prescribed medicines and cancer risk, adjusting for a wide range of confounders including smoking, using a series of nested case-control studies.

Data source
Data for this study was obtained from Primary Care Clinical Information Unit Research (PCCIUR) [22], a high-quality population-based database of over two million patients registered at 393 general practices in Scotland between 1993 and 2011. PCCIUR data contains up to 20 years of demographic, clinical and diagnostic information and has been widely used in epidemiological research [23][24][25][26].

Study design
A series of retrospective nested case-control studies were conducted using PCCIUR data. Cases were identified based upon a new diagnosis of primary cancer (including only the twenty-two most common cancers in Scotland). Cases were excluded if they had a previous cancer, excluding non-melanoma skin cancer, or they were diagnosed with multiple primary cancers on the date of their first cancer diagnosis (due to uncertainty about the primary cancer and the potential for coding errors). Up to five controls were matched to cases on practice, year of birth (plus or minus five years), gender and year of registration (in categories). The index date within each matched set was defined as the diagnosis date of cancer in the case. Controls were required to be alive and free from cancer (with the exception of non-melanoma skin cancer) on the index date. Both cases and controls were required to have at least 3 years of follow-up data and remain registered with the same general practice over the follow-up period.
Within each matched set, the exposure period, i.e. the period of time over which medicine use was determined, started on either 1 January 1993 (as prescriptions before this time were less likely to be electronically recorded) or the most recent GP registration date if this occurred after January 1993. This ensured that all members within each matched set had the same exposure period. The exposure period ended 1 year before the index date, to reduce the risk of reverse causality and exclude medications that are unlikely to have had sufficient time to cause the cancer [27,28].

Classification and definition of medicine
Prescription entries were extracted from PCCIUR, and over 99% were converted to a generic name, formulation and route of administration. Foods, nutritional supplements, homoeopathic items and emollients (which are pharmaceutically inert) were excluded from the analyses. Single-agent medicines were grouped together if the active substance was the same and patient indications were similar (e.g. hydrocortisone). A distinction was made between low-dose aspirin (defined as 75 mg or less in the UK [29]) and high dose-aspirin (over 75 mg); low-dose aspirin is not usually considered a non-steroidal antiinflammatory drug (NSAID) in the UK [30]. This distinction between low and high dose aspirin can be noted in observational studies [31] (including analyses examining associations between aspirin and cancer risk [32]) and clinical trials [33]. Combination drugs of two or more medicines with different pharmaceutical effects were split into their component parts and considered as two or more separate medicines. Where combination drugs comprised active and inert medicines or agents not affecting physiology, or active medicines combined with other substances used to enhance the effect of the active ingredient (e.g. clavulanic acid which enhances the effect of penicillin in co-amoxiclav), only the active medicines were considered for association. All forms of insulin were grouped together as insulin. Systemic formulations of medicines (oral and parenteral formulations, together with all topical items applied for a systemic effect) and local formulations (all topical items applied for a local effect) were analysed separately. These groupings of medications were reviewed independently by a GP and pharmacist.

Covariates
The following comorbidities, based upon published read codes for the Charlson Comorbidity Index (CCI) [34], were identified prior to or during the exposure period: diabetes, myocardial infarction, coronary heart disease, heart failure, peripheral vascular disease, dementia, cerebrovascular disease, chronic obstructive pulmonary disease, osteoporosis, rheumatological disease, renal disease, liver disease, irritable bowel disease, human immunodeficiency viruses (HIV) and hemiplegia/paraplegia. Additional risk factors relevant to specific cancers were identified from the literature. For example, studies have shown a reduced risk of prostate cancer in patients with Parkinson's disease [35], an increased risk of lung cancer in patients with tuberculosis [36] and a 30% increased relative risk of kidney cancer among women undergoing a hysterectomy [37]. These additional risk factors were independently reviewed by a pharmacist and a GP to determine which could be considered confounders between medicine use and cancer risk and were extracted from PCCIUR where they were recorded. These are listed in Table 2 as potential site-specific confounders. Smoking status (non-smoker, current smoker, former smoker) and alcohol consumption (non-drinker, light or moderate drinker, heavy drinker) were determined from the most recent smoking or alcohol record prior to or during the exposure period.

Statistical analysis
Analyses were conducted for each of the 250 medicines most commonly prescribed (three or more times) within the exposure period in the matched controls for each cancer site. Where a number of medicines were equally prevalent at the 250 rank, all were studied. Descriptive statistics were used to summarise the cases and controls. Conditional logistic regression was used to calculate odds ratios (OR) and 95% confidence intervals (CI) for the association between any prescription and each cancer. The matched design accounted for age (± 5 years), GP practice, gender and year of registration, and the adjusted model contained age (in years, allowing for the fact that patients were matched in age bands rather by calendar year) and comorbidities. Analyses were repeated additionally adjusting for smoking, and were restricted to the 77.9% of patients (n = 263,615) with a smoking record before the end of the exposure period. These supplementary analyses additionally adjusted for smoking status rather than both smoking and alcohol use as smoking status was recorded for a greater number of patients than alcohol use (67.5%, n = 228,425) and has been shown to be well recorded in primary care [38]. Body mass index (BMI) could not be controlled for as it was only recorded for one-quarter of PCCIUR patients. Exposure-response analyses were conducted calculating ORs for low and high use compared with none, with low/high use based upon numbers of prescriptions equal to or below/above the median (among the control patients who were users), respectively. To illustrate, for a medicine associated with an increased risk of cancer and where the median number of items among users without cancer was 4, we required the odds ratio, comparing use of 1-4 items of medicine to none, to be less than the odds ratio comparing the use of 5 or more items to none. For a medicine associated with a decreased cancer risk and with the same median number of items, we required the odds ratio comparing use of 1-4 items of medicine to none to be larger than the odds ratio comparing use of 5 or more items to none. This approach to quantifying exposure-response relationships is found in other screening studies [16].
Supplementary analyses were undertaken with an exposure of six or more prescriptions (v less than six items), using the exposure-response analysis based on the median.

Definition of signal
For each set of analyses undertaken, the following criteria were used to identify signals, i.e. medicines deemed worthy of further consideration, as no accepted definitions of a signal exist [16,17,19,21]: (step 1) an adjusted OR for the association between the medicine and cancer risk greater than 1.25 (or less than 0.80); (step 2) an OR in 1 of statistical significance at the 1% level; (step 3) evidence of an exposure-response such that the OR comparing low to none was less extreme than the OR comparing high to none.

Sensitivity analyses
A number of sensitivity analyses were undertaken as follows: (1) the period of time before the index date during which prescriptions were not counted was increased from 1 year to 2 years to reduce the risk of reverse causation; (2) matched sets where the case had a new primary cancer diagnosis in a different cancer site within 12 months of the index date were excluded to allow for possible misclassification of the original cancer site; (3) adjustments were made for comorbidities, smoking and alcohol status for the 221,570 (65.4%) patients with available data. Additionally, analyses adjusting for comorbidities and smoking were rerun using multiple imputation with chained equations (MICE) techniques to impute smoking status. This is a simulation-based method appropriate for handling missing data when it is assumed that such values are missing at random or missing completely at random. Ordered logit models were used with age, gender, deprivation and comorbidities for the imputations, stratified by case-control status, and used 25 imputations.

Medication-wide association study (MWAS) plots
Results from the primary analyses, estimating associations between medicine use and cancer, adjusting for comorbidities, were depicted graphically using medicationwide association study (MWAS) plots. MWAS plots display the p values for the associations against the medicines grouped by British National Formulary (BNF) chapter [39].

Descriptive statistics
The study included 62,019 cases (29,653 males and 32, 366 females) and 276,580 matched controls. The most common cancers were breast (12,269), lung (9409), colorectal (8674) and prostate (7471). Overall, 53,533 cases (86.3%) had at least four matched controls. The median exposure period was 8.1 years in cases and controls (inter-quartile range 5.5 to 11.0). The overall characteristics of cases and controls are shown in Table 1.

Signals
In total, 5622 medicine-cancer associations were investigated across the 22 cancer sites. Of these, 2060 had a comorbidity-adjusted OR for any prescription greater than 1.25 (or less than 0.80), 214 were statistically significant at the 1% level and 118 had an exposureresponse relationship with cancer risk. Repeating these analyses additionally adjusting for smoking, 2139 medicine-cancer associations had an OR greater than 1.25 (or less than 0.80), of which 143 were statistically significant at the 1% level and 77 had an exposureresponse relationship with cancer. There were 142 unique medicine-cancer signals.
For the supplementary analyses, 2714 medicine-cancer associations had a comorbidity-adjusted OR for six or more medicines greater than 1.25 (or less than 0.80), 138 were statistically significant at the 1% level and 118 had an exposure-response relationship with cancer risk. Repeating these analyses additionally adjusting for smoking, 2926 medicine-cancer associations had an OR greater than 1.25 (or less than 0.80), of which 89 were statistically significant at the 1% level and 82 had an exposure-response relationship with cancer. There were 147 unique medicine-cancer signals.
Across all analyses, there were 231 unique medicinecancer signals, of which 22 were found in every analysis; 89 signals were only identified with an exposure of at least six prescriptions. One hundred and eighty-six signals were identified after adjusting for comorbidities, of which less than half (85) met the signal criteria after additionally adjusting for smoking. A further 45 signals were only identified in the analyses which controlled for both smoking and comorbidities; 169 signals were associated with an increased cancer risk and 62 with a lower risk of cancer.
The number of signals identified by each criterion for each analysis is listed in Table 2. The signals found are summarised in Table 3, Table 4, Table 5, and Table 6, with full details of the signals given in Additional file 1: Tables S1 to S4. Additional file 2 details potentially relevant clinical or epidemiological references for these signals.

MWAS plots
An MWAS plot for the most frequently prescribed medicines analysed in the most prevalent cancer site (breast) is given for illustrative purposes in Fig. 1. MWAS plots for all the medicines studied in each cancer site, with an exposure of any prescription, are found in Additional file 3: Fig. S1 & S2.

Sensitivity analyses
Increasing the lag-time from 1 year to 2 years, or removing the 1155 matched sets where cases had an additional primary cancer diagnosis within 12 months of the index date, had a minimal effect on the estimated ORs and p values. Results obtained using multiple imputation for the comorbidity and smoking adjusted analyses were similar to those obtained using the 77.9% of patients who had available smoking data.  Step 2 Step 3 Step 1 Step 2 Step 3 Step 1 Step 2 Step 3 Step 1 Step 2 Step 3 Breast §

Principal findings
Using a population-based database, we conducted an exploratory set of analyses, systematically screening medicines frequently prescribed in relation to their potential carcinogenic or chemo-preventative properties for commonly diagnosed cancers, adjusting for relevant comorbidities and smoking. The vast majority of medicines did not meet the criteria for our definition of a signal. From these analyses, we identified 231 signals potentially worthy of further consideration. The majority of these signals (169) were associated with increased cancer risk, the remainder a reduced cancer risk and covered a variety of medicine types. Adjusting for smoking in addition to comorbidities identified 45 signals not identified when adjusting for comorbidities only.

Context of other studies
This study follows the principles established in other screening studies to identify potential signals, namely by identifying effect sizes of interest, which are of statistical significance and where there is an exposure-response relationship between medicine and cancer. However, this study adjusts more extensively for comorbid conditions than previous screening studies, by using individual conditions and includes smoking in the analyses. Low availability of data on lifestyle factors is a limitation of current screening studies which the literature recognises [17]. Due to differences between studies, such as country of location, time of study, medicine licencing and grouping of cancers studied, it is not always possible to compare results directly between screening papers. However, as with other studies which have taken place to date, the vast majority of medicines are not associated with an increased risk of cancer. This should provide some reassurance to both patients and clinicians. Of those signals which have been identified, broadly speaking they can be divided into three groups. Firstly, there are signals which replicate well-known associations in the literature between medicine use and cancer risk, such as the increased risk of breast cancer associated with HRT medicine (Tables 3 and 5), [5], the reduced risk of oesophageal cancer with HRT medicine (Table 4) [41] and the reduced risk of colorectal cancer associated with some NSAIDs (e.g. diclofenac, naproxen (Table 6)) [42]. As such, our results provide reassurance that the study design and methodology employed are appropriate and informative.
Secondly, there are signals for which the relationship is unlikely to be causal. This may be due to a variety of factors, such as a chance finding due to multiple testing, reverse causation (e.g. tamoxifen and breast cancer) or omission of other appropriate confounders (such as BMI, a risk factor for cancers such as liver and colon [38]). Finally, there are some signals which merit further consideration. These include, for example, some antiplatelet/anticoagulant medicines and upper gastrointestinal cancer (warfarin and oesophageal cancer, clopidogrel and pancreatic cancer, Table 4). Both medicines are commonly prescribed, are intended for longterm use and can cause inflammation, [43,44] a wellknown risk factor for cancer. Possible mechanisms for a harmful association between clopidogrel and cancer include indirect modulation of the tumour growth, longterm platelet inhibition or instability of platelet-tumour cell aggregates [45]. As with all other signals, these need to be evaluated carefully in relation to clinical plausibility and causality [21], including application of the Bradford Hill criteria [46] in conjunction with more bespoke analyses.
This is the first screening study to adjust for smoking status. We observed that the effect of adjusting for smoking in addition to comorbidities varied with medicines and cancer sites. Some medicine-cancer associations which met the signal criteria after adjusting for comorbidities did not do so after additionally adjusting for smoking (e.g. cerivastatin and prostate cancer, cimetidine and stomach cancer (Table 3)). This is not unexpected for cancer sites where smoking is an important risk factor (e.g. lung, bladder, pancreas, prostate and stomach). Other medicine cancer associations met the signal criteria regardless of whether smoking was controlled for, even if the effect sizes were attenuated to some extent. Overall, the effect of adjusting for smoking varied between medicines and cancer sites, and we speculate that this suggests that smoking can both confound and synergise medicine-cancer associations in highly complex genetic, pharmacological and biological interactions.
In summary, the findings from our analyses highlight the need for additional analyses for signals of interest, tailored to the specific medicines and cancers.

Strengths and limitations of study
There are a number of strengths to our study. This is the first time PCCIUR data has been used to undertake a systematic screening study determining medicines associated with an altered cancer risk. The PCCIUR is a nationally representative database, covering 15% of Scotland. The comprehensive linking of practice data to Scottish Cancer Registry data means there is a high coverage of cancer cases and a relatively long follow-up period of patients. Thorough cleaning and validation of the data has reduced the loss of prescription items due to transcription errors.
A further strength of the study is the incorporation of a wider range of risk factors into the models, including conditions relevant to individual cancer sites and smoking status, which have not been incorporated in any screening studies to-date. The replication of well-known associations between medicines and cancer risk suggests that the study design and methodology are appropriate and hence that other signals which are less-well documented are worthy of consideration in relation to their potential carcinogenic or chemo-preventative properties.
There are a number of limitations to this study. There are alternative ways in which prescriptions can be studied in relation to cancer risk other than by medicine, such as by Anatomical Therapeutic Chemical (ATC) code [47]. For topical medicines, there will be uncertainty as to how much medicine was administered, and absorption will vary due to factors such as the patient, the site of application, the formulation, the agent and the medical condition [48].
For our analyses, we grouped cancers together by cancer site as has been done in some other screening studies [12][13][14][15][16]18]. This is useful in giving an overview by cancer site; however, histological subtypes of cancer each may vary in relation to their causal relationship with medicine, e.g. oesophageal adenocarcinoma and squamous cell carcinoma have different aetiologies and risk factors [28]. Smoking data were based upon primary care records, which have been shown to be reasonably accurate [49], but there remains the possibility of misclassification of smoking status. We did not have access to detailed smoking data, such as the quantity of cigarettes a patient smoked or the length of time they were a smoker. Where changes in the odds ratios after additionally adjusting for smoking are not as expected, smoking may possibly act as a proxy for other characteristics of an unhealthy lifestyle, such as lack of exercise or stress [50].
There are also a number of limitations to the statistical analyses. The large number of medicines studied within each cancer site increases the probability of type one error and undoubtedly some of the signals identified are false-positives. Although we used a 1% significance level, we did not apply a more stringent method to control for multiple testing, such as the Bonferroni correction [51] or false discovery rate control [52], as these would have reduced the likelihood of identifying true associations and because our analyses were exploratory in nature. This is similar to previous screening studies that have not applied any corrections for multiple testing [17] and consistent with arguments against multiple testing in general [53]. Due to the number of medicine-cancer associations investigated, it was not possible to create a set of bespoke confounders for each of the associations investigated nor to undertake more advanced handling of missing data. The use of median splits to categorise users as low users or high users can result in spurious results [54]; however, dose-response relationships are not necessarily linear and indeed these may often be non-linear [55]. Finally, it is inevitable that some of our analyses will be underpowered.

Implications for policy and research
Results from our study show that the vast majority of prescribed medicines are not associated with an increase in the risk of cancer. This should provide reassurance to both patients and clinicians. However, given the increasing volume and consumption of medicines, the identification of medicines with cancer-limiting or cancerincreasing potential remains a global priority.
We recommend that researchers with expertise in specific cancers and/or medications examine the individual signals we have identified to prioritise those worthy of further investigation, either in preclinical studies and/or other prescribing databases. Medicines which are more likely to be prescribed long-term and/or prescribed to a greater number of people could be prioritised. We think more detailed analyses of specific medicine-cancer associations identified within PCCIUR data are of value. These analyses could include additional relevant confounding factors not included in this paper, consideration of daily defined doses (DDDs) (where these are available) and more sophisticated ways of analysis and addressing missing data could be considered [56]. Finally, additional screening studies should be conducted to attempt to identify further signals and which would allow us to validate our findings.
The medicine-cancer associations identified in our study require replication elsewhere. Should these associations be replicated, the use of medicines not previously known to increase the risk of cancer may require reconsideration of current licencing and use of such medicines. Medicines with chemo-preventative properties may warrant further study in clinical trials with a view to repurposing.

Conclusions
This screening study has examined associations between medicine use and cancer risk in a sample of Scottish patients. The majority of medications are not associated with an altered risk of common cancers. There are novel candidate medicines which may have chemopreventative or carcinogenic properties. Further analyses of such medicines are warranted.