Study design and participants
UK Biobank is a prospective population-based cohort study of 502,616 participants enrolled from 22 different assessment centres across England, Scotland and Wales between 2006 and 2010 (5% response rate). Individuals were invited to participate on a voluntary basis if they lived within 25 miles of a UK Biobank assessment centre and were registered with a General Practitioner; all participants gave informed consent for data provision and linkage. UK Biobank has full ethical approval from the NHS National Research Ethics Service (16/NW/0274). A detailed account of alcohol consumption patterns, sociodemographic, lifestyle and medical information was collected from all participants recruited to the study.
Information on alcohol consumption
Participants completed a touchscreen questionnaire to report their frequency of alcohol intake, average amount and type of alcoholic beverage and how they consume alcohol in relation to food. Participants who abstained from alcohol (due to various reasons https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=3859) and those with missing values were excluded from the analysis. Participants who reported drinking alcohol infrequently (e.g. special occasions and one to three times a month) were also excluded from analysis, as although they did report their average amount of alcohol consumed, the purpose of this analysis was to study the health risks associated with different drinking patterns among regular alcohol drinkers. Average weekly intake of red wine, spirits, beer plus cider, champagne plus white wine, fortified wine and other alcoholic drinks was reported at the time of study recruitment. Using this information, we calculated the total average weekly units of alcohol. If red wine consumption accounted for more than 50% of the total weekly units consumed by a participant on average, then that participant was labelled as “red wine drinker”. Using similar definitions, type of beverage was classified into five categories: red wine, beer/cider, spirits, white wine/fortified wine/champagne and mixed. Participants were asked if they usually drink alcohol with food and based on their answers classified into yes, no and mixed. The frequency of alcohol intake over the week was divided into three categories: daily or almost daily, three to four times a week and once or twice a week. Average weekly alcohol units was used as a continuous variable and also divided into the following categories for sensitivity analysis: 1–14 units (low risk), 15–35 units in females and 15–50 units in males (increasing risk) and > 35 units in females and > 50 units in males (higher risk), adapted from the latest health survey in England and from information incorporated in NICE guidelines [13].
Demographics, lifestyle, biomarkers and long-term conditions (LTCs) information
Socioeconomic status was classified based on Townsend score (a measure of deprivation in the UK) [14]. A Townsend deprivation score calculated using the participant’s home postcode, based on the preceding national census output areas, was provided; a higher score implied higher levels of socioeconomic deprivation. Townsend score was divided into five quintiles. Smoking status was divided into three categories: non-smokers, previous smokers and current smokers. Physical activity was self-reported and classified as: none (no physical activity in the last 4 weeks), low (light activity only in the last 4 weeks), medium (heavy walking for pleasure and/or other exercises in the last 4 weeks) and high (strenuous sports in the last 4 weeks) [15, 16]. Body mass index (BMI) calculated from anthropometric measurements at the baseline assessment was classified as per WHO classification into < 18.5, 18.5–24.9, 25–29.9, 30–34.9, 35–39.9 and ≥ 40 kg/m2 [17]. Systolic blood pressure was recorded using an automated machine by two readings at baseline and classified into < 120, 120–139, 140–159 and > 160 mmHg [18]. Total cholesterol levels were measured at baseline and categorised into ≥ 5.0 mmol/L and < 5.0 mmol/L [19]. C-reactive protein levels (in mg/L) and gamma glutamyltransferase levels (in U/L) were measured at baseline and used as continuous variables. Self-rated health was classified into excellent, good, fair and poor by participants at baseline. The physical and mental health conditions (including diabetes and hypertension) self-reported by participants were organised into a list of 42 long-term conditions (LTCs) based on our previously published literature on multimorbidity (see supplementary Table S1) [20, 21]. The number of LTCs was classified based on LTC count into 0 LTCs, 1 LTC, 2 LTCs, 3 LTCs and ≥ 4 LTCs.
Clinical outcomes
The baseline assessment centre data were linked to national mortality, cancer and hospital episode statistics records by UK Biobank data analysts. The six outcomes studied were all-cause mortality, major adverse cardiovascular event-MACE (stroke, myocardial infarction (MI) or vascular death), external causes of injuries/accidents, incidence of all-cause and alcohol-related cancers (colon, rectum, breast, liver, oesophagus and larynx) [22]. Participants with previous history of stroke, MI, liver cirrhosis or any cancer were removed from analysis to avoid reverse causality. All-cause mortality, stroke, MI and cancer incidence events were reported by UK Biobank data analysis team through data linkage. We utilised ICD-10 classifications for defining vascular deaths (ICD-10 codes “I00-I78”, “G45” and “G46” as primary cause of death), liver cirrhosis (hospitalisation events with primary diagnostic ICD-10 codes “K70” and “K74”) and external causes of injuries/accidents (hospitalisation events with primary diagnostic ICD-10 codes beginning with “W”, “X”, “V” and “Y0”) [23]. The follow-up period ended between November 2015 and January 2016, depending on different assessment centres across the UK. Length of follow-up was a median duration of 9 years (Interquartile range 8.3–9.5 years).
Statistical analysis
The distribution of various demographic, health-related behaviour characteristics, frequency of alcohol consumption, type of alcoholic beverages consumed and alcohol consumption with/without food was described across the three levels of average alcoholic weekly units (low risk, increasing risk, high risk), using mean and standard deviation for continuous variables and percentages for categorical variables. Six different Cox’s proportional hazards regression models [24] were fitted for the six clinical outcomes under consideration: all-cause mortality, MACE, liver cirrhosis, injuries/accidents, incidence of all-cause and alcohol-related cancers using the final study sample, after all exclusions. In each of these models, age was used as the underlying time variable [25]. Three patterns of alcohol consumption were used together as predictor variables: frequency of alcohol consumption, type of alcoholic beverage and alcohol consumption with food. Results were presented as hazard ratios (HR) with 95% confidence intervals (CI), adjusted for confounding variables (average weekly alcohol units-continuous, sex, socioeconomic status (based on Townsend score), smoking, physical activity levels, BMI and number of LTCs). The models for MACE were adjusted for presence of diabetes, hypertension, systolic blood pressure and total cholesterol values at baseline, in addition to the confounders listed above as they have been recognised as cardiovascular risk factors by the WHO [26]. The total number of participants included in the survival analysis models varied according to the completeness of the putative confounding variables and all missing data were excluded from regression modelling. In view of a large number of co-variates, a global p value for heterogeneity was calculated using the “globaltest” package for each regression model, respectively [27].
Marginal fractional polynomials [28] were used to visualise the relationship between average weekly alcohol units (continuous) and two outcomes of interest (all-cause mortality and MACE). In the next step, a set of predicted probability values for the outcome of interest (all-cause mortality and MACE at 7 years minimum duration of follow-up) were calculated from the Cox regression models described above using multiple fractional polynomials for weekly alcohol units and marginal standardisation in which predicted probabilities of the outcome were calculated for every observed confounder value (each category for categorical covariate and mean value for continuous covariates) [29]. The results were visualised by plotting 7-year predicted probability of all-cause mortality and MACE, respectively, against weekly alcohol units, using sub-groups based on three patterns of alcohol consumption with three different sub-plots [30].
Mediation analysis
The mediating effects of five variables on the relationship between alcohol consumption patterns and clinical outcomes (all-cause mortality and MACE) were examined: amount of average weekly alcohol units (continuous), socio-economic status using Townsend score (continuous), CRP levels (continuous), smoking status (categorical) and self-rated health (categorical). The outcomes of interest (all-cause mortality and MACE) were regressed by the primary exposure variable (alcohol consumption pattern) and all other covariates as per the main analysis. The potential mediators, as listed above, were then regressed by primary exposure variable and all other covariates. The results of the outcome and mediator models were combined to estimate the proportion of average mediated effect and 95% confidence intervals were calculated using Quasi-Bayesian estimates with 100 iterations. This analysis was performed using the “mediation package” [31].
Sensitivity analyses
The above analysis was repeated using a different classification for type of alcoholic beverage where the classification was based on the drink type consumed in volumes larger than any other drink type by a participant from the total weekly alcohol units instead of > 50% of the total. In addition, several other sensitivity analyses were performed with the entire analyses repeated in sub-groups stratified based on amount of average alcoholic weekly units (low risk, increasing risk, high risk) and sex (male and female), after excluding participants with poor self-rated health at baseline and after excluding first 2 years of follow-up to mitigate the impact of reverse causality.
Repeated measurement of alcohol consumption
A small number of participants, selected at random for imaging study, self-reported their alcohol intake and consumption pattern during the follow-up period. This information was captured. Average weekly alcohol units and alcohol consumption pattern were classified using the same methods described above. Changes in amount and alcohol consumption pattern were reported using percentages. To model the repeat measurements of alcohol consumption patterns with the outcomes of interest, an extended Cox’s proportional hazards models were fit which allowed for time-varying exposure variables.
All statistical analyses were conducted using R version R-3.6.1.