The UK Biobank is a prospective population-based cohort study that recruited 502,414 participants aged 40–69 years at baseline. Between March 2006 and July 2010, participants were invited to one of 22 assessment centres across the UK. The participants were mostly white Europeans (94.1%), with a small number of Asians (2.3%), blacks (1.6%), and people of other ethnic origins. They also provided detailed baseline information on sociodemographic factors, lifestyle behaviours, and health-related history via a self-completed touchscreen questionnaire and computer-assisted interview. Anthropometric measurements and biological samples were obtained as well. The UK Biobank has full ethical approval from the National Health Service (NHS) National Research Ethics Service. All individuals voluntarily participated in the study and provided informed written consent for participation and follow-up.
Inclusion and exclusion criteria
Of the 502,414 UK Biobank participants, 210,970 completed the online 24-h dietary recall questionnaire on at least one occasion. Of these, 47,472 participants had a history of a single cardiometabolic disease (coronary heart disease, hypertension, stroke, or diabetes) at baseline and were included in the analysis. Participants who had missing data (n = 8830; n = 7905 due to missing data on physical activity) or who reported dubious intake of total energy (n = 648; < 500 or >3500 kcal/day for women, <800 or >4200 kcal/day for men) were excluded, as in a previous study , leaving 37,994 participants for analysis (Additional file 1: Fig. S1).
Assessment of beverage consumption
Dietary information was collected using a web-based dietary assessment tool (Oxford WebQ), which is based on a set of elaborate questions on the intake of up to 206 types of foods and 32 types of beverages consumed during the previous 24 h. The intake of beverages was assessed by asking participants how many glasses/cans/cartons/250 mL SSBs (carbonated drinks, fruit drinks, squash and cordial), ASBs (low-calorie and diet drinks), and pure fruit/vegetable juices they had consumed the previous day. The Oxford WebQ has been validated against a traditional interviewer-based 24-h dietary recall, as a suitable method for measuring dietary intake in large population studies [35, 36]. Participants were invited to complete the Oxford WebQ at baseline, with four separate follow-up occasions between February 2011 and June 2012. Of the 37,994 participants in our analysis, 39.9% completed the questionnaire once, 22.6% twice, 20.2% three times, 14.5% four times, and 2.8% five times. We used the mean dietary consumption of participants who completed more than one questionnaire in our main analysis and the consumption from the baseline questionnaire in the sensitivity analysis. The Pearson correlation coefficient between mean beverage consumption and beverage consumption from the baseline questionnaire was 0.874 for SSBs, 0.898 for ASBs, and 0.876 for pure fruit/vegetable juices.
Assessment of covariates
Information on sociodemographic and lifestyle behaviours were ascertained at baseline using a self-reported online questionnaire, which included age, sex, ethnicity, alcohol consumption, food and vegetable consumption, red meat consumption, sedentary behaviour (using computers, driving, and watching television), smoking status, and drug use (insulin, antihypertensive drugs, lipid-lowering drugs, and aspirin). The area-based Townsend deprivation index was used to assess socioeconomic status, which was derived from consensus data on employment, housing, car ownership, and household overcrowding, corresponding to the postcode of residence. Physical activity was assessed at recruitment using a questionnaire based on the International Physical Activity Questionnaire (IPAQ), regarding the duration and frequency of different-intensity activities. The overall energy expenditure from physical activity was derived from the summed metabolic equivalents (MET-h/week) which were calculated by multiplying the duration of light, moderate, and vigorous physical activity per week by the weights of 2.5, 4, and 8, respectively, and then summing them. Body mass index (BMI) was calculated by dividing the weight (kg) by height squared (m2). The total sugar, fat, and energy intake for each participant were provided by the UK Biobank, which was calculated by multiplying the quantity consumed by the nutrient composition of the food or beverage, as taken from the UK food composition database, McCance and Widdowson’s The Composition of Foods and its supplements . More detailed information on all covariates is provided in Additional file 2: Table S1.
Ascertainment of outcomes
The main outcome was CMM, defined as progression to at least two of the following cardiometabolic diseases: coronary heart disease, hypertension, stroke, and diabetes. Participants were regarded as having cardiometabolic diseases if they had either a self-reported diagnosis, cardiometabolic disease medication, or surgery history on the touchscreen questionnaire or by verbal interview or via electronic health records which were consistent with the diagnosis of cardiometabolic disease. The electronic in-patient data was accessed by linkage to the National Health Service (NHS) Digital for England, Information and Statistics Division for Scotland, and Secure Anonymised Information Linkage for Scotland. The specific diagnostic criteria according to a previous study are shown in Additional file 3: Table S3 . For each participant with the cardiometabolic disease, the date of diagnosis was compared with the date of recruitment to determine whether the onset occurred before recruitment or at follow-up. The date of onset of CMM was considered the earliest date of the second cardiometabolic disease during the follow-up period, ascertained via any of the data sources. Deaths were ascertained via linkage to the death register data from NHS Digital for England and Wales and the NHS Central Register, National Records of Scotland for Scotland.
Study design and statistical analyses
In this population-based prospective cohort study, participants were categorised into three groups, based on their consumption of SSBs, ASBs, and pure fruit/vegetable juices as follows: 0, 0–1, >1 servings per day, respectively. The baseline characteristics of the participants are expressed as mean (standard deviation) or number (percentage) and compared among different beverage consumption groups using Student’s t-tests and chi-square tests for continuous and categorical variables, respectively. The Fine and Gray competing risk model was used to estimate the cumulative incidence of CMM and death. We also calculated hazard ratios (HRs) and 95% confidence intervals (95% CI) of CMM risks for beverage consumption in patients with a single cardiometabolic disease using the multivariable Fine and Gray competing risk model. The follow-up period began at the time when the participants completed their last 24-h dietary questionnaire and continued until the first occurrence of CMM, death, or the end of study period (June 2021), whichever came first. Linear trends were tested using the median intake, representing each category as a continuous variable in the multivariable competing risk model. To assess potential differences in the associations between beverage consumption and CMM risk in patients with specific cardiometabolic diseases, we conducted the analyses in patients with hypertension, patients with diabetes, and patients with cardiovascular disease (stroke or coronary heart disease), respectively. Participants who consumed 0–1 serving per day and participants who consumed >1 serving per day were combined (>0 serving per day) in the analysis for cardiovascular disease and diabetes due to the small sample size. Four multivariable-adjusted models were fitted in the analyses for patients with any single cardiometabolic disease: model 0 was adjusted for sociodemographic factors, including age, sex, ethnicity, and deprivation index; model 1 was additionally adjusted for BMI and lifestyle behaviours, including smoking status (current smokers or non-current smokers), alcohol consumption (over three times a week or not), physical activity, and sedentary time; model 2 was additionally adjusted for dietary factors, including total sugar, energy, fat, vegetable and fruit intake, fish intake, and red meat intake; and model 3 was additionally adjusted for drug use, including insulin, antihypertensive drugs, lipid-lowering drugs, and aspirin use. Model 3 was fitted in the analysis for patients with specific cardiometabolic diseases.
Eight sensitivity analyses were conducted to test the robustness of our results. First, we eliminated the first 2 years of follow-up to minimize reverse causality. Second, we recalculated the risks after excluding participants who reported losing weight compared with 1 year before recruitment in a baseline touchscreen questionnaire to decrease the effects of going on a diet. Third, we re-ran the model using beverage intake ascertained only from the baseline questionnaire instead of the mean intake of all questionnaires completed to approach baseline status. Fourth, to test for residual confounding of alcohol consumption, we re-ran the models by replacing alcohol consumption frequency (drinking ≥ 3 times per week or not) with alcohol consumption units (≥14 units per week or not). Fifth, we recalculated the models using a different approach to assess the physical activity. We divided the participants into two groups based on whether they met the 2017 UK Physical Activity Guidelines of 150 min of walking or moderate activity per week or 75 min of vigorous activity. Sixth, to investigate whether the association of beverage consumption were partially mediated by obesity, we re-ran the models without adjusting for BMI. Seventh, to minimize the confounding effects of other beverage consumption, we re-ran the analysis mutually adjusted for three different types of beverages, that is, SSBs were adjusted for ASBs and pure fruit/vegetable juice intake, and vice versa. We also conducted a stratified analysis with several covariates selected prior to data analysis as in previous study, such as age, sex, ethnicity, deprivation index, smoking status, alcohol consumption, physical activity, sedentary time, and BMI . Lastly, we used a multi-state model (MSM) to recalculate the associations between the three types of beverages and cardiometabolic outcomes, which is a classical and reliable approach for multimorbidity studies. The MSM approach allowed for simultaneous estimation of the role of risk factors in the transitions from healthy to single cardiometabolic disease and from single cardiometabolic disease to CMM [39, 40]. Two-tailed P < 0.05 was considered statistically significant. The Benjamini-Hochberg correction was used for P-values in the main analysis and P for interaction in the stratified analysis to account for multiple comparisons. All statistical analyses were performed using R software (version 4.1.0).