Study design and participants
UK Biobank is a population-based prospective cohort study that recruited about 500,000 males and females aged 37 to 73 years from 2006 to 2010 [15]. The Ethics Committee permitted the UK Biobank from the Northwest Multi-Center Research Ethics Committee (Research Ethics Committee Reference: 16 /NW/0274). All participants assessed one of 22 assessment centers in England, Scotland, or Wales, covering various settings to provide socio-economic and ethnic heterogeneity and urban-rural mix. This ensures a wide distribution in all exposure ranges to detect the generalized correlation between baseline characteristics and health outcomes. After informed consent, participants accomplished a touch screen questionnaire, oral interview, and physical examination and reserved some biological samples.
Measurement of sunlight exposure
Participants reported how many hours they respectively spent the daylight outdoors on summer and winter typical days. Data was collected at baseline assessment (2006–2010). This section is defined as fields 1050 and 1060 in the UK Biobank database. Participants described a number using a touch-screen pad or selected integers that included “less than an hour a day,” “do not know,” or “prefer not to answer” in several options set in advance. If participants spent a lot of time outdoors, the average time they spent daily should be provided. Initial data preparation included deleting participants who reported “do not know” or “prefer not to answer” (n=33856) and redefining “less than an hour a day” as 0 (n=19865). Considering the effective daytime duration in the UK, extreme values larger than the typical day length in summer (16 h, n=253) and winter (8 h, n=5472) were excluded. To unify the outdoor light exposure time standard, the duration reported in winter and summer was averaged within the participants. We exclude people whose average time is more than twice the standard deviation for analysis. There is a strong correlation between the length of outdoor lighting reported in summer and winter (Pearson’s r=0.541, P<0.001), indicating that participants with more outdoor light time in summer also tend to spend more time in winter.
Covariates
The selection of covariates was based on the following criteria: demographic variables, exposure-related variables, and confounding variables associated with AD [16, 17]. The following variables were selected: age, sex, education, skin color, use of sun/UV protection, employment status, sleep duration, and pollution of air, fracture history, vitamin D supplement, hearing loss, smoking status, alcohol use, cardiovascular disease (CVD), total physical activity (TPA), and body mass index (BMI). The response of the skin to UV depends on the difference in skin color caused by the distribution of melanocytes caused by the size, volume, and keratinocytes rather than the difference in the number of melanocytes between races [18, 19]. Thus, we choose skin color rather than race as a covariate. Information on education (with or without a college or university degree), skin color (white or colored), use of sun/UV protection (never/rarely, yes, and do not go out in the sunshine), employment status (yes or no), fracture history in past 5 years (yes or no), vitamin D supplement (yes or no), hearing loss (yes or no), cigarette, and alcohol consumption (never, former, and current) was collected from the touchscreen questionnaire. Sleep duration was divided into short (<7 h per night), normal (7 h per night), and long (>7 h per night) [20]. Considering that high air pollution concentrations induce smog, which attenuates sunlight exposure, we used PM2.5 as one of the correction factors. CVD was collected from both the touchscreen questionnaire and verbal interview, including heart attack, angina, stroke, and hypertension. TPA was measured basing the revised International Physical Activity Questionnaire (IPAQ), including the frequency and duration of walking (Field 864 and 874), moderate (Field 884 and 894), and vigorous activity (Field 904 and 914) on a typical day/week over the past 4 weeks [21]. BMI was calculated from weight (kg) and standing height (meters) measured during the medical examination.
Measurement of outcome
Dementia syndromes were identified from the International Classification of Diseases, 9th and 10th revision (ICD9 and ICD10) codes of hospital inpatient admission data, provided by the UK Hospital Episode Statistics. Corresponding start dates and annual review dates were from Scottish Morbidity Records and Patient Episode Database. Participants followed up on the earliest diagnosis of dementia, the date of death, the date of the last data collection by general practitioners, or the time of the last hospitalization, whichever occurred first. According to the ICD, all-cause dementia was defined as code in ICD-9 codes 290, 290.4, 291.2, 294.1, 331.0-331.2, 331.5, and ICD-10 codes A81.0, F00, F01, F02, F03, F05.1, F10.6, G31.0, G31.1, G31.8, G30, and I67.3 (Additional file 1: Table S1). Additionally, dementia diagnoses were also retrieved from primary care statistical information utilizing reading codes (version 2 [Read v2] and version 3 [CTV3 or Read v3]) [22]. We excluded those who reported “dementia, Alzheimer’s disease, or cognitive impairment” in baseline to reduce the possibility of including prevalent cases in our analyses (n=34488). Missing participants without dementia outcomes were excluded (n=34572). The missing rate of the cohort is 6.88%.
Statistical analyses
If the distribution of variables is normal, t test was used to compare the average level of the incident dementia group and the no incident dementia group. Otherwise, the Mann-Whitney U test was used. Categorical variables were presented as numbers (percentages) and compared by the chi-square test. The dose-response relationship was flexibly modeled by the restricted cubic spline (RCS) with five knots to explore the potential nonlinear correlation between sunlight exposure and the risk of all-cause dementia [23]. With the change points as a reference, univariate and multivariate Cox proportional hazard regression models were fitted to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) for the sunlight exposure on dementia outcomes.
Series sensitivity analyses were conducted to test the robustness of our findings. First, we ran our survival models in white participants to check the potential role of skin color since most participants in this study are white. Second, we excluded the population with dementia events within the first 3 years and a further 10 years of follow-up to avoid potential reverse causation. To assess whether the association between time spent in outdoor light and risk of dementia differed across subpopulations, we examined potential effect modification by age at baseline (<60 and ≥60years old), age of dementia onset (early onset: <65 and late onset: ≥65 years old), and sex (male and female). Considering that limited light can lead to sleep disorders, we performed subgroup analysis according to sleep duration (<7 h, 7 h, and >7 h per night). We used the interaction between sunlight exposure time and each potential modifier to test the homogeneity across stratum-specific HRs.
Statistical analyses were completed using R 3.6.1 (R Foundation, Vienna, Austria). P values less than 0.05 were statistically significant.