Study design and participants
The CAC study details have been described elsewhere [27]. Cooperative and voluntary children aged 6–8 years old in 23 selected schools have been recruited yearly since 2013. Baseline information included sociodemographic issues, dietary intake and eating behaviours, physical activity and sedentary behaviours, anthropometry and pubertal development. Follow-up data on nutrition, growth, metabolism, and health status were collected at regular intervals until the children were 15 years old: anthropometry and puberty assessments were conducted annually, and dietary intake and physical activity data were collected biennially. This study was approved by the Ethics Committee of Sichuan University, and all of the parents of the participants provided their written confirmed consent before enrolment. All examinations and questionnaires were administered with parental consent.
Between January 2013 and December 2018, 6967 children aged 6–8 years old were included at baseline. Of these children, 5439 had completed at least 2 follow-up assessments by the end of 2020. Since we were interested in the prospective relevance of diet to puberty timing, 389 children who had already reached B2/G2 at baseline were excluded from our current analysis. Among them, 141 participants with implausible energy intakes (less than or greater than age- and sex-specific cut-offs) [28] and 128 children with incomplete information on potential confounders were further excluded. In total, 4781 children (2152 girls, 2629 boys) were eligible (Additional file 1: Fig. S1), and of them, 1311 children provided first morning voided midstream urine samples.
Nutrition assessment
Nutrition data were collected via a validated food frequency questionnaire (FFQ) by trained investigators [29]. This FFQ included 17 categories of the 53 most representative local foods or food groups among children: whole grains, refined grains, tubers, vegetables, fruits, nuts, meat, fish and shrimp, animal viscera, eggs, dairy and products, total soy (soybean and products), fried foods, sugary snacks, sugar-sweetened beverages, fruit juices and dietary supplements. The participants reported their frequency (never, daily, weekly, monthly or annually) for each item and estimated portion sizes using food models and picture aids. During the interviews, the investigators checked FFQs for potentially incorrect responses and made clarifications when necessary. Dietary intake data were converted into energy and nutrient intake data using the continuously updated in-house nutrient database based on NCCW software (version 11.0, 2014), which reflects food composition in China.
This study investigated individual mean daily intakes of total soy (soy and soy products), dietary fibre and major fibre subtypes (cereal fibre: cereals, noodle, rice, tubers, cookies and cakes; fruit fibre: fruits and its products; vegetable fibre: vegetables and its products).
Urine analysis
Detailed instructions on collecting first morning voided midstream urine samples were carefully given to parents and children. All urine samples were stored immediately at − 20 °C before transportation and then at − 80 °C until analysis. Equol levels were determined using a previously validated gas chromatography-mass spectrometry method [7]. The detection limit was 3.8 ng/ml. All laboratory equipment was calibrated, and blinded duplicate samples were used. All of the data were double entered into the database.
Puberty timing
According to Tanner stage standardized criteria [28], B2 and pubic hair (girls and boys) were assessed at each examination by investigators. G2 was assessed by comparative palpation with a Prader orchidometer. If the volumes of the two testes varied, the larger volume was recorded. Testicular volume less than 1 mL was recorded as 1. Moreover, children were asked whether M or VB occurred during the annual physical examination; if so, respective months and years were recorded.
Anthropometry
An ultrasonic weight and height metre (DHM-30, Dingheng Ltd, Zhengzhou, China) was used to assess standing height to the nearest 0.1 cm and weight to the nearest 0.1 kg with the subject lightly dressed and barefoot. Triceps skinfold thicknesses and subscapular angle sites were measured on the right side to the nearest 0.1 mm using Holtain callipers (Holtain Ltd, Crymych, UK). All measurements were performed twice to calculate averages. Body mass index (BMI) sex- and age-independent BMI standard deviation scores (SDS) were calculated using Chinese reference curves [30]. Overweight was defined according to the International Obesity Task Force (IOTF) BMI cut-offs for children, which corresponds to an adult BMI of 25 kg/m2 [31]. The percent body fat (%BF) was calculated using Slaughter equations [32].
Covariates
Information on the frequency, duration and type of physical activity in various settings among children was collected by a validated physical activity questionnaire with 38 items (e.g., walking, running, climbing stairs, ball games, dancing) [33]. The participants reported typical time spent on sedentary behaviours associated with television, computers, smartphones and homework.
Furthermore, parents provided information about pregnancy and infancy (i.e., children's birth weight, exclusive breastfeeding duration, timing of complementary feeding) and domestic characteristics (i.e., residency, income, family size, smoking status, parental age, occupations and education level).
Statistical analysis
SAS® procedures (version 9.4, SAS Inc., Cary, NC, USA) and Stata 14 (Stata Corp., College Station, TX, USA) were used for data analyses. All analyses were performed with a significance level of p<0.05. Although there was no statistical interaction between dietary soy intake and sex, in theory, dietary oestrogen, similar to endogenous oestrogen, might differentially impact the course of puberty in girls and boys [34]. Data from girls and boys were thus analysed separately.
Since energy intake has been suggested to influence pubertal development, and energy intake is dependent on age [35], intake of total soy was expressed as age-specific residuals from the regression of soybean and its product intake on energy intake. Similarly, dietary fibre intake was expressed as age-specific residuals from the regression of fibre intake on energy intake. To examine the potential associations of total soy intake or dietary fibre intake with puberty timing, their distributions were grouped into tertiles (T1–T3).
The Kolmogorov-Smirnov and Shapiro-Wilk tests were conducted to examine the data for normality. Baseline birth weight and %BF were nonnormally distributed and presented as medians together with the interquartile rages; other continuous variables were normally distributed and presented as the means with their standard deviations (SD). Differences in anthropometric, sociodemographic and nutritional data between tertiles were analysed using an ANOVA for normally distributed continuous variables, the Kruskal-Wallis test for nonnormally distributed continuous variables, and the chi-square test for categorical variables. Statistical models and descriptive tables were stratified by sex.
Cox proportional hazard regression models were used to investigate the prospective relevance of total soy or dietary fibre (and its types) intake at baseline with age at B2/G2 or M/VB. Censoring occurred at the age of reaching B2/G2 and M/VB or age at the last follow-up if puberty events had not been reported.
In the basic models, the tertiles of total soy intake (residuals) or dietary fibre (and its types) intake (residuals) at baseline were the main independent fixed effects. These following potential confounders were considered for the Cox regression models: birth weight, age at baseline, school location, physical activity, body composition (Z scores of BMI, overweight (Y/N), %BF), parental/paternal/maternal educational level, family income, mother’s age at menarche, smoking status in the household, and total energy intake at baseline, as well as dietary fibre intake (residual) at baseline (in the total soy intake model) and total soy intake (residual) at baseline (in the dietary fibre intake model). In addition, we conceptualized confounders using the DAG platform [36] to validate and justify the potential confounders. Each potential confounder was initially considered separately and was included if it was associated with both the dietary index and indicators of puberty timing and if it substantially altered the estimate by more than 10% [37]. As high levels of isoflavones and dietary fibre often coexist in food, Model 2 was adjusted for parental education level, energy intake at baseline, mother’s age at menarche, and fibre intake (residuals) at baseline (in the total soy intake model) or total soy intake (residuals) at baseline (in the dietary fibre intake model). In the final model, we controlled for confounding and/or mediation by percent body fat at baseline (Model 3), because it has been proposed that body composition in childhood might be relevant to the timing of puberty [38]. Hazard ratios (HRs) and 95% confidence intervals (CIs) were estimated by comparing the 2nd and 3rd tertiles to the 1st tertile in these models. We assessed the linear trends by entering the value of dietary fibre/soy intake as a continuous variable in the above models.
To explore potential nonlinear relationships, we examined the associations (based on Model 3) of dietary soy intake and fibre intake with pubertal markers using restricted cubic spline models (four knots, according to Harrell’s recommendation [39]) among all of the participants. Four knots offer an adequate fit of the model and constitute a good compromise between flexibility and loss of precision caused by overfitting.
Moreover, we tested the potential interactions of urinary equol level (or fibre and its subtype intake) on the relationship between dietary soy intake and puberty timing. Further stratified analyses were conducted if the p for interaction was < 0.05.
To test the robustness of our results, we re-run our analyses using mixed model (PROC MIXED in SAS) with school clustering as a random effect, to investigate the associations of total soy intakes or fibre intakes in childhood with puberty timing.