Skip to main content
  • Research article
  • Open access
  • Published:

The relationships between women’s reproductive factors: a Mendelian randomisation analysis

Abstract

Background

Women’s reproductive factors include their age at menarche and menopause, the age at which they start and stop having children and the number of children they have. Studies that have linked these factors with disease risk have largely investigated individual reproductive factors and have not considered the genetic correlation and total interplay that may occur between them. This study aimed to investigate the nature of the relationships between eight female reproductive factors.

Methods

We used data from the UK Biobank and genetic consortia with data available for the following reproductive factors: age at menarche, age at menopause, age at first birth, age at last birth, number of births, being parous, age first had sexual intercourse and lifetime number of sexual partners. Linkage disequilibrium score regression (LDSC) was performed to investigate the genetic correlation between reproductive factors. We then applied Mendelian randomisation (MR) methods to estimate the causal relationships between these factors. Sensitivity analyses were used to investigate directionality of the effects, test for evidence of pleiotropy and account for sample overlap.

Results

LDSC indicated that most reproductive factors are genetically correlated (rg range: |0.06–0.94|), though there was little evidence for genetic correlations between lifetime number of sexual partners and age at last birth, number of births and ever being parous (rg < 0.01). MR revealed potential causal relationships between many reproductive factors, including later age at menarche (1 SD increase) leading to a later age at first sexual intercourse (beta (B) = 0.09 SD, 95% confidence intervals (CI) = 0.06,0.11), age at first birth (B = 0.07 SD, CI = 0.04,0.10), age at last birth (B = 0.06 SD, CI = 0.04,0.09) and age at menopause (B = 0.06 SD, CI = 0.03,0.10). Later age at first birth was found to lead to a later age at menopause (B = 0.21 SD, CI = 0.13,0.29), age at last birth (B = 0.72 SD, CI = 0.67, 0.77) and a lower number of births (B = −0.38 SD, CI = −0.44, −0.32).

Conclusion

This study presents evidence that women’s reproductive factors are genetically correlated and causally related. Future studies examining the health sequelae of reproductive factors should consider a woman’s entire reproductive history, including the causal interplay between reproductive factors.

Peer Review reports

Background

A woman’s reproductive life course includes her age at menarche and menopause, the age at which she starts and stops having children and the number of children she has, as well as the age she first has sexual intercourse and the number of sexual partners she has in her lifetime. Some of these reproductive factors have been identified as risk factors for chronic diseases, including breast cancer, respiratory disease and cardiometabolic diseases [1]. A younger age at menarche and older age at menopause were associated with an increased risk of breast cancer in one large meta-analysis [2], while having fewer children and a higher age at first birth (AFB) were positively associated with breast cancer risk in another [3]. Other studies have implicated age at menarche, AFB, number of still births and miscarriages, age at menopause and parity in relation to respiratory and cardiovascular disease [4,5,6]. One study found that later age at menarche was associated with a reduced risk of coronary artery disease [7]. Having any children and later AFB have been associated with a lower risk of lung cancer [8]. Older age at menarche and a shorter reproductive period have also been linked with a higher risk of chronic kidney disease [9, 10].

However, on the whole, studies have not considered a woman’s entire reproductive history and the potential interplay between reproductive factors. Understanding the inter-relationships between reproductive factors is important to correctly identify potential confounders (common causes of the exposure and outcome of interest) and mediators (factors that lie on the causal pathway between exposure and outcome). Information on multiple reproductive factors will provide useful additions to algorithms for predicting disease risk in women [1].

Evidence of association between age at menarche and menopause is inconsistent, with some studies reporting earlier age at menarche associated with earlier menopause [11,12,13,14,15,16], others showing the inverse association [17, 18] and some showing no evidence of this association [19,20,21,22,23,24]. While there is some evidence of an association between an earlier age at menarche and earlier AFB [25, 26], there is little evidence of the association between age at menarche and parity [26]. Another study has also investigated reproductive factors in relation to sexual history, suggesting a younger age at menarche is not a risk factor for younger age at first having sexual intercourse (AFS) [27]. Associations between reproductive factors could be reflective of causal relationships, or common genetic or non-genetic environmental causes, i.e. confounding.

Observational studies are prone to confounding bias as it is difficult to capture all confounders accurately. Mendelian randomisation (MR) is a method that assesses the causal relationship between an exposure and outcome by using genetic variants robustly associated with the exposure. MR is advantageous as it is less likely to be affected by confounding and reverse causation than standard multivariable regression analysis [28,29,30]. There have been an increasing number of genome-wide association studies (GWAS) of reproductive factors [31,32,33], which can be used to investigate genetic correlation (i.e. shared genetic causes) between these factors as well as whether relationships between reproductive factors may be causal using MR.

The present study aims to identify and clarify the nature of any relationships between women’s reproductive factors, by investigating their genetic overlap and the causal relationships between eight reproductive factors, including potential bidirectional effects where the temporal order between the traits is not clear.

Methods

UK Biobank

The UK Biobank study is a large population-based cohort of 502,682 individuals who were recruited at ages 37–73 years across the UK between 2006 and 2010. The study includes extensive health and lifestyle questionnaire data, physical measures and biological samples from which genetic data has been generated. The study protocol is available online, and more details have been published elsewhere [34]. At recruitment, the participants gave informed consent to participate and be followed up.

Reproductive factors

The reproductive factors investigated in this study were age at menarche, age at menopause, age at first live birth, age at last live birth, number of live births, age first had sexual intercourse, lifetime number of sexual partners (at the time of assessment) and parous status (ever/never given birth at the time of assessment). In UK Biobank, these reproductive factors were derived from questionnaire responses at the baseline assessment; further details can be found in Additional file 1.

Phenotypic correlation

We calculated the correlation between reproductive factors using the Pearson correlation coefficient.

GWAS

To identify genetic variants robustly related to each of the reproductive factors, we first performed a GWAS for each reproductive factor among women of European ancestry in the entire UK Biobank sample. Each GWAS was performed using the Medical Research Council (MRC) Integrative Epidemiology Unit (IEU) UK Biobank GWAS pipeline [35, 36]. BOLT-LMM was used to conduct the analysis in the GWAS pipeline [37], which accounts for population stratification and relatedness using linear mixed modelling. Genotyping chip and age were included as covariates. Genome-wide significant single nucleotide polymorphisms (SNPs) were selected at p < 5 × 10−8 and were clumped to ensure independence at linkage disequilibrium (LD) r2 < 0.001 and a distance of 10,000 kb using the TwoSampleMR package [35].

Genetic correlation

Genetic correlations between the reproductive factors were calculated using LD score regression (LDSC) and the UK Biobank GWAS summary statistics [38, 39]. The regressions were performed using pre-computed LD scores for each SNP calculated based on individuals of European ancestry from using 1000 Genomes European data and are appropriate for use with European GWAS data [38]. These LD scores were filtered to HapMap3 SNPs as these are well-imputed in most studies [40]. SNPs found on chromosome 6 in the region 26 to 34MB were excluded. GWAS summary statistics were converted for LDSC regression using the munge_sumstats.py command from the command line tool “ldsc”, and LDSC was performed using the ldsc.py command.

Mendelian randomisation

We conducted MR analysis using the “TwoSampleMR” R package [35], where the inverse variance weighted (IVW) method was used in the primary analysis to assess the causal relationships between pairs of reproductive factors. This method combines Wald ratios, calculated by dividing the SNP-outcome association by the SNP-exposure association, in a multiplicative random effect meta-analysis where the weight of each ratio is the inverse of the variance of the SNP-outcome association [41].

We assessed earlier-occurring reproductive factors as the exposure in relation to later-occurring factors (the outcomes), e.g. age at menarche was investigated as a potential cause of AFB but not vice versa. In some cases where there was no clear temporal ordering, we carried out analyses in both possible directions, e.g. between ever parous status and lifetime number of sexual partners. These cases are shown in Additional file 2: Table S1. Additionally, we investigated the effect of age at menopause on earlier-occurring factors: age at menarche, AFS and AFB to assess the effect of ovarian reserve, using age at menopause as a proxy.

All relationships tested by MR are shown in Additional file 2: Table S2, and GWAS estimates were standardised (mean = 0 and standard deviation (SD) = 1) prior to performing MR.

The IVW method makes a number of assumptions: that the genetic instruments are strongly associated with the exposure; do not share common causes, either genetic or other confounders such as population stratification, with the outcome; and are not pleiotropic, i.e. do not have an effect on the outcome through a pathway other than via the exposure [41]. We therefore performed a series of sensitivity analyses to evaluate the robustness of our results to these assumptions (see the “Evaluating MR assumptions” section).

In our primary analysis, we applied two-sample MR methods on a single large dataset, UK Biobank, which is advantageous over other methods due to the large sample size. In this analysis, the GWAS used for the exposure and outcome were both performed on women in the UK Biobank study, and therefore, the exposure and outcome samples overlap entirely. Large overlap in the sample(s) used to generate genetic variant-exposure and genetic variant-outcome associations can introduce bias in estimates obtained using two-sample MR [42]. In particular, sample overlap between the exposure and outcome samples may bias estimates towards the observational (and potentially confounded) exposure-outcome association and may lead to an overestimation of effects [42]. While it has been proposed that this approach of applying two-sample MR methods in a single sample may be performed within large studies with minimal bias introduced to the causal estimates by sample overlap [43], we performed a series of sensitivity analyses to evaluate the robustness of our results to this (see the “Assessing the impact of sample overlap” section).

Evaluating MR assumptions

We evaluated the likelihood that MR assumptions were violated where we found evidence of effects in our primary analysis.

Instrument strength

The strength of the genetic instrument for each reproductive factor in the main IVW analysis was assessed using the mean F statistic, calculated based on the variance explained (r2) by the genetic instrument and sample size of the exposure [30].

Negative controls

We repeated our primary analysis for five “negative control” pairs of reproductive factors, for which we would not expect to see causal effects due to their temporal ordering (the outcome occurring before the exposure). These negative controls included the effect of AFB on age at menarche, AFS on age at menarche, AFB on AFS, age at menopause on AFS and age at last birth (ALB) on age at menarche. In these cases, any evidence of an effect would suggest pleiotropy. This may occur when a genetic instrument affects the exposure and outcome through a shared heritable factor, which could be a shared process or pathway [44].

Heterogeneity

We performed a test for heterogeneity using Cochran’s Q statistic using the TwoSampleMR package between instruments. A Q larger than the number of instruments minus one provides evidence for heterogeneity and invalid instruments, which can imply the presence of pleiotropy [45, 46].

Pleiotropy

We used additional MR methods: weighted mode [47], weighted median [48] and MR-Egger [49] to assess evidence of pleiotropy [50]. The intercept and 95% confidence interval of the MR-Egger regression line was used to determine directional pleiotropy using the TwoSampleMR package [49]. In addition, the I2 statistic was calculated to quantify the strength of the instruments used for MR-Egger to determine if the genetic variants violated the “NO Measurement Error” (NOME) assumption [51]. For IVs with an I2 statistic < 90%, we performed additional MR-Egger sensitivity analyses using the simulation extrapolation (SIMEX) to adjust for regression dilution bias.

We also applied the R function MR-PRESSO (Mendelian Randomisation Pleiotropy RESidual Sum and Outlier) to identify and correct for potential outliers (p < 0.05) [52]. Further details can be found in Additional file 1 [46,47,48,49,50, 52, 53].

Steiger filtering for bidirectional relationships

We performed the MR Steiger test and Steiger filtering bidirectionally for pairs of reproductive factors where the temporal ordering was not clear [54] (Additional file 2: Table S1). This was performed to assess whether the hypothesised causal directional of the relationship was correct for each genetic instrument [54]. Further details can be found in Additional file 1 [54].

Assessing the impact of sample overlap

To investigate whether the degree of bias introduced by sample overlap impacted our findings, we conducted a series of sensitivity analyses.

Firstly, we performed MR on GWAS summary statistics using a “split-sample” approach, in which the UK Biobank sample was divided in two halves at random. The MR analysis was performed twice for each relationship, once using the exposure GWAS from one half and the outcome GWAS from the second half and vice versa, with the resulting MR effect estimates being meta-analysed using a fixed effects model.

Secondly, we performed two-sample MR using results from largely non-UK Biobank replication studies and consortia to estimate SNP effects on the exposure (sample 1) and UK Biobank estimates to estimate SNP effects on the outcome (sample 2), and vice versa where appropriate [31, 32, 55,56,57,58]. Further details on the number of studies and sample sizes used for the replication consortia are shown in Additional file 2: Table S3. Using replication studies may also avoid bias introduced by winner’s curse, which is the overestimation of SNP effects on the exposure in a discovery GWAS [59, 60].

Finally, we used a recently developed MR method, MRlap, that is robust to bias introduced by sample overlap, winner’s curse and weak instruments [61]. MRlap was performed using the UK Biobank GWAS summary statistics for reproductive factors where both the exposure and outcome were continuous, i.e. excluding associations involving ever parous status, as the correction for biases cannot account for a different degree of overlap for cases/controls in case of binary traits [61]. Further details can be found in Additional file 1 [42, 60,61,62].

Only those reproductive factor associations for which there was evidence of an effect from the primary analysis were taken forward for this sensitivity analysis, as the causal effect would likely be overestimated when performing MR with overlapping exposure and outcome samples [42].

Evaluating the role of adiposity

Childhood adiposity may confound the relationships between reproductive factors that we identify since adiposity is genetically correlated with age at menarche [63], and age at menarche is genetically correlated with other reproductive factors [64, 65]. To investigate this, we performed multivariable MR (MVMR) using the “MVMR” R package, adjusting for childhood body size from UK Biobank [66].

Results

UK Biobank

A total of 264,698 women from UK Biobank were included in this analysis. The mean age at assessment was 56.4 years (SD = 8.0); further sample characteristics are shown in Table 1. Many of the reproductive factors were weakly phenotypically correlated. The strongest correlations were between AFB and ALB (Pearson correlation coefficient = 0.71), and AFB and number of births (Pearson correlation coefficient = −0.34) (Fig. 1A).

Table 1 UK Biobank reproductive factor descriptives
Fig. 1
figure 1

A Phenotypic correlation using the Pearson correlation coefficient. B Genetic correlation between reproductive factors using LD score regression. ***<0.001; **<0.01; *<0.05. The values in each correlation square are the Pearson correlation coefficient and Rg for A and B respectively

UK Biobank GWAS

Table 2 displays the number of variants associated with the eight reproductive factors at genome-wide significance (p < 5 × 10−8) after LD clumping within the full UK Biobank sample. Between four (ever parous status) and 223 (age at menarche) SNPs were identified. All F statistics were above the standard threshold of 10, indicative of strong genetic instruments (Table 2).

Table 2 Sample size of the exposure (N), F statistic and the number of SNPs (nSNPs) used within the primary analysis

Genetic correlation

The LDSC revealed that the 8 reproductive factors were genetically correlated (rg range: |0.06–0.94|), except for the lifetime number of sexual partners, which was not correlated with ALB, number of births or ever parous status (rg < 0.01). Age at menarche was only weakly genetically correlated with other reproductive factors (rg range: 0.06–0.12). In general, genetic correlations were larger in magnitude than the corresponding phenotypic correlations (Fig. 1B, Additional file 2: Table S4).

Mendelian randomisation

Effect of age at menarche

MR findings from the primary analysis suggest that the positive genetic correlation reflects a causal relationship between later age at menarche (1 SD increase) and later AFS (beta (B) = 0.09 SD, 95% confidence intervals (CI) = 0.06, 0.11), AFB (B = 0.07 SD, CI = 0.04, 0.10), ALB (B = 0.06 SD, CI = 0.04, 0.09) and age at menopause (B = 0.06 SD, CI = 0.03, 0.10) (Fig. 2A).

Fig. 2
figure 2

Mendelian randomisation of inter-relationships between UK Biobank reproductive factors. Panels to the right of plots AL refer to the exposures investigated by MR, on outcomes shown on the y axis. GWAS summary statistics were standardised prior to performing MR

Effects of age first had sexual intercourse

In addition, later AFS (1 SD increase) appears to lead to later age at menopause (B = 0.11 SD, CI = 0.04, 0.18), later AFB (B = 0.56 SD, CI = 0.49, 0.63), later ALB (B = 0.42 SD, CI = 0.35, 0.50), lower number of births (B = −0.24 SD, CI = −0.31, −0.17), lower lifetime number of sexual partners (B = −0.51 SD, CI = −0.58, −0.44) and increased likelihood of not having any children (odds ratio (OR) = 0.90 SD, CI = 0.88, 0.93) (Fig. 2B, F).

Effect of age at first birth

Findings suggest later AFB (1 SD increase) may lead to a later age at menopause (B = 0.21 SD, CI = 0.13, 0.29), later ALB (B = 0.72 SD, CI = 0.67, 0.77) and lower number of births (B = −0.38 SD, CI = −0.44, −0.32) (Fig. 2C).

Effect of age at last birth

Findings suggest later ALB (1 SD increase) may lead to a lower number of births (B = −0.19 SD, CI = −0.31, −0.07) (Fig. 2D).

Effect of lifetime number of sexual partners

Finally, a higher lifetime number of sexual partners decreases the likelihood of having children (OR = 0.96 SD, CI = 0.92, 1.0) (Fig. 2L).

Number of births, ever having children and age at menopause do not appear to have strong effects on any of the other reproductive factors (Fig. 2G, H, I, K), although confidence intervals for the effects of number of births and ever having children are wide.

Full results of this analysis can be found in Additional file 2: Table S5, and a causal graph shows where we found evidence of an effect between reproductive factors (Fig. 3).

Fig. 3
figure 3

Relationships identified in the primary with evidence of a causal effect. The relationship between age at menarche and lifetime number of sexual partners is not highlighted here due to the attenuation of effect in the sensitivity analyses. + along with a green arrow indicates a positive relationship, and – along with a red arrow indicates a negative relationship. The weight of the arrows represents the relative magnitude of the effect

Evaluating MR assumptions

Negative controls

We found little evidence for an effect of age at menopause on AFS (B = 0.03 SD, CI = −1.32 × 10−3, 0.05), of ALB on age at menarche (B = 0.11 SD, CI = −0.12, 0.34), of AFB on age at menarche (B = 0.04 SD, CI = −0.07, 0.16) or of AFS on age at menarche (B = 0.10 SD, CI = −7.03 × 10−4, 0.21) (Additional file 2: Table S6). However, there was strong evidence for an effect of AFB on AFS (B = 0.58 SD, CI = 0.52, 0.65), suggestive of shared pleiotropy. To assess whether the effect identified between AFB and AFS was due to shared pleiotropy via age at menarche, we performed MVMR using the “MVMR” R package, including age at menarche as an additional exposure. Adjusting for age at menarche did not attenuate the effect of AFB on AFS (Additional file 2: Table S6), suggesting shared pleiotropy is likely to occur via another pathway.

Heterogeneity

For the relationships identified in the primary analysis, evidence for heterogeneity in the individual SNP effects in the IVW was present across many of the investigated relationships, except for between AFB and ALB and between ALB and number of births (Additional file 2: Table S7). Evidence for heterogeneity could indicate the presence of SNP outliers which were investigated using MR-PRESSO (see the “Pleiotropy” section).

Pleiotropy

The effects of age at menarche on AFS, of AFS on AFB and ALB and of AFB on ALB and menopause and number of births were consistent across MR-Egger, weighted median and weighted mode that test for the presence of pleiotropy (Additional file 2: Table S8, Additional file 1: Fig. S1).

Effects were less consistent across the additional MR methods between age at menarche and AFB, ALB, menopause and lifetime number of sexual partners, as well as AFS and age at menopause, lifetime number of sexual partners, number of births and ever being parous.

Furthermore, the effect of AFB on age at menopause, ALB on number of births and lifetime number of sexual partners and ever being parous appeared inconsistent across the different MR methods.

In the primary analysis, the only instance where the MR-Egger intercept test revealed evidence for directional pleiotropy was in the relationship between age at menarche and lifetime number of sexual partners (Additional file 2: Table S9).

We assessed the heterogeneity in gene-exposure estimates, or I2GX. The I2GX was > 97% in all analyses, suggesting MR-Egger is performing optimally (Additional file 2: Table S10).

We also applied MR-PRESSO to the UK Biobank full overlap GWAS to additionally test for evidence of pleiotropy and correct for outliers (Additional file 2: Table S11). MR-PRESSO revealed evidence for outliers in almost all tests, other than for the relationships between AFB and ALB. However, after outlier correction, there was little change in the strength of evidence in the IVW estimates (Additional file 2: Table S12).

We applied an MR Steiger method to assess whether we had captured the intended causal direction between reproductive factors where the causal direction was unclear. Findings show aggregated instruments have successfully captured the intended causal direction in all cases (Additional file 2: Table S13). Steiger filtering was also implemented to assess whether there were any individual SNPs that did not capture the intended causal direction, and results are displayed in Additional file 2: Table S14. Where instruments contained SNPs that did not capture the intended causal direction, MR analysis was then performed excluding those SNPs and the strength of evidence for the causal estimate using the IVW method did not change (Additional file 2: Table S15).

Assessing the impact of sample overlap

UK biobank split-sample

In the split-sample GWAS within UK Biobank, between 1 and 101 SNPs were identified at genome-wide significance (p < 5 × 10−8) after LD clumping (r2 < 0.001 and a distance of 10,000 kb) (Additional file 2: Table S16). No SNPs were identified at genome-wide significance in relation to ALB and parous status in the GWAS performed on one of the UK Biobank split-samples; therefore, the split-sample MR was only conducted once when ALB or ever parous status was the exposure.

Where SNPs were identified in the split-sample analysis, F statistics were above the standard threshold of 10, indicative of strong genetic instruments (Additional file 2: Table S16). However, there was little overlap in the SNPs which surpassed genome-wide significance between sample 1 and sample 2, with 9 SNPs overlapping between samples for age at menarche and age at menopause but none for the other traits (Additional file 2: Table S17). A number of the SNPs identified in one of the samples of the split-sample GWAS were identified above the significance threshold but removed during LD clumping in the GWAS of the other sample, while other SNPs were just below the significance threshold or appeared not to be associated (Additional file 2: Table S17).

We performed MR for each relationship twice, i.e. MR of exposure in sample 1 on outcome in sample 2 and MR of exposure in sample 2 on outcome in sample 1. This was with the exception of the MR analyses when ALB and parous status were the exposure, which were assessed only once (Additional file 2: Table S18). We then meta-analysed findings between both samples, which showed limited evidence of heterogeneity between the causal estimates obtained from the split-sample MRs. Full results of the meta-analysis can be found in Additional file 2: Table S19.

Replication consortia

Inter-relations between the reproductive factors were also investigated using GWAS summary statistics from consortia studies which excluded UK Biobank. Sixty SNPs were identified at genome-wide significance (p < 5 × 10−8) for age at menarche (ReproGen) and 5 for AFB (SSGAC) (Additional file 2: Table S20). All F statistics were above the standard threshold of 10, indicative of strong genetic instruments (Additional file 2: Table S20). Full results of this analysis can be found in Additional file 2: Table S21. Estimates were consistent when using a larger replication GWAS from ReproGen for age at menopause, although the sample this GWAS was performed in had a large proportion of UK Biobank overlap (Additional file 2: Table S3).

MRlap UK biobank

MRlap was performed using the reproductive factor GWAS summary statistics for the full UK Biobank sample. This method identified slightly more variants at genome-wide significance (p < 5 × 10−8) after LD pruning (10,000kb, r2 = 0.001) compared to the main analysis. Between 11 (ALB and number of births) and 231 (age at menarche) were identified (Additional file 2: Table S22). MR estimates were largely similar to the primary analysis, although in some cases the effect size was slightly larger, including for the relationship between ALB and number of births. Full results of this analysis can be found in Additional file 2: Table S23.

Assessing evidence of causal effects across sensitivity analyses

Figure 4 illustrates the effects which appear robust across multiple sensitivity analyses. In particular, a later age at menarche appears to have consistent effects on a later AFB, ALB and AFS. In addition, a later AFB leading to a later ALB, a later AFS leading to later AFB and a later AFS leading to a lower number of lifetime sexual partners were consistent across all sensitivity analyses. There was no consistent evidence for a causal relationship between age at menarche and lifetime number of sexual partners across sensitivity analysis and limited evidence between AFS and age at menopause.

Fig. 4
figure 4

Mendelian randomisation estimates from the primary and across the sensitivity analyses. Panels to the right of the plots refer to the relationships investigated, and each analysis is shown on the y axis. All analyses were performed using the IVW MR method. GWAS summary statistics were standardised prior to performing MR

Evaluating the role of adiposity

We used MVMR analysis to adjust for childhood body size and assess for the presence of confounding. We found little evidence that adjustment of childhood body size impacts the relationships identified in the primary analysis. One exception was the relationship between age at menarche and lifetime number of sexual partners, which attenuates with adjustment for childhood body size, and which additional sensitivity analyses indicate may be affected by pleiotropy and bias induced by sample overlap (Additional file 2: Table S24).

Discussion

This study provides evidence supporting causal effects of several female reproductive factors on other reproductive traits. We show evidence that earlier reproductive factors including age at menarche, AFS and AFB have effects on subsequent events and factors, while ever parous status, age at menopause, number of births, ALB and lifetime number of sexual partners appear to have limited effects on other reproductive factors.

We substantiate the genetic correlation between reproductive factors shown in previous studies, while showing additional correlations that have not been previously investigated [64, 65]. Our study supports evidence for a positive causal link between age at menarche and age at menopause [11,12,13,14,15,16, 67, 68] and opposes previous studies that have shown the inverse association [17, 18] or no association [19,20,21,22,23,24]. Furthermore, our findings support one study that found little evidence for an association between age at menarche and parity [26]. Additionally, we corroborated the findings of previous MR studies that identified a positive causal relationship between age at menarche and AFB, ALB and age at menopause, and between AFS and ALB [67,68,69].

Many estimates identified in the primary analysis appear consistent across sensitivity analyses that aim to account for biases. However, some results did not persist in sensitivity analyses checking for robustness to sample overlap and winner’s curse.

The split-sample meta-analysed MR shows a weaker magnitude of effect compared to our primary analysis, which may be due to sample size reduction in this sensitivity analysis or bias introduced by sample overlap in the primary analysis.

Overall, using replication GWAS studies as the exposure or outcome showed weaker strength of evidence and/or magnitude of effects, although evidence for a causal effect for many relationships assessed was maintained. This may be due to bias introduced by winner’s curse in the primary analysis or smaller sample sizes available for the replication studies. In particular, age at menopause from the ReproGen consortium has a sample size of 69,360, compared to 143,791 in our primary analysis, and where this is used as the outcome, we found little evidence of an effect of reproductive factors on age at menopause. A more recent GWAS of age at menopause conducted by the ReproGen consortium has a much larger sample size (n = 201,323) [70, 71], although more than half of the sample comprise UK Biobank women, meaning a large sample overlap in the MR analysis. Nonetheless, MR estimates using this more recent GWAS revealed similar results compared to the previous smaller GWAS [32]. While there are more recent, larger GWAS available for age at menarche [72] and AFB [73], UK Biobank has formed a large contribution to these GWAS. We decided to prioritise studies which had a smaller number of participants from UK Biobank for the replication GWAS, in order to reduce the likelihood of bias due to sample overlap.

The difference in how the phenotype for age at menopause between UK Biobank and ReproGen has been derived may contribute to differences in estimated effects. While both GWAS have excluded women who had a hysterectomy, ReproGen additionally excludes women who had a bilateral ovariectomy, those who had menopause induced by radiation or chemotherapy and those using hormone replacement therapy [70].

MRlap revealed almost identical results compared to our primary analysis suggesting sample overlap may not substantially bias estimates.

Pleiotropy may occur when genetic variants have an effect on multiple phenotypes, which can be an issue in MR as the genetic instruments used as a proxy for the exposure can affect the outcome independently of the exposure of interest [29, 60]. Therefore, resulting effect estimates may not correctly capture the exposure-outcome relationship of interest. This could be a problem as many of the reproductive factors are genetically correlated, and consequently, multiple sensitivity analyses were used to assess whether there was an exclusion restriction assumption violation. We implemented additional MR methods and numerous relationships did not appear to be affected by pleiotropy. Where outlier correction was possible, results were consistent with the primary analysis, with the exception of the effect of lifetime number of sexual partners on ever having children, where there was a complete attenuation of the effect after outlier correction.

However, it is worth considering that a recent study found that using MR-Egger on overlapping exposure and outcome samples may induce bias in the direction and magnitude of the confounding. This bias attenuates when the MR-Egger method is performing optimally, i.e. when it is employed with maximum variability in instrument strength. This is expressed as heterogeneity in gene-exposure estimates across SNPs, also referred to as I2GX, which can be calculated using the I2 statistic. It is estimated that the bias in MR-Egger when used in a one-sample setting is substantially reduced when I2GX is higher than the recommended 90% [43]. Conversely, other two-sample methods appear to perform similarly in a one-sample MR compared to a two-sample approach in similarly large sample size [43]. Where there was evidence of non-null effects in the primary analysis, the I2GX was >97% suggesting MR-Egger is performing optimally. Nonetheless, the MR-Egger test can be underpowered, especially when few instruments are available.

Mechanisms underlying causal links

We show that an earlier age at menarche may lead to an earlier AFS and AFB, as well as an earlier AFS leading to an earlier AFB. It is likely that earlier maturation may lead to earlier sexual activity, logically increasing the chance of an earlier pregnancy. In UK Biobank, a proportion of women may have had first had sexual intercourse prior to the introduction of the NHS family planning act 1967 which made contraception readily available through the NHS. This may have strengthened the effect of AFS on AFB in this cohort and findings may not be generalisable to more contemporary studies. We also show that an earlier AFS may lead to a higher number of sexual partners, which may occur due to a longer amount of time to acquire partners if sexual activity commences earlier. Furthermore, we identify that having a higher lifetime number of sexual partners may lead to a lower chance of having children. This may be due to the increased prevalence of short-term relationships and regularly changing sexual partners [74], which, as a result, might lead to less chance of starting a family. However, it is worth noting that after excluding outlying variants, the effect between lifetime number of sexual partners and ever parous status attenuated. We present strong evidence for a positive relationship between AFB and ALB. One explanation for this link could be as parents tend to have children in a relatively short period of time, as shown in UK Biobank where the average AFB is 26 years, and ALB is 30 years for women.

The life history theory is another explanation as to why earlier age at menarche leads to earlier subsequent reproductive events and a likelihood of an increased number of children. This theory distinguishes the allocation of resources into growth and reproductive efforts and categorises “fast” or “slow” life history strategies [75, 76]. A “fast” life history strategy exerts more effort towards reproduction: earlier puberty and sexual activity leading to an early AFB, and an increased number of births [75, 76]. This is corroborated by our finding that women who experience an earlier AFS have children earlier and have more children. If a woman starts having children earlier, they have more opportunity to conceive again before menopause, which may explain the effect we identify between an earlier AFB and a higher number of children. A “fast” life history may lead to an earlier age at menopause as allocating resources towards reproductive efforts earlier in life and towards a higher number of children, which may result in a completing reproduction at a younger age.

There were a number of relationships where we did not find evidence for an effect in our primary analysis. Of note, we did not find a causal effect of age at menarche on the number of births and ever parous status. Considering the life history theory, we might have expected to find an inverse effect, suggesting an earlier age at menarche leads to a high number of births.

Furthermore, we did not find evidence of an effect of ever parous status on lifetime number of sexual partners and number of births on ALB. We investigated bidirectional effects between reproductive factors where there was not a clear temporal order and identified no bidirectional effects. Specifically, there were no effects between age at menopause and ALB, lifetime number of sexual partners, number of births and ever parous status, ALB and lifetime number of sexual partners and finally number of births and lifetime number of sexual partners.

Several relationships between reproductive factors separated by many years could be mediated by other intervening reproductive events. For example, we identify effects between age at menarche and AFS, AFS and AFB, and age at menarche and AFB; therefore, the effect we find between age at menarche and AFB may be mediated by AFS. Similarly, we found effects between AFS and AFB, AFB and ALB, and AFS and ALB, which could suggest that an earlier AFS leading to an earlier ALB may be mediated through an earlier AFB. In addition, there are likely to be mediating mechanisms for the relationships we have identified other than through reproductive factors such as body mass index [63]. Future investigations could use mediation analyses to further elucidate these relationships [77].

Implication of findings

When investigating one reproductive factor in relation to a health outcome, our findings might aid in identifying reproductive factors that could confound this relationship. For example, becoming a parent at an earlier age has been identified as a risk factor for depressive symptoms in young adulthood [78, 79]. We have presented evidence that age at menarche has a causal effect on AFB, and previous studies have identified earlier age at menarche as a risk factor for poor mental health outcomes [80, 81]. The evidence presented in this study suggests it would be important to adjust for age at menarche in an investigation of the effects of AFB on mental health outcomes.

Our work also suggests that reproductive factors might lie on the causal pathway between an earlier reproductive factor and a later outcome. We present evidence for a causal effect between AFB and number of births, and both reproductive factors have been identified as a risk factor for cardiovascular disease [82]. An investigation of AFB on the risk of cardiovascular disease might want to consider mediation via the number of births.

Finally, a number of reproductive factors have been identified as risk factors for breast cancer, including age at menarche, age at menopause [2], number of births and AFB [3]. We have presented a number of causal inter-relationships between reproductive factors; therefore, researchers should carefully consider the total impact of reproductive factor variability on chronic diseases such as breast cancer rather than the impact of single reproductive indicators, and a multivariable approach could be particularly useful [83].

Strengths and limitations

The strengths of the study include the range of reproductive factors investigated using the MR approach, the use of the large UK Biobank resource and data from other genetic consortia, and the extent of MR sensitivity analyses to evaluate MR assumptions and address sample overlap. However, this study has a number of limitations.

Firstly, negative control analysis revealed strong evidence of an effect of AFB on AFS, suggesting possible evidence of pleiotropy which has been previously identified for the AFS genetic instrument [84]. As this may reduce the reliability of our results, future work could further assess whether the associations identified for AFS reflect true causal effects.

For some exposures such as ALB, the number of births and ever being parous, the number of SNPs used as genetic instruments was limited, meaning we cannot reliably evaluate pleiotropy and heterogeneity in these instances. Increasing the number of SNPs in the genetic instruments for each of these reproductive factors through larger GWAS would be valuable.

Another limitation is the issue of selection bias in UK Biobank. While 9 million individuals were invited to participate in the study, the response rate was 5%. Additionally, the participants in the UK Biobank and replication studies we used were largely restricted to women of European ancestry. These samples are therefore not representative of the entire UK female population and estimates may not be generalisable to women in other ancestry groups. In addition, these findings may not be representative of younger generations of women considering the average age of UK Biobank participants, and the evidence of secular trends in some reproductive factors. For example, there is evidence that there is a long-term downward trend in age at menarche [85] and increase in AFB [86]. Future work is required to replicate our findings in contemporary independent studies and translate the results in women in other ancestry groups.

While the majority of the reproductive factors are likely to be accurately captured through a questionnaire (such as AFB, number of births and ALB), other factors such as age at menarche may not be as reliably recalled [87]. Self-report of lifetime number of sexual partners is also known to be overestimated by some, which could explain the positively skewed distribution we identified [88]. To account for this, we performed a rank-based inverse normal transformation of this variable.

It is also worth noting that some reproductive events may not have been fully captured in the analysis, as certain reproductive milestones may not have been reached by some women. For example, younger women who were reported to have not had children may subsequently have children. In addition, the ALB and number of births may not reflect final reproductive milestones if some women go on to have more children. However, considering the mean age of UK Biobank women is 56.4 years (SD = 8.0), there are likely to be few women who go on to have more children.

The split-sample GWAS revealed little overlap between genome-wide significant SNPs identified in each sample. While some of these SNPs were identified slightly below the significance threshold between samples, others appeared not to be associated. This suggests that some SNPs may have been identified through spurious associations and may suggest evidence of winner’s curse.

Conclusion

In conclusion, we present evidence of inter-relationships between reproductive factors. In particular, we find strong evidence of an effect of age at menarche, AFS and AFB on subsequent reproductive events and factors. Future work should consider the inter-relationships between reproductive factors when assessing reproductive risk on disease outcomes.

Availability of data and materials

The availability of all data analysed in this study has been referenced throughout the manuscript and supplementary materials.

https://www.reprogen.org/

https://www.thessgac.org/

Abbreviations

AFB:

Age at first birth

AFS:

Age first had sexual intercourse

ALB:

Age at last birth

B :

Beta

CI:

Confidence interval

GWAS:

Genome-wide association studies

IEU:

Integrative Epidemiology Unit

IVW:

Inverse variance weighted

LD:

Linkage disequilibrium

LDSC:

Linkage disequilibrium score regression

MR:

Mendelian randomisation

MRC:

Medical Research Council

MR-PRESSO:

Mendelian Randomisation Pleiotropy RESidual Sum and Outlier

OR:

Odds ratio

SD:

Standard deviation

SNP:

Single nucleotide polymorphism

References

  1. Rich-Edwards JW. Reproductive health as a sentinel of chronic disease in women. Womens Health (Lond). 2009;5(2):101–5.

    Google Scholar 

  2. Collaborative Group on Hormonal Factors in Breast C. Menarche, menopause, and breast cancer risk: individual participant meta-analysis, including 118 964 women with breast cancer from 117 epidemiological studies. Lancet Oncol. 2012;13(11):1141–51.

    Google Scholar 

  3. Ewertz M, Duffy SW, Adami HO, Kvale G, Lund E, Meirik O, et al. Age at first birth, parity and risk of breast cancer: a meta-analysis of 8 studies from the Nordic countries. Int J Cancer. 1990;46(4):597–603.

    CAS  PubMed  Google Scholar 

  4. Tang R, Fraser A, Magnus MC. Female reproductive history in relation to chronic obstructive pulmonary disease and lung function in UK biobank: a prospective population-based cohort study. BMJ Open. 2019;9(10):e030318.

    PubMed  PubMed Central  Google Scholar 

  5. Okoth K, Chandan JS, Marshall T, Thangaratinam S, Thomas GN, Nirantharakumar K, et al. Association between the reproductive health of young women and cardiovascular disease in later life: umbrella review. BMJ. 2020;371:m3502.

    PubMed  PubMed Central  Google Scholar 

  6. Parikh NI, Jeppson RP, Berger JS, Eaton CB, Kroenke CH, LeBlanc ES, et al. Reproductive risk factors and coronary heart disease in the Women’s health initiative observational study. Circulation. 2016;133(22):2149–58.

    PubMed  PubMed Central  Google Scholar 

  7. Cao M, Cui B. Negative Effects of Age at Menarche on Risk of Cardiometabolic Diseases in Adulthood: A Mendelian Randomization Study. J Clin Endocrinol Metab. 2019;105(2):515-522.

  8. Yin X, Zhu Z, Hosgood HD, Lan Q, Seow WJ. Reproductive factors and lung cancer risk: a comprehensive systematic review and meta-analysis. BMC Public Health. 2020;20(1):1458.

    PubMed  PubMed Central  Google Scholar 

  9. Noh JH, Koo H. Older menarche age and short reproductive period linked to chronic kidney disease risk. Medicine (Baltimore). 2019;98(18):e15511.

    Google Scholar 

  10. Kang SC, Jhee JH, Joo YS, Lee SM, Nam KH, Yun HR, et al. Association of reproductive lifespan duration and chronic kidney disease in postmenopausal women. Mayo Clin Proc. 2020;95(12):2621–32.

    CAS  PubMed  Google Scholar 

  11. Hardy R, Kuh D. Reproductive characteristics and the age at inception of the perimenopause in a British National Cohort. Am J Epidemiol. 1999;149(7):612–20.

    CAS  PubMed  Google Scholar 

  12. Henderson KD, Bernstein L, Henderson B, Kolonel L, Pike MC. Predictors of the timing of natural menopause in the multiethnic cohort study. Am J Epidemiol. 2008;167(11):1287–94.

    PubMed  Google Scholar 

  13. Brand JS, Onland-Moret NC, Eijkemans MJ, Tjonneland A, Roswall N, Overvad K, et al. Diabetes and onset of natural menopause: results from the European prospective investigation into cancer and nutrition. Hum Reprod. 2015;30(6):1491–8.

    CAS  PubMed  Google Scholar 

  14. Li J, Eriksson M, Czene K, Hall P, Rodriguez-Wallberg KA. Common diseases as determinants of menopausal age. Hum Reprod. 2016;31(12):2856–64.

    PubMed  Google Scholar 

  15. Mishra GD, Pandeya N, Dobson AJ, Chung HF, Anderson D, Kuh D, et al. Early menarche, nulliparity and the risk for premature and early natural menopause. Hum Reprod. 2017;32(3):679–86.

    PubMed  PubMed Central  Google Scholar 

  16. Ruth KS, Perry JR, Henley WE, Melzer D, Weedon MN, Murray A. Events in early life are associated with female reproductive ageing: a UK biobank study. Sci Rep. 2016;6:24710.

    CAS  PubMed  PubMed Central  Google Scholar 

  17. van Keep PA, Brand PC, Lehert P. Factors affecting the age at menopause. J Biosoc Sci Suppl. 1979;6:37–55.

    Google Scholar 

  18. Boulet MJ, Oddens BJ, Lehert P, Vemer HM, Visser A. Climacteric and menopause in seven south-east Asian countries. Maturitas. 1994;19(3):157–76.

    CAS  PubMed  Google Scholar 

  19. van Noord PA, Dubas JS, Dorland M, Boersma H, te Velde E. Age at natural menopause in a population-based screening cohort: the role of menarche, fecundity, and lifestyle factors. Fertil Steril. 1997;68(1):95–102.

    PubMed  Google Scholar 

  20. Kato I, Toniolo P, Akhmedkhanov A, Koenig KL, Shore R, Zeleniuch-Jacquotte A. Prospective study of factors influencing the onset of natural menopause. J Clin Epidemiol. 1998;51(12):1271–6.

    CAS  PubMed  Google Scholar 

  21. Nagel G, Altenburg HP, Nieters A, Boffetta P, Linseisen J. Reproductive and dietary determinants of the age at menopause in EPIC-Heidelberg. Maturitas. 2005;52(3-4):337–47.

    PubMed  Google Scholar 

  22. Dratva J, Gomez Real F, Schindler C, Ackermann-Liebrich U, Gerbase MW, Probst-Hensch NM, et al. Is age at menopause increasing across Europe? Results on age at menopause and determinants from two population-based studies. Menopause. 2009;16(2):385–94.

    PubMed  Google Scholar 

  23. Rizvanovic M, Balic D, Begic Z, Babovic A, Bogadanovic G, Kameric L. Parity and menarche as risk factors of time of menopause occurrence. Mediev Archaeol. 2013;67(5):336–8.

    Google Scholar 

  24. Zsakai A, Mascie-Taylor N, Bodzsar EB. Relationship between some indicators of reproductive history, body fatness and the menopausal transition in Hungarian women. J Physiol Anthropol. 2015;34:35.

    PubMed  PubMed Central  Google Scholar 

  25. Zhang Q, Wang YY, Zhang Y, Zhang HG, Yang Y, He Y, et al. The influence of age at menarche, menstrual cycle length and bleeding duration on time to pregnancy: a large prospective cohort study among rural Chinese women. BJOG. 2017;124(11):1654–62.

    CAS  PubMed  Google Scholar 

  26. Sandler DP, Wilcox AJ, Horney LF. Age at menarche and subsequent reproductive events. Am J Epidemiol. 1984;119(5):765–74.

    CAS  PubMed  Google Scholar 

  27. Marino JL, Skinner SR, Doherty DA, Rosenthal SL, Cooper Robbins SC, Cannon J, et al. Age at menarche and age at first sexual intercourse: a prospective cohort study. Pediatrics. 2013;132(6):1028–36.

    PubMed  Google Scholar 

  28. Smith GD, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32(1):1–22.

    PubMed  Google Scholar 

  29. Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–98.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Davies NM, Holmes MV, Davey Smith G. Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ. 2018;362:k601.

    PubMed  PubMed Central  Google Scholar 

  31. Perry JR, Day F, Elks CE, Sulem P, Thompson DJ, Ferreira T, et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature. 2014;514(7520):92–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Day FR, Ruth KS, Thompson DJ, Lunetta KL, Pervjakova N, Chasman DI, et al. Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. Nat Genet. 2015;47(11):1294–303.

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Mathieson I, Day FR, Barban N, Tropf FC, Brazel DM, Consortium e, Consortium B, et al. Genome-wide analysis identifies genetic effects on reproductive success and ongoing natural selection at the FADS locus. bioRxiv. 2020.05.19.104455.

  34. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779.

    PubMed  PubMed Central  Google Scholar 

  35. Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, Laurin C, Burgess S, Bowden J, Langdon R, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408.

  36. Mitchell RC, Elsworth BL, Mitchell R, Raistrick CA, Paternoster L, Hemani G, et al. MRC IEU UK biobank GWAS pipeline version 2: University of Bristol; 2019. https://doi.org/10.5523/bris.pnoat8cxo0u52p6ynfaekeigi.

    Book  Google Scholar 

  37. Loh PR, Tucker G, Bulik-Sullivan BK, Vilhjalmsson BJ, Finucane HK, Salem RM, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47(3):284–90.

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, Schizophrenia Working Group of the Psychiatric Genomics C, et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47(3):291–5.

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh PR, et al. An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47(11):1236–41.

    CAS  PubMed  PubMed Central  Google Scholar 

  40. International HapMap C, Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311):52–8.

    Google Scholar 

  41. Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–65.

    PubMed  PubMed Central  Google Scholar 

  42. Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol. 2016;40(7):597–608.

    PubMed  PubMed Central  Google Scholar 

  43. Minelli C, Del Greco M. F, van der Plaat DA, Bowden J, Sheehan NA, Thompson J. The use of two-sample methods for Mendelian randomization analyses on single large datasets. Int J Epidemiol. 2021;50(5):1651-1659.

  44. Morrison J, Knoblauch N, Marcus JH, Stephens M, He X. Publisher correction: Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet. 2020;52(7):750.

    CAS  PubMed  Google Scholar 

  45. Greco MF, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015;34(21):2926–40.

    Google Scholar 

  46. Bowden J, Del Greco MF, Minelli C, Zhao Q, Lawlor DA, Sheehan NA, et al. Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol. 2019;48(3):728–42.

    PubMed  Google Scholar 

  47. Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985–98.

    PubMed  PubMed Central  Google Scholar 

  48. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.

    PubMed  PubMed Central  Google Scholar 

  49. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol. 2015;44(2):512–25.

    PubMed  PubMed Central  Google Scholar 

  50. Lawlor DA, Wade K, Borges MC, Palmer TM, Hartwig FP, Hemani G: A Mendelian Randomization dictionary: useful definitions and descriptions for undertaking, understanding and interpreting Mendelian Randomization studies [Internet]. OSF Preprints 2019.

  51. Bowden J, Del Greco MF, Minelli C, Davey Smith G, Sheehan NA, Thompson JR. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-egger regression: the role of the I2 statistic. Int J Epidemiol. 2016;45(6):1961–74.

    PubMed  PubMed Central  Google Scholar 

  52. Verbanck M, Chen CY, Neale B, Do R. Publisher correction: detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(8):1196.

    CAS  PubMed  Google Scholar 

  53. Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-egger method. Eur J Epidemiol. 2017;32(5):377–89.

    PubMed  PubMed Central  Google Scholar 

  54. Hemani G, Tilling K, Davey Smith G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017;13(11):e1007081.

    PubMed  PubMed Central  Google Scholar 

  55. Barban N, Jansen R, de Vlaming R, Vaez A, Mandemakers JJ, Tropf FC, et al. Genome-wide analysis identifies 12 loci influencing human reproductive behavior. Nat Genet. 2016;48(12):1462–72.

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Barban N, Jansen R, de Vlaming R, Vaez A, Mandemakers JJ, Tropf FC, Shen X, Wilson JF, Chasman DI, Nolte IM et al: Genome-wide analysis identifies 12 loci influencing human reproductive behavior. In. https://www.ebi.ac.uk/gwas/publications/27798627: EBI GWAS Catalog; 2016.

  57. Perry JR, Day F, Elks CE, Sulem P, Thompson DJ, Ferreira T, He C, Chasman DI, Esko T, Thorleifsson G: Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. In. https://www.reprogen.org/data_download.html: ReproGen Consortium; 2014.

  58. Day FR, Ruth KS, Thompson DJ, Lunetta KL, Pervjakova N, Chasman DI, Stolk L, Finucane HK, Sulem P, Bulik-Sullivan B: Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. In. https://www.reprogen.org/data_download.html: ReproGen Consortium; 2015.

  59. Taylor AE, Davies NM, Ware JJ, VanderWeele T, Smith GD, Munafo MR. Mendelian randomization in health research: using appropriate genetic variants and avoiding biased estimates. Econ Hum Biol. 2014;13:99–106.

    PubMed  PubMed Central  Google Scholar 

  60. Zheng J, Baird D, Borges MC, Bowden J, Hemani G, Haycock P, et al. Recent developments in Mendelian randomization studies. Curr Epidemiol Rep. 2017;4(4):330–45.

    PubMed  PubMed Central  Google Scholar 

  61. Mounier N, Kutalik Z. Correction for sample overlap, winner’s curse and weak instrument bias in two-sample Mendelian Randomization. bioRxiv. 2021.03.26.437168.

  62. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–63.

    PubMed  Google Scholar 

  63. Burgess S, Thompson DJ, Rees JMB, Day FR, Perry JR, Ong KK. Dissecting causal pathways using Mendelian randomization with summarized genetic data: application to age at menarche and risk of breast cancer. Genetics. 2017;207(2):481–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Ni G, Amare AT, Zhou X, Mills N, Gratten J, Lee SH. The genetic relationship between female reproductive traits and six psychiatric disorders. Sci Rep. 2019;9(1):12041.

    PubMed  PubMed Central  Google Scholar 

  65. Day FR, Helgason H, Chasman DI, Rose LM, Loh PR, Scott RA, et al. Physical and neurobehavioral determinants of reproductive onset and success. Nat Genet. 2016;48(6):617–23.

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Stat Med. 2021;40(25):5434–52.

    PubMed  Google Scholar 

  67. Magnus MC, Guyatt AL, Lawn RB, Wyss AB, Trajanoska K, Kupers LK, et al. Identifying potential causal effects of age at menarche: a Mendelian randomization phenome-wide association study. BMC Med. 2020;18(1):71.

    PubMed  PubMed Central  Google Scholar 

  68. Ding X, Tang R, Zhu J, He M, Huang H, Lin Z, et al. An appraisal of the role of previously reported risk factors in the age at menopause using Mendelian randomization. Front Genet. 2020;11:507.

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Lawn RB, Sallis HM, Wootton RE, Taylor AE, Demange P, Fraser A, et al. The effects of age at menarche and first sexual intercourse on reproductive and behavioural outcomes: a Mendelian randomization study. PLoS One. 2020;15(6):e0234488.

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Ruth KS, Day FR, Hussain J, Martinez-Marchal A, Aiken CE, Azad A, et al. Genetic insights into biological mechanisms governing human ovarian ageing. Nature. 2021;596(7872):393–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Ruth KS, Day FR, Hussain J, Martinez-Marchal A, Aiken CE, Azad A, Thompson DJ, Knoblochova L, Abe H, Tarry-Adkins JL: Genetic insights into biological mechanisms governing human ovarian ageing. In. https://www.reprogen.org/data_download.html: ReproGen Consortium; 2021.

  72. Day FR, Thompson DJ, Helgason H, Chasman DI, Finucane H, Sulem P, et al. Genomic analyses identify hundreds of variants associated with age at menarche and support a role for puberty timing in cancer risk. Nat Genet. 2017;49(6):834–41.

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Mills MC, Tropf FC, Brazel DM, van Zuydam N, Vaez A, Agbessi M, Ahsan H, Alves I, Andiappan AK, Arindrarto W, et al. Identification of 371 genetic variants for age at first sex and birth linked to externalising behaviour. Nat Hum Behav. 2021;5(12):1717-1730.

  74. Fenton KA, Hughes G. Sexual behaviour in Britain: why sexually transmitted infections are common. Clin Med (Lond). 2003;3(3):199–202.

    Google Scholar 

  75. Ellis BJ, Bjorklund DF. Beyond mental health: an evolutionary analysis of development under risky and supportive environmental conditions: an introduction to the special section. Dev Psychol. 2012;48(3):591–7.

    PubMed  Google Scholar 

  76. Ellis BJ. Timing of pubertal maturation in girls: an integrated life history approach. Psychol Bull. 2004;130(6):920–58.

    PubMed  Google Scholar 

  77. Carter AR, Sanderson E, Hammerton G, Richmond RC, Davey Smith G, Heron J, et al. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021;36(5):465–78.

    PubMed  PubMed Central  Google Scholar 

  78. Falci CD, Mortimer JT, Noel H. Parental timing and depressive symptoms in early adulthood. Adv Life Course Res. 2010;15(1):1–10.

    PubMed  PubMed Central  Google Scholar 

  79. Aitken Z, Hewitt B, Keogh L, LaMontagne AD, Bentley R, Kavanagh AM. Young maternal age at first birth and mental health later in life: does the association vary by birth cohort? Soc Sci Med. 2016;157:9–17.

    PubMed  Google Scholar 

  80. Mendle J, Ryan RM, McKone KMP. Age at menarche, depression, and antisocial behavior in adulthood. Pediatrics. 2018;141(1):e20171703.

  81. Copeland W, Shanahan L, Miller S, Costello EJ, Angold A, Maughan B. Outcomes of early pubertal timing in young women: a prospective population-based study. Am J Psychiatry. 2010;167(10):1218–25.

    PubMed  PubMed Central  Google Scholar 

  82. Peters SA, Woodward M. Women’s reproductive factors and incident cardiovascular disease in the UK biobank. Heart. 2018;104(13):1069–75.

    CAS  PubMed  Google Scholar 

  83. Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015;181(4):251–60.

    PubMed  PubMed Central  Google Scholar 

  84. Gormley M, Dudding T, Kachuri L, Burrows K, Chong AHW, Martin RM, Thomas SJ, Tyrrell J, Ness AR, Brennan P, et al. Investigating the effect of sexual behaviour on oropharyngeal cancer risk: a methodological assessment of Mendelian randomization. medRxiv. 2021.06.21.21259261.

  85. Forman MR, Mangini LD, Thelus-Jean R, Hayward MD. Life-course origins of the ages at menarche and menopause. Adolesc Health Med Ther. 2013;4:1–21.

    PubMed  PubMed Central  Google Scholar 

  86. Births by parents’ characteristics, England and Wales. In. Office for National Statistics: www.ons.gov.uk; 2019.

  87. Cooper R, Blell M, Hardy R, Black S, Pollard TM, Wadsworth ME, et al. Validity of age at menarche self-reported in adulthood. J Epidemiol Community Health. 2006;60(11):993–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Graham CA, Catania JA, Brand R, Duong T, Canchola JA. Recalling sexual behavior: a methodological analysis of memory recall bias via interview using the diary as the gold standard. J Sex Res. 2003;40(4):325–32.

    PubMed  Google Scholar 

Download references

Acknowledgements

This research was conducted using the UK Biobank Resource under application number 6326. We thank the participants and researchers from the UK Biobank who contributed or collected data. This work was carried out using the computational facilities of the Advanced Computing Research Centre, University of Bristol—http://www.bris.ac.uk/acrc/.

Funding

All authors work in a unit that receives funding from the University of Bristol and the UK Medical Research Council (MC_UU_00011/1, MC_UU_00011/5, MC_UU_00011/6). Further support was provided by the CRUK-funded Integrative Cancer Epidemiology Programme (C18281/A29019). C.P. is supported by a Wellcome Trust PhD studentship in Molecular, Genetic and Lifecourse Epidemiology (108902/B/15/Z). G.C.S. is supported by the MRC (New Investigator Research Grant, MR/S009310/1) and the European Joint Programming Initiative ‘A Healthy Diet for a Healthy Life’ (JPI HDHL, NutriPROGRAM project, UK MRC MR/S036520/1). L.D.H. is supported by Career Development Awards from the UK Medical Research Council (MR/M020894/1). R.C.R. is a de Pass Vice Chancellor’s Research Fellow at the University of Bristol.

Author information

Authors and Affiliations

Authors

Contributions

C.P. was responsible for the analysis, investigation, and writing of the original draft. G.C.S., L.D.H., A.F. and R.C.R. were responsible for conceptualisation, writing, review, editing, and supervision. R.C.R. was additionally responsible for investigation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Claire Prince.

Ethics declarations

Ethics approval and consent to participate

UK Biobank received ethical approval from the North West Multi-Centre Research Ethics Committee (REC reference: 16/NW/0274) and was conducted in accordance with the principles of the Declaration of Helsinki.

Consent for publication

This manuscript does not include details, images or videos relating to an individual person; therefore, consent for publication is not required, beyond the informed consent provided by all study participants as described above.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Forest plots showing effect estimates of additional MR methods for relationships identified in the primary MR analysis. Panels A-P refer to the relationships assessed using MR, and MR methods used is shown on the y axis.

Additional file 2: Table S1

Relationships where bi-directional MR was performed due to unclear temporal ordering. Table S2 Relationships investigated. Table S3 Replication consortia and studies. Table S4 Genetic correlation results. Table S5 Primary analysis (IVW). Table S6 Negative control results. Table S7 Heterogeneity for primary analysis. Table S8 Additional MR methods in relation to primary analysis. Table S9 Egger intercept test for the primary analysis. Table S10 I-squared statistics. Table S11 MR PRESSO Global test for primary analysis. Table S12 MR PRESSO Outlier correction for primary analysis. Table S13 Steiger results for the primary analysis. Table S14 Steiger: SNPs found to be in the incorrect intended for the primary analysis direction. Table S15 Steiger filtered MR results for the primary analysis. Table S16 Split sample SNPs, R2, F stats and number of overlapping SNPs. Table S17 Split sample GWAS overlapping SNPs between samples. Table S18 IVW UKBB split sample results. Table S19 UKBB meta-analysed split sample results. Table S20 Replication SNPs, R2 and F stats. Table S21 IVW UKBB and replication results. Table S22 MRlap number of SNPs. Table S23 MRlap observed and corrected results. Table S24 MVMR findings adjusted for childhood body size (UK Biobank).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prince, C., Sharp, G.C., Howe, L.D. et al. The relationships between women’s reproductive factors: a Mendelian randomisation analysis. BMC Med 20, 103 (2022). https://doi.org/10.1186/s12916-022-02293-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12916-022-02293-5

Keywords