- Research article
- Open Access
The hepato-ovarian axis: genetic evidence for a causal association between non-alcoholic fatty liver disease and polycystic ovary syndrome
BMC Medicine volume 21, Article number: 62 (2023)
Recent studies found associations between non-alcoholic fatty liver disease (NAFLD) and polycystic ovary syndrome (PCOS), but the causal nature of this association is still uncertain.
We performed a bidirectional two-sample Mendelian randomization (MR) analysis to test for the causal association between NAFLD and PCOS using data from a large-scale biopsy-confirmed NAFLD genome-wide association study (GWAS) (1483 cases and 17,781 controls) and PCOS GWAS (10,074 cases and 103,164 controls) in European ancestries. Data from glycemic-related traits GWAS (in up to 200,622 individuals) and sex hormones GWAS (in 189,473 women) in the UK Biobank (UKB) were used in the MR mediation analysis to assess potential mediating roles of these molecules in the causal pathway between NAFLD and PCOS. Replication analysis was conducted using two independent datasets from NAFLD and PCOS GWASs in the UKB and a meta-analysis of data from FinnGen and the Estonian Biobank, respectively. A linkage disequilibrium score regression was conducted to assess genetic correlations between NAFLD, PCOS, glycemic-related traits, and sex hormones using full summary statistics.
Individuals with higher genetic liability to NAFLD were more likely to develop PCOS (OR per one-unit log odds increase in NAFLD: 1.10, 95% CI: 1.02–1.18; P = 0.013). Indirect causal effects of NAFLD on PCOS via fasting insulin only (OR: 1.02, 95% CI: 1.01–1.03; P = 0.004) and further a suggestive indirect causal effect via fasting insulin in concert with androgen levels were revealed in MR mediation analyses. However, the conditional F statistics of NAFLD and fasting insulin were less than 10, suggesting likely weak instrument bias in the MVMR and MR mediation analyses.
Our study suggests that genetically predicted NAFLD was associated with a higher risk of developing PCOS but less evidence for vice versa. Fasting insulin and sex hormones might mediate the link between NAFLD and PCOS.
Polycystic ovary syndrome (PCOS) is the most common cause of anovulatory infertility affecting up to nearly 10% of reproductive-age women [1, 2], and it was recently reported that there are up to ~1.55 million incident cases of women with PCOS globally . In addition, women with PCOS are also at increased risk of developing long-term endocrine complications and cardiometabolic diseases . Linkages between PCOS and non-alcoholic fatty liver disease (NAFLD), which is characterized by excessive hepatic fat accumulation (steatosis) in the absence of significant alcohol consumption , have been consistently reported [6, 7], and recent large-scale cohort and meta-analysis studies observed that women with PCOS were associated with a higher risk of NAFLD and its more progressive form, non-alcoholic steatohepatitis [8, 9]. The global prevalence of NAFLD has now reached 32.4%, and its incidence among women has been estimated to be nearly 30 cases per 1000 person-years . The annual burden of PCOS and the direct medical costs of NAFLD and related complications were nearly $8 billion and over $137 billion, respectively, in the USA and Europe [11, 12]. However, to date, there are no effective preventions or therapeutic interventions for the two common and burdensome diseases.
In view of the close connection between these two diseases, recently, a novel hepato-ovarian axis was hypothesized . Moreover, growing evidence showed that insulin resistance and sex hormones (especially increased serum androgen levels) may play essential roles in the pathophysiology of both NAFLD and PCOS [14, 15]. To date, however, the causal relationship between NAFLD and PCOS, and whether there exist potential mediating roles of serum androgen levels and insulin resistance between these two conditions have been insufficiently addressed, because conventional observational analyses are susceptible to residual confounding or reverse causation bias .
Mendelian randomization (MR) is a statistical approach that could minimize the risk of bias due to residual confounding or reverse causation as it basically uses germline genetic variants as instrumental variables (IVs) to estimate possible causal effects between modifiable exposures and outcome measures .
Thus, in the present study, we investigated the causal relationship between NAFLD and PCOS using a bidirectional two-sample MR analysis. A linkage disequilibrium score regression (LDSR) was then used to assess the genetic correlation between these two diseases. Furthermore, we performed stepwise multivariable MR (MVMR) analyses to test for the mediating roles of glycemic-related traits and serum androgens.
A schematic overview of the data sources, genetic instrument selection, and statistical analysis in this study is presented in Fig. 1 (panel a). Summary data on NAFLD were obtained from a large genome-wide association study (GWAS) conducted by Anstee et al., which included 1483 cases and 17,781 controls . All NAFLD cases were diagnosed using strict criteria (i.e., liver biopsy). Due to the lack of sex-specific GWAS of NAFLD, we used data from the NAFLD GWAS in the general population and assumed there were no sex-specific genetic effects for NAFLD, as supported by previous studies . Data on PCOS were obtained from a large-scale meta-analysis of PCOS GWAS conducted by Day et al., including 10,074 cases and 103,164 controls of European ancestry, where participants were diagnosed with PCOS according to National Institutes of Health (NIH) criteria, Rotterdam criteria, or self-reported diagnoses .
Summary data on glycemic-related traits, including fasting glucose and insulin levels (i.e., a proxy of insulin resistance), were obtained from a GWAS conducted by Chen et al. that involved 200,622 individuals of European ancestry without known diabetes . Summary data on sex hormones were extracted from a GWAS of serum sex hormone-binding globulin (SHBG) and bioavailable testosterone levels (i.e., bioavailable testosterone is calculated using an equation that includes serum total testosterone, SHBG, and albumin concentrations) in up to 189,473 women of European ancestry in the UK Biobank (UKB) .
For replication analysis, we used two independent GWASs, a GWAS of NAFLD in the UKB (5921 cases and 366,616 controls)  and a GWAS meta-analysis of data on PCOS women (3609 cases and 229,788 controls) in the FinnGen and Estonian Biobank (EstBB), respectively . Information on International Classification of Diseases (ICD) codes that were used to define cases of NAFLD in the UKB and cases of PCOS in the FinnGen and EstBB is presented in Table 1. Detailed information of each GWAS summary statistic in our study can be found in Additional file 1: Table S1.
Genetic instrument selection
In the primary MR analysis, ten genome-wide significant (P < 5 × 10−8) single nucleotide polymorphisms (SNPs) were identified in the biopsy-based NAFLD GWAS . After linkage disequilibrium (LD) clumping (a window of 10Mb and r2 < 0.001) using the clump-data function in the TwoSampleMR R package , 4 bi-allelic SNPs with minor allele frequency (MAF) > 0.01 were retained as genetic instruments (Table 2). Of 14 genome-wide significant SNPs identified in the PCOS GWAS conducted by Day et al. , 13 SNPs were selected as genetic instruments for PCOS after excluding rs853854 (MAF close to 0.5) and LD clumping with the same threshold as above.
Proxy variant selection and data harmonization
For genetic instruments that were not available in the outcome GWAS summary data, a proxy variant was looked up (a window of 1 Mb and r2 ≥ 0.8) in the European 1000 Genomes dataset using the LDlink (https://ldlink.nci.nih.gov/?tab=ldproxy). In the data harmonization procedure, we coded the effect allele and the reference allele in the same strand for both exposure and outcome.
Following the same procedure of LD clumping, proxy variant selection, and data harmonization as above, eligible genetic instruments for glycemic-related traits and serum sex hormone levels are detailed in Additional file 1: Tables S2-S3. In the replication MR analysis, 6 SNPs were selected as genetic instruments for NAFLD and for PCOS, respectively. After getting the eligible IVs, we compared the IV-specific causal effect estimate between the most significant variants used in our analysis (i.e., rs17216588, rs2068834, and rs73001065) and their high LD causal variants, which were previously reported in the literature (i.e., rs58542926 on TM6SF2 and rs1260326 on GCKR) (Additional file 2: Fig. S1).
Primary MR analysis
A bidirectional MR analysis was performed to determine the causal relationship between NAFLD and PCOS (Fig. 1, panel b). The random-effects inverse-variance weighted (IVW) method or fixed-effects IVW method was used in the primary MR analysis using the TwoSampleMR R packages . In particular, we used the fixed-effects IVW method when there were three or fewer genetic instruments available; otherwise, the random-effects IVW method was used . To assess the strength of the selected genetic instruments in MR analysis, F statistics were calculated, which can be used to examine whether MR estimates are likely to be influenced by weak instrument bias. F statistics greater than 10 are generally considered strong . In addition, Cochran’s Q test was conducted to assess the heterogeneity of causal effect estimates between NAFLD and PCOS .
MR mediation analysis
A stepwise MR analysis approach was used to examine whether there exist mediation effects of glycemic-related traits and sex hormones (i.e., serum SHBG and bioavailable testosterone levels) between NAFLD and PCOS (Fig. 1, panel c) [32, 33]. To assess the direct causal effect between NAFLD, glycemic-related traits, sex hormones, and PCOS in each step, we performed an MVMR analysis using the MVMR R package . Conditional F statistics were calculated for assessing the strength of the genetic instruments in MVMR analysis (Additional file 1: Tables S4-S6) . Furthermore, to minimize the risk of bias due to horizontal pleiotropy, the MR mediation analysis was conducted after excluding the obesity-related genetic variants which were identified from the PhenoScanner V2 database  and the GWAS Catalog . The product of the coefficients method  and the multivariate delta method  were used to calculate the indirect effects of NAFLD on PCOS via mediators. The detailed stepwise MR mediation analysis and obesity-related SNPs selection procedures can be found in Additional file 1: Table S7 and Additional file 2: “Step-wise MR mediation analysis” and “Obesity-related genetic variants selection.”
Replication MR analysis
A replication bidirectional MR analysis between NAFLD and PCOS was performed using two independent NAFLD and PCOS GWAS datasets [26, 27]. To increase the statistical power and precision of our causal estimates, a fixed-effects meta-analysis was conducted to combine the causal estimates derived from the primary MR analysis and the replication MR analysis using the meta R package . We also replicated the findings of the mediation effects of glycemic traits and serum sex hormone levels using the replication analysis datasets.
MR sensitivity analysis
To examine the robustness of MR effect estimates to potential invalid genetic variants, we conducted MR-Egger regression , weighted median , and the Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO)  tests as sensitivity analyses. Unlike the IVW method that assumes all the SNPs are valid IVs  when the Instrument Strength must be Independent of the Direct Effect (InSIDE) assumption holds, the MR-Egger regression test could generate a consistent estimate even if all the genetic instruments are invalid . The weighted median model is a robust approach, which could provide consistent estimate results when more than half of the genetic instruments are valid . We used MR-PRESSO to detect the presence of outliers (i.e., potentially pleiotropic SNPs) and estimate the causal effect after excluding outliers . The leave-one-out (LOO) analysis was used to assess whether the causal effect was driven by an influential SNP via recalculating the MR estimates by leaving one instrument out at a time . Moreover, we performed an IVW analysis after excluding obesity-related genetic variants.
Genetic correlation analysis
We estimated the genetic correlation between NAFLD, PCOS, glycemic-related traits, and sex hormones via LDSR using the primary and replication GWAS summary datasets, respectively .
Non-collapsibility of the odds ratio
Non-collapsibility of the odds ratio is a challenge in the mediation analysis when the outcome is binary, such as NAFLD . To assess whether binary outcomes used in MR analysis would impact the estimates and conclusions of our study, a magnetic resonance imaging-derived proton density fat fraction (PDFF) GWAS in the UKB, which was conducted using a linear model , was used to replicate the causal associations between NAFLD and PCOS (Additional file 2: “Non-collapsibility of the odds ratio” and Fig. S2).
All statistical analyses were undertaken with R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria). Given that up to five risk factors (NAFLD, two glycemic-related traits, and two sex hormone traits) were investigated in MVMR analysis, an estimate with a P value, after applying a multiple testing Bonferroni correction, less than 0.01 (P = 0.05/5 traits) was considered as strong evidence for causal effects, whereas a P value between 0.01 and 0.05 indicated a suggestive causal effect.
Causal effect between NAFLD and PCOS
In the primary MR analysis, we found that genetically predicted NAFLD increased the risk of PCOS by 10% (odds ratio [OR] per one-unit log odds increase in NAFLD: 1.10, 95% confidence interval [CI]: 1.02–1.18; P = 0.013) (Fig. 2, panel a). Additionally, a total effect equated to an OR for PCOS of 1.12 (95% CI: 1.02–1.24; P = 0.019) was estimated in the two-sample MR analysis after excluding an obesity-related SNP (i.e., rs2068834). A similar causal effect (OR: 1.08, 95% CI: 1.01–1.15; P = 0.029) was observed in the replication analysis after excluding one obesity-related SNP (i.e., rs429358). Furthermore, the fixed-effects meta-analysis of the IVW causal estimates derived from the primary and replication MR IVW analysis results generated a pooled positive causal effect equated to an OR for PCOS of 1.08 (95% CI: 1.02–1.14; P = 0.009) per one-unit log odds increase in NAFLD. In contrast, there was little evidence for a causal effect of genetically predicted PCOS on NAFLD risk, which was consistent with the results of replication analysis and sensitivity analyses (Fig. 2, panel b).
F statistics for their respective genetic instruments ranged from 30.8 to 249.4 (Table 2). It suggested that MR analysis results were unlikely to be influenced by weak instrument bias. For the causal effect of NAFLD on the risk of PCOS, Cochran’s Q statistics was 1.99 (P = 0.575), whereas for the reverse causal effect of POCS on NAFLD risk, Cochran’s Q statistics was 29.94 (P = 0.003), thereby suggesting a potential heterogeneity across SNP-specific causal effect estimates. The results of the LOO analysis suggested that there was no potentially influential SNP in the primary and replication MR analyses (Additional file 2: Fig. S3). The MR-Egger intercept test results did not show any directional pleiotropy. An outlier-corrected MR-PRESSO test was performed after removing strong outliers among the IVs. Detailed MR-Egger intercept test results and MR-PRESSO global test results can be found in Additional file 1: Tables S8-S9.
Notably, a positive genetic correlation (rg = 0.73, standard error [SE] = 0.27; P = 0.007) between NAFLD and PCOS was observed using the primary GWAS summary statistics via LDSR (Additional file 1: Table S10). Although the replication LDSR analysis generated a weaker genetic correlation (rg = 0.27, SE = 0.19; P = 0.150), the direction was consistent with that observed in the primary analysis. We further tested pair-wise genetic correlations between all traits in the primary and replication analyses, respectively. Detailed information can be found in Additional file 1: Table S10 and Additional file 2: Fig. S4.
Causal effects of NAFLD, glycemic-related traits, sex hormones, and PCOS via stepwise MR mediation analysis
After excluding obesity-related SNPs, MVMR analysis revealed direct causal effects of NAFLD (OR: 1.11, 95% CI: 1.05–1.17; P < 0.001), fasting insulin (OR per increase in natural log-transformed pmol/L fasting insulin: 3.11, 95% CI: 1.68–5.76; P < 0.001), and serum bioavailable testosterone levels (OR per increase in natural log-transformed nmol/L bioavailable testosterone: 1.90, 95% CI: 1.27–2.85; P = 0.002) on the risk of developing PCOS, respectively (Fig. 3, panel a; Additional file 1: Table S4). By contrast, no causal effect was observed for fasting glucose (OR: 0.89, 95% CI: 0.61–1.31; P = 0.564) and SHBG levels (OR: 1.21, 95% CI: 0.72–2.04; P = 0.461) on PCOS risk.
In the following steps of the MR mediation analysis, we found strong evidence for a causal effect of serum SHBG levels (beta: −0.929, 95% CI: −0.969 to −0.888; P < 0.001) on serum bioavailable testosterone levels (Fig. 3, panel b; Additional file 1: Table S5). Furthermore, an inverse causal association (beta: −0.280, 95% CI: −0.424 to −0.135; P < 0.001) between fasting insulin and SHBG levels, whereas a null causal association between either NAFLD (beta: −0.006, 95% CI: −0.023–0.010; P = 0.468) or fasting glucose (beta: −0.060, 95% CI: −0.141–0.020; P = 0.144) and SHBG levels, was observed (Fig. 3, panel c; Additional file 1: Table S6).
During further estimating causal effects on glycemic-related traits, the MR analysis results did not support any causal effect of genetically predicted NAFLD on fasting insulin levels; nevertheless, a significantly positive causal effect was observed (beta: 0.0152, 95% CI: 0.0087–0.0216; P < 0.001) after excluding the pleiotropic obesity-related SNP (Fig. 3, panel d). Meanwhile, little evidence was found to support a causal effect of NAFLD on fasting glucose levels, which was consistent with the results of sensitivity analyses.
Taken together, we found the following two potential mediation pathways between NAFLD and PCOS: (1) an indirect causal effect of NAFLD on PCOS risk via fasting insulin levels only (θ2×θ6) (OR: 1.02, 95% CI: 1.01–1.03; P = 0.004) and (2) a suggestive indirect causal effect of NAFLD on PCOS risk via circulating levels of fasting insulin, SHBG, and bioavailable testosterone (θ3×θ4×θ5×θ6) (OR: 1.0025, 95% CI: 1.0002–1.0049; P = 0.0323) (Additional file 1: Table S11). These two pathways mediated 14.9% and 2.2% of the total causal effect of NAFLD on PCOS risk, respectively. Detailed estimates of direct and indirect causal effects using the replication datasets can be found in both Additional file 1: Tables S4-S6 and Additional file 2: Fig. S5. The conditional F statistics can be found in Fig. 3 (panel a to panel c) and Additional file 1: Table S4-S6, which suggested weak instrument bias may occur in the MVMR analysis for NAFLD and fasting insulin.
In bidirectional MR analyses, we found that genetically predicted NAFLD was causally associated with a higher risk of developing PCOS, whereas there was little evidence for a causal effect of genetically predicted PCOS on the risk of developing NAFLD. In addition, our MR mediation analyses confirmed a direct causal effect of NAFLD on the risk of developing PCOS along with significant indirect causal effects via circulating levels of insulin and sex hormones (namely serum SHBG and bioavailable testosterone levels). Therefore, these findings suggest that fasting insulin and serum androgen levels might play mediating roles in the putative causal pathway, which might be the recently proposed hepato-ovarian axis .
Our MR analysis further indicated a causal effect of increased fasting insulin levels (a proxy of insulin resistance) on the risk of PCOS, which was supported by a suggestive causal effect of insulin resistance on PCOS reported in a previous MR study . In the ovarian theca cells, insulin may exert a co-gonadotropin effect on upregulating luteinizing hormone (LH)-induced androgen production . Furthermore, increased serum LH levels and insulin resistance could impair follicle maturation and even cause anovulatory cycles . Previous studies suggested that disruption of insulin receptor signaling in the central nervous system may also contribute to the development of PCOS via the hypothalamic-pituitary-gonadal axis [48,49,50].
Accumulating evidence supported an association between higher serum androgen levels and PCOS . Moreover, a causal association between increased serum androgen levels and PCOS was confirmed in a recent MR study , which was further replicated in the present study. In particular, we found that higher serum bioavailable testosterone levels were causally associated with a higher risk of PCOS, but little evidence was found for a direct causal effect of serum SHBG levels on PCOS risk when adjusting for circulating bioavailable testosterone levels.
A previous MR study reported causal associations between increased fasting insulin and decreased SHBG levels and higher bioavailable testosterone levels, respectively . Our present MR analysis results supported the existence of inverse causal effects of fasting insulin on SHBG levels and SHBG on serum bioavailable testosterone levels; however, there was little evidence for a direct causal effect of fasting insulin levels on serum bioavailable testosterone after adjusting for SBHG levels. Taken together, our findings suggested that higher fasting insulin levels might affect serum bioavailable testosterone levels mainly through serum SHBG reduction.
In our study, obesity was an essential confounder between NAFLD and glycemic-related traits. Previous research reported that obesity could upregulate the pro-inflammatory gene expression, then increase pro-inflammatory cytokine production in the liver, and induce hepatic and systemic insulin resistance [52, 53]. We observed a causal effect of NAFLD on fasting insulin levels, which was consistent with a previous MR study , but not fasting glucose levels using genetic instruments for NAFLD excluding one obesity-related SNP (rs2068834). This finding was also supported by studies showing that hepatic steatosis could impair insulin action and then induce insulin resistance in the liver . It is noteworthy that our study did not find any causal associations between genetically predicted NAFLD and serum SHBG or bioavailable testosterone levels after adjusting for obesity and glycemic-related traits, which was inconsistent with observations from some previous studies suggesting that NAFLD patients were more likely to have lower serum SHBG levels [55, 56]. Previous studies found that circulating levels of SHBG could be upregulated by adiponectin, which was inversely associated with obesity [57, 58]. Thus, it is possible to hypothesize that the lower serum SHBG levels and higher bioavailable testosterone levels observed among patients with NAFLD in previous observational studies might be affected by obesity, independent of NAFLD.
Our study has several strengths. First, we used the largest and most recent data from GWASs in European ancestry. Second, we comprehensively tested for the potential mediators in the causal pathway between NAFLD and PCOS. Third, we used independent data sources to validate our causal inference.
There are also some important limitations to this study. First, both the biopsy-based NAFLD GWAS by Anstee et al.  and NAFLD GWAS in UKB , which were used in the present MR study due to a lack of large-scale sex-specific NAFLD GWAS, were conducted in a sex-combined population. Although previous studies found that NAFLD is a sexual dimorphism condition , no sex differences in genetic effects were found for SNPs in genes (or in high LD with genes) including PNPLA3, HSD17B13, TM6SF2, and GCKR , which were the selected genetic instruments for NAFLD in our MR analyses. Second, the disparity in results of mediation and LDSR analyses between different independent datasets might be, at least in part, attributed to varying definitions used for cases of NAFLD and PCOS. In the datasets for primary analysis, cases of NAFLD and PCOS were diagnosed using strict criteria (i.e., liver biopsy and NIH or Rotterdam criteria, respectively), whereas cases of both conditions were identified only by ICD codes in the datasets for replication analysis. PCOS was identified in the FinnGen study using electronic health records since 1968, which may not be as accurate as data using the recent diagnostic criteria. Third, due to lacking independent large-scale glycemic-related traits and sex hormones GWAS, sample overlap exists between fasting insulin and fasting glucose and between SHBG and bioavailable testosterone levels in the mediation analysis. However, we tried our best to search for all the available GWASs and selected independent NAFLD and PCOS GWASs in primary and replication analyses, respectively. Therefore, our main causal effect estimates between NAFLD and PCOS were unlikely to be affected by sample overlap. Fourth, although each exposure was strongly predicted by the genetic variants in the two-sample MR analysis, the MVMR analysis was still likely to be biased by the conditional weak instruments . And the weak instrument bias cannot be ruled out in both primary and replication MR mediation analyses. The underlying mechanisms of the suggestive causal pathways between NAFLD and PCOS in our study need further investigation.
It is noteworthy that the primary MR analysis found a positive causal effect of NAFLD on PCOS risk using the NAFLD GWAS by Anstee et al., which was conducted within the population of South Europe , while NAFLD and PCOS GWASs used in the replication analysis were conducted in West and North Europe (i.e., the UK and Finland/Estonia) [26, 27]. Although a statistically non-significant causal effect of NAFLD on PCOS risk was observed in our replication MR analysis, the causal effect magnitude and direction were consistent with the primary analysis results. Moreover, the statistically significant pooled MR estimates of primary and replication analysis results supported a causal effect of NAFLD on PCOS risk. Thus, collectively, our results can largely be generalized to European populations.
Our study supported a causal association between genetically predicted NAFLD and higher risk of developing PCOS. Moreover, the underlying mechanisms from NAFLD to PCOS might be linked via higher circulating levels of fasting insulin (a proxy of insulin resistance) and sex hormones (mainly bioavailable testosterone levels). The findings of this study suggested the potential clinical and public health significance of early diagnosis and management of NAFLD for future PCOS prevention. Given that the likelihood of our MVMR analysis results being potentially biased by conditional weak instruments cannot be ruled out, the mediating biomarkers investigated in this study should be cautiously considered as potential therapeutic targets and need to be validated in future larger genetic studies and intervention studies.
Availability of data and materials
All data generated or analyzed during this study are included in Additional file 2: “Additional information on data sources.”
Effect allele frequency
Genome-wide association study
International Classification of Diseases
Linkage disequilibrium score regression
Mendelian randomization pleiotropy residual sum and outlier
Multivariable Mendelian randomization
Non-alcoholic fatty liver disease
National Institutes of Health
Other alleles (reference allele)
Polycystic ovary syndrome
Sex hormone-binding globulin
Single nucleotide polymorphisms
Bozdag G, Mumusoglu S, Zengin D, Karabulut E, Yildiz BO. The prevalence and phenotypic features of polycystic ovary syndrome: a systematic review and meta-analysis. Hum Reprod. 2016;31(12):2841–55.
McCartney CR, Marshall JC. Polycystic ovary syndrome. N Engl J Med. 2016;375(1):54–64.
Liu J, Wu Q, Hao Y, Jiao M, Wang X, Jiang S, et al. Measuring the global disease burden of polycystic ovary syndrome in 194 countries: Global Burden of Disease Study 2017. Hum Reprod. 2021;36(4):1108–19.
Norman RJ, Dewailly D, Legro RS, Hickey TE. Polycystic ovary syndrome. Lancet. 2007;370(9588):685–97.
Anstee QM, Targher G, Day CP. Progression of NAFLD to diabetes mellitus, cardiovascular disease or cirrhosis. Nat Rev Gastroenterol Hepatol. 2013;10(6):330–44.
Macut D, Bozic-Antic I, Bjekic-Macut J, Tziomalos K. Management of endocrine disease: polycystic ovary syndrome and nonalcoholic fatty liver disease. Eur J Endocrinol. 2017;177(3):R145–R58.
Von-Hafe M, Borges-Canha M, Vale C, Leite AR, Sérgio Neves J, Carvalho D, et al. Nonalcoholic fatty liver disease and endocrine axes—a scoping review. Metabolites. 2022;12(4):298.
Kumarendran B, O'Reilly MW, Manolopoulos KN, Toulis KA, Gokhale KM, Sitch AJ, et al. Polycystic ovary syndrome, androgen excess, and the risk of nonalcoholic fatty liver disease in women: a longitudinal study based on a United Kingdom primary care database. PLoS Med. 2018;15(3):e1002542.
Rocha ALL, Faria LC, Guimaraes TCM, Moreira GV, Candido AL, Couto CA, et al. Non-alcoholic fatty liver disease in women with polycystic ovary syndrome: systematic review and meta-analysis. J Endocrinol Investig. 2017;40(12):1279–88.
Riazi K, Azhari H, Charette JH, Underwood FE, King JA, Afshar EE, et al. The prevalence and incidence of NAFLD worldwide: a systematic review and meta-analysis. Lancet Gastroenterol Hepatol. 2022;7(9):851–61.
Riestenberg C, Jagasia A, Markovic D, Buyalos RP, Azziz R. Health care-related economic burden of polycystic ovary syndrome in the United States: pregnancy-related and long-term health consequences. J Clin Endocrinol Metab. 2022;107(2):575–85.
Younossi ZM, Blissett D, Blissett R, Henry L, Stepanova M, Younossi Y, et al. The economic and clinical burden of nonalcoholic fatty liver disease in the United States and Europe. Hepatology. 2016;64(5):1577–86.
Targher G, Rossini M, Lonardo A. Evidence that non-alcoholic fatty liver disease and polycystic ovary syndrome are associated by necessity rather than chance: a novel hepato-ovarian axis? Endocrine. 2016;51(2):211–21.
Salva-Pastor N, Chavez-Tapia NC, Uribe M, Nuno-Lambarri N. Understanding the association of polycystic ovary syndrome and non-alcoholic fatty liver disease. J Steroid Biochem Mol Biol. 2019;194:105445.
Vassilatou E. Nonalcoholic fatty liver disease and polycystic ovary syndrome. World J Gastroenterol. 2014;20:8351–63.
Smith GD, Lawlor DA, Harbord R, Timpson N, Day I, Ebrahim S. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS Med. 2007;4(12):e352.
Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–98.
Ruth KS, Day FR, Tyrrell J, Thompson DJ, Wood AR, Mahajan A, et al. Using human genetics to understand the disease impacts of testosterone in men and women. Nat Med. 2020;26(2):252–8.
Hazelwood E, Sanderson E, Tan VY, Ruth KS, Frayling TM, Dimou N, et al. Identifying molecular mediators of the relationship between body mass index and endometrial cancer risk: a Mendelian randomization analysis. BMC Med. 2022;20(1):125.
Liu Z, Zhang Y, Graham S, Wang X, Cai D, Huang M, et al. Causal relationships between NAFLD, T2D and obesity have implications for disease subphenotyping. J Hepatol. 2020;73(2):263–76.
Zhang Y, Movva VC, Williams MS, Lee MTM. Polycystic ovary syndrome susceptibility loci inform disease etiological heterogeneity. J Clin Med. 2021;10(12):2688.
Anstee QM, Darlay R, Cockell S, Meroni M, Govaere O, Tiniakos D, et al. Genome-wide association study of non-alcoholic fatty liver and steatohepatitis in a histologically characterised cohort☆. J Hepatol. 2020;73(3):505–15.
Lefebvre P, Staels B. Hepatic sexual dimorphism - implications for non-alcoholic fatty liver disease. Nat Rev Endocrinol. 2021;17(11):662–70.
Day F, Karaderi T, Jones MR, Meun C, He C, Drong A, et al. Large-scale genome-wide meta-analysis of polycystic ovary syndrome suggests shared genetic architecture for different diagnosis criteria. PLoS Genet. 2018;14(12):e1007813.
Chen J, Spracklen CN, Marenne G, Varshney A, Corbin LJ, Luan J, et al. The trans-ancestral genomic architecture of glycemic traits. Nat Genet. 2021;53(6):840–60.
Sveinbjornsson G, Ulfarsson MO, Thorolfsdottir RB, Jonsson BA, Einarsson E, Gunnlaugsson G, et al. Multiomics study of nonalcoholic fatty liver disease. Nat Genet. 2022;54(11):1652–63.
Tyrmi JS, Arffman RK, Pujol-Gualdo N, Kurra V, Morin-Papunen L, Sliz E, et al. Leveraging Northern European population history: novel low-frequency variants for polycystic ovary syndrome. Hum Reprod. 2021;37(2):352–65.
Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408.
Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol. 2017;46(6):1734–9.
Burgess S, Thompson SG. Avoiding bias from weak instruments in Mendelian randomization studies. Int J Epidemiol. 2011;40(3):755–64.
Greco MFD, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015;34(21):2926–40.
Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015;181(4):251–60.
Carter AR, Sanderson E, Hammerton G, Richmond RC, Davey Smith G, Heron J, et al. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021;36(5):465–78.
Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Stat Med. 2021;40(25):5434–52.
Sanderson E, Davey Smith G, Windmeijer F, Bowden J. An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol. 2018;48(3):713–27.
Kamat MA, Blackshaw JA, Young R, Surendran P, Burgess S, Danesh J, et al. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics. 2019;35(22):4851–3.
Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47(D1):D1005–d12.
VanderWeele TJ. Mediation analysis: a practitioner’s guide. Annu Rev Public Health. 2016;37:17–32.
MacKinnon DP, Fairchild AJ, Fritz MS. Mediation analysis. Annu Rev Psychol. 2007;58(1):593–614.
Balduzzi S, Rucker G, Schwarzer G. How to perform a meta-analysis with R: a practical tutorial. Evid Based Ment Health. 2019;22(4):153–60.
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25.
Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.
Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–8.
Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–65.
Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG. Sensitivity analyses for robust causal inference from Mendelian randomization analyses with multiple genetic variants. Epidemiology. 2017;28:30–42.
Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, Schizophrenia Working Group of the Psychiatric Genomics C, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47(3):291–5.
Pang M, Kaufman JS, Platt RW. Studying noncollapsibility of the odds ratio with marginal structural and logistic regression models. Stat Methods Med Res. 2016;25(5):1925–37.
Diamanti-Kandarakis E, Dunaif A. Insulin resistance and the polycystic ovary syndrome revisited: an update on mechanisms and implications. Endocr Rev. 2012;33(6):981–1030.
Wang J, Wu D, Guo H, Li M. Hyperandrogenemia and insulin resistance: the chief culprit of polycystic ovary syndrome. Life Sci. 2019;236:116940.
Nandi A, Wang X, Accili D, Wolgemuth DJ. The effect of insulin signaling on female reproductive function independent of adiposity and hyperglycemia. Endocrinology. 2010;151(4):1863–71.
Deswal R, Yadav A, Dang AS. Sex hormone binding globulin - an important biomarker for predicting PCOS risk: a systematic review and meta-analysis. Syst Biol Reprod Med. 2018;64(1):12–24.
Lackey DE, Olefsky JM. Regulation of metabolism by the innate immune system. Nat Rev Endocrinol. 2016;12(1):15–28.
Tencerova M, Aouadi M, Vangala P, Nicoloro SM, Yawe JC, Cohen JL, et al. Activated Kupffer cells inhibit insulin sensitivity in obese mice. FASEB J. 2015;29(7):2959–69.
Meex RCR, Watt MJ. Hepatokines: linking nonalcoholic fatty liver disease and insulin resistance. Nat Rev Endocrinol. 2017;13(9):509–20.
Wang X, Xie J, Pang J, Zhang H, Chen X, Lin J, et al. Serum SHBG is associated with the development and regression of nonalcoholic fatty liver disease: a prospective study. J Clin Endocrinol Metab. 2020;105(3):e791–804.
Jaruvongvanich V, Sanguankeo A, Riangwiwat T, Upala S. Testosterone, sex hormone-binding globulin and nonalcoholic fatty liver disease: a systematic review and meta-analysis. Ann Hepatol. 2017;16(3):382–94.
Arita Y, Kihara S, Ouchi N, Takahashi M, Maeda K, Miyagawa J-i, et al. Paradoxical decrease of an adipose-specific protein, adiponectin, in obesity. Biochem Bioph Res Co. 1999;257(1):79–83.
Simó R, Saez-Lopez C, Lecube A, Hernandez C, Fort JM, Selva DM. Adiponectin upregulates SHBG production: molecular mechanisms and potential implications. Endocrinology. 2014;155(8):2820–30.
Ballestri S, Nascimbeni F, Baldelli E, Marrazzo A, Romagnoli D, Lonardo A. NAFLD as a sexual dimorphic disease: role of gender and reproductive status in the development and progression of nonalcoholic fatty liver disease and inherent cardiovascular risk. Adv Ther. 2017;34(6):1291–326.
This work was supported by the Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China [grant number 2021YJRC02]; the National Natural Science Foundation of China [grant number 82103949]; and the Applied Basic Research Project of Shanxi Province, China [grant number 20210302124186].
Ethics approval and consent to participate
All genetic instruments and summary statistics used in this study were obtained from published GWASs. All the participants in each study provided informed consent, and all the participating studies had been approved by the relevant institutional review board. Ethical approval for the present study was not required because no individual-level data were used.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Table S1.
Key characteristics of participating studies. Table S2. GWAS significant SNPs used as genetic instruments for fasting insulin and fasting glucose. Table S3. GWAS significant SNPs used as genetic instruments for serum SHBG levels and bioavailable testosterone levels in women. Table S4. Direct causal effects of NAFLD, fasting insulin, fasting glucose, serum SHBG levels, and serum bioavailable testosterone levels on PCOS risk via multivariable MR analysis. Table S5. Direct causal effects of NAFLD, fasting insulin, fasting glucose, and serum SHBG levels on serum bioavailable testosterone levels via multivariable MR analysis. Table S6. Direct causal effects of NAFLD, fasting insulin, and fasting glucose on serum SHBG levels via multivariable MR analysis. Table S7. Obesity-related genome-wide significant genetic variants. Table S8. Directional pleiotropy test using MR-Egger intercepts. Table S9. Horizontal pleiotropy test using MR-PRESSO. Table S10. Linkage disequilibrium score regression results on genetic correlations between NAFLD, fasting insulin, fasting glucose, SHBG, BT, and PCOS. Table S11. Indirect causal effects between NAFLD and PCOS via fasting insulin, serum SHBG levels, and serum bioavailable testosterone levels through step-wise MR analysis.
Additional file 2:
Additional information on data sources. Additional methods and results. Fig. S1. Causal effects of NAFLD on PCOS using specific variants. Fig. S2. Results of bidirectional MR analysis between NAFLD and PCOS using linear PDFF GWAS and binary NAFLD GWAS. Fig. S3. Leave-one-out analysis results of MR causal effects between NAFLD and PCOS. Fig. S4. Genetic correlations between NAFLD, FI, FG, SHBG, BT, and PCOS in LDSR analysis. Fig. S5. Results of step-wise MR mediation analysis for causal associations between NAFLD, glycemic-related traits, sex hormones, and PCOS using replication GWAS datasets. Fig. S6. The overview of the step-wise MR mediation analysis between NAFLD and PCOS via glycemic-related traits and sex hormones. Fig. S7. A schematic diagram of calculating the indirect causal effect of NAFLD on PCOS via fasting insulin and sex hormones. Fig. S8. Causal effect of NAFLD on PCOS using two-sample MR after excluding BMI-related IV and using MVMR.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Liu, D., Gao, X., Pan, XF. et al. The hepato-ovarian axis: genetic evidence for a causal association between non-alcoholic fatty liver disease and polycystic ovary syndrome. BMC Med 21, 62 (2023). https://doi.org/10.1186/s12916-023-02775-0
- Non-alcoholic fatty liver disease
- Polycystic ovary syndrome
- Mendelian randomization
- Fasting insulin
- Sex hormones
- Hepato-ovarian axis