Mendelian randomization study of inflammatory bowel disease and bone mineral density

Background Recently, the association between inflammatory bowel disease (including ulcerative colitis and Crohn’s disease) and BMD has attracted great interest in the research community. However, the results of the published epidemiological observational studies on the relationship between inflammatory bowel disease and BMD are still inconclusive. Here, we performed a two-sample Mendelian randomization analysis to investigate the causal link between inflammatory bowel disease and level of BMD using publically available GWAS summary statistics. Methods A series of quality control steps were taken in our analysis to select eligible instrumental SNPs which were strongly associated with exposure. To make the conclusions more robust and reliable, we utilized several robust analytical methods (inverse-variance weighting, MR-PRESSO method, mode-based estimate method, weighted median, MR-Egger regression, and MR.RAPS method) that are based on different assumptions of two-sample MR analysis. The MR-Egger intercept test, Cochran’s Q test, and “leave-one-out” sensitivity analysis were performed to evaluate the horizontal pleiotropy, heterogeneities, and stability of these genetic variants on BMD. Outlier variants identified by the MR-PRESSO outlier test were removed step-by-step to reduce heterogeneity and the effect of horizontal pleiotropy. Results Our two-sample Mendelian randomization analysis with two groups of exposure GWAS summary statistics and four groups of outcome GWAS summary statistics suggested a definitively causal effect of genetically predicted ulcerative colitis on TB-BMD and FA-BMD but not on FN-BMD or LS-BMD (after Bonferroni correction), and we merely determined a causal effect of Crohn’s disease on FN-BMD but not on the others, which was somewhat inconsistent with many published observational researches. The causal effect of inflammatory bowel disease on TB-BMD was significant and robust but not on FA-BMD, FN-BMD, and LS-BMD, which might result from the cumulative effect of ulcerative colitis and Crohn’s disease on BMDs. Conclusions Our Mendelian randomization analysis supported the causal effect of ulcerative colitis on TB-BMD and FA-BMD. As to Crohn’s disease, only the definitively causal effect of it on decreased FN-BMD was observed. Updated MR analysis is warranted to confirm our findings when a more advanced method to get less biased estimates and better precision or GWAS summary data with more ulcerative colitis and Crohn’s disease patients was available.


Background
The incidence of aging-related disorders has dramatically increased in modern society for improved healthcare, socio-economic, and lifestyle changes which greatly increased life expectancy [1]. Osteoporosis is a common, aging-related systemic skeletal disease characterized by decreased bone strength, micro-architectural deterioration of bone tissue, and consequent increased fracture risk [2,3]. It is clinically diagnosed largely through measurement of bone mineral density (BMD) at central sites (the lumbar spine and the proximal femur) and peripheral sites (including the distal forearm) as examined by dual-energy X-ray absorptiometry (DXA) [2,4]. In the USA, the prevalence of osteoporosis is estimated to increase to more than14 million cases in 2020, and the burden is projected to increase to exceed 3 million fractures and $25.3 billion each year by 2025 [5]. Clearly, the severe clinical and economic consequences of osteoporosis urgently call for a concerted effort to identify the risk factors causing osteoporosis and assess patients at risk to allow for prevention and early intervention when appropriate. The etiology of osteoporosis is not well understood. It is well recognized that increasing age, female gender, and a wide range of clinical factors, medical factors, behavior factors, nutritional factors, and genetic factors are associated with the disease [2,[6][7][8][9]. Many studies demonstrated that the potential risk factors including cigarette smoking, heavy alcohol intake, caffeine intake, glucocorticoid therapy, low body mass index (BMI), physical inactivity, gastrointestinal diseases, hematologic disorders, and calcium and vitamin D deficiency may contribute to low BMD and fractures [2,10,11].
Inflammatory bowel disease (IBD), which includes ulcerative colitis (UC) and Crohn's disease (CD), is a chronic, relapsing inflammatory condition of the gastrointestinal tract [12]. It affects more than 2.5 million people in Europe, with increasing prevalence in Asia and developing countries [13]. Recently, the association between IBD and BMD has gained great interest. However, available epidemiological evidences on the effects of IBD on the level of BMD are inconclusive. A populationbased prospective study containing 60 UC patients and 60 CD patients demonstrated that only minor changes in BMD were observed in both CD and UC patients during a 2-year period [14]. Another study found that steroid-naive young male patients with IBD had lower bone density values than healthy controls [15]. Some studies have revealed that decreased BMD in individuals with IBD was related to corticosteroid use but not the disease itself [16]. And some studies concluded that BMD is reduced in patients with CD but not in patients with UC [17][18][19]. Given that the studies, which have drawn inconsistent conclusions, were either based on limited samples or only explored the correlations between IBD (including UC and CD) and BMD and osteoporosis, and the epidemiological observational studies may be subjected to confounding factors and reverse causality [20]. A study, like randomized controlled trials (RCTs), directly inferring the causal relationship between IBD and BMD and osteoporosis is helpful for the prevention and early intervention of osteoporosis and consequent fractures in high-risk populations. However, RCTs are difficult or impractical to perform for they are expensive, labor resource-intensive, time-consuming, and ethical limitations. As an alternative, Mendelian randomization (MR), mimic the design of RCT, is a popular yet more convenient technique to test the causality between an exposure (IBD) and an outcome (BMD or osteoporosis) [20].
Two-sample MR is a technique, using germline genetic variants as instrument variables (IV) for exposure to study the causal relations between the exposure phenotype and the outcome phenotype. It enables the use of publically available results from very large genome-wide association studies (GWAS) for both risk factor "exposures" and disease "outcomes" and overcomes the typical pitfalls present in observational studies. In order to obtain unbiased estimates, MR need to fulfill three key assumptions: IV1-genetic variants used in the analysis should be significantly associated with the exposure; IV2-genetic variants extracted as instrument variables for exposure are independent of confounding factors that are associated with the selected exposure and outcome; and IV3-the genetic variants affects the outcome only through the exposure and not via other biological pathways (i.e., no horizontal pleiotropic effect) [21]. Here, a two-sample Mendelian randomization analysis was performed to investigate the causal link between IBD (including UC and CD) and decreased BMD, in which we used the summary statistics from GWAS data of IBD (including UC and CD) and BMDs (including total body BMD (TB-BMD), femoral neck BMD (FN-BMD), lumbar spine BMD (LS-BMD), and forearm BMD (FA-BMD)).

IBD and BMD GWAS summary statistics
To obtain a more comprehensive and reliable conclusion of the causal link between IBD and BMDs, we selected the largest GWAS published to date for IBD including UC and CD [22]. Another study with a larger GWAS of IBD was also included for replication purposes [23]. Full summary statistics for the IBD (unit, logOR) GWAS are available for download from the International IBD Genetics Consortium's website at https://www.ibdgenetics. org/downloads.html. The datasets used for replication are available at https://gwas.mrcieu.ac.uk/datasets/. The femoral neck, lumbar spine, and forearm are the three common skeletal sites of postmenopausal women and men who are 50 years or older for measurement of BMD based on DXA. Total body BMD (TB-BMD) GWAS summary data is used to estimate the general effect of IBD on whole-body BMD. TB-BMD measurement is the most appropriate method for an unbiased assessment of BMD variation in the same skeletal site from childhood to old age [24]. GWAS summary statistics for BMDs (unit, g/cm 2 ) was downloaded from the GEnetic Factors for OSteoporosis Consortium website (GEFOS, http:// www.gefos.org/). We also could download GWAS summary statistics of IBD and BMD from the publicly available GWAS catalog website (https://www.ebi.ac.uk/ gwas/downloads/summary-statistics) or IEU GWAS database (https://gwas.mrcieu.ac.uk/datasets/). The corresponding effect estimates of SNP on IBD (including UC and CD) and BMD had been adjusted for many principal components. The diagnosis of IBD was based on accepted radiologic, endoscopic, and histopathologic criteria. Measurement of BMD was recommended utilizing dual-energy X-ray absorptiometry.
The summary statistics of the largest GWAS published to date for IBD (N = 12,882 cases, 21,770 controls), UC (N = 6968 cases, 20,464 controls), and CD (N = 5956 cases, 14,927 controls) was obtained from the International IBD Genetics Consortium [22]. All participants were of European ancestry.
Summary statistics of a combined analysis including 38,565 IBD cases and 37,747controls and immunochipwide association analyses with UC (N = 10,920 cases, 15, 977 controls) and CD (N = 14,763 cases, 15,977 controls) were included for replication purposes [23]. To reduce the possibility of population stratification, all participants were of European ancestry. GWAS summary statistics were downloaded from https://gwas.mrcieu.ac.uk/datasets/.
A meta-analysis comprising 56,284 individuals of European ancestry was performed to investigate the genetic determinants of total body bone mineral density (TB-BMD) [24]. The meta-analyzed effect size estimates were used in this study. The GWAS summary statistic of TB-BMD was downloaded from the GEFOS website.

Genetic instrumental variables
From the GWAS summary data of IBD including UC and CD, we conducted a series of quality control steps to select eligible instrumental SNPs. Firstly, we extracted SNPs associated with IBD with genome-wide significance (P < 5 × 10 −8 ). Secondly, it is important to ensure that all the instrumental SNPs for the exposure are not in linkage disequilibrium (LD), since instrumental SNPs in strong LD may cause biased results. In this study, we performed the clumping process (R 2 < 0.001, window size = 10,000 kb) with the European samples from the 1000 genomes project which were used to estimate LD between SNPs. Among those pairs of SNPs that had LD R 2 above the specified threshold (R 2 = 0.001) only the SNP with the lower P value would be retained. SNPs absent from the LD reference panel were also removed. Thirdly, SNPs with minor allele frequency (MAF) < 0.01 were removed. Fourthly, extracting data for the aboveselected SNPs from the outcome trait (BMDs) GWAS summary. By default, if a particular requested SNP was not present in the outcome GWAS, then a SNP (proxy) that was in LD with the requested SNP (target) would be searched for instead. LD proxies were defined using 1000 genomes of European sample data. The effect of the proxy SNP on the outcome was returned, along with the proxy SNP, the effect allele of the proxy SNP, and the corresponding allele (in phase) for the target SNP. Fifthly, the effect of ambiguous SNPs with nonconcordant alleles (e.g., A/G vs. A/C) and palindromic SNPs with an ambiguous strand (i.e., A/T or G/C) was corrected or the ambiguous and palindromic SNPs were directly excluded from the above-selected instrument SNPs in harmonizing process to ensure that the effect of a SNP on the exposure, and the effect of that same SNP on the outcome, corresponds to the same allele. These stringently selected SNPs were used as the instrumental variables for subsequent two-sample MR analysis.
According to the assumptions of MR analysis, the selected instrumental SNPs should strongly associate with exposure. To test whether there was a weak instrumental variable bias, namely genetic variants selected as instrumental variables had a weak association with exposure, we calculated the F statistic (F = R 2 (n − k − 1)/k(1 − R 2 ); R 2 , variance of exposure explained by selected instrumental variables, and we got the value of R 2 in MR Steiger directionality test; n, sample size; and k, number of instrumental variables). If the F statistic is much greater than 10 for the instrument-exposure association, the possibility of weak instrumental variable bias is small [25].

Mendelian randomization estimates
MR analysis uses genetic variants as instrumental variables to estimate the causative effect of exposure variables on an outcome. In the study, we combined the summary statistics (β coefficients and standard errors) to estimate the causal associations between IBD (including UC and CD) and BMDs (including TB-BMD, FN-BMD, LS-BMD, and FA-BMD) using different methods. Since it is unlikely that all genetic variants would be valid instrumental variables, several robust methods have been proposed. The methods which included inverse variance weighting (IVW), MR-Pleiotropy RESidual Sum and Outlier (MR-PRESSO) method, mode-based estimate (MBE) method, weighted median (WM), MR-Egger regression, and robust adjusted profile score (MR.RAPS) method were based on different assumptions.
The IVW method uses a meta-analysis approach to combine Wald estimates for each SNP (i.e., the β coefficient of the SNP for BMD divides by the β coefficient of the SNP for IBD) to get the overall estimates of the effect of IBD on BMD [26]. If there is no violation of the IV2 assumption (no horizontal pleiotropy), or the horizontal pleiotropy is balanced, an unbiased causal estimate can be obtained by IVW linear regression [27]. Fixed and random effects IVW approaches are available. If significant heterogeneity (P < 0.05) is observed, a random-effect IVW model is applied. MR-PRESSO is a method for the detection and correction of outliers in IVW linear regression. MR-PRESSO has three components, including (a) detection of horizontal pleiotropy (MR-PRESSO global test), (b) correction for horizontal pleiotropy via outlier removal (MR-PRESSO outlier test), and (c) testing of significant differences in the causal estimates before and after correction for outliers (MR-PRESSO distortion test). The MR-PRESSO outlier test requires that at least 50% of the variants are valid instruments, has balanced pleiotropy, and relies on the Instrument Strength Independent of Direct Effect (InSIDE) condition that instrument-exposure and pleiotropic effects are uncorrelated [28]. The mode-based method clusters the SNPs into groups basing on the similarity of causal effects and returns the causal effect estimate basing on the cluster that has the largest number of SNPs.
The causal estimate from the mode-based estimator is unbiased if the SNPs contributing to the largest cluster are valid instruments even if the majority of instruments are invalid [29]. The median-based approach will provide an unbiased estimate of the causal effect in the presence of unbalanced horizontal pleiotropy even when up to 50% of SNPs are invalid IVs (e.g., due to pleiotropy) [30]. If there is a particular direction of the horizontal pleiotropic effect, then constraining the slope to go through zero will introduce bias. Egger regression which allows the intercept to pass through a value other than zero will relax the constraint. The MR-Egger regression, based on the assumption of InSIDE, performs a weighted linear regression of the outcome coefficients on the exposure coefficients [31]. Under the InSIDE assumption, it gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid IVs [31]. However, MR-Egger estimates may be inaccurate and can be strongly influenced by outlying genetic variants. The WM estimate which does not require the InSIDE assumption has been confirmed to have distinct superiorities over MR-Egger for its improved power of causal effect detection and lower type I error [30]. When the InSIDE assumption is valid and the percentage of horizontal pleiotropic variants is small (≤ 10%), the causal estimate of the MR-PRESSO outlier adjustment is less biased and has better precision (smaller standard deviation) than MR-Egger. However, when the percentage of horizontal pleiotropic variants is high (≥ 50%), the opposite is found [28]. The weighted median has less bias but also less precision in the causal estimate compared to the MR-PRESSO outlier test, particularly when the percentage of horizontal pleiotropic variants is < 50% [28]. Since we included many weak instrumental variables in the analyses, we carried out a recently proposed method called MR.RAPS to make our results more reliable [32]. This method is robust to both systematic and idiosyncratic pleiotropy and can give a robust inference for MR analysis with many weak instruments. It is able to correct for pleiotropy using robust adjusted profile scores and is recommended to routinely use the RAPS estimator in practice, especially if the exposure and the outcome are both complex traits.
If the estimates of different methods are inconclusive, the link between exposure and outcome phenotype with an adjusted P value < 0.05/5 = 0.01 (Bonferroni correction for multiple testing) is considered significant.

Pleiotropy and sensitivity analysis
We conducted the MR-Egger regression to assess the potential pleiotropic effects of the SNPs used as IVs. The intercept term in MR Egger regression can be a useful indication of whether directional horizontal pleiotropy is driving the results of a MR analysis [33]. In MR-PRESSO analysis, it attempts to reduce heterogeneity in the estimate of the causal effect by removing SNPs that contribute to the heterogeneity disproportionately more than expected. The number of distributions in MR-PRESSO analysis was set to 1000. We used the IVW method and MR-Egger regression to detect heterogeneity. The heterogeneities were quantified by Cochran Q statistic; a P value of < 0.05 would be regarded as significant heterogeneity. Additionally, to identify potentially influential SNPs, we performed a "leave-one-out" sensitivity analysis to where the MR is performed again but leaving out each SNP in turn.

Procedures of MR analysis
In our study, we firstly performed MR analysis with all the above-selected SNPs as IVs. If the MR-PRESSO analysis detected a significant horizontal pleiotropy, we shall remove the outlier variants (with P value less than the threshold in the MR-PRESSO outlier test) and perform MR analysis again. After the MR-PRESSO outlier removal step, if the heterogeneity was still significant, we would perform MR analysis under the condition of removing all the SNPs of which the P value was less than 1 in the MR-PRESSO outlier test. At last, if potentially influential SNPs were identified in the "leave-one-out" sensitivity analysis, we should draw the conclusion with caution. A flow chart about the analytical methods and how the MR analysis was performed step-by-step was shown in Fig. 1.

Ethics
Our analysis used published study or publicly available GWAS summary data. No original data was collected for this manuscript, and thus, no ethical committee approval was required. Each study included was approved by their institutional ethics review committees, and all participants provided written informed consent.
All statistical analyses were conducted using R version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria) using the Two-Sample MR package [27]. P values < 0.05 were considered statistically significant. In multiple testing, an adjusted P value after Bonferroni correction (P < 0.05/N, N = the number of testing methods) was considered statistically significant.

Selection of instrumental variables
Detailed information of LD-independent SNPs (after clumping process) for exposure (IBD, UC, and CD) was listed in Additional file 1. The listed SNPs would be excluded in the following situations: first, in the process of extracting particular SNPs from the outcome (BMDs) GWAS, a particular requested SNP was not present in and a proxy that was in LD with the requested SNP could not be searched from the outcome GWAS. Second, the effect of ambiguous SNPs with non-concordant alleles or palindromic SNPs with ambiguous strand could not be corrected. Eventually, the number of SNPs selected as IVs for exposure in further analyses would be equal to or less than that listed in Additional file 1. F statistics for every instrument-exposure association were much greater than 10 in our study, demonstrating the small possibility of weak instrumental variable bias.

Two-sample Mendelian randomization analysis for causal link of IBD with BMDs
The MR estimates from different methods of assessing the causal effect of IBD on BMDs were presented in Table 1. MR estimates of assessing the causal effect of IBD on BMDs at different steps were presented in Additional file 2: Table S1. The results of Table 1 which contained the last step of MR estimates of Additional file 2: Table S1 demonstrated that genetically predicted IBD was negatively associated with the level of TB-BMD  (117), P = 0.035). Our analysis suggested no significant evidence of horizontal pleiotropy (as indicated by MR-Egger regression intercept close to zero, with a P value larger than 0.05). It was likely that there were SNPs exhibited horizontal pleiotropy in this part (which then tended to cancel out when the estimates were combined together in meta-analysis/Egger regression). The estimated effect sizes of the SNPs on both the exposure (IBD) and BMD outcomes are displayed in scatter plots (Fig. 2). The funnel plots providing an indication of where there existed directional horizontal pleiotropy for each outcome were shown in Additional file 3: Fig. S1. Plots of leave-one-out analysis which were shown in Additional file 3: Fig. S2 demonstrated that there was a potentially influential SNP driving the causal link between IBD and FN-BMD. Thus, we need to carefully interpret the result and draw a cautious conclusion.
In replication practice, the sample size of IBD was much larger than that in initial practice. The results of Table 1 showed the strong causal link of IBD and TB-BMD (IVW: β , which were consistent with that in initial practice. However, no causal effect of IBD on LS-BMD and FA-BMD was found in this section. We detected no heterogeneity and pleiotropy in this part. The scatter plots and funnel plots for each outcome in replication practice were shown in Fig. 3 and Additional file 3: Fig. S3. Plots of the leave-one-out analysis (Additional file 3: Fig. S4) demonstrated that the causal link between IBD and FN-BMD was driven by potentially influential SNPs, and we should carefully interpret the result and draw a cautious conclusion.
The F statistics for instrument IBD are 136.97 in the initial practice and 86.16 in the replication practice, demonstrating the small possibility of weak instrumental variable bias. As the results mentioned above, we may conclude the causal effect of genetically predicted IBD on TB-BMD but not on LS-BMD or FA-BMD. As to FN-BMD, the results of MR analysis in initial practice and replication practice were driven by potentially influential SNPs identified in the   "leave one out" analysis, and we cannot draw a robust or definitive conclusion.   BMD was found in this initial practice using an adjusted P value after Bonferroni correction (P < 0.01). MR estimates of assessing the causal effect of UC on BMDs at different steps were presented in Additional file 2: Table S2. MR-Egger regression tests and heterogeneity tests suggested no significant horizontal pleiotropy and heterogeneities in this part. The scatter plots, funnel plots, and "leave-one-out analysis" plots were shown in Fig. 4 and Additional file 3: Fig. S5 and Fig. S6.

Two-sample Mendelian randomization analysis for causal link of UC with BMDs
In the replication practice, Table 2 demonstrated the negative causal link between UC and TB-BMD (using the IVW, WMM, and MR.RAPS methods) and FA-BMD (using the IVW, MR-Egger regression, and MR.RAPS methods), which was consistent with that in initial   practice. In the FN-BMD and LS-BMD groups, no causal effect of UC on decreased FN-BMD or LS-BMD was found. No directional horizontal pleiotropy and heterogeneities were detected in this section. The scatter plots and funnel plots were shown in Fig. 5 and Additional file 3: Fig. S7. Plots of the leave-one-out analysis (Additional file 3: Fig. S8) demonstrated that there was no potentially influential SNP driving the causal link and our conclusion was of stability. The F statistics for instrument UC in the initial practice and the replication practice are 103.21 and 89.11, respectively. Summarizing the results of Table 2, we could receive the definite causal effect of genetically predicted UC on TB-BMD and FA-BMD but not on FN-BMD or LS-BMD.

Two-sample Mendelian randomization analysis for causal link of CD with BMDs
In the two-sample MR analysis, we found CD did not have a causal link with the change of TB-BMD, LS-BMD, and FA-BMD under different MR methods (Table 3). MR estimates of assessing the causal effect of CD on BMDs at different steps were presented in Additional file 2: Table S3. As to FN-BMD, a negative causal link was found using the IVW method (β (95%CI)    Step # : 2, MR analysis after removing the SNPs (with P value less than threshold in MR-PRESSO outlier test); 3, MR analysis after removing all the SNPs (with   Fig. 7 and Additional file 3: Fig. S11. The plots of the leave-one-out analysis (Additional file 3: Fig. S12) demonstrated no potentially influential SNPs driving the causal link between CD and BMDs in the replication practice.
The F statistics for instrument CD in the initial practice and replication practice are 160.18 and 102.38, respectively. Summarizing the results above, we could conclude there was a causal link of CD on FN-BMD, but not on TB-BMD, LS-BMD, or FA-BMD.

Discussion
In this study, we used summary statistics from GWASs to identify the causal relationships between IBD (including UC and CD) and BMD at different skeletal sites. The results suggested that UC causally decreased TB-BMD and FA-BMD, the estimated effect sizes of UC on FN-BMD were not significant with an adjusted P value after Bonferroni correction, and UC did not definitely decrease LS-BMD, implying that the causal effects of UC on BMD at different skeletal sites were different. Previous studies suggested that only cortical thickness and cortical BMD were different, with smaller values in the UC patients than controls, and no differences were found in the trabecular or endocortical compartments [34]. The adult human skeleton is composed of 80% cortical bone and 20% trabecular bone overall. The vertebra is composed of the cortical to trabecular bone in a ratio of 25:75; this ratio is 50:50 in the femoral head and 95:5 in the radial diaphysis [35]. Thus, we inferred the different effect of UC on BMDs at different skeletal sites may be differently associated with various components of the bone, since the bone from different skeletal sites differs in composition (e.g., different proportions of the trabecular and cortical bones). Radial diaphysis and total body have the highest percentage of the cortical bone in the skeletal sites studied here. From our results, we found that the genetically predicted UC significantly caused a decrease in FA-BMD and TB-BMD. The femoral head and the vertebra have the lowest percentage of the cortical bone for the BMD phenotypes, and the effect of UC on FN-BMD and LS-BMD was not obvious.
Some publications reported that BMD was reduced in patients with CD but not in patients with UC [18,19]. Haschka et al.'s research demonstrated that CD patients exhibited a more severe bone loss phenotype compared with UC patients [34]. The possible reasons might be as follows: CD is a systemic disease with a long premorbid phase, while UC is a mucosal disease with an acute onset and is often limited to the distal colonic tracts. In addition, CD has important immunological differences when compared to UC. The localization of CD is in the small intestine, and intestinal resection may cause malnutrition and estrogen deficiency [36]. However, in Schoon et al.'s research, it concluded no significant differences in BMD between patients with either CD or UC [37]. In this two-sample MR analysis assessing the causal link of IBD (including UC and CD) on BMDs, we determined a causal effect of genetically predicted UC on TB-BMD and FA-BMD, but only get a causal effect of CD on BMD, which was somewhat inconsistent with many published observational researches. The reasons for the difference between our MR analysis results and most other observational researches may be explained as follows: firstly, the results of epidemiological observational studies were affected by other related factors. For example, Bernstein et al.'s publication revealed that decreased BMD in IBD patients was related to corticosteroid use but not the disease itself [16]. The results of Andreassen et al.'s research with 113 CD patients and 113 healthy subjects, individually matched for gender, age, and body weight, showed that BMD of patients with CD was not different from that of healthy controls except for a decreased BMD of the hip in female patients, and gender, age, and body weight are the major determinants of BMD in patients with CD [38]. And in this MR analysis, the corresponding effect estimate of SNP on IBD (including UC and CD) and BMD had been adjusted for many principal components. Secondly, the results of our MR analysis might be biased by pleiotropy. We did not search through the Ensembl Project or Phe-noScanner database as previous studies to screen genetic variants which are associated with confounding factors [39,40]. We just performed the MR-PRESSO outlier test to identify and remove outlier variants. However, we deemed the possibility that pleiotropy significantly biased the results of our analysis was tiny, as several robust methods for MR have been performed, which can provide reliable inferences when some genetic variants violate the IV assumptions. Otherwise, we included an IBD (including UC and CD) GWAS dataset for replication purposes. It would make our conclusions more robust and reliable. Further MR analysis with more CD patients or more advanced methods to get less biased estimates and better precision is warranted in the future to confirm the relationship between CD and the level of BMDs. The causal effect of IBD on TB-BMD was significant and robust but not on FA-BMD or LS-BMD after Bonferroni correction. As to FN-BMD, the causal effect was the lack of stability. This might result from the cumulative effect of UC and CD on BMDs.
There are two types of pleiotropy (vertical pleiotropy and horizontal pleiotropy). Vertical pleiotropy occurs when a variant is directly associated with the exposure and another phenotype on the same biological pathway. This does not lead to the violation of the IV assumptions providing the only causal pathway from the genetic variant to the outcome passes via the exposure. Horizontal pleiotropy occurs when the second phenotype is on a different biological pathway, and so, there may exist different causal pathways from the variant to the outcome. This would violate the IV3 assumption [41]. To solve the problems that arise due to horizontal pleiotropy, several robust MR methods besides IVW have been performed in our study. The methods can be divided into three categories: consensus methods (e.g., weighted median, mode-based method), outlier-robust methods (e.g., MR-PRESSO), and modeling methods (e.g., MR-Egger and MR-RAPS), and the methods mentioned each possesses its own advantages [41]. Investigators should perform a range of robust methods that come from different categories and that operate in different ways and rely on different assumptions for valid inferences to assess the reliability of MR analyses. The other measures that might be taken to reduce the effect of horizontal pleiotropy were searching through the Ensembl Project or PhenoScanner database to identify and exclude genetic variants relating to confounding factors. We have not admitted this measure as it would not necessarily differentiate between horizontal and vertical pleiotropy, where only the former would bias MR studies. On the other hand, the exact biological function of many genetic variants is unknown.
Our research was the first MR analysis of this topic. In this study, we selected SNPs with genome-wide association and independent inheritance without any LD as IVs to detect the causal link between IBD (including UC and CD) and BMDs. To make our conclusions more robust and reliable, the outlier variants identified by the MR-PRESSO outlier test were removed step-by-step. We also utilized several robust analytical methods based on different assumptions of two-sample MR analysis with four groups of outcome summary GWAS data (TB-BMD, FN-BMD, LS-BMD, and FA-BMD) and two groups of exposure summary GWAS data. Instead of using just a few strong SNPs as IVs, we utilized many (potentially hundreds of) stringently selected weak SNPs as the IVs for our two-sample MR analysis, which usually substantially decreases the variance of the estimator. Since we included many weak instrumental variables in the analysis, the F statistic was used to assess the strength of the association between the genetic variants and exposure. The F statistics were much greater than 10 in our analysis, hinting the small possibility of weak instrumental variable bias [25]. We also carried out the MR.RAPS method, which can give a robust inference for our MR analysis with many weak IVs. Lastly, the summary GWAS data we drew for IBD (including UC and CD) and BMDs consisted uniquely of individuals of European descent and had been adjusted for many principal components, which would reduce potential bias.
Some limitations of our MR analysis need to be considered. First, the exposure and outcome studies used in two-sample MR analysis should not involve overlapping participants. We were not able to estimate the degree of overlap in the study. However, bias from sample overlap can be minimized by using strong instruments (e.g., F statistic much greater than 10), [42]. Second, the summary GWAS data merely concern individuals of European descent, and our results may not be fully representative of the whole population. So, we should carefully utilize our conclusion in racially and ethnically diverse populations. Third, we cannot expel the possibility that horizontal pleiotropy affected our results, even though we took steps to identify and exclude outlier variants. Fourth, each method we utilized in the analysis has its own strengths and weaknesses. However, the use of so many methods based on different assumptions may increase the possibility of getting inconsistent or contrary results and make the conclusion become obscured.

Conclusion
In this study, our aim is to assess the causal effect of IBD (including UC and CD) on decreased BMD by using two-sample MR analysis. The results of our research got a definitively causal effect of genetically predicted UC on TB-BMD and FA-BMD but not on FN-BMD or LS-BMD, and we merely determined a causal effect of CD on FN-BMD, which was somewhat inconsistent with many published observational researches. Updated MR analysis is warranted to confirm our findings when a more advanced method to get less biased estimates and better precision or GWAS summary data with more UC and CD patients was available. Foremost, our research reminded clinicians that measures and concerted efforts for prevention of bone loss and early intervention of osteoporosis should be considered when IBD patients are diagnosed.
Additional file 1 Detailed information of LD-independent SNPs (after clumping process) for exposure (IBD, UC and CD).
Additional file 2: Table S1. MR estimates from different methods of assessing the causal effect of IBD on BMDs step by step. Table S2. MR estimates from different methods of assessing the causal effect of UC on BMDs step by step. Table S3. MR estimates from different methods of assessing the causal effect of CD on BMDs step by step.