IBD and BMD GWAS summary statistics
To obtain a more comprehensive and reliable conclusion of the causal link between IBD and BMDs, we selected the largest GWAS published to date for IBD including UC and CD [22]. Another study with a larger GWAS of IBD was also included for replication purposes [23]. Full summary statistics for the IBD (unit, logOR) GWAS are available for download from the International IBD Genetics Consortium’s website at https://www.ibdgenetics.org/downloads.html. The datasets used for replication are available at https://gwas.mrcieu.ac.uk/datasets/. The femoral neck, lumbar spine, and forearm are the three common skeletal sites of postmenopausal women and men who are 50 years or older for measurement of BMD based on DXA. Total body BMD (TB-BMD) GWAS summary data is used to estimate the general effect of IBD on whole-body BMD. TB-BMD measurement is the most appropriate method for an unbiased assessment of BMD variation in the same skeletal site from childhood to old age [24]. GWAS summary statistics for BMDs (unit, g/cm2) was downloaded from the GEnetic Factors for OSteoporosis Consortium website (GEFOS, http://www.gefos.org/). We also could download GWAS summary statistics of IBD and BMD from the publicly available GWAS catalog website (https://www.ebi.ac.uk/gwas/downloads/summary-statistics) or IEU GWAS database (https://gwas.mrcieu.ac.uk/datasets/). The corresponding effect estimates of SNP on IBD (including UC and CD) and BMD had been adjusted for many principal components. The diagnosis of IBD was based on accepted radiologic, endoscopic, and histopathologic criteria. Measurement of BMD was recommended utilizing dual-energy X-ray absorptiometry.
The summary statistics of the largest GWAS published to date for IBD (N = 12,882 cases, 21,770 controls), UC (N = 6968 cases, 20,464 controls), and CD (N = 5956 cases, 14,927 controls) was obtained from the International IBD Genetics Consortium [22]. All participants were of European ancestry.
Summary statistics of a combined analysis including 38,565 IBD cases and 37,747controls and immunochip-wide association analyses with UC (N = 10,920 cases, 15,977 controls) and CD (N = 14,763 cases, 15,977 controls) were included for replication purposes [23]. To reduce the possibility of population stratification, all participants were of European ancestry. GWAS summary statistics were downloaded from https://gwas.mrcieu.ac.uk/datasets/.
Three separate GWAS summary statistics of European participants’ femoral neck bone mineral density (FN-BMD, n = 32,735), lumbar spine bone mineral density (LS-BMD, n = 28,498), and forearm bone mineral density (FA-BMD, n = 8143) were downloaded from GEFOS; it is the largest GWAS on DXA-measured BMD to date [8].
A meta-analysis comprising 56,284 individuals of European ancestry was performed to investigate the genetic determinants of total body bone mineral density (TB-BMD) [24]. The meta-analyzed effect size estimates were used in this study. The GWAS summary statistic of TB-BMD was downloaded from the GEFOS website.
Genetic instrumental variables
From the GWAS summary data of IBD including UC and CD, we conducted a series of quality control steps to select eligible instrumental SNPs. Firstly, we extracted SNPs associated with IBD with genome-wide significance (P < 5 × 10−8). Secondly, it is important to ensure that all the instrumental SNPs for the exposure are not in linkage disequilibrium (LD), since instrumental SNPs in strong LD may cause biased results. In this study, we performed the clumping process (R2 < 0.001, window size = 10,000 kb) with the European samples from the 1000 genomes project which were used to estimate LD between SNPs. Among those pairs of SNPs that had LD R2 above the specified threshold (R2 = 0.001) only the SNP with the lower P value would be retained. SNPs absent from the LD reference panel were also removed. Thirdly, SNPs with minor allele frequency (MAF) < 0.01 were removed. Fourthly, extracting data for the above-selected SNPs from the outcome trait (BMDs) GWAS summary. By default, if a particular requested SNP was not present in the outcome GWAS, then a SNP (proxy) that was in LD with the requested SNP (target) would be searched for instead. LD proxies were defined using 1000 genomes of European sample data. The effect of the proxy SNP on the outcome was returned, along with the proxy SNP, the effect allele of the proxy SNP, and the corresponding allele (in phase) for the target SNP. Fifthly, the effect of ambiguous SNPs with non-concordant alleles (e.g., A/G vs. A/C) and palindromic SNPs with an ambiguous strand (i.e., A/T or G/C) was corrected or the ambiguous and palindromic SNPs were directly excluded from the above-selected instrument SNPs in harmonizing process to ensure that the effect of a SNP on the exposure, and the effect of that same SNP on the outcome, corresponds to the same allele. These stringently selected SNPs were used as the instrumental variables for subsequent two-sample MR analysis.
According to the assumptions of MR analysis, the selected instrumental SNPs should strongly associate with exposure. To test whether there was a weak instrumental variable bias, namely genetic variants selected as instrumental variables had a weak association with exposure, we calculated the F statistic (F = R2(n − k − 1)/k(1 − R2); R2, variance of exposure explained by selected instrumental variables, and we got the value of R2 in MR Steiger directionality test; n, sample size; and k, number of instrumental variables). If the F statistic is much greater than 10 for the instrument-exposure association, the possibility of weak instrumental variable bias is small [25].
Mendelian randomization estimates
MR analysis uses genetic variants as instrumental variables to estimate the causative effect of exposure variables on an outcome. In the study, we combined the summary statistics (β coefficients and standard errors) to estimate the causal associations between IBD (including UC and CD) and BMDs (including TB-BMD, FN-BMD, LS-BMD, and FA-BMD) using different methods. Since it is unlikely that all genetic variants would be valid instrumental variables, several robust methods have been proposed. The methods which included inverse variance weighting (IVW), MR-Pleiotropy RESidual Sum and Outlier (MR-PRESSO) method, mode-based estimate (MBE) method, weighted median (WM), MR-Egger regression, and robust adjusted profile score (MR.RAPS) method were based on different assumptions.
The IVW method uses a meta-analysis approach to combine Wald estimates for each SNP (i.e., the β coefficient of the SNP for BMD divides by the β coefficient of the SNP for IBD) to get the overall estimates of the effect of IBD on BMD [26]. If there is no violation of the IV2 assumption (no horizontal pleiotropy), or the horizontal pleiotropy is balanced, an unbiased causal estimate can be obtained by IVW linear regression [27]. Fixed and random effects IVW approaches are available. If significant heterogeneity (P < 0.05) is observed, a random-effect IVW model is applied. MR-PRESSO is a method for the detection and correction of outliers in IVW linear regression. MR-PRESSO has three components, including (a) detection of horizontal pleiotropy (MR-PRESSO global test), (b) correction for horizontal pleiotropy via outlier removal (MR-PRESSO outlier test), and (c) testing of significant differences in the causal estimates before and after correction for outliers (MR-PRESSO distortion test). The MR-PRESSO outlier test requires that at least 50% of the variants are valid instruments, has balanced pleiotropy, and relies on the Instrument Strength Independent of Direct Effect (InSIDE) condition that instrument-exposure and pleiotropic effects are uncorrelated [28]. The mode-based method clusters the SNPs into groups basing on the similarity of causal effects and returns the causal effect estimate basing on the cluster that has the largest number of SNPs. The causal estimate from the mode-based estimator is unbiased if the SNPs contributing to the largest cluster are valid instruments even if the majority of instruments are invalid [29]. The median-based approach will provide an unbiased estimate of the causal effect in the presence of unbalanced horizontal pleiotropy even when up to 50% of SNPs are invalid IVs (e.g., due to pleiotropy) [30]. If there is a particular direction of the horizontal pleiotropic effect, then constraining the slope to go through zero will introduce bias. Egger regression which allows the intercept to pass through a value other than zero will relax the constraint. The MR-Egger regression, based on the assumption of InSIDE, performs a weighted linear regression of the outcome coefficients on the exposure coefficients [31]. Under the InSIDE assumption, it gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid IVs [31]. However, MR-Egger estimates may be inaccurate and can be strongly influenced by outlying genetic variants. The WM estimate which does not require the InSIDE assumption has been confirmed to have distinct superiorities over MR-Egger for its improved power of causal effect detection and lower type I error [30]. When the InSIDE assumption is valid and the percentage of horizontal pleiotropic variants is small (≤ 10%), the causal estimate of the MR-PRESSO outlier adjustment is less biased and has better precision (smaller standard deviation) than MR-Egger. However, when the percentage of horizontal pleiotropic variants is high (≥ 50%), the opposite is found [28]. The weighted median has less bias but also less precision in the causal estimate compared to the MR-PRESSO outlier test, particularly when the percentage of horizontal pleiotropic variants is < 50% [28]. Since we included many weak instrumental variables in the analyses, we carried out a recently proposed method called MR.RAPS to make our results more reliable [32]. This method is robust to both systematic and idiosyncratic pleiotropy and can give a robust inference for MR analysis with many weak instruments. It is able to correct for pleiotropy using robust adjusted profile scores and is recommended to routinely use the RAPS estimator in practice, especially if the exposure and the outcome are both complex traits.
If the estimates of different methods are inconclusive, the link between exposure and outcome phenotype with an adjusted P value < 0.05/5 = 0.01 (Bonferroni correction for multiple testing) is considered significant.
Pleiotropy and sensitivity analysis
We conducted the MR-Egger regression to assess the potential pleiotropic effects of the SNPs used as IVs. The intercept term in MR Egger regression can be a useful indication of whether directional horizontal pleiotropy is driving the results of a MR analysis [33]. In MR-PRESSO analysis, it attempts to reduce heterogeneity in the estimate of the causal effect by removing SNPs that contribute to the heterogeneity disproportionately more than expected. The number of distributions in MR-PRESSO analysis was set to 1000. We used the IVW method and MR-Egger regression to detect heterogeneity. The heterogeneities were quantified by Cochran Q statistic; a P value of < 0.05 would be regarded as significant heterogeneity. Additionally, to identify potentially influential SNPs, we performed a “leave-one-out” sensitivity analysis to where the MR is performed again but leaving out each SNP in turn.
Procedures of MR analysis
In our study, we firstly performed MR analysis with all the above-selected SNPs as IVs. If the MR-PRESSO analysis detected a significant horizontal pleiotropy, we shall remove the outlier variants (with P value less than the threshold in the MR-PRESSO outlier test) and perform MR analysis again. After the MR-PRESSO outlier removal step, if the heterogeneity was still significant, we would perform MR analysis under the condition of removing all the SNPs of which the P value was less than 1 in the MR-PRESSO outlier test. At last, if potentially influential SNPs were identified in the “leave-one-out” sensitivity analysis, we should draw the conclusion with caution. A flow chart about the analytical methods and how the MR analysis was performed step-by-step was shown in Fig. 1.
Ethics
Our analysis used published study or publicly available GWAS summary data. No original data was collected for this manuscript, and thus, no ethical committee approval was required. Each study included was approved by their institutional ethics review committees, and all participants provided written informed consent.
All statistical analyses were conducted using R version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria) using the Two-Sample MR package [27]. P values < 0.05 were considered statistically significant. In multiple testing, an adjusted P value after Bonferroni correction (P < 0.05/N, N = the number of testing methods) was considered statistically significant.