Skip to main content
  • Research article
  • Open access
  • Published:

The association between the pre-pregnancy vaginal microbiome and time-to-pregnancy: a Chinese pregnancy-planning cohort study



Although sexually transmitted infections are regarded as the main cause of tubal infertility, the association between the common vaginal microbiome and female fecundability has yet to be determined. The objective of this study was to find convincing evidence relating to the impact of the vaginal bacterial structure on the fecundability of women planning pregnancy.


We recruited women who took part in the Free Pre-pregnancy Health Examination Project from 13 June 2018 to 31 October 2018 (n = 89, phase I) and from 1 November 2018 to 30 May 2020 (n = 389, phase II). We collected pre-pregnancy vaginal swabs from each subject; then, we followed up each subject to acquire the pregnancy-planning outcome in 1 year. In phase I, 16S rRNA gene sequencing was performed to investigate the vaginal bacterial content between the pregnancy and non-pregnancy groups. These findings were verified in phase II by applying a quantitative real-time polymerase chain reaction for the measurement of the absolute abundance of specific species. Cox models were used to estimate fecundability ratios (FR) for each vaginal microbiome type.


In phase I, 59.6% (53/89) of women became pregnant within 1 year. The principal coordinate analysis showed that the pre-pregnancy vaginal microbial community structures of the pregnant and non-pregnant groups were significantly different (PERMANOVA test, R2 = 0.025, P = 0.049). The abundance of the genus Lactobacillus in the pregnancy group was higher than that of the non-pregnant group (linear discriminant analysis effect size (LDA) > 4.0). The abundance of the genus Gardnerella in the non-pregnant group was higher than those in the pregnant group (LDA > 4.0). In phase II, female fecundability increased with higher absolute loads of Lactobacillus gasseri (quartile Q4 vs Q1, FR = 1.71, 95%CI 1.02–2.87) but decreased with higher absolute loads of Fannyhessea vaginae (Q4 vs Q1, FR = 0.62, 95%CI 0.38–1.00). Clustering analysis showed that the vaginal microbiome of type D (characterized by a higher abundance of Lactobacillus iners, a lower abundance of Lactobacillus crispatus and Lactobacillus gassri) was associated with a 55% reduction of fecundability (FR = 0.45, 95%CI 0.26–0.76) compared with type A (featuring three Lactobacillus species, low Gardnerella vaginalis and Fannyhessea vaginae abundance).


This cohort study demonstrated an association between the pre-pregnancy vaginal microbiome and female fecundability. A vaginal microbiome characterized by a higher abundance of L. iners and lower abundances of L. crispatus and L. gasseri appeared to be associated with a lower fecundability. Further research now needs to confirm whether manipulation of the vaginal microenvironment might improve human fecundability.

Peer Review reports


Infertility has become a severe public health problem; more than 186 million people suffer from infertility worldwide [1]. A reduction of human fecundability not only affects the physical and mental health of pregnancy-planning couples, but also results in a general trend towards an aging population [1, 2]. Despite many efforts to explore the factors that influence human fecundability, there are still many unanswered questions. Previous case-control studies showed that there were some potential differences between infertile and fertile women with regard to the vaginal microbiota and that a low-Lactobacillus vaginal microbiome appeared to be a risk factor for infertility [3, 4]. However, it has proven difficult to determine the causal relationship between these factors, largely because vaginal sampling is not performed before infertility diagnosis. On this basis, our previous study used a prospective design and the Chinese National Free Pre-conception Check-up Project database to illustrate that a poor vaginal microenvironment was associated with a longer time-to-pregnancy (TTP) in normal healthy women [5]. In another study, Lokken et al. found that women with bacterial vaginosis (BV) may be at an increased risk of sub-fecundity in a Kenyan pregnancy-planning cohort [6].

However, traditional microscopic examination cannot reveal the structural characteristics of the vaginal microbiome [7], thus limiting the study of fertility-related vaginal species. Common vaginal Lactobacillus species include Lactobacillus crispatus (L. crispatus), Lactobacillus iners (L. iners), and Lactobacillus gasseri (L. gasseri). Different vaginal Lactobacillus species have been found to exert different health effects over recent years; for example, when the microbiota is dominated by L. iners, there is a higher likelihood of a shift towards dysbiosis [8, 9]. However, routine tests cannot distinguish different Lactobacillus species, and there is no evidence to show which Lactobacillus is most beneficial for female fecundability from prospective studies. Furthermore, recent studies have identified substantial divergences in the vaginal microbiome structure between healthy individuals from different races and ethnicity [10]. The incidences of vaginal communities with several non-Lactobacillus species gradually increase from European to Asian to African populations [11]. Therefore, it is vital that we investigate the association between female fecundability and the vaginal microbiome in Chinese cohorts. In the present study, we recruited a pregnancy-planning cohort of subjects to investigate the association between female fecundability and the vaginal microbiome, thus providing a new concept for female fertility intervention strategies.


Study population

Between 13 June 2018 and 30 May 2020, all couples who took part in the Free Pre-pregnancy Health Examination Project in the Maternal and Child Center of Gulou district in Nanjing, China, were invited to join this study cohort. The inclusion criteria were as follows: (1) according to the Chinese legal marriageable age, the female needed to be older than 20 years, and the male needed to be older than 22 years, and all of them should be less than 49 years old, and (2) couples who reported that they were ready to become pregnant. The exclusion criteria were as follows: (1) females who had been pregnant when taking part in the project; (2) either partner had been diagnosed with a medical condition unsuitable for pregnancy, including uterine malformation, testicular loss, and Treponema pallidum infection; (3) the women who had some diseases related to fertility, such as endometritis, polycystic ovarian syndrome, uterine fibroids, and pelvic inflammatory disease; (4) women who refused to provide vaginal swabs; (5) women who had used antibiotics in the previous 2 weeks; and (6) women who were lost to follow-up (data available for only baseline, without one visit).

This study was divided into two phases. Phase I was conducted from 13 June 2018 to 31 October 2018; 106 women participated in this phase. This phase featured a nested case-control design. All participants were divided into pregnant or non-pregnant groups according to pregnancy outcomes after 1 year of participation. The potential biomarkers for bacteria that were identified in phase I were then detected in phase II, and their associations with TTP were verified by a cohort design. Phase II was conducted from 1 November 2018 to 30 May 2020. In total, 500 women were invited, and 495 women signed the informed consent; 23 women refused to provide vaginal swabs because of a menstrual period, and 51 women withdrew without the first visit. The final study included 89 women in phase I and 332 women in phase II. Further details are shown in Fig. 1.

Fig. 1
figure 1

Study design and participants

Sample size estimations were performed when the research protocol was first designed (further details are provided in Additional file 1 [12]). All participants signed an informed consent form, and the study was approved by the Ethics Committee of Zhongda Hospital (Reference: 2018ZDSYLL116-P01).

Acquisition of data for covariate analysis

At baseline, we performed a unified epidemiological survey for every female so that we could collate their age, the age difference within couples, educational level (high school and below/higher education and above), occupation (workers/office clerk/others), pregnancy history (yes/no), and menstrual status (regular or not). A regular menstrual cycle was defined as a cycle length of 21–35 days [13]. All data were acquired by one professional nurse to ensure that the information was credible.

Outcome assessment

All females would be contacted by the medical staff (by telephone) every 3 months. The main outcome was clinical pregnancy, as self-reported by the subjects; this needed to be confirmed by a pelvic ultrasound scan in the hospital. TTP was the interval between the dates of the last menstrual period (LMP) obtained at follow-up and before conception (pregnant within 1 year) or the last follow-up call (if not pregnant). TTP in months was calculated by TTP in days/30. The TTP in cycles was calculated by TTP in days/average length of the menstrual cycle. These indices were all round up to an integer.

Vaginal swab collection and nucleic acid sequencing

Women were placed in a lithotomy position under standard operating procedures; then, gynecologists obtained two vaginal swabs with the aid of a sterile speculum. Swabs were rotated three times on the vaginal fornix to uniformly scrape any discharge and were transported to the laboratory within 4 h. Vaginal cleanliness was graded via microscopic examination of cervical smears. Grades I and II were regarded as normal while grades III and IV were regarded as disordered; these gradings were in accordance with the Chinese standards [14]. The second swab was stored in a dry tube at − 80 °C to await nucleic acid extraction. The detailed procedures for DNA extraction and 16S rRNA gene sequencing are described in Additional file 1. In brief, the swabs were eluted with PBS buffer and the TIANamp Bacterial DNA Kit (Tiangen Biochemical Technology, Beijing, China) as used to extract and purify nucleic acids. The V3–V4 region of the 16S rRNA gene was amplified and sequenced on an Illumina HiSeq 2500 platform (Beijing Biomarker Technologies Co. Ltd., Beijing, China). The raw sequencing data is stored in the figshare platform [15]. Then, sequencing data were processed using a standard procedure (Additional file 1). Denoised sequences were clustered using USEARCH (version 10.0), and tags with ≥ 97% similarity were regarded as an operational taxonomic unit (OTU). Representative sequences were annotated through the National Center for Biotechnology Information (NCBI) dataset using the QIIME software ( The numbers of reads for each sample were normalized according to the sample with the least sequence. All bioinformatics analyses were completed on the Biomarker BioCloud platform (

We used the QIIME2 software ( to calculate α diversity for the vaginal microbiome, including Shannon, Simpson, Chao1, and ACE indices. These indices reflect the richness and diversity of the microbial community structure [16]. Based on the matrix of relative abundance of bacteria, we estimated the Jaccard Distance Index and then performed principal coordinate analysis (PCoA) to intuitively display different groups of the microbiome. Next, we performed a permutational multivariate analysis of variance (PERMANOVA) to test for statistical significance. Linear discriminant analysis (LDA) effect size (LEfSe) was used to identify potential biomarkers among different groups, which should meet P < 0.05, adjusted P < 0.01, and LDA > 4.0 [17]. The Benjamini-Hochberg (BH) method was used to adjust P values to minimize the false discovery rate when performing multiple comparisons.

Assessment of absolute bacterial loads and the clustering of microbial communities

Quantitative real-time polymerase chain reaction (qPCR) was used to measure the absolute loads of specific vaginal bacteria, including L. crispatus, L. gasseri, L. iners, Gardnerella vaginalis (G. vaginalis), and Fannyhessea vaginae (F. vaginae, also called Atopobium vaginae). We used specific primers that had been verified by previous studies (further details are given in Additional file 2: Table S1 [18,19,20,21]). We used the NCBI Blast database to predict the amplified products, and specific plasmids were synthesized by Sangon Biotech Company (Shanghai, China). The copy number concentration of the plasmid was calculated using the following formula: copies/mL = 6.02 × 1023 × 10−6 × concentration (ng/μL)/(fragment length × 660). Then, 10-fold serial dilutions of the plasmid were prepared and subjected to qPCR to obtain a standard curve. In order to reduce variations in the total bacterial load from different swab samples, the copy number concentrations of the 16S V3–V4 region were also measured; these were then standardized to 1 × 1010 copies/mL for each sample. Under this condition, we measured the absolute abundance of another 5 species.

For each species, we calculated the absolute abundance z-score after the logarithmic transformation of the absolute loads. The clustering of microbial communities was explored with the k-means algorithm which minimizes the error inside the groups and maximizes the distance between clusters. We considered the Euclidean distance metric in our analysis and then tried to use the elbow method to determine the optimum number of clusters [22]. In this method, the slow-down point denotes the optimum number of clusters. Then, we compared the average z-score for specific species among different clusters using variance analysis.

Statistical analysis

All data were uploaded into the EpiData (Version 3.1) software by two independent researchers. The analyses followed a defined approach that was determined before running the models. Continuous variables are described by the mean and standard deviation (SD) (normal distribution) or median and quartile if not distributed normally. The t test or the Kruskal-Wallis test was used to test for the differences between the groups. Categorical variables are described by frequency and percentage; the chi-squared test or Fisher’s exact test was used to compare the distribution between the groups. Missing data were imputed by the multivariate imputation chained equations (MICE) package in the R software [23]. We set up five imputed datasets; the main analysis results were aggregated with Rubin’s rule after appropriate transformation [24]. We performed analyses using the completed case dataset as sensitivity analysis.

Spearman coefficients were calculated to determine the correlation between two relative abundances of bacteria. The Kaplan-Meier (KM) method was used to calculate the cumulative pregnancy rates in different types of microbiomes, and the log-rank test was used to test the differences. Cox models were used to estimate the fecundability ratios (FRs) and their 95% confidence intervals (CIs) for different types of microbiome after adjusting for potential confounding factors. FR reflects the ratio of pregnancies among females with certain characteristics compared with the reference groups; thus, an FR < 1.0 implies a lower fecundability or a longer TTP. All of these analyses were carried out using the R software (version 4.1.0), and two-sided probability values of < 0.05 were deemed to be statistically significant.


The vaginal microbiome and fecundability in phase I

In phase I, the mean age of the participants was 28.66 ± 3.14 years old; most women did not have a history of pregnancy (79/89, 88.76%). In total, 59.6% (53/89) of women achieved pregnancy within 1 year. A comparison of the baseline characteristics between the two groups (pregnant or non-pregnant) revealed that there were no significant differences in terms of the age difference within couples, educational level, occupation, history of pregnancy, and the regularity of menstruation (P > 0.05, Table 1), although the mean age of the pregnancy group was significantly lower than that of the non-pregnancy group (27.98 vs 29.52, P = 0.036).

Table 1 Baseline characteristics of cohort phases I and II

All nucleic acid samples from vaginal swabs were sequenced successfully, and the sequencing depths were sufficient (Additional file 2: Fig. S1). The sequencing quality of all samples was confirmed to be good (all Q20 indices were > 95%, Additional file 2: Table S2). At the genus level, the most common bacteria with the highest abundances were Lactobacilli (mean relative abundance 79.95%), Gardnerella (8.57%), Streptococcus (1.79%), and Atopobium (1.54%) (Additional file 2: Fig. S2). Comparisons of the Shannon, Simpson, Chao1, and ACE indices showed that pre-pregnancy vaginal bacterial diversities were not statistically different when compared between the pregnant and non-pregnant groups (P > 0.05, Additional file 2: Fig. S3). However, PCoA showed that the vaginal microbial community structures of these two groups were slightly different; 2.5% of variations were associated with pregnancy outcomes (PERMANOVA test, R2 = 0.025, P = 0.049, Fig. 2A, B). To further identify the key species, we performed a Lefse analysis; the results showed that the abundance of the Lactobacillales order in the pregnant group was higher than that in the non-pregnant group (LDA > 4.0, average relative abundance 86.33% vs 75.63%). The abundance of the Actinobacteria phylum (8.64% vs 14.01%, LDA > 4.0) and the Gardnerella genus (6.34% vs 11.84%, LDA > 4.0) in the non-pregnant group was higher than those in the pregnant group (Fig. 2C). At the genus level, random forest model analysis identified potential biomarkers for distinguishing pregnancy or non-pregnancy, including Gardnerella, Lactobacillus, and Fannyhessea, with a relatively high Gini index (Fig. 2D). Based on these results, we further compared the relative abundance of this genus among the two groups (Fig. 3). The relative abundance of Gardnerella in the pregnant group was significantly lower than that in the non-pregnant group (P = 0.0029).

Fig. 2
figure 2

Differences in the vaginal microbiome with different pregnancy outcomes. A, B PCoA analysis based on Jaccard distance: PC 1, PC2, and PC3 could explain 33.33%, 9.07%, and 8.58% of the variation, respectively. C Lefse analysis; the threshold of LDA value was 4.0. D The rank of the Gini index from the random forest model. PCoA, principal coordinate analysis; Lefse, linear discriminant analysis effect size

Fig. 3
figure 3

Scatter diagrams showing the relative abundances of the genera between the pregnant and non-pregnant groups. The middle lines represent the median, while the error bars represent the interquartile range. P values were acquired by the Kruskal-Wallis test

Association validation in phase II

Based on the findings from phase I, we further focused on the Gardnerella, Fannyhessea, and Lactobacillus genera. In consideration of the leading role of Lactobacillus in the vaginal microbiome, and the potentially differential effects of different Lactobacillus species, except for the G. vaginalis and F. vaginae, we also detected the three most common Lactobacillus species, including L. cripatus, L. iners, and L. gassri. In total, 332 women were included in the phase II analysis. The mean age was 29.50 ± 3.95 years,, and the mean age difference within couples was 1.35 years. The baseline characteristics of women who were excluded for various reasons were comparable with those included (Additional file 2: Table S3). The absolute loads of L. crispatus, L. gasseri, L. iners, G. vaginalis, and F. vaginae in the vaginal swabs taken at baseline were detected using standard curves (Additional file 2: Fig. S4). The correlation analysis shows that L. crispatus was positively associated with L. gasseri (ρ = 0.56, P < 0.001) and negatively associated with L. iners (ρ = 0.18, P = 0.004). L. iners was negatively associated with L. gasseri (ρ = − 0.14, P = 0.009). G. vaginalis was positively associated with F. vaginae (ρ = 0.31, P < 0.001). The associations between the remaining species were not statistically significant (P > 0.05), the details were shown in Additional file 2: Table S4.

The absolute loads of these species at baseline were then divided into four groups (Q1–Q4) based on the interquartile range. Cox models were then used to estimate the association between these four groups and female fecundability (Additional file 2: Table S5). Data showed that female fecundability increased with higher absolute loads of L. gasseri (Q4 vs Q1, FR = 1.71, 95%CI 1.02–2.87) but decreased with higher absolute loads of F. vaginae (Q4 vs Q1, FR = 0.62, 95%CI 0.38–1.00). Other species were not statistically associated with female fecundability (P > 0.05).

Vaginal microbiome type and fecundability

Based on the absolute loads of five bacterial species, we found that the vaginal microbiome clustered into five types (A–E). Figure 4A shows that type A (24.7%, 82/332) featured three high abundant Lactobacillus species, with a low abundance of G. vaginalis and F. vaginae. Type B (13.0%, 43/332) was characterized by a high abundance of G. vaginalis and a low abundance of the three Lactobacillus species. Type C (20.2%, 67/332) was characterized by a high abundance of F. vaginae and a modest abundance of the three Lactobacillus species. Type D (22.3%, 74/332) was characterized by a high abundance of L. iners abundance and low abundances of the other four species. Type E (19.9%, 66/332) was characterized by high abundances of L. crispatus and L. gasseri and low abundances of the other three species. The z-scores of the absolute abundance of specific species grouped by different types are shown in Additional file 2: Table S6. Figure 4B showed that women with different types of vaginal microbiome had different levels of fecundability (log-rank test, P = 0.014). Women with the type A vaginal microbiome had the highest cumulative pregnancy rate (12th month, 54.7%, 95%CI 41.2–65.1%) while women with the type D vaginal microbiome had the lowest cumulative pregnancy rate (12th month, 28.2%, 95%CI 16.8–38.1%). Types B, C, and E had similar cumulative pregnancy rates (12th month, 44.5% vs 45.0% vs 45.2%).

Fig. 4
figure 4

Types of vaginal microbiome and fecundability. A Clustering analysis based on the absolute loads of five bacterial species, based on the z-score of the log10(absolute load). B Kaplan-Meier plots for the cumulative pregnancy rate across different vaginal microbiome types

After adjusting for potential confounding factors, including female age, educational level, occupation, pregnancy history, vaginal cleanliness grading, and the age difference between couples, we found that compared to women with a type A vaginal microbiome, women with a type D microbiome showed a 55% reduction in fecundability (model A, FR = 0.45, 95%CI 0.26–0.76). This association was robust irrespective of whether the TTP was determined by month or menstrual cycle (model B, FR = 0.45, 95%CI 0.27–0.77). Women with vaginal microbiome types B, C, and E had lower tendencies of fecundability compared with type A, although these differences were not statistically significant (Table 2). Sensitivity analysis based on the dataset of completed cases was consistent with these primary results (Additional file 2: Table S7).

Table 2 Fecundability ratios for different vaginal microbiome types


Our two-stage cohort study aimed to demonstrate the association between the pre-pregnancy vaginal microbiome and female fecundability among healthy pregnancy-planning women. The results supported this association and suggested that the higher relative abundances of L. crispatus and L. gasseri were positively associated with female fecundability, while a higher relative abundance of F. vaginae appeared to be detrimental to female fecundability. From a community perspective, a vaginal microbiome characterized by a higher abundance of L. iners and a lower abundance of L. crispatus and L. gasseri appears to be associated with a lower fecundability. This study provides more credible evidence than previous studies in that we demonstrated that it is possible to predict female fecundability by assessing the pre-pregnancy vaginal microbiome.

Many studies have focused on the damaging effects of BV on infertility and, in particular, tubal infertility. However, the case-controlled design of these studies limited causal inference [4]. It is difficult to collect vaginal swabs before a patient is diagnosed as being infertile. Furthermore, the precise role played by the vaginal microbiome in cases of non-tubal infertility, and in particular, unexplained infertility, remains unknown [25]. While some studies found that women with a better vaginal environment appeared to have a higher chance of successful embryo implantation when undergoing in vitro fertilization (IVF) [26], a recent meta-analysis did not identify a significant impact of BV on the live birth rate or clinical pregnancy rate in women undergoing IVF [27]. Thus far, the screening and treatment of BV before attempting conception remain a possibility but are not a widely accepted consensus [28]. A Kenyan cohort study provided a clue that BV appeared to be negatively associated with female fecundability [6]; however, the microscopy-based vaginal microenvironment assessment could not fully reflect the status of the vaginal microbiota, especially considering the diverse effects of different Lactobacillus spp. [8]. Next-generation sequencing technology provides an opportunity to explain many unknown problems [29]. A retrospective case-control study, with a small sample size, revealed that major vaginal microbiota clusters could not be grouped by infertility status [30]. The present, prospective study is the first to demonstrate the different effects of L. crispatus, L. gasseri, and L. iners, on female fecundability.

Lactobacillus has always been regarded as a biomarker for a healthy vaginal microenvironment. One of the most important reasons for this is that this species can produce lactic acid to maintain a locally acidic environment to prevent pathogen colonization [31]. Local inflammation, caused by disordered vaginal microbiota, may lead to reduced levels of fertility; higher levels of cervical interleukin (IL)-1b, IL-6, and IL-8 cytokines have been reported to be associated with infertility [32]. L. iners was also shown to produce a type of protein toxin (inerolysin) that might play a potential role in the pathogenesis of bacterial vaginosis [33]. This mechanism might explain our current findings in that the vaginal microbiome characterized by a higher abundance of L. iners and a lower abundance of L. crispatus and L. gasseri is the only type that could reduce fecundability when compared with other microbiome types. However, studying the effects of individual L. iners seems less important than studying the effects of the microbiome as a whole. Our study found a type A microbiome (characterized by three Lactobacillus species, including L. iners) was the best type for fecundability. Thus, a comprehensive assessment of vaginal microbial structure seems necessary, especially with regard to different Lactobacillus species. Furthermore, Li et al. demonstrated that the vaginal probiotic L. crispatus greatly affected sperm activity and could also reduce pregnancies via its adhesive properties; this might account for some cases of unexplained infertility [34]. In the present study, we identified the positive effect of L. crispatus on fecundability; this suggests that it is important to investigate the dual role of Lactobacillus in future research.

G. vaginalis and F. vaginae have always been regarded as BV-related bacteria [35]; however, the direct association between these species and female fecundability remains unknown. Recent molecular analyses of protein-coding genes demonstrated that G. vaginalis consists of at least four distinct sub-species, although not all of these sub-species cause clinical symptoms [36]. Thus, an asymptomatic carrier of G. vaginalis might be a potential reason for unexplained infertility. Meanwhile, the presence of F. vaginae would lead to the creation of biofilms in the vagina and would resist some antimicrobial substances [37]; however, the effects of these biofilms on sperm motility have yet to be investigated.

Our study was strengthened by the two-stage cohort design. Although many statistical efforts had been carried out, it is possible that the omics study may have led to false positives [38]. Thus, the mutual verification of the results from our two phases increased the robustness of our findings. Compared with a register-based cohort [5], our refined cohort guaranteed the accuracy of TTP estimation. In addition, our novel strategy for defining the vaginal microbiome type provides a new concept for studying the vaginal microflora in the future. However, our study was also associated with some limitations that need to be considered. First, it was very difficult to collect data relating to sperm quality from the couples who were planning pregnancy; this is a vital confounding factor for pregnancy outcome. This potential confounding effect is a critical problem that needs to be solved in future research. Secondly, the sample size was insufficient in phase II, especially when investigating new types of vaginal microbiome; several types showed a decreasing trend for cumulative pregnancy rate, but without statistical significance. Thirdly, the vaginal microbiota appears to change dynamically with menstruation [39]; one sampling event is not able to fully reflect the characteristics of the vaginal microbiota. Fourthly, all of the samples were obtained from a single center; this could influence the extrapolation of our conclusions, especially when considering the variation of vaginal microbiota across different races [40]. Finally, we just only focused on three genera in phase II; further studies should focus on other potential bacteria species, in order to gain a more comprehensive understanding of vaginal microbiome.


This cohort study demonstrated an association between the pre-pregnancy vaginal microbiome and female fecundability. A vaginal microbiome characterized by a higher abundance of L. iners and a lower abundance of L. crispatus and L. gasseri appears to be associated with a lower fecundability. Further research now needs to confirm whether manipulation of the vaginal microenvironment might improve human fecundability.

Availability of data and materials

The raw sequencing data is stored in the figshare platform: Hong X and Wang B. The raw data for the analysis of the association between the pre-pregnancy vaginal microbiome and time-to-pregnancy. figshare 2022:





Bacterial vaginosis


Confidence intervals

F. vaginae :

Fannyhessea vaginae


Fecundability ratios

G. vaginalis :

Gardnerella vaginalis


In vitro fertilization



L. crispatus :

Lactobacillus crispatus

L. gasseri :

Lactobacillus gasseri

L. iners :

Lactobacillus iners


Linear discriminant analysis effect size


Last menstrual period


Multivariate imputation chained equations


National Center for Biotechnology Information


Operational taxonomic unit


Principal coordinate analysis


Permutational multivariate analysis of variance


Quantitative real-time polymerase chain reaction


Standard deviation




  1. Inhorn MC, Patrizio P. Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. Human Reprod Update. 2015;21(4):411–26.

    Article  Google Scholar 

  2. Sun H, Gong TT, Jiang YT, Zhang S, Zhao YH, Wu QJ. Global, regional, and national prevalence and disability-adjusted life-years for infertility in 195 countries and territories, 1990-2017: results from a global burden of disease study, 2017. Aging. 2019;11(23):10952–91.

    Article  Google Scholar 

  3. Campisciano G, Florian F, D’Eustacchio A, Stanković D, Ricci G, De Seta F, et al. Subclinical alteration of the cervical-vaginal microbiome in women with idiopathic infertility. J Cell Physiol. 2017;232(7):1681–8.

    Article  CAS  Google Scholar 

  4. Hong X, Ma J, Yin J, Fang S, Geng J, Zhao H, et al. The association between vaginal microbiota and female infertility: a systematic review and meta-analysis. Arch Gynecol Obstet. 2020;302(3):569–78.

    Article  Google Scholar 

  5. Hong X, Zhao J, Zhu X, Dai Q, Zhang H, Xuan Y, et al. The association between the vaginal microenvironment and fecundability: a register-based cohort study among Chinese women. BJOG. 2022;129(1):43–51.

    Article  CAS  Google Scholar 

  6. Lokken EM, Manhart LE, Kinuthia J, Hughes JP, Jisuvei C, Mwinyikai K, et al. Association between bacterial vaginosis and fecundability in Kenyan women planning pregnancies: a prospective preconception cohort study. Human Reprod (Oxford, England). 2021;36(5):1279–87.

    Article  Google Scholar 

  7. Berman HL, McLaren MR, Callahan BJ. Understanding and interpreting community sequencing measurements of the vaginal microbiome. BJOG. 2020;127(2):139–46.

    Article  CAS  Google Scholar 

  8. Petrova MI, Reid G, Vaneechoutte M, Lebeer S. Lactobacillus iners: friend or foe? Trends Microbiol. 2017;25(3):182–91.

    Article  CAS  Google Scholar 

  9. Verstraelen H, Verhelst R, Claeys G, De Backer E, Temmerman M, Vaneechoutte M. Longitudinal analysis of the vaginal microflora in pregnancy suggests that L. crispatus promotes the stability of the normal vaginal microflora and that L. gasseri and/or L. iners are more conducive to the occurrence of abnormal vaginal microflora. BMC Microbiol. 2009;9:116.

    Article  Google Scholar 

  10. Gupta VK, Paul S, Dutta C. Geography, ethnicity or subsistence-specific variations in human microbiome composition and diversity. Front Microbiol. 2017;8:1162.

    Article  Google Scholar 

  11. Zhou X, Brown CJ, Abdo Z, Davis CC, Hansmann MA, Joyce P, et al. Differences in the composition of vaginal microbial communities found in healthy Caucasian and black women. ISME J. 2007;1(2):121–33.

    Article  CAS  Google Scholar 

  12. Kelly BJ, Gross R, Bittinger K, Sherrill-Mix S, Lewis JD, Collman RG, et al. Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA. Bioinformatics (Oxford, England). 2015;31(15):2461–8.

    Article  CAS  Google Scholar 

  13. Doi SA, Al-Zaid M, Towers PA, Scott CJ, Al-Shoumer KA. Irregular cycles and steroid hormones in polycystic ovary syndrome. Human Reprod (Oxford, England). 2005;20(9):2402–8.

    Article  CAS  Google Scholar 

  14. Yu F, Tang YT, Hu ZQ, Lin XN. Analysis of the vaginal microecological status and genital tract infection characteristics of 751 pregnant women. Med Sci Monit. 2018;24:5338–45.

    Article  CAS  Google Scholar 

  15. Hong X, Wang B. The raw data for the analysis on the association between the pre-pregnancy vaginal microbiome and time-to-pregnancy. figshare. 2022.

  16. Hall M, Beiko RG. 16S rRNA gene analysis with QIIME2. Methods Mol Biol (Clifton, NJ). 2018;1849:113–29.

    Article  CAS  Google Scholar 

  17. Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12(6):R60.

    Article  Google Scholar 

  18. Byun R, Nadkarni MA, Chhour KL, Martin FE, Jacques NA, Hunter N. Quantitative analysis of diverse Lactobacillus species present in advanced dental caries. J Clin Microbiol. 2004;42(7):3128–36.

    Article  CAS  Google Scholar 

  19. De Backer E, Verhelst R, Verstraelen H, Alqumber MA, Burton JP, Tagg JR, et al. Quantitative determination by real-time PCR of four vaginal Lactobacillus species, Gardnerella vaginalis and Atopobium vaginae indicates an inverse relationship between L. gasseri and L. iners. BMC Microbiol. 2007;7:115.

    Article  Google Scholar 

  20. Zariffard MR, Saifuddin M, Sha BE, Spear GT. Detection of bacterial vaginosis-related organisms by real-time PCR for Lactobacilli, Gardnerella vaginalis and Mycoplasma hominis. FEMS Immunol Med Microbiol. 2002;34(4):277–81.

    Article  CAS  Google Scholar 

  21. Hong X, Qin P, Yin J, Shi Y, Xuan Y, Chen Z, et al. Clinical manifestations of polycystic ovary syndrome and associations with the vaginal microbiome: a cross-sectional based exploratory study. Front Endocrinol. 2021;12:662725.

    Article  Google Scholar 

  22. Pandey A, Malviya AK. Enhancing test case reduction by k-means algorithm and elbow method. Int J Comput Sci Eng. 2018;6(6):299–303.

    Google Scholar 

  23. Liu Y, De A. Multiple imputation by fully conditional specification for dealing with missing data in a large epidemiologic study. Int J Stat Med Res. 2015;4(3):287–95.

    Article  Google Scholar 

  24. Marshall A, Altman DG, Holder RL, Royston P. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9:57.

    Article  Google Scholar 

  25. Mol BW, Tjon-Kon-Fat R, Kamphuis E, van Wely M. Unexplained infertility: is it over-diagnosed and over-treated? Best Pract Res Clin Obstet Gynaecol. 2018;53:20–9.

    Article  Google Scholar 

  26. Koedooder R, Singer M, Schoenmakers S, Savelkoul PHM, Morré SA, de Jonge JD, et al. The vaginal microbiome as a predictor for outcome of in vitro fertilization with or without intracytoplasmic sperm injection: a prospective study. Human Reprod (Oxford, England). 2019;34(6):1042–54.

    Article  CAS  Google Scholar 

  27. Haahr T, Zacho J, Bräuner M, Shathmigha K, Skov Jensen J, Humaidan P. Reproductive outcome of patients undergoing in vitro fertilisation treatment and diagnosed with bacterial vaginosis or abnormal vaginal microbiota: a systematic PRISMA review and meta-analysis. BJOG. 2019;126(2):200–7.

    Article  CAS  Google Scholar 

  28. Ravel J, Moreno I, Simón C. Bacterial vaginosis and its association with infertility, endometritis, and pelvic inflammatory disease. Am J Obstet Gynecol. 2021;224(3):251–7.

    Article  CAS  Google Scholar 

  29. Greenbaum S, Greenbaum G, Moran-Gilad J, Weintraub AY. Ecological dynamics of the vaginal microbiome in relation to health and disease. Am J Obstet Gynecol. 2019;220(4):324–35.

    Article  Google Scholar 

  30. Wee BA, Thomas M, Sweeney EL, Frentiu FD, Samios M, Ravel J, et al. A retrospective pilot study to determine whether the reproductive tract microbiota differs between women with a history of infertility and fertile women. ANZJOG. 2018;58(3):341–8.

    PubMed  Google Scholar 

  31. Tachedjian G, Aldunate M, Bradshaw CS, Cone RA. The role of lactic acid production by probiotic Lactobacillus species in vaginal health. Res Microbiol. 2017;168(9-10):782–92.

    Article  CAS  Google Scholar 

  32. Spandorfer SD, Neuer A, Giraldo PC, Rosenwaks Z, Witkin SS. Relationship of abnormal vaginal flora, proinflammatory cytokines and idiopathic infertility in women undergoing IVF. J Reprod Med. 2001;46(9):806–10.

    CAS  PubMed  Google Scholar 

  33. Rampersaud R, Planet PJ, Randis TM, Kulkarni R, Aguilar JL, Lehrer RI, et al. Inerolysin, a cholesterol-dependent cytolysin produced by Lactobacillus iners. J Bacteriol. 2011;193(5):1034–41.

    Article  CAS  Google Scholar 

  34. Li P, Wei K, He X, Zhang L, Liu Z, Wei J, et al. Vaginal probiotic Lactobacillus crispatus seems to inhibit sperm activity and subsequently reduces pregnancies in rat. Front Cell Dev Niol. 2021;9:705690.

    Article  Google Scholar 

  35. Menard JP, Fenollar F, Henry M, Bretelle F, Raoult D. Molecular quantification of Gardnerella vaginalis and Atopobium vaginae loads to predict bacterial vaginosis. Clin Infect Dis. 2008;47(1):33–43.

    Article  CAS  Google Scholar 

  36. Schellenberg JJ, Patterson MH, Hill JE. Gardnerella vaginalis diversity and ecology in relation to vaginal symptoms. Res Microbiol. 2017;168(9-10):837–44.

    Article  CAS  Google Scholar 

  37. Mendling W, Palmeira-de-Oliveira A, Biber S, Prasauskas V. An update on the role of Atopobium vaginae in bacterial vaginosis: what to consider when choosing a treatment? A mini review. Arch Gynecol Obstet. 2019;300(1):1–6.

    Article  Google Scholar 

  38. Lay JO Jr, Liyanage R, Borgmann S, Wilkins CL. Problems with the “omics”. TrAC. 2006;25(11):1046–56.

    CAS  Google Scholar 

  39. dos Santos L, Santiago G, Cools P, Verstraelen H, Trog M, Missine G, et al. Longitudinal study of the dynamics of vaginal microflora during two consecutive menstrual cycles. PloS One. 2011;6(11):e28180.

    Article  Google Scholar 

  40. Hudson PL, Ling W, Wu MC, Hayward MR, Mitchell AJ, Larson J, et al. Comparison of the vaginal microbiota in postmenopausal Black and White women. J Infect Dis. 2020;224(11):1945–9.

    Article  Google Scholar 

Download references


We would like to express our sincere gratitude to the health workers and participants in the project for their considerable efforts and collaboration. We thank the International Science Editing for editing this manuscript.


This research was supported by the National Natural Science Foundation of China (Grant No. 81872634); the National Key Research and Development Program of China (Grant No. 2016YFC1000307); the Scientific Research Project of Jiangsu Provincial Health Commission (Grant No. ZD2021047); the National Human Genetic Resources Sharing Service Platform (Grant No. 2005DKA21300); the National Population and Reproductive Health Science Data Center (Grant No. 2005DKA32408), People’s Republic of China; and the Fundamental Research Funds for the Central Universities (No. 2242022R20067).

Author information

Authors and Affiliations



Conceptualization: HX and WB. Formal analysis: HX and YJ. Funding acquisition: WB, ZJ, MX, and HX. Methodology: HX, YJ, ZF, and WW. Supervision: WB, YH, and DX. Writing: HX. Review and editing: WB, ZJ, and MX. The authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jun Zhao, Xu Ma or Bei Wang.

Ethics declarations

Ethics approval and consent to participate

All participants signed an informed consent form, and the study was approved by the Ethics Committee of Zhongda Hospital (Reference: 2018ZDSYLL116-P01).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplementary information.

Additional file 2: Table S1.

The specific primers used for qPCR. Table S2. The sequencing quality for all samples. Table S3. The baseline characteristics for the women included and excluded in Phase II. Table S4. Spearman’s correlation coefficients between different species. Table S5. Fecundability ratios for the absolute loads of different species. Table S6. Z scores for the absolute abundance of species grouped by cluster A~E. Table S7. Fecundability ratios for different vaginal microbiome types based on a complete case dataset. Fig. S1. Rarefaction curves for OTU number. Fig. S2. Histogram showing the relative abundance of different genera. Fig. S3. Scatter diagram showing different α diversities between pregnancy and non-pregnancy groups. Fig. S4. Standard curves for the detection of different species by qPCR.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hong, X., Zhao, J., Yin, J. et al. The association between the pre-pregnancy vaginal microbiome and time-to-pregnancy: a Chinese pregnancy-planning cohort study. BMC Med 20, 246 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: