Sample collection and processing
A total of 34 fresh endometrial tissues (Control = 14, Endometriosis = 20) and 26 blood samples (Control = 9, Endometriosis = 17) were collected from women with and without endometriosis undergoing surgery (total n = 60). Clinical features of the patients (cycle phase, diagnosis, and stage of disease) are listed in Additional file 1: Table S1. Diagnosis was made by the physician and pathologists at the University of California, San Francisco (UCSF), and stage of disease was determined following the rASRM classification system (stages I and II were determined as mild and III and IV as severe stages) . Women without visualized endometriosis at the time of surgery or without a history of endometriosis were defined as controls. Women with any type of cancer and/or endometrial hyperplasia were also excluded. Some patients exhibited non-malignant gynecologic disorders such as leiomyomas or uterine polyps. Cycle phase was determined by following Noyes et al. system of classification of endometrial histology . Clinical features of the patients were collected only by authorized personnel by using the REDCap Database, after informed written consent. Only patients of reproductive age (18–49 years old) were included in the study. In addition, patients presenting with immune-related comorbidities, such as systemic lupus erythematosus (SLE), endometritis, or other immune disorders, were excluded, as well as those positive for HIV, HVB, and HVC. Moreover, patients were not exposed to hormone therapies for at least 3 months prior to biospecimen collection. Finally, women under any treatment containing iodine were also excluded from the study, as this element interferes with the CyTOF instrument. All samples were obtained between 2019 and 2021 under the auspices of the UCSF Institutional Review Board Procotol #: IRB#10-03964, using the WERF EPHect standardized protocols for tissue collection, and processing, and clinical annotation . All patient data were de-identified and followed HIPAA and the Convention of the Declaration of Helsinki.
Endometrium was obtained either by endometrial biopsy using a Pipelle catheter (CooperSurgical, Trumbull, CT, USA) or from hysterectomy specimens. Tissues were placed into transport medium and processed within 5 h following collection where they were first washed with serum containing media (SCM) and then digested mechanically and enzymatically using a mix of collagenase IV and hyaluronidase, as previously described . After 1 h of digestion at 37 °C under rotation, for live/dead discrimination, samples were processed for incorporation of cisplatin and fixed for further usage. Briefly, the single-cell suspension was washed with FACS/EDTA buffer (PBS supplemented with 2% FBS and 2mM EDTA). Cells were counted, and an appropriate amount of cisplatin (Fluidigm, South San Francisco, CA, USA) (25 mM per 1–6 million cells) was added to the suspension (4ml PBS/EDTA per 1–6 million cells) for exactly 60 s at room temperature (RT). Then the cells were quenched with CyFACS (metal contaminant-free PBS (Rockland, Pottstown, PA, USA) supplemented with 0.1% BSA and 0.1% sodium azide). Finally, cells were fixed with 1.6% formaldehyde for 10 min, washed three times in CyFACS, and stored at – 80 °C until further use.
Peripheral blood mononuclear cells (PBMCs)
Blood was collected in collection tubes containing anticoagulant acid citrate dextrose (ADC) Solution B (Fisher Scientific, Hampton, NH, USA). Ficoll (Stemcell Technologies, Vancouver, Canada) was slowly added to the bottom of the blood in a falcon tube at a ratio of 2:1. The samples were then centrifuged at 2000 rpm for 30 min at RT. After centrifugation, the supernatant was removed and the PBMC layer was carefully collected. PBMCs were washed twice with FACS buffer, and the number of cells was then counted. The same protocol above was used to incorporate cisplatin, fix, and store the cells.
First, we designed a CyTOF broad panel to identify the important cell types in endometrial tissue of women with endometriosis compared to controls. This panel consisted of 42 markers and is shown in Additional file 2: Table S2. Then, we designed a 40-parameter CyTOF panel that includes mostly myeloid surface markers as well as functional markers, including efferocytosis and phagocytosis, activation, and inhibition markers (Additional file 2: Table S2). For the broad panel, all 42 antibodies were conjugated in house. For the myeloid panel, sixteen of the 40 antibodies required in-house conjugation to their corresponding metal isotope. Metals were conjugated according to the manufacturer’s instructions (Fluidigm, South San Francisco, CA, USA). Briefly, this process comprises loading the metal to a polymer (incubation of 1 h at RT). The unconjugated antibody is transferred into a 50-kDA Amicon Ultra 500 V-bottom filter (Fisher Scientific, Hampton, NH, USA) and reduced at 37 °C with 1:125 dilution of Tris (2-carboxyethyl) phosphine hydrochloride (TCEP) (Thermo Fisher, Waltham, MA, USA) for 30 min. Then, the column is washed twice with buffer C (Fluidigm, South San Francisco, CA, USA) and the metal-loaded polymer is suspended in 200 μl of C-buffer in the 3-kDA Amicon Ultra 500-ml V-bottom filter (Fisher Scientific, Hampton, NH, USA). The suspension is then transferred to the 50-kDa filter containing the antibody and incubated for 1.5 h at 37 °C. After the incubation time, antibodies are washed three times with W-buffer (Fluidigm, South San Francisco, CA, USA) and quantified for protein content by Nanodrop. Once the concentration was determined, the antibodies were resuspended with Antibody Stabilizer (Boca Scientific, Dedham, MA, USA) at a concentration of 0.2 mg/ml and stored at 4 °C. The rest of the antibodies were commercially available (Fluidigm, South San Francisco, CA, USA). Optimal concentrations of all antibodies were performed by different rounds of titrations.
Barcoding and cell staining with metal antibodies
The staining protocol was optimized to use each antibody in aliquots of 6 million cells as previously described . Samples were thawed and washed with FACS buffer. Then, cells were counted and since some samples had fewer than 6 million cells, they were barcoded before the staining with the antibodies, following the manufacturer’s instructions (Fluidigm, South San Francisco, CA, USA). Briefly, each sample was incubated with 10 μl of each barcode and perm buffer (Fluidigm, South San Francisco, CA, USA) for 30 min at RT, and then samples were combined and split into different tubes for the first day of staining. For the staining, samples were blocked using rat, mouse, and human serum for 15 min on ice. They were then washed and stained with the primary cocktail of antibodies for 45 min at 4 °C. After this incubation time, cells were washed and fixed using 2% paraformaldehyde (PFA) diluted in CyPBS. Cells were incubated overnight at 4 °C and the next day were washed with perm buffer (Fluidigm, South San Francisco, CA, USA), washed with CyPBS, and blocked with rat and mouse serum for 15 min on ice. They were then washed, and the intracellular staining was performed. Cells were resuspended with the intracellular cocktail of antibodies for 45 min on ice and were washed and incubated for 20 min at RT with Ir-intercalator (Biolegend CNS, San Diego, CA, USA), prepared at a dilution of 1:500 in 2% fresh PFA. After the incubation time, cells were washed and kept at 4 °C overnight. Finally, on the third day, cells were washed with cell staining media (CSM, Fluidigm, South San Francisco, CA, USA), then with water, and then with cell acquisition solution (CAS, Fluidigm, South San Francisco, CA, USA) at RT. Subsequently, cells were counted, resuspended in 1× EQTM calibration beads (Fluidigm, South San Francisco, CA, USA) and CAS and samples were run in the CyTOF®2 instrument (Fluidigm, South San Francisco, CA, USA).
The fcs. files obtained from the instrument were concatenated, normalized to EQTM calibration beads, and de-barcoded using CyTOF software (Fluidigm, South San Francisco, CA, USA). This study is comprised of eight different sample groups: control (Ctrl) and endometriosis (Endo) eutopic endometrium (EM) in the proliferative (PE) and secretory (SE) phases (Ctrl_EM_PE, Ctrl_EM_SE, Endo_EM_PE, Endo_EM_SE) and control and blood (PBMCs) in the proliferative and secretory phases (Ctrl_PBMC_PE, Ctrl_PBMC_SE, Endo_PBMC_PE, Endo_PBMC_SE). Normalized data from the broad panel were imported to FlowJo (BD, Franklin Lakes, NJ, USA) to perform manual gating, and we performed unsupervised analysis of the manually gated CD45+ cells. We also performed manual gating of the different populations obtained from the focused panel and 8 populations were obtained and are shown in Additional file 3: Fig. S1. Then, unsupervised analysis of the manually gated myeloid cells of interest (including macrophages, monocytes, dendritic cells, and plasmacytoid dendritic cells) was also performed. Finally, statistical analyses for specific markers and populations of the focused panel were performed by manual gating to validate the results from the unsupervised analysis. The datasets generated and analyzed during the current study are available in the Dryad repository (doi:10.7272/Q6Q52MVQ).
Data and statistical analysis
Samples included in the study
The broad panel included a total of 17 endometrial samples (4 controls in the proliferative phase, 2 controls in the secretory phase, 6 from endometriosis patients (cases) in the proliferative phase and 5 cases in the secretory phase). For the focused panel, the endometrial data included 13 control samples (9 in the PE and 4 in SE phases, respectively) and 18 samples from women with endometriosis, which correspond to 13 in the PE phase (8 mild and 5 severe stages, respectively) and 5 in the SE phase (all mild stage of disease). In the case of the PBMCs, data from the focused panel included 9 control samples (6 in the PE and 3 in the SE phase) and 17 disease samples, corresponding to 13 in the PE phase (8 mild stage and 5 severe stage) and 4 in SE phase (3 mild and one severe). Note that some samples were used for both panels, totally 60 in the study (34 endometrial tissues (n = 20 control and n = 14 endometriosis cases) and 26 blood samples (n = 17 controls and n = 9 endometriosis cases).
The broad panel included a total of 17 endometrial samples. Endometrial samples with large numbers of cells were downsized by random cell selection to a number of cells (169,599) per sample compatible with the system’s memory limits and computational efficiency. The Seurat R package for single-cell analysis  was used to identify clusters of cells. After combining samples, a total of 2,223,274 endometrial cells were subjected to Seurat clustering, using the levels of the 42 CyTOF markers as expression value to create Seurat objects. CyTOF clusters were identified using a shared nearest neighbor (SNN) graph . Similarly, in the focused panel, 50,000 cells per sample from PBMCs were down sampled from each subject group to enable processing within memory limits and computational efficiency. On the other hand, endometrial samples in the focused panel were not down sampled as they displayed an average of 15,000 mononuclear phagocytic cells per sample after elimination of the lineage positive (CD3+, CD56+, CD66b+). Samples with less than 1000 CD45+ cells were discarded from the study and are not shown here. Due to the different origin and properties of the two tissues, two independent Seurat objects were created. A total of 355,240 endometrial myeloid phagocytic cells and 890,602 myeloid phagocytic cells from blood were subjected to Seurat clustering. The expression level of the markers from the panel were given as expression value to create Seurat objects.
As only myeloid populations were gated and analyzed, markers for other cell types were excluded from the analysis (CD45, CD3, CD56, CD66b). In addition, the expression levels of the antibodies MerTK and Erα were negative, indicating that the staining did not perform well, and these two markers were also excluded from the analyses, resulting in the inclusion of 34 markers from the panel to the final analysis. The downsampling of cells should not have resulted in any bias because cells from all subjects in the study were included in the analyses and subjected to an unbiased downsampling. Moreover, cells only from subjects with numbers of cells exceeding a chosen threshold (depending on the sample type as explained above) were downsampled.
Visualization using UMAPs colored by biological variables and technical variables
Biological variables (disease, menstrual cycle phase) and technical variables (batch/run) were visualized in different colors using the DimPlot function in the Seurat package. Because two sampling methods were used to collect the endometrial samples (hysterectomy and biopsy), we also generated and compared UMAP coordinates of the cells between the two collection methods. We concluded that they were comparable, and thus all endometrial samples were used (Additional file 4: Fig. S2A). To determine if the two sampling methods affected cell composition of each cluster, we performed t-test (p < 0.05) and compared the proportion of cells obtained by each method. We did not observe any significant differences (Additional file 4: Fig. S2B). Thus, we concluded that the tissue sampling method was not a confounder in our study.
Batch correction procedure using Harmony
After visualizing batch effects in the broad panel using DimPlot, the first and remaining group of the three batches were treated as two batch groups and the cells were combined using the harmony batch correction function RunHarmony . Similarly, samples run in batch/run 5 for the focused panel (both endometrial and PBMCs) were systematically different from the samples run in batches 1–4 (Additional file 5: Fig. S3). Therefore, samples under run 5 and the rest of the samples were treated as two separate batches that were subjected to correction using the RunHarmony function to proceed with further analysis.
Clustering of cells at different resolutions
Clustering of the cells was performed using the FindClusters function (implementing the “original Louvain” algorithm) in the Seurat package at resolutions 0.01, 0.1, 0.2, 0.4, 0.8, 1, and 1.5. At each of these resolutions, the markers for each cluster were determined using the FindMarkers function in Seurat where the processing batch for each sample was encoded as a latent variable. The average expression for each marker was determined using the AverageExpression function in the Seurat. After looking at the cisplatin levels (dead cells), some clusters were removed, as they presented high levels of dead cells; also, very small clusters that contained less than 100 cells were excluded. In the broad panel, the resolution parameter was chosen to be 0.2 for endometrium. In the focused panel, the resolution parameter was chosen to be 0.4 and 0.2 for endometrium and PBMCs, respectively.
Between cluster association with disease and menstrual cycle state
A generalized linear mixed model (GLMM) (implemented in the lme4  package in R with family argument set to the binomial probability distribution) was used to estimate the association between cluster membership, disease, and menstrual cycle phase. The model consisted of cluster membership as a response variable and five explanatory variables: subject as a random effect variable, variables encoding disease, menstrual cycle phase, their interaction, and a batch variable were included as fixed variables.
Between cluster association with disease stage
GLMM was also used to explore the association between cluster membership and disease stage of cells. Disease samples from patients with mild or severe stage were selected for disease stage association analysis. The model explanatory variables included subject as a random effect, disease stage, menstrual cycle, and the processing batch (for analyses involving disease samples processed across multiple batches) as fixed variables. Only proliferative phase samples were studied due to limited secretory phase samples from patients with severe disease.
Within cluster association with disease and menstrual cycle
To assess the association (per cluster) between CyTOF marker expression quantile levels and disease, menstrual cycle phase and their interaction, a linear model was fit. The tested marker expression quantile values of 25%, 50%, and 75% for each subject were estimated across all the subjects’ cells. In addition to variables that capture disease status, menstrual cycle phase, and their interaction, a variable capturing the processing batch was also included as explanatory variables in the linear model.
Within cluster association with disease stage
Similarly, a linear model was used to explore the association between CyTOF marker cell expression quantile levels and disease stage. The variables included in these models are described in the section “Between cluster association with disease stage”. All p-values were corrected using the false discovery rate (FDR), and a threshold of 0.05 was used to determine significance for the distribution of cells in each cluster and 0.1 for marker expression. GLMM was performed using R package lme4 , and linear regression was performed using the lm function in R. Only proliferative phase samples were studied.
Validation of the unsupervised analysis by manual gating
To validate the results, we manually gated the populations of interest in FlowJo and studied their abundance and marker expression. To find differences in expression between the manually gated populations, we extracted the mean signal intensity (MSI) of each marker, and its expression was compared between the different groups by t-test (p ≤ 0.05) and multiple correction with Benjamini-Hochberg (BH) with a false discovery rate (FDR) lower than 0.05.