Study population
This study included three populations, one cross-sectional population and two screening cohorts. Women were eligible if they had an intact cervix and no prior history of CIN. Women eligible for the screening cohorts were additionally aged 25 to 65. Women who were pregnant, had a hysterectomy, or received treatment for cervical diseases were excluded. Further details on the study design are provided in Fig. 1. Institutional review board (IRB) approval was provided by the Ethics Committee from Cancer Hospital, Chinese Academy of Medical Sciences. All participants have agreed on the study protocol and provided informed consent.
Cross-sectional population
Participants were recruited from five hospitals in China between 2014 and 2015 and included women attending routine cervical cancer screening programs, outpatients referred for colposcopy, and inpatients planning treatment for CIN2+. A questionnaire was used to collect information on demographic factors and obstetrics and gynecology history. Two cervical exfoliated cell samples were collected: one was kept in PreservCyt Solution (Hologic) and aliquoted for cobas HPV (Roche), Aptima HPV (Hologic), Onclarity HPV (BD Diagnostics) testing, p16/Ki-67 dual staining (Roche), and liquid-based cytology (LBC) assessment and the other sample was kept in a Dacron swab for HPV16/18 E6 protein detection (Arbor Vita Corporation). Cervical biopsies were conducted using a protocol as previously described [14]. Local pathologists provided the primary diagnosis, and a panel of five pathologists from each center underwent a diagnostic blind review for consensus.
Screening cohorts
Both screening cohorts included a baseline phase and a 3-year follow-up phase. Participants in the screening cohort I (SC-I) were recruited from Shanxi Province of China between 2017 and 2020. At baseline, all participants received Aptima HPV, INNO-LiPA HPV genotyping (Innogenetics), and LBC. Aptima HPV positive samples were tested by Aptima HPV16/18/45. Women with HPV16/18/45 positive or abnormal cervical cytology (ASC-US+) were referred for colposcopy and women with HPV16/18/45 results had an additional swab for E6 oncoprotein test collected before colposcopy.
Participants in the screening cohort II (SC-II) were recruited from the Inner Mongolia Autonomous Region of China between 2016 and 2019. At baseline, all participants received cobas HPV, INNO-LiPA HPV genotyping, and LBC. Women with HPV16/18 positive or ASC-US+ were referred for colposcopy.
For both screening cohorts, women who were HPV positive or had an ASC-US+ cytology continued to annual follow-up visits, and all women regardless the results at baseline came back at the 3rd year for a final visit. At each visit, a LBC specimen was obtained and women with ASC-US+ were referred for colposcopy. Women found to have a diagnosis of CIN2+ at baseline or follow-up exited the study after the colposcopy visit and were referred for treatment.
Laboratory tests
The Onclarity HPV is a PCR assay for the detection of six individual HPV genotypes (16, 18, 31, 45, 51, and 52) and three groups of types (33/58, 59/56/66, and 39/68/35). The cobas HPV is another PCR assay for the detection of viral DNA of the 14 hrHPV types, which simultaneously differentiates HPV16 and HPV18. The Aptima HPV is based on the qualitative detection of E6/E7 mRNA of 14 hrHPV types. The Aptima HPV16/18/45 uses the same technology as Aptima HPV for detection of E6/E7 mRNA from HPV16/18/45; the assay differentiates genotype 16 from 18 and 45 but does not differentiate between 18 and 45. INNO-LiPA HPV genotyping assay allows simultaneous and separate detection of 25 different HPV genotypes (14 hrHPV and HPV6, 11, 34, 40, 42, 43, 44, 53, 54, 70, and 74). All HPV tests were performed at the fully automated system according to the manufacturer’s instructions. The OncoE6 cervical test is an immunochromatographic test for the detection of HPV16/18 E6 oncoprotein. The operation procedures were described previously [15].
Cytology slides were first evaluated by junior cytologists and then diagnosed by senior cytologists. Results were reported using the Bethesda 2014 nomenclature. A second cytology slide was prepared from the residual PreservCyt Solution for p16/Ki-67 dual staining using the CINtecPLUS Cytology kit according to the manufacturer’s instructions for the cross-sectional samples. Technicians were blinded to each other’s findings to minimize bias.
Statistical analyses
Model development
Based models of logistic regression and SVM were implemented on the platform of R (Version 3.5.2). Model construction and internal validation were performed in the cross-sectional population, which was randomly split into 70% for a training set and 30% for a testing set.
Logistic regression or SVM using age, cytology, and hrHPV as predictors was set as the base model. Among the predictors, age was a continuous covariate; hrHPV testing was dichotomous (any type of the 14 hrHPV types positive vs. all of the 14 hrHPV types negative); and cytology was a seven-level covariate: negative for intraepithelial lesion or malignancy (NILM), ASC-US, low-grade squamous intraepithelial lesion (LSIL), atypical squamous cells cannot exclude high-grade lesion (ASC-H), atypical glandular cell (AGC), high-grade squamous intraepithelial lesion/adenocarcinoma in situ (HSIL/AIS), and squamous cell carcinoma/adenocarcinoma (SCC/ADC). HSIL and AIS, as well as SCC and ADC, were separately combined because limited cases were available for these levels. CIN2+ or CIN3+, the outcome of interest, was dichotomous. Receiver operating characteristic (ROC) curve (sensitivity and 1-specificity) and the area under the curve (AUC) were used to assess predictive accuracy. Sensitivity, specificity, and colposcopy referral rate were also calculated for current screening methods and models based on the thresholds with the largest Youden Index.
The base model was extended by substituting hrHPV using different detection methods, i.e., the result of cobas was substituted by Aptima or Onclarity. Additional covariates were also added to the base model, including E6 oncoprotein (dichotomous, either HPV16/18 positive vs. both HPV16&18 negative), p16/Ki-67 (dichotomous, positive vs. negative), and HPV genotyping (nine dummy variables: HPV16, 18, 31, 45, 51, 52, 33/58, 59/56/66, and 39/68/35, positive vs. negative). AUCs were compared using the “pROC” package in R. Logistic regression or SVM, which one showed better clinical performance, was chosen for further analysis. Statistical significance was assessed by two-tailed tests with α level of 0.05.
External validation in screening cohorts
The base model and extended versions with HPV genotyping were applied to both screening cohorts. The extended models with E6 oncoprotein were applied to SC-I only because swab samples were not collected in SC-II. Cytology results diagnosed by junior and senior cytologists were also evaluated in models. Three-year cumulative risks of CIN2+ were estimated by hrHPV and cytology co-testing negative and predicted-negative populations.