Is there an added value of faecal calprotectin and haemoglobin in the diagnostic work-up for primary care patients suspected of significant colorectal disease? A cross-sectional diagnostic study

Background The majority of primary care patients referred for bowel endoscopy do not have significant colorectal disease (SCD), and are – in hindsight – unnecessarily exposed to a small but realistic risk of severe endoscopy-associated complications. We developed a diagnostic strategy to better exclude SCD in these patients and evaluated the value of adding a faecal calprotectin point-of-care (POC) and/or a POC faecal immunochemical test for haemoglobin (FIT) to routine clinical information. Methods We used data from a prospective diagnostic study in SCD-suspected patients from 266 Dutch primary care practices referred for endoscopy to develop a diagnostic model for SCD with routine clinical information, which we extended with faecal calprotectin POC (quantitatively in μg/g faeces) and/or POC FIT results (qualitatively with a 6 μg/g faeces detection limit). We defined SCD as colorectal cancer (CRC), inflammatory bowel disease, diverticulitis, or advanced adenoma (>1 cm). Results Of 810 patients, 141 (17.4 %) had SCD. A diagnostic model with routine clinical data discriminated between patients with and without SCD with an area under the receiver operating characteristic curve (AUC) of 0.741 (95 % CI, 0.694–0.789). This AUC increased to 0.763 (95 % CI, 0.718–0.809; P = 0.078) when adding the calprotectin POC test, to 0.831 (95 % CI, 0.791–0.872; P < 0.001) when adding the POC FIT, and to 0.837 (95 % CI, 0.798–0.876; P < 0.001) upon combined extension. At a ≥ 5.0 % SCD probability threshold for endoscopy referral, 30.4 % of the patients tested negative based on this combined POC-tests extended model (95 % CI, 25.7–35.3 %), with 96.4 % negative predictive value (95 % CI, 93.1–98.2 %) and 93.7 % sensitivity (95 % CI, 88.2–96.8 %). Excluding the calprotectin POC test from this model still yielded 30.1 % test negatives (95 % CI, 24.7–35.6 %) and 96.0 % negative predictive value (95 % CI, 92.6–97.9 %), with 93.0 % sensitivity (95 % CI, 87.4–96.4 %). Conclusions FIT – and to a much lesser extent calprotectin – POC testing showed incremental value for SCD diagnosis beyond standard clinical information. A diagnostic strategy with routine clinical data and a POC FIT test may safely rule out SCD and prevent unnecessary endoscopy referral in approximately one third of SCD-suspected primary care patients. Please see related article: http://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-016-0694-3. Electronic supplementary material The online version of this article (doi:10.1186/s12916-016-0684-5) contains supplementary material, which is available to authorized users.


Background
Patients with persistent lower abdominal complaints are common in primary care [1]. At presentation, the general practitioner (GP) has to differentiate between potentially life-threatening significant colorectal diseases (SCD), such as colorectal cancer (CRC) and inflammatory bowel disease (IBD), and functional bowel disorders such as irritable bowel syndrome. As symptoms and signs alone have insufficient specificity, GPs refer many patients for endoscopy to not miss an SCD diagnosis. Consequently, 60-80 % of referred patients do not have SCD at endoscopy [2][3][4][5][6], unnecessarily straining healthcare budgets and endoscopy schedules, and exposing many non-SCD patients to a small but realistic risk of severe endoscopyassociated complications.
Thus, an improved diagnostic strategy that can safely rule out SCD is needed. Previouslargely non-primary carestudies have shown that diagnostic strategies solely based on symptoms and signs are unlikely to suffice [7,8]. Adding faecal biomarkers to such diagnostic strategies may, however, improve their performance. One promising faecal biomarker is calprotectin, which indicates the presence of intestinal inflammation [9]. Calprotectin has been recommended by the National Institute for Health and Care Excellence (NICE) to help distinguish between IBD and non-IBD [10]. However, calprotectin has only been evaluated as a single test without accounting for other diagnostic information [11][12][13]. Furthermore, the presence of faecal haemoglobin (Hb) may indicate neoplastic disease [14]. Faecal occult blood tests have previously been included in diagnostic strategies for CRC with limited success [15,16]. Over the past decade these tests have improved substantially, mainly because of specific immunochemical detection of human Hb, resulting in so-called faecal immunochemical tests for Hb (FITs) [14].
We designed the large-scale prospective CEDAR study (Cost-Effectiveness of a Decision rule for Abdominal complaints in primary caRe), to develop a new diagnostic strategy to safely rule out SCD in primary care patients with lower abdominal complaints, thus reducing the number of unnecessary endoscopy referrals. To meet this aim, we specifically quantified the incremental diagnostic accuracy of a point-of-care (POC) calprotectin test and a POC FIT above routine diagnostic information, both individually and in combination. We specifically focused on POC tests as these can be easily executed at the time and place of patient care.

Study design
The prospective diagnostic CEDAR study enrolled patients from 266 Dutch primary care practices referred for endoscopy from July 2009 through January 2012 [11]. Patients were eligible if suspected of SCD, defined by lower abdominal complaints for at least 2 weeks, combined with rectal bleeding, change in bowel habit, abdominal pain, fever, diarrhoea, weight loss, and/or a sudden onset of abdominal complaints at > 50 years of age. Patients were excluded if aged below 18, known with SCD, or with confirmed parasitic bowel infection. Recruitment was at the GP's office (19.0 %) or directly following endoscopy scheduling (81.0 %). If not directly recruited by their GP, our research staff contacted eligible patients. If at any time during the study patient referral outpaced our study resources, each n th scheduled patient was screened and contacted to guarantee representativeness of the study population. The University Medical Center Utrecht ethics committee approved the study (protocol number 08-462E), and all patients gave written informed consent.

History taking and physical examination
Patient and GP questionnaires facilitated a structured history taking. Abdominal pain, rectal blood loss or mucus, weight loss, and fever were considered present upon patient or GP report; duration of abdominal pain, abdominal bloating, and family history of CRC upon patient report; and change in bowel habit upon GP report. We defined constipation as at least two of the following symptoms: less than three bowel movements/week, difficult/incomplete defecation, hard/lumpy faeces, sensation of anorectal obstruction, or laxative use. We based diarrhoea on frequently loose/liquid faeces, or anti-diarrhoea medication use. GPs reported the presence of a palpable abdominal mass or an abnormal digital rectal examination.

Blood and faecal SCD biomarkers
A pre-endoscopy venous blood sample was drawn to estimate Hb and C-reactive protein (CRP) concentrations according to routine clinical practice. Directly following study inclusion, patients provided faeces samples collected before bowel preparation for endoscopy in a plain bluecapped faecal container, and kept refrigerated (4°C) for a maximum of 2 days before handing in. Study protocol allowed freezing (-20°C) of faecal samples before processing (this occurred in 67.9 % of samples; median days between collection and processing: 10; 10th-90th percentile: 4-21). If not frozen, the refrigerated faecal samples needed to be processed for calprotectin testing within 6 days (adherence 96.3 %; median days: 2: 10 th -90 th percentile: 0-3), and needed to be tested for Hb within 3 days of collection (adherence 94.5 %; median days: 2: 10 th -90 th percentile: 0-3).
We analysed the faecal samples for calprotectin concentration by a quantitative POC test (Quantum Blue®; dynamic range 30-300 μg/g) and by an enzyme-linked immunosorbent assay (ELISA; EK-CAL Calprotectin ELISA, both from Bühlmann Laboratories), both yielding estimates of μg calprotectin/g faeces, and for faecal Hb by a qualitative POC FIT (Clearview® iFOBT One Step Faecal Occult Blood Test Device, Alere Health), yielding either a positive or negative test result (lower detection limit of 6 μg/g). Laboratory technicians performed the ELISA, and trained research nurses the POC tests, blinded for clinical information and according to the manufacturers' instructions. Briefly, for the calprotectin assays, 80 mg homogenized faeces was centrifuged and the supernatant was tested for calprotectin (1:16 diluted for the POC test and undiluted for the ELISA; supernatant for the ELISA was stored at -20°C for maximally 4 months before analysis); for the POC FIT three separate random areas of the faecal sample were stabbed by the specimen collection stick and transferred to the collection tube, and two drops of extracted specimen were then applied to the test device. For more details see Kok et al. [11].

Diagnostic outcome
Experienced gastroenterologists from three high-volume centres (i.e. > 1000 endoscopies annually) performed endoscopy in all patients, i.e. colonoscopy or sigmoidoscopy. A final diagnosis was established according to routine clinical practice, including histopathology of biopsies if required, and 3 months follow-up after negative endoscopy. We defined SCD as CRC, IBD, diverticulitis, or advanced adenoma (AA; > 1 cm). Outcome assessment was blinded for the biomarker test results and other diagnostic information.

Statistical analysis
In view of the number of SCD diagnoses [17], we first developed a basic diagnostic model for SCD considering 15 patient history and physical examination predictors (listed in Table 1) and simple blood analyses (Hb and CRP concentrations). We started by selecting patient history and physical examination predictors using Akaike's Information Criterion (AIC)-based stepwisebackward logistic regression; first considering and selecting only the patient history predictors, and then considering and selecting the physical examination predictors while keeping the selected patient history predictors fixed. Subsequently, Hb and/or CRP were only selected if they significantly improved the patient history/physical examination model. We deliberately used a more stringent selection criterion for the blood analyses (P < 0.05 instead of AIC-based) in view of the patient burden associated with obtaining this information. Blood Hb and CRP were modelled continuously instead of using a threshold for abnormal values (e.g. defining anaemia), to preserve as much diagnostic information as possible.
We then added the faecal biomarker tests to this basic diagnostic model (the calprotectin tests continuously and the POC FIT dichotomously), resulting in five extended models: three separate extensions (calprotectin POC or ELISA, or POC FIT), and two combined extensions (calprotectin POC or ELISA with POC FIT). As faecal testing may also be burdensome, we used the same stringent selection criterion for each faecal biomarker test as for the blood analyses (i.e. P < 0.05 for model improvement). Any blood analysis included in these extended models was subsequently removed if non-significant. For those models extended with the FIT, we also considered whether the FIT diagnostic odds ratio for SCD was lower in patients with overt rectal blood loss compared to those without (implying less diagnostic information), by testing a [FIT*blood loss] interaction term. All predictor selection tests were based on the log likelihood ratio. In all modelling, continuous predictors were included as such, using transformations if necessary to maintain linearity, while truncating outliers. Transformations were necessary for blood Hb (U-shape relation with SCD risk), and for duration of abdominal pain and CRP (logarithmic relations). See Additional file 1 for further model development details.
The final six diagnostic models were assessed for discrimination (area under the receiver operating characteristic curve; AUC), calibration, explained variation (Nagelkerke R 2 ), accuracy (i.e. sensitivity, specificity, negative and positive predictive values (NPV and PPV) at different SCD probability thresholds: 2.5 %, 5.0 % and 7.5 %), and net benefit (decision curve analysis) [18][19][20]. All faecal biomarker extended models were compared to the basic model and the combined biomarker extended models to the individual biomarker extended models, in terms of discrimination, explained variation, and reclassification (net reclassification improvement (NRI) at 5.0 % and 50.0 % probability threshold for low and high risk, and (relative) integrated discrimination improvement (IDI)) [21].

Basic and extended diagnostic models
Nine of the 15 candidate predictors from patient history and physical examination were selected for the basic diagnostic model, to which blood Hb did not significantly contribute (P = 0.23) but CRP did (P = 0.03; see Table 2 for specification of the basic diagnostic model). This basic model significantly improved upon individual or combined extension with the calprotectin POC or ELISA and the POC FIT tests. Although CRP significantly contributed to the basic diagnostic model, it did not contribute to any of the five faecal biomarker extended models and was thus excluded from these. In none of the models with POC FIT did the odds ratio for SCD significantly differ in patients with and without rectal blood loss (Additional file 1), so we did not stratify the FIT results for overt rectal bleeding subgroups in the final models.

Model performance and comparison
The basic model's AUC increased from 0.741 (95 % CI, 0.694-0.789) to 0.763 (95 % CI, 0.718-0.809; P = 0.078) and 0.831 (95 % CI, 0.791-0.872; P < 0.001) upon extension with POC calprotectin and FIT, respectively, and to 0.837 (95 % CI, 0.798-0.876; P < 0.001) upon combined extension ( Fig. 2 and Table 2). All three POC test extended models showed significant net reclassification improvement compared to the basic model. The FITonly extended model and the combined POC extended model both yielded the highest NRI (both 0.38; see Additional file 1 for the corresponding reclassification tables). When adding FIT to the calprotectin POC extended model, both the AUC and NRI significantly increased, which was not true for adding calprotectin to the FIT extended model ( Table 2). The basic model explained 19.0 % of the variation in SCD, which increased to 23.5, 34.5, and 35.8 % for the calprotectin, the FIT, and the combined POC extended models, respectively. All diagnostic models showed excellent calibration (Additional file 1).

Ruling out SCD
Using the combined POC extended model at the ≥ 5.0 % SCD probability threshold for referral would rule out SCD (i.e. prevent referral) in 30.4 % of all patients in our study, with 96.4 % NPV and 93.7 % sensitivity (inappropriately not referring one CRC [stage 1], four diverticulitis, and four AA patients; Table 3). At the same threshold, the Regarding the net benefit at the ≥ 5.0 % SCD probability threshold for referral when compared to the basic model, the combined POC extended model resulted in 60 more correctly non-referred patients without increasing the number of non-referred SCD patients, and three more correctly referred SCD patients without increasing unnecessary referrals (all per 1000 tested patients). These numbers were 34 and two, respectively, for the FIT extended model (Additional file 1).

Calprotectin POC versus ELISA test
Substituting the calprotectin POC with an ELISA test yielded similar results with regard to discrimination, explained variation, reclassification, and diagnostic accuracy (Tables 2 and 3; see Additional file 1 for ROC curves).

Towards use in new patients
To improve valid estimation of SCD risk in future patients, Table 4 shows the optimism-corrected regression coefficients of the combined POC and the FIT-only extended models (see Additional file 1 for the other models); the optimism-corrected AUC and explained variation of these models were 0.818 (95 % CI, 0.779-0.857) and 0.813 (95 % CI, 0.772-0.853), and 30.6 % and 29.5 %, respectively. See Additional file 1 for nomograms.

Discussion
We are the first to develop a diagnostic strategy in primary care patients suspected of SCD, considering signs, symptoms, simple blood analyses, and both faecal calprotectin and Hb levels. This study showed that especially a POC FIT, and to a much lesser extent calprotectin tests, have incremental value beyond patient history, physical examination, and CRP in ruling out SCD in primary care patients with persistent lower abdominal complaints. Use of a simple diagnostic model including calprotectin POC and POC FIT test results could safely rule out SCD and prevent endoscopy referral in about 30 % of patients with 96.4 % NPV (at a 5.0 % SCD probability referral threshold). Excluding the calprotectin test from this model yielded similar results, missing one additional AA patient  (of the 49 present in our study). Substituting the calprotectin POC test by an ELISA did not substantially change these results. A perfect strategy would not miss any SCD patients. A substantial reduction of the number of unnecessary endoscopy referralsas we show is feasiblewill, however, inevitably result in a small risk of missing serious SCD. In our study, one patient with stage 1 CRC was not selected for referral by any of the POC FIT extended models at the ≥ 5.0 % SCD probability threshold (this patient tested negative on both the calprotectin POC test and the POC FIT). With keen attention in case of non-referral at first consultation to persisting symptoms over a time frame of 2-3 weeks, we think this will result in delaying, but not missing, such diagnoses. Such a limited delay will also not likely advance the disease stage substantially for CRC patients who were initially non-referred [29].
Notwithstanding the 2013 NICE recommendation for use in diagnosing IBD [10], calprotectin has so far only been studied in absence of other diagnostic information [11][12][13]. One retrospective study investigating the use of calprotectin in irritable bowel syndrome-suspected primary care patients from the United Kingdom reported an AUC for SCD of 0.89 (95 % CI, 0.85-0.93), much higher than we report here (0.68; 95 % CI, 0.63-0.73 [POC], 0.66; 95 % CI, 0.61-0.72 [ELISA]) [12]. Besides the different patient populations, adenomas were not considered SCD in that study, as they were in ours. As calprotectin levels are low in (advanced) adenoma patients [11], this partly explains the observed difference between the studies (AUCs for SCD without adenomas in our data:  [11]. Still, calprotectin did not show as much incremental diagnostic value as expected. This observation remained when analysing the data for IBD instead of SCD, and when considering adenomas non-SCD (data not shown).
Faecal Hb testing for CRC screening is widely accepted. Here, we showed that a qualitative POC FIT also has large incremental value for ruling out SCD in primary care. Our data further suggests that the POC FIT has value even in patients with overt rectal bleeding, equally so as in those without (Additional file 1). Additional analysis showed that the POC FIT was negative in 65.6 % of our patients with overt rectal bleeding. It may be more specific for blood mixed with faeces, thereby better reflecting the generally higher gastrointestinal location of SCD compared to other causes of rectal bleeding (e.g. haemorrhoids).
In a recent United Kingdom-based primary care study that ran between 2013-2014, 755 patients referred for bowel examination had available data on both faecal calprotectin (same ELISA as in our study) as well as Hb levels (using the quantitative EIKEN OC-Sensor assay) [16]. The authors concluded that undetectable faecal Hb may be sufficient to exclude CRC/IBD/higher-risk adenomas with 41.7 % test negatives, 96.2 % NPV and 88.2 % sensitivitythereby questioning the added value of calprotectin, as in our study. Other studies have also advocated quantitative faecal Hb testing for ruling out SCD [30,31], or advanced neoplasia [32][33][34], in symptomatic patients. We could not confirm these promising results of faecal Hb by itself (Table 1), which is possibly because of the higher threshold of our POC FIT (with a detection limit of 6 μg/g), and it being a qualitative and not a quantitative test. Previous results suggest that using a single test could, in fact, be sufficient in deciding whom to refer for endoscopy. Indeed, our results also underscore that a positive POC FIT already implies the need for referral by itself (at the ≥ 5.0 % SCD probability threshold; see nomogram in Additional file 1). Here, the clinical data do not add much, but they do when the POC FIT returns negative. Also, in daily clinical practice, and certainly in primary care, it is rare thatexcept in a screening situationphysicians would immediately apply such test in suspected patients presenting with symptoms and signs of SCD without even considering any other pre-test diagnostic information from history taking and physical examination. The diagnostic process in primary care is sequential, starting with history taking and physical examination, and follow-up testing only in cases where the first provide indications that legitimates additional testing. To adhere as much as possible to primary care practice, we therefore explicitly first evaluated the diagnostic value of history taking, physical examination, and simple blood analysis, and subsequently the added value of the POC FIT test, rather than the other way around. Obviously, in unsuspected people, in the realm of screening, a singletest approach using first and foremost the POC FIT test, seems a very reasonable approach, but in our view not for diagnostic work-up of clinically suspected patients, which was the focus of this paper.
A major strength of our study is its prospective conduct in a primary care setting, where results from secondary care studies may not be applicable [8]. We also took care to enrol representative patients from 266 general practices, while measuring all potentially relevant diagnostic information, including blood and faecal biomarkers, under Table 3 Diagnostic accuracy when basing endoscopy referral on varying SCD probability thresholds for the basic and the five faecal biomarker extended models, as observed in 810 Dutch patients with lower abdominal complaints referred for endoscopy in the CEDAR study a  The percentage referred and the accuracy measures are each averaged over the 10 imputed datasets. Hence, it is possible that, e.g. 100.0 % sensitivity does not directly match with 100.0 % NPV b A patient with SCD was considered missed if his/her predicted SCD probability was below the respective threshold for referral in at least 5 of the 10 imputed datasets routine conditions, enhancing the generalizability of our results. Moreover, patients underwent reference testing by the same standard, including 3 months follow-up after inconclusive endoscopy to identify any initially missed SCD, and index and reference tests were interpreted independently in each patient. Finally, we purposely developed diagnostic models for SCD, and not solely for CRC (or IBD) as commonly done. This resulted in a diagnostic strategy applicable to primary care patients with persistent lower abdominal complaints that is optimally aligned with the diagnostic challenge at hand: ruling out SCD. When defining SCD, we only included adenomas > 1 cm as AA, without taking histologic high-risk features such as the presence of high-grade dysplasia or villous components in smaller adenomas into account. However, such high-risk features are seldom present in small adenomas [35], and we estimate that about 2 to 3 of the small adenomas we have considered non-SCD are actually high-risk lesions. This amount of misclassification (i.e. only~2 % of all SCD cases in CEDAR) will likely not have importantly influenced the results. Some other limitations of our study also need discussion. For instance, we did not enrol primary care patients urgently referred for endoscopy (e.g. for on-going bleeding or imminent obstruction) or at very low SCD-suspicion (not necessitating endoscopy). Our study population thus reflects patients at intermediate risk of SCD. These patients, however, pose the largest diagnostic dilemma, where an improved diagnostic work-up is especially urgent. Further, most diagnostic predictors had missing data despite systematic data collection, and we had to use state of the art multiple imputation of the 5.2 % missing data points to prevent selection bias and loss of information [23][24][25]. Furthermore, as we used all available data to optimally develop the best diagnostic strategy, and despite using bootstrapping techniques for internal validation to These models can be used to calculate the probability for a certain patient of having SCD. correct for over-optimism, formal external validation of our findings is still warranted. Finally, the use of a qualitative POC FIT in the way that we did in this study, although easily implemented in primary care, also has limitations. First, as the qualitative POC FIT yields a positive or a negative test result (with a detection limit of 6 μg Hb/g faeces), the diagnostic information that would be available by quantitatively assessing the amount of Hb present in faeces is lost. Second, patients collected faecal samples in regular blue-capped containers without Hb stabilizing buffer (so each patient needed to fill only one faecal container for both calprotectin and Hb analysis). Samples were kept refrigerated, andif not frozen before further processing -90 % were tested within 3 days of collection. Additional data-analysis showed that the chance of a positive POC FIT slightly decreased with increasing time between collection and testing (0.3 % absolute decrease per day; P = 0. 19), and that frozen samples were more likely to be POC FIT negative than non-frozen samples (absolute 8.6 % decrease in POC FIT positivity; P = 0.017; calprotectin results seemed not to be affected). Some patients have thus likely tested falsely negative for the POC FIT because of Hb degradation in our study. However, in none of the models with POC FIT did its odds ratio for SCD significantly differ in patients whose faecal samples were and were not frozen. Furthermore, the POC FIT performed well in our study despite these limitations, and the sensitivity and discriminatory performance of faecal Hb testing in primary care will thus likely be even better when using Hb stabilizing buffers in faecal sample collection devices and using a quantitative FIT.

Conclusions
A simple model including information from history taking, physical examination, and a POC FIT may safely rule out SCD and prevent unnecessary endoscopy referral in approximately one-third of SCD-suspected primary care patients. Adding a calprotectin test to such a strategy has limited value.

Additional file
Additional file 1: Supplementary appendix. Contents: interaction between POC FIT and overt rectal bleeding; Table S1: model development strategy and specification;  Figure S1: calibration curves; Figure S2: decision curve analysis; Figure S3: ROC curves subsitituting calprotectin POC with ELISA; Figure S4: nomogram of the combined POC extended model; Figure S5: nomogram of the POC FIT extended model. (DOCX 2031 kb) Abbreviations AA: advanced adenoma; AIC: Akaike's information criterion; AUC: area under the receiver operating characteristic curve; CEDAR: Cost-Effectiveness of a Decision rule for Abdominal complaints in primary caRe; CRC: colorectal cancer; CRP: C-reactive protein; ELISA: enzyme-linked immunosorbent assay; FIT: faecal immunochemical test for haemoglobin; GP: general practitioner; Hb: haemoglobin; IBD: inflammatory bowel disease; IDI: integrated discrimination improvement; NICE: National Institute for Health and Care Excellence; NPV: negative predictive value; NRI: net reclassification improvement; POC: point-of-care; PPV: positive predictive value; ROC curve: receiver operating characteristic curve; SCD: significant colorectal disease