The major global gap in knowledge of causes of death, particularly for adult mortality in LMICs, might be filled by adopting nationally-representative VA surveys [1–5]. We develop and implement simple metrics which can measure the performance of national VA-based COD systems, such as the MDS. Applying these metrics, we find that lay reporting with double physician coding yields plausible results in the MDS. The MDS retains ill-defined deaths as a check on quality, and finds few ill-defined deaths during young and middle age (below age 70 years), but far more at older ages. This corresponds with public health priorities, which are mostly concerned with avoidable death in young and middle age [1, 37]. Indeed, misclassification is common at older ages even for medically certified deaths occurring in hospitals in high-income countries [38, 39].
A simple but effective method to establish usefulness of CSMFs from VA-based national surveys is to compare if an independent resample yields similar results. The CSMFs and rank order of CODs at the population level is similar between randomly resampled deaths and those from the main MDS. This suggests stability and reproducibility of the RHIME method, but does not itself prove validity. By contrast, CSMFs differed substantially between hospital- and home-based deaths, and between urban and rural deaths, highlighting the need for VA studies to use true random samples to reliably capture COD distribution at the national level. The age-, sex- and temporal-plausibility for major conditions is high. Consistency of coding is not dependent on various household characteristics but does depend on whether the person lived with the deceased and on the availability of a good quality narrative. Physicians reached initial agreement about two-thirds of the time at initial coding. Finally, the MDS classification is roughly comparable to WHO’s classification system in being able to classify most ICD-10 codes assigned to surveyed deaths and to minimize (but not to eliminate) ill-defined conditions, and performs better than the GBD classification system on these metrics.
National patterns of CODs based solely on hospital and/or urban deaths can be misleading. For example, the GBD estimates for India over-report injury deaths, in particular fires, by relying in large part on urban hospital-based deaths . Unpublished data from the MDS also suggests that the leading COD in India (ischemic heart disease) may be overestimated by the GBD given the former’s reliance on urban, hospital deaths. Finally, the MDS shows that the leading cause of cancer death in women in India is cervical, followed by breast. The GBD finds the exact opposite, due to its reliance on mostly urban cancer registries . Hospital-home and urban–rural differences persisted after adjustment for age, education, religion, region and other variables, suggesting there are underlying biases which cannot be easily corrected for, and the need for caution in extrapolating CSMFs from hospitals to non-hospitalized populations. Moreover, there are differences in the distribution of CODs, treatment patterns, and underlying pathogens for infectious causes between hospital deaths (mostly urban) and rural, unattended deaths in the home [1, 40–44]. For example, malaria is observed mostly among rural, unattended deaths . Thus, hospital deaths should not be regarded as a gold standard from which to ‘validate’ rural, medically unattended deaths.
VA generally produces a proportion of deaths that are coded as ill-defined or unspecified causes, particularly at older ages. However, ill-defined categories are important to maintain as they permit a check on the quality of a VA system, as well as individual surveyors’ quality of fieldwork . For example, the Indian government ceased an earlier system of obtaining COD from rural health centers  in part because the ill-defined rate was rising, suggesting decreasing quality [5, 46]. The MDS methods explicitly keep ill-defined codes visible, rather than re-classifying them into other causes and artificially reducing ill-defined codes to zero. The MDS COD classification system groups several of the ill-defined codes with more certain diagnoses (for example, adding R96 for sudden death to the acute myocardial infarction group, see Additional files 3 and 4), though the majority of ill-defined codes remain in a distinct ill-defined group. WHO’s Global Health Estimates (GHE) similarly groups ill-defined codes with other diseases, and the system is reasonably transparent about these re-allocations , which are reproducible. The GBD system tends to distribute ill-defined codes to other diseases. GBD uses an unpublished method for re-distributing ill-defined codes to well-defined categories that has not yet been reproduced.
The simple ICD-10 classification systems used by the MDS and WHO are preferable to the complex re-classification systems used by the GBD. The GBD’s poor performance in classifying deaths is due in part to the peculiar decision to treat cerebrovascular diseases (ICD-10 code I64), as a ‘garbage code’ subject to misclassification. In most VA studies (such as by the INDEPTH network [48, 49]), and indeed in the United States , I64 constitutes the majority of the cerebrovascular ICD-10 codes (I60-I69) on death certificates.
These findings carry implications for other LMICs considering introduction of VA-based methods. First, simple but important statistical features are to ensure random sampling of deaths, random resampling of fieldwork, and double coding by physicians. The most important limitation in global estimation of CODs is simply an insufficient number of countries that implement simple, large-scale VA studies . The debate about physician versus machine coding is somewhat misleading , as it misses the key point that far more nationally-representative studies are needed.
Innovations in electronic capture of field records, as well as electronic physician recruitment, training, certification, and coding, have resulted in the ability to rapidly conduct large physician-coded VA studies. Indeed, the main rate-limiting steps are organizational and financial. Technical innovations can further simplify the fieldwork and ensure physician coding is supported with computer-based diagnosis. The use of electronic data entry  with time- and GPS-tracking of fieldwork, as well as the important feature of resampling deaths, can further improve field quality. In the MDS, the major delay in the coding of records has occurred due to administrative issues and reliance on paper-to-electronic scanning, and not due to the rate of physician coding. Advancing the field will also require commitment to open-source materials, methods, and software. To this end, all the MDS tools are freely available to use without restriction. Open-source data sets  are the logical next step in the evolution of global estimates of COD.