Skip to main content
  • Research article
  • Open access
  • Published:

An in-depth assessment of a diagnosis-based risk adjustment model based on national health insurance claims: the application of the Johns Hopkins Adjusted Clinical Group case-mix system in Taiwan



Diagnosis-based risk adjustment is becoming an important issue globally as a result of its implications for payment, high-risk predictive modelling and provider performance assessment. The Taiwanese National Health Insurance (NHI) programme provides universal coverage and maintains a single national computerized claims database, which enables the application of diagnosis-based risk adjustment. However, research regarding risk adjustment is limited. This study aims to examine the performance of the Adjusted Clinical Group (ACG) case-mix system using claims-based diagnosis information from the Taiwanese NHI programme.


A random sample of NHI enrollees was selected. Those continuously enrolled in 2002 were included for concurrent analyses (n = 173,234), while those in both 2002 and 2003 were included for prospective analyses (n = 164,562). Health status measures derived from 2002 diagnoses were used to explain the 2002 and 2003 health expenditure. A multivariate linear regression model was adopted after comparing the performance of seven different statistical models. Split-validation was performed in order to avoid overfitting. The performance measures were adjusted R2 and mean absolute prediction error of five types of expenditure at individual level, and predictive ratio of total expenditure at group level.


The more comprehensive models performed better when used for explaining resource utilization. Adjusted R2 of total expenditure in concurrent/prospective analyses were 4.2%/4.4% in the demographic model, 15%/10% in the ACGs or ADGs (Aggregated Diagnosis Group) model, and 40%/22% in the models containing EDCs (Expanded Diagnosis Cluster). When predicting expenditure for groups based on expenditure quintiles, all models underpredicted the highest expenditure group and overpredicted the four other groups. For groups based on morbidity burden, the ACGs model had the best performance overall.


Given the widespread availability of claims data and the superior explanatory power of claims-based risk adjustment models over demographics-only models, Taiwan's government should consider using claims-based models for policy-relevant applications. The performance of the ACG case-mix system in Taiwan was comparable to that found in other countries. This suggested that the ACG system could be applied to Taiwan's NHI even though it was originally developed in the USA. Many of the findings in this paper are likely to be relevant to other diagnosis-based risk adjustment methodologies.

Peer Review reports


Risk adjustment has become an increasingly important tool in healthcare around the globe. It is being extensively applied to provider performance assessment [13], high risk predictive modelling for disease management [46] and payment adjustment [710]. Through these applications the broader goals of equity, efficiency and improved outcomes may be achieved in a healthcare system. Health indicators that are often used for risk adjustment include demographic factors, subjective/self-reported health status [11, 12], biomedical clinical indicators [13, 14], prior expenditure [15, 16] and claims-based morbidity burden indicators (using diagnostic codes [17, 18] or medication codes [9, 1921]). Even though demographic factors are most often used given the availability of the data, both prior-expenditure and claims-based health indicators perform much better than demographic factors [8, 9, 13, 14, 16, 19, 22]. However, since risk adjustment models are usually adopted for payment adjustment, models using diagnosis and/or pharmacy data are preferred as prior use models could offer inappropriate incentives to increase services in order to receive higher payments. Diagnosis and/or pharmacy-based risk adjustment models have been developed and gradually adopted in Canada, the USA and Europe [9, 2329].

The Johns Hopkins Adjusted Clinical Group (ACG) system is a comprehensive risk adjustment software that incorporates diagnosis information, pharmacy information, or both, to capture an individual's morbidity burden [23, 24, 30]; ACG actuarial cells and Aggregated Diagnosis Group (ADG) binary morbidity markers have been extensively examined in the USA [8, 16], Canada [31] and other European countries [32]. However, Expanded Diagnosis Clusters (EDCs), another output of the ACG system, have not been assessed carefully in terms of performance on risk adjustment. EDCs are binary indicators which show whether or not an individual has specific diseases/symptoms. It has been shown that a substantial fraction of health costs resulted from the treatment of a relatively small number of common, but expensive, chronic diseases [15]. Therefore, our supposition is that the basic ACG models can be improved upon by the inclusion of selected disease conditions (represented by EDCs) [17].

Taiwan launched a government-run, single-payer National Health Insurance (NHI) programme in May 1995. All Taiwanese nationals are obligated by law to join this programme to ensure adequate risk pooling. Under the jurisdiction of the national government's Department of Health, the NHI is administered by the Bureau of National Health Insurance (BNHI) and six regional branches are in charge of administrating the NHI in each area. The NHI's benefit packages are comprehensive, including inpatient and outpatient services, pharmacy services, Chinese medicine and dental services. Beneficiaries have complete freedom of choice of providers and therapies, and they do not need to go through 'gatekeepers' in order to obtain medical services from specialists. The primary source of funding for the NHI is the payment of premiums shared by the insured, the employers and the government. In terms of reimbursement, the global budget payment system was adopted in order to contain the growth of medical expenditure. Within budget limits, the NHI reimburses contracted providers mostly on a fee-for-service basis, using uniform national fee schedules. Entering the second decade, reform of Taiwan's NHI has focused on three aspects: quality improvement; financial balance; and expansion of social participation. In order to achieve the first two goals, the implementation of risk adjustment is crucial [33].

Diagnosis-based risk adjustment is still a very new concept in Asia and all existing risk adjustment technologies were developed using claims data from Western countries. Several studies evaluated their performance in Taiwan [34, 35]. However, they were either methodologically limited (for example, split-validation was not performed), only reported a single measure at individual level (R2), or simply focused on total expenditure. Given that diagnosis-based risk adjustment has different implications at individual and group level (such as budget allocation) and across different types of expenditures, it is necessary to thoroughly examine risk adjustment models before they can be directly applied in Taiwan. In addition, in most cases, previous evaluations of risk adjustment models have used regional datasets or focused only on sub-population. Taiwan is one of very few health care systems in the world which has universal coverage and a single national computerized database that includes medical diagnosis information on almost 100% of the population. For this reason the results of this paper have potential policy and methodology implications for most other high or middle income nations.

In this paper we aimed to assess the performance of the ACG system using Taiwan's NHI claims data and to evaluate how adding EDCs could affect the performance of the ACG system.


Data sources

The source of the data was a longitudinal dataset prepared by Taiwan's BNHI, which is available for researchers interested in observing longitudinal changes of medical utilization. Individuals' identifiers in this dataset have been encrypted in order to protect privacy and confidentiality, and this study has been approved by the Johns Hopkins School of Public Health Institutional Review Board. This dataset contained enrollment and claims files of a randomly chosen 1% of Taiwan's population (about 200,000 individuals). The enrollment files contained individual subscription information and demographic factors, including sex, date of birth, type of beneficiaries and location. The claims files contained comprehensive records of inpatient care, ambulatory care, pharmacy store, dental care and Chinese medicine services, including date of service, ICD-9-CM (International Classification of Diseases) diagnosis codes, claimed medical expenses and the amount of co-payment for each encounter. The requirement was 12-months enrollment in year 2002 for concurrent analyses, while 24 months enrollment in years 2002 and 2003 were required for prospective analyses. The final sample size was 173,234 in the concurrent and 164,562 in prospective analyses.

Annual health expenditures were aggregated from all inpatient, outpatient and pharmacy store claimed expenses for every enrollee, including claimed reimbursement, medication expenses and co-payments; expenses for dental care and Chinese medicine were excluded from this aggregation. The total expenditure could be further divided into inpatient/outpatient/pharmacy store expenditure, or medical/drug/pharmacy service expenditure. Given that pharmacy store and pharmacy service expenditure was very small (each accounted for less than 2.5% of the total expenditure), results of both categories were not reported. The 2002 expenditure was used for concurrent analyses while year 2003 expenditure was used for prospective analyses. The unit of money in Taiwan is the New Taiwan Dollar (NTD); the exchange rate is about 32 NTD: 1 US dollar as of November 2009. Demographic factors included: sex; type of beneficiaries (insured or dependent); categorical age (0-17, 18-34, 35-49, 50-64, 65+); insurance category (based on insured's type of job); residence (three levels with different degrees of population density); and locality (six regions: Taipei, Northern, Central, Southern, Kao-Ping and Eastern). Diagnosis-based risk adjustment factors, including ACGs, ADGs and EDCs, were derived from the ACG case-mix system (Version 7.1) using the individuals' overall ICD-9-CM codes from both inpatient and outpatient records in 2002 (diagnosis codes from dental care and Chinese medicine were excluded).

The ACG risk adjustment system

ACG actuarial cells are mutually exclusive health status categories defined by morbidity pattern, age and sex. The ACG system assigns all ICD-9-CM codes to one of 32 diagnostic clusters (ADGs) based on five clinical dimensions: duration; severity; diagnostic certainty; aetiology; and specialty care involvement [23, 24]. Each ADG is a grouping of diagnosis codes similar in terms of severity and likelihood of persistence of the health condition treated over a relevant period of time, typically 1 year. ADGs are not mutually exclusive and individuals can have multiple ADGs (up to 32). Individuals are then placed into one of 93 discrete ACG categories according to their assigned ADGs, age and sex. The result is that individuals within a given ACG experienced a similar pattern of morbidity and resource consumption. The Johns Hopkins EDC methodology assigns each ICD code to a single disease category or EDC; there are 264 EDCs in total. ICD codes within an EDC share similar clinical characteristics and are expected to induce similar types of diagnostic and therapeutic responses.

Measuring predictive performance

The following risk adjustment models (from the simplest to the most comprehensive) were used to explain five types of expenditure (total, inpatient, outpatient, medical and drug), both concurrently and prospectively:

  1. 1.

    Demographics only,

  2. 2.

    ACGs only,

  3. 3.

    ADGs with demographics,

  4. 4.

    ADGs plus selected EDCs with demographics, and

  5. 5.

    Full EDCs with demographics.

Selected EDCs were derived from the results of stepwise analyses using all EDCs (Additional file 1) and the final set of selected EDCs were different in concurrent (33 EDCs) and prospective analyses (19 EDCs). As expenditure is a non-negative variable, negative predicted expenditures from models were set at zero.

The performance of five risk adjustment models was evaluated at two levels: adjusted R2 and mean absolute prediction error (MAPE) [22, 36] at individual level, and predictive ratio (PR) at group level. MAPE was the average of all absolute differences between the observed and the predicted. MAPEs of different types of expenditure were divided by their respective means so that results on different types of expenditures could be compared. PR was calculated by dividing mean predicted expenditure by mean actual expenditure within a selected group of subjects. The model performed better if R2 was larger, MAPE was smaller and the PR was closer to one. Split analysis was performed (a randomly selected 70% of study subjects were used for model development while the rest was set aside for model validation), and measures of model performance were obtained from the validation set to avoid overfitting.

Among these three indicators, R2 was easily influenced by outliers [37]. Therefore, three models with different levels of truncation were performed in order to reduce the influence of outliers: no truncation (raw expenditure); truncation at two standard deviations above mean of log expenditure plus one; and truncation at the top 0.5%. Definitions of groups used to calculate PR included actual total expenditure quintiles, disease burden and age/sex group. Disease burden was classified into six categories from very low to very high morbidity and was also based on an output of the ACG system. In concurrent analyses, group classification could only be based on the 2002 (current) information; in prospective analyses, however, group classification could be based on either the 2002 (prior) or the 2003 (current) information.

Statistical analysis

All statistical analyses were conducted using SAS™ software version 9.1. Several statistical methods had been proposed for the analysis of expenditure and no single method was seen to be the best under different conditions examined in these studies [3840]. Comparisons of these statistical models for the explanation of expenditure are presented in Additional file 2. In this study, given the very high R2, the comparable MAPE, the standard approach usually adopted in studies involving risk adjustment [41, 42] and a very large sample size [16], the ordinary least squares (OLS) regression model was used.


Characteristics of the population (Table 1)

Table 1 Characteristics of the Taiwanese population for concurrent and prospective analyses

The distribution of demographic factors and medical utilization was similar among all subjects included in concurrent and prospective analyses. About half of the study subjects were male and 40% were the insured. The mean age in 2002 was 35 years and 10% were elderly. About one-third lived in the areas within the Taipei Branch, while only 2% were from the Eastern Branch. About 65% were living in rural county areas. Only 10% had not made any outpatient visit, while 8% had at least one inpatient stay. About 90% had non-zero total expenditure and a similar percentage had non-zero drug expenditure. The 2003 expenditure of the prospective sample were about the same as the 2002 expenditure of the concurrent sample. Mean total expenditure was about 14,500 NTDs, among which medical expenditure (10,000 NTDs) was much higher than pharmacy expenditure (4000 NTDs). Outpatient expenditure (9500 NTDs) was also much higher than inpatient expenditure (4800 NTDs).

Proportion of total variance explained by the models (adjusted R2) (Tables 2 and 3)

Table 2 Concurrent adjusted R-squared and mean absolute prediction error of alternate risk factors and different categories of expenditure.
Table 3 Prospective adjusted R2 and mean absolute prediction error of alternate risk factors and different categories of expenditure.

In concurrent analyses, the demographic model explained 4%, the ACGs and ADGs models explained about 15%, the ADGs plus selected EDCs model and full EDCs model explained roughly 40% of variances in the total expenditure. In prospective analyses, the demographic model explained 4%, the ACGs and ADGs models explained about 10%, the ADGs plus selected EDCs model and full EDCs model explained over 20% of variances. In concurrent analyses, the adjusted R2 of medical expenditures was much higher than pharmacy expenditure across all models. In the prospective analyses, the adjusted R2 of medical expenditure was only higher among the two EDCs-related models while comparable in other models. In addition, in concurrent analyses, the adjusted R2 of outpatient expenditures was slightly higher in simpler models while comparable to that of inpatient expenditure in the two EDC-related models. In prospective analyses, the adjusted R2 of outpatient expenditure was always higher across all models. Comparing adjusted R2 of concurrent and prospective analyses, it was found that the lower adjusted R2 of total expenditure in prospective analyses was the result of the lower prospective adjusted R2 of medical and inpatient expenditure (the adjusted R2 of pharmacy and outpatient expenditure from both concurrent and prospective analyses was similar).

In both concurrent and prospective analyses, truncation increased adjusted R2 across all types of expenditure, especially in pharmacy and outpatient expenditure. The only exception was that the adjusted R2 of prospective medical expenditure in the EDC-related models remained the same after truncation. After truncation, the adjusted R2 was higher in pharmacy expenditure (compared to medical expenditure) and outpatient expenditure (relative to inpatient expenditure), and the differences of adjusted R2 (pharmacy/medical expenditure and inpatient/outpatient expenditure) were much larger in prospective analyses. We also found that the adjusted R2 of pharmacy expenditure increased the most after truncation. In addition, it also showed that the more the number of observations truncated, the higher the adjusted R2. After truncation at the top 0.5%, the adjusted R2 of total expenditure in two EDCs models, increased from 40% to 53% in concurrent analyses and from 22% to 29% in prospective analyses; such an increase was larger in concurrent analyses.

Adjusted R2 was different between the elderly and non-elderly group (Table 4). In concurrent analyses, the adjusted R2 in four ACG-related risk adjustment models was always larger in the elderly population, with the exception of the adjusted R2 of outpatient expenditure from the two EDCs models. The biggest difference was the adjusted R2 of inpatient expenditure. It was about 20 percentile larger in the elderly population in the EDC-related models. The adjusted R2 of total expenditure in the non-elderly population ranged from 1.4% in the demographic model to 33% in the EDCs-related models, while in the elderly population it was from 0.4% to 45%. In the prospective analyses, the adjusted R2 in the elder population was only larger in pharmacy expenditure, while smaller or similar in all other expenditure. The adjusted R2 in the non-elderly population ranged from 1.7% in the demographic model to 22% in the EDC-related models, while in the elderly population it was from 0.3% to 16%. Demographic models performed badly in the elderly population in both prospective and concurrent analyses.

Table 4 Concurrent and prospective adjusted R-Squared of raw expenditures by two age groups.

Mean absolute prediction error (%) (Tables 2 and 3)

In concurrent analyses, MAPE of total expenditure in the demographic model was 109%; that from the ACG model was 87%, which was better than the ADGs model (94%), and the MAPEs from the two EDC-related models were roughly the same (78%). In the prospective analyses, the MAPE of total expenditure in the demographic model was 112%. Those of the ACG and ADG models were close (103%), while the MAPEs of the two EDCs-related models were about 96%. Both concurrent and prospective analyses showed that MAPEs were the smallest in outpatient expenditures, then total, pharmacy, and medical expenditures, while the largest MAPEs were from inpatient expenditures. MAPEs of outpatient expenditures were about half of those from inpatient expenditures across all models. In addition, MAPEs were smaller in concurrent analyses than in prospective analyses.

Predictive ratio (PR) (Tables 5 and 6)

Table 5 Concurrent predictive ratios of alternate risk factors and different categories of expenditures.
Table 6 Prospective predictive ratios of alternate risk factors and different categories of expenditures.

Expenditure levels by quintiles

All models underpredicted total expenditure in the highest quintile group while expenditure was overpredicted in the four other groups. When groups were defined based on current information (2002 expenditure for the concurrent and 2003 expenditure for the prospective analyses), PR decreased from the lowest to the highest quintile group. There was an especially large drop moving from the lowest to the second lowest group. Among the current classification, PR was smaller in concurrent than prospective analyses. When groups were defined using prior classification (in prospective analyses only), the decreasing trend was not clear and PR was much smaller. In general, comprehensive models usually performed better than simpler models and the demographic model performed much worse than the other models.

Morbidity status

Overall, comprehensive models tended to perform better with the exception of the ACG model. The ACG model performed far better than all other models in concurrent analyses. PR of people with the lowest morbidity burden was only about 0.9 in the ACG model but was more than 100 in the other models. Similarly, all models tended to overpredict the total expenditure for people with the lower morbidity burden, who had lower total expenditure, while underpredicted expenditure for people with higher morbidity burden (hence higher total expenditure). There was no decreasing trend for PR other than that in the demographic model or current classification in prospective analyses. PR based on current classification usually deviated further from 1 in prospective than in concurrent analyses. PR based on prior classification was much better than that on current classification, probably because the difference in mean expenditures across groups was much smaller.

Age/sex group

Comprehensive models tended to perform better. Among younger groups, PR deviated further from 1 in females. However, among elder groups, on the contrary, deviation was larger in males. Overall, PR was closer to 1 in male compared to the female groups in both concurrent and prospective analyses. In addition, there was a tendency for all models to overpredict total expenditure in the elder groups while underpredicting in the younger groups in both genders, especially in prospective analyses and simpler models.


We found that the adjusted R2 of total expenditures in concurrent/prospective analyses was about 4% in the demographic model, 15%/10% in the ACGs or ADGs models and 40%/22% in the models containing EDCs. The adjusted R2 of medical/outpatient expenditure was always larger than that of pharmacy/inpatient expenditure. The performance of the ADGs plus selected EDCs models was comparable to that of the full EDCs model. When predicting expenditure for groups based on expenditure quintiles, all models underpredicted the highest group while overpredicting the other four groups. For population sub-groups selected on morbidity burden, however, the ACGs model had the best performance overall.

The prerequisite for adopting diagnosis-based risk adjustment models is that individuals' diagnosis information has to be complete and available. Given the consistently high enrollment rate (99% by the end of 2006 [43]), the high NHI-contracted rate of providers (above 90% [43, 44]), a comprehensive benefit package and the centralization of claims data, diagnosis information should be able to capture an individual's morbidity information and is readily available in Taiwan. Among the 1.25 million unique diagnoses encountered by 173,234 subjects, only 0.393% were non-grouped and 0.78% were unknown to the ACG system. Given the very small number in both cases, it provided face validity in the quality of diagnosis and the ability of the ACG system to process claims data in Taiwan.

One possible weakness of the study is that the coding may have improved over the study years so that people with the same condition had more complete ICD codes reported if they used medical services during a latter period. Therefore, we examined the number of ICD codes reported for each person from 2000 to 2003. If the number did not differ very much, it would imply that the improved coding might not be a problem for this analysis. The numbers of ICD codes assigned to each patient from year 2000 to 2003 were: 17.86, 18.07, 18.62 and 18.20, respectively. Given this slight variation, it seemed that the increased coding was not likely to be a problem for this study.

Several risk adjusters have been evaluated in Taiwan, including catastrophic disease status [45], prior utilization [4548], diagnosis-based models [4547, 49, 50] and pharmacy-based models [34, 51]. It was found that prior utilization yielded the highest R2. Diagnosis-based models performed better than pharmacy-based models, while the catastrophic disease status was somewhat less efficient than the pharmacy-based models. The ACG system has been examined in several studies [35, 49, 50, 52]. Given the difference in the truncation levels, statistical methods and how expenditure was calculated, it was difficult to make direct comparisons. However, the general findings that the ACGs/ADGs categorical model did not perform as well as other claims-based risk adjustment models that document individual diseases (such as the EDCs) and that the adjusted R2 of outpatient expenditure was much higher than that of inpatient expenditure, still hold in this study.

The performance of the ACGs/ADGs model in Taiwan was comparable to what the models had achieved among the general population in other countries. This in part suggested that the ACG system can be directly applied to Taiwan's NHI system. The lower R2 performance of both categorical models compared to other disease-specific diagnosis-based models is probably due to the limited numbers of variables included in both models and the difference in the grouping algorithm (ACGs has 93 mutually exclusive categories and ADGs consist of only 32 binary variables). However, after adding selected disease indicators to the ADGs model, the performance was comparable to what could be achieved by other diagnosis-based models (40% concurrently and 22% prospectively in raw total expenditure). This finding was consistent with results from previous research that patients of some common and expensive chronic diseases accounted for a relatively large proportion of healthcare costs and adding these disease indicators improved the predictive power of the risk adjustment models [15, 17]. It may be necessary to incorporate important disease indicators if the ACG system were used. That is the approach used by the current ACG-PM model in ACG version 7.0 and after.

Quality improvement and financial balance are two of three main goals of Taiwan's NHI reform set up by the NHI's Second Generation Planning Committee [33] and both require strong risk adjustment tools. One major approach suggested by the Committee to achieve quality improvement is to release valid and understandable quality information regarding healthcare providers to the public in order that beneficiaries can make informed decisions. However, before quality information can be released, it is important and necessary to implement risk adjustment so that patient differences across healthcare organizations are controlled for and variation in the quality of care can be attributed to providers. In addition, the Planning Committee also concluded that the payment system reform should involve healthcare providers in taking on more financial responsibility for containing costs and it was suggested that the per-case payments and partial capitation should replace the current fee-for-service payment system [33]. The implementation of risk adjustment is necessary in order to ensure equity if any form of capitation or budgeted payment system is used in the future. Both of these issues are likely to be applicable to most other developed healthcare systems around the globe.

Given the availability and comprehensiveness of claims data, Taiwan has the necessary information to implement diagnosis-based risk adjustment. This study further shows that the ACG case-mix system performs comparably to diagnosis-based risk adjustment models applied in other health care systems and far better than the demographics-only model currently employed for the NHI. Therefore, incorporating diagnosis-based risk adjustment into NHI will be a major task facing healthcare policymakers and administrators in Taiwan.


The ACG system was developed using American health insurance claims; given the differences in healthcare systems and care-seeking behaviours, it may be necessary to adjust the risk classification system inherent in the model so that it can reflect the local patterns of disease burden and health services utilization, such as Chinese medicine.

The calculation of an individual's enrollment period was a concern in this study. As only an individual's latest enrollment record was included in the yearly enrollment files starting from 2003, it was only possible to calculate the exact length of enrollment before 2003 but not afterwards. It was assumed in this study that all subjects in the 2003 file were enrolled in NHI starting from January 2003 for the following reasons: (1) the enrollment type of all enrollment records in 2003 was the same - 'transferring in' indicates that an individual had a new enrollment record because of the change in insurance identity or unit and they were all enrolled prior to this change; (2) the enrollment rate was consistently high in Taiwan; (3) the distribution of individuals' length of enrollment in 2003 and after, based on this assumption, was similar to that in 2002 or earlier. The effect of this assumption was that some subjects who were not 12-month enrollees in 2003 would be included in the study.

In this analysis people who did not have continuous enrollment over the study period were excluded. This led to some differences of characteristics between subjects in the analysis sample and the target population, which may have moderately affected the generalizability of the results. It was found that there were statistically significantly differences in demographics and medical utilization between those who had full 12-month enrollment and those who did not in 2002 (Table 7) and between those having 24-month enrollment in both 2002 and 2003 and those having 12-month enrollment only in 2002 (Table 8). Those excluded from the analyses had a much higher average healthcare expenditure and inpatient visits (even though they did not have full-year enrollment), although a higher proportion of them did not use any medical service.

Table 7 Characteristics of subjects with continuous and incomplete enrollment among 2002 enrollees (N = 181,790).
Table 8 Characteristics of subjects with continuous and incomplete enrollment in 2003 among 2002 continuous enrollees.

The reason for this seemingly conflicting result might be that the group without full enrollment mainly consisted of two different types of people: those who died during the year and those who served in the army in that year (and thus were removed from this dataset for national security reasons). People tended to consume a lot more medical resources before they died, so the average expenditure would increase hugely. On the other hand, those who served in the army were mostly in their twenties and, hence, less likely to use medical services and therefore the proportion of people using any service decreased. Therefore, the results of this study may not be fully generalizable to those who died during the year or were or may be in the army.

Future research directions

Taiwan's NHI provides beneficiaries' comprehensive drug coverage. With drug information readily available, it will be interesting to evaluate how a pharmacy-based risk adjustment model, such as the ACG system's pharmacy-based morbidity groups (Rx-MG) measures, works in Taiwan and how much improvement can be made by including pharmacy information in the claims-based risk adjustment model. Furthermore, most diagnosis information used for risk adjustment models is cross-sectional, due in part to the difficulty of obtaining an individual's longitudinal diagnosis information. Given the universal and lifelong coverage under NHI in Taiwan, this setting provides a very good opportunity to examine how bringing in longitudinal claims data will affect the performance of risk adjustment models.


Given the availability of claims data and the much better performance of claims-based risk adjustment models over the demographics-only model, Taiwan's government should incorporate claims-based models in the important policy-setting processes, such as resource allocation, predictive modeling for high-risk case finding and cost prediction. The performance of the ACG risk adjustment system in Taiwan is comparable to that found in other countries; therefore, this suggests that the ACG system may be directly applied to Taiwan's NHI even though it was originally developed using USA claims data. In addition, it may be necessary to utilize the disease indicators component (EDCs) of the ACG system in order to ensure the highest performance of the ACG system. Given the experience in Taiwan, it is very likely that other nations will be able to apply the ACG system or other similar diagnosis-based risk adjustment tools if insurance claims or other computerized data sources capturing ambulatory and inpatient medical diagnoses are available.



Adjusted Clinical Group


Aggregated Diagnosis Group


Bureau of NHI


Expanded Diagnosis Cluster


International Classification of Diseases


mean absolute prediction error


ordinary least squares


National Health Insurance


New Taiwan dollar


predictive modelling


predictive ratio.


  1. Liu CF, Sales AE, Sharp ND, Fishman P, Sloan KL, Todd-Stenberg J, Nichol WP, Rosen AK, Loveland S: Case-mix adjusting performance measures in a veteran population: pharmacy- and diagnosis-based approaches. Health Serv Res. 2003, 38 (5): 1319-1337. 10.1111/1475-6773.00179.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Thomas JW, Grazier KL, Ward K: Comparing accuracy of risk-adjustment methodologies used in economic profiling of physicians. Inquiry. 2004, 41 (5): 218-231.

    PubMed  Google Scholar 

  3. Huang IC, Dominici F, Frangakis C, Diette GB, Damberg CL, Wu AW: Is risk-adjustor selection more important than statistical approach for provider profiling? Asthma as an example. Med Decis Making. 2005, 25 (5): 20-34. 10.1177/0272989X04273138.

    Article  PubMed  Google Scholar 

  4. Ash AS, Zhao Y, Ellis RP, Schlein Kramer M: Finding future high-cost cases: comparing prior cost versus diagnosis-based methods. Health Serv Res. 2001, 36 (6 Pt 2): 194-206.

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Radcliff TA, Cote MJ, Duncan RP: The identification of high-cost patients. Hosp Top. 2005, 83 (5): 17-24. 10.3200/HTPS.83.3.17-24.

    Article  PubMed  Google Scholar 

  6. Meenan RT, Goodman MJ, Fishman PA, Hornbrook MC, O'Keeffe-Rosetti MC, Bachman DJ: Using risk-adjustment models to identify high-cost risks. Med Care. 2003, 41 (5): 1301-1312. 10.1097/01.MLR.0000094480.13057.75.

    Article  PubMed  Google Scholar 

  7. FitzHenry F, Shultz EK: Health-risk-assessment tools used to predict costs in defined populations. J Healthc Inf Manag. 2000, 14 (5): 31-57.

    CAS  PubMed  Google Scholar 

  8. A Comparative Analysis of Claims-Based Tools for Health Risk Assessment. []

  9. Zhao Y, Ash AS, Ellis RP, Ayanian JZ, Pope GC, Bowen B, Weyuker L: Predicting pharmacy costs and other medical costs using diagnoses and drug claims. Med Care. 2005, 43 (5): 34-43.

    PubMed  Google Scholar 

  10. Fowles JB, Weiner JP, Knutson D, Fowler E, Tucker AM, Ireland M: Taking health status into account when setting capitation rates: a comparison of risk-adjustment methods. JAMA. 1996, 276 (5): 1316-1321. 10.1001/jama.276.16.1316.

    Article  CAS  PubMed  Google Scholar 

  11. Pietz K, Ashton CM, McDonell M, Wray NP: Predicting healthcare costs in a population of veterans affairs beneficiaries using diagnosis-based risk adjustment and self-reported health status. Med Care. 2004, 42 (5): 1027-1035. 10.1097/00005650-200410000-00012.

    Article  PubMed  Google Scholar 

  12. Parkerson GR, Harrell FE Jr, Hammond WE, Wang XQ: Characteristics of adult primary care patients as predictors of future health services charges. Med Care. 2001, 39 (5): 1170-1181. 10.1097/00005650-200111000-00004.

    Article  PubMed  Google Scholar 

  13. Newhouse JP, Manning WG, Keeler EB, Sloss EM: Adjusting capitation rates using objective health measures and prior utilization. Health Care Financ Rev. 1989, 10 (5): 41-54.

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Sernyak MJ, Rosenheck R: Risk adjustment in studies using administrative data. Schizophr Bull. 2003, 29 (5): 267-271.

    Article  PubMed  Google Scholar 

  15. van Vliet RC, Ven van de WP: Towards a capitation formula for competing health insurers. An empirical analysis. Soc Sci Med. 1992, 34 (5): 1035-1048. 10.1016/0277-9536(92)90134-C.

    Article  CAS  PubMed  Google Scholar 

  16. Shen Y, Ellis RP: How profitable is risk selection? A comparison of four risk adjustment models. Health Econ. 2002, 11 (5): 165-174. 10.1002/hec.661.

    Article  PubMed  Google Scholar 

  17. Dudley RA, Medlin CA, Hammann LB, Cisternas MG, Brand R, Rennie DJ, Luft HS: The best of both worlds? Potential of hybrid prospective/concurrent risk adjustment. Med Care. 2003, 41 (5): 56-69. 10.1097/00005650-200301000-00009.

    Article  PubMed  Google Scholar 

  18. Ash AS, Ellis RP, Pope GC, Ayanian JZ, Bates DW, Burstin H, Iezzoni LI, MacKay E, Yu W: Using diagnoses to describe populations and predict costs. Health Care Financ Rev. 2000, 21 (5): 7-28.

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Sales AE, Liu CF, Sloan KL, Malkin J, Fishman PA, Rosen AK, Loveland S, Paul Nichol W, Suzuki NT, Perrin E, et al: Predicting costs of care using a pharmacy-based measure risk adjustment in a veteran population. Med Care. 2003, 41 (5): 753-760. 10.1097/00005650-200306000-00008.

    PubMed  Google Scholar 

  20. Lamers LM, van Vliet RC: Multiyear diagnostic information from prior hospitalization as a risk-adjuster for capitation payments. Med Care. 1996, 34 (5): 549-561. 10.1097/00005650-199606000-00005.

    Article  CAS  PubMed  Google Scholar 

  21. Kuhlthau K, Ferris TG, Davis RB, Perrin JM, Iezzoni LI: Pharmacy-and diagnosis-based risk adjustment for children with Medicaid. Med Care. 2005, 43 (5): 1155-1159. 10.1097/01.mlr.0000182551.87591.73.

    Article  PubMed  Google Scholar 

  22. A comparative analysis of claims-based methods of health risk assessment for commercial populations. []

  23. Starfield B, Weiner J, Mumford L, Steinwachs D: Ambulatory care groups: a categorization of diagnoses for research and management. Health Serv Res. 1991, 26 (5): 53-74.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Weiner JP, Starfield BH, Steinwachs DM, Mumford LM: Development and application of a population-oriented measure of ambulatory care case-mix. Med Care. 1991, 29 (5): 452-472. 10.1097/00005650-199105000-00006.

    Article  CAS  PubMed  Google Scholar 

  25. Ash A, Porell F, Gruenberg L, Sawitz E, Beiser A: Adjusting Medicare capitation payments using prior hospitalization data. Health Care Financ Rev. 1989, 10 (5): 17-29.

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Ellis RP, Ash A: Refinements to the Diagnostic Cost Group (DCG) model. Inquiry. 1995, 32 (5): 418-429.

    PubMed  Google Scholar 

  27. Pope GC, Ellis RP, Ash AS, Liu CF, Ayanian JZ, Bates DW, Burstin H, Iezzoni LI, Ingber MJ: Principal inpatient diagnostic cost group model for Medicare risk adjustment. Health Care Financ Rev. 2000, 21 (5): 93-118.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Fishman PA, Goodman MJ, Hornbrook MC, Meenan RT, Bachman DJ, O'Keeffe Rosetti MC: Risk adjustment using automated ambulatory pharmacy data: the RxRisk model. Med Care. 2003, 41 (5): 84-99. 10.1097/00005650-200301000-00011.

    Article  PubMed  Google Scholar 

  29. Sloan KL, Sales AE, Liu CF, Fishman P, Nichol P, Suzuki NT, Sharp ND: Construction and characteristics of the RxRisk-V: a VA-adapted pharmacy-based case-mix instrument. Med Care. 2003, 41 (5): 761-774. 10.1097/00005650-200306000-00009.

    PubMed  Google Scholar 

  30. The Johns Hopkins ACG Case-Mix System Reference Manual Version 7.0. 2005, Baltimore: Johns Hopkins Bloomberg School of Public Health

  31. Reid RJ, Roos NP, MacWilliam L, Frohlich N, Black C: Assessing population health care need using a claims-based ACG morbidity measure: a validation analysis in the Province of Manitoba. Health Serv Res. 2002, 37 (5): 1345-1364. 10.1111/1475-6773.01029.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Carlsson L, Borjesson U, Edgren L: Patient based 'burden-of-illness' in Swedish primary health care. Applying the Johns Hopkins ACG case-mix system in a retrospective study of electronic patient records. Int J Health Plann Manage. 2002, 17 (5): 269-282. 10.1002/hpm.674.

    Article  CAS  PubMed  Google Scholar 

  33. NHI's Second Generation Planning Committee: Towards A National Health Insurance System Where Rights and Duties Are Met: The Final Report by NHI's Second Generation Planning Committee. 2004, In NHI's Second Generation Planning Committee, Executive Yuan. Taiwan: ROC

    Google Scholar 

  34. Hsieh M: Refining Diagnosis-Based Risk Adjustment Models with Prescription Information. Taipei. 2005, National Taiwan University

    Google Scholar 

  35. Hung S: Using ACG Case-Mix System to Evaluate the Ambulatory Utilization of Liver Disease in Taiwan. Taipei. 2006, National Yang-Ming University

    Google Scholar 

  36. Cousins MS, Shickle LM, Bander JA: An introduction to predictive modeling for disease management risk stratification. Dis Manag. 2002, 5 (5): 157-167. 10.1089/109350702760301448.

    Article  Google Scholar 

  37. Hu G, Root M: Accuracy of prediction models in the context of disease management. Dis Manag. 2005, 8 (5): 42-47. 10.1089/dis.2005.8.42.

    Article  PubMed  Google Scholar 

  38. Blough DK, Madden CW, Hornbrook MC: Modeling risk using generalized linear models. J Health Econ. 1999, 18 (5): 153-171. 10.1016/S0167-6296(98)00032-0.

    Article  CAS  PubMed  Google Scholar 

  39. Manning WG, Mullahy J: Estimating log models: to transform or not to transform?. J Health Econ. 2001, 20 (5): 461-494. 10.1016/S0167-6296(01)00086-8.

    Article  CAS  PubMed  Google Scholar 

  40. Buntin MB, Zaslavsky AM: Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J Health Econ. 2004, 23 (5): 525-542. 10.1016/j.jhealeco.2003.10.005.

    Article  PubMed  Google Scholar 

  41. Iezzoni LI: Risk adjustment for measuring healthcare outcomes. 1997, Chicago: Health Administration Press, 2

    Google Scholar 

  42. Greenwald LM: Medicare risk-adjusted capitation payments: from research to implementation. Health Care Financ Rev. 2000, 21 (5): 1-5.

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Department of Health: Health Statistics in Taiwan, 2006: Part III. Current Situation of Medical Facilities, Medical Personnel, and Medical Services. 2007, Taipei: Department of Health, ROC

    Google Scholar 

  44. Bureau of National Health Insurance: National Health Insurance Annual Statistical Report: 2006. 2007, Taipei: Bureau of National Health Insurance, Department of Health, ROC

    Google Scholar 

  45. Chang RE, Lin W, Hsieh CJ, Chiang TL: Healthcare utilization patterns and risk adjustment under Taiwan's National Health Insurance system. J Formos Med Assoc. 2002, 101 (5): 52-59.

    PubMed  Google Scholar 

  46. Chang RE, Lai CL: Use of diagnosis-based risk adjustment models to predict individual health care expenditure under the National Health Insurance system in Taiwan. J Formos Med Assoc. 2005, 104 (5): 883-890.

    PubMed  Google Scholar 

  47. Chang S: Development of Risk-Adjusted Diagnostic Groups Based on All Diagnostic Information and Applications to Risk Adjustment Models. 2006, Taipei: National Taiwan University

    Google Scholar 

  48. Tsai WD, Lo JC: Capitation payment system: risk factors estimation. Acad Econ Pap. 2000, 28 (3): 31-261.

    Google Scholar 

  49. Lee WC, Huang TP: Explanatory ability of the ACG system regarding the utilization and expenditure of the national health insurance population in Taiwan--a 5-year analysis. J Chin Med Assoc. 2008, 71 (5): 191-199. 10.1016/S1726-4901(08)70103-5.

    Article  PubMed  Google Scholar 

  50. Lee WC: Quantifying morbidities by Adjusted Clinical Group system for a Taiwan population: a nationwide analysis. BMC Health Serv Res. 2008, 8: 153-10.1186/1472-6963-8-153.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Liu F: Ambulatory Risk-Adjusted Model Based on Prescription of Chronic Disease. 2004, Taipei: National Taiwan University

    Google Scholar 

  52. Lin C: Comparing the Ability of Different Diagnosis-Based Risk Adjusters to Predict Individuals' Ambulatory Expenditures under the National Health Insurance. 2006, Taipei: National Taiwan University

    Google Scholar 

Pre-publication history

Download references


The authors thank Dr Weng-Foung Huang, a professor at National Yang-Ming University, Taipei, Taiwan, for his collaboration in this project. This study is based in part on data from the National Health Insurance Research Database provided by the Bureau of National Health Insurance, Department of Health and managed by National Health Research Institutes in Taiwan. The interpretation and conclusions contained herein do not represent those of Bureau of National Health Insurance, Department of Health or National Health Research Institutes in Taiwan.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Hsien-Yen Chang.

Additional information

Competing interests

HYC worked as a part-time research assistant for Dr Jonathan Weiner, one of the founders of the ACG system, when pursuing his PhD at Johns Hopkins University. The author is currently hired as post-doctorate fellow at John Hopkins University with part of the funding coming from the ACG team. JW is one of the developers of the ACG system. The Johns Hopkins University receives royalties for non-academic use of software based on the ACG methodology.

Authors' contributions

HYC designed the study, cleaned the data, performed the statistical analysis and drafted the manuscript. JW provided insight into the concept and design of the study and revised the manuscript critically for important intellectual content. All authors read and approved the final manuscript.

Electronic supplementary material

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Chang, HY., Weiner, J.P. An in-depth assessment of a diagnosis-based risk adjustment model based on national health insurance claims: the application of the Johns Hopkins Adjusted Clinical Group case-mix system in Taiwan. BMC Med 8, 7 (2010).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: