Skip to main content
  • Research article
  • Open access
  • Published:

Racial differences in healthcare expenditures for prevalent multimorbidity combinations in the USA: a cross-sectional study

Abstract

Background

We aimed to model total charges for the most prevalent multimorbidity combinations in the USA and assess model accuracy across Asian/Pacific Islander, African American, Biracial, Caucasian, Hispanic, and Native American populations.

Methods

We used Cerner HealthFacts data from 2016 to 2017 to model the cost of previously identified prevalent multimorbidity combinations among 38 major diagnostic categories for cohorts stratified by age (45–64 and 65 +). Examples of prevalent multimorbidity combinations include lipedema with hypertension or hypertension with diabetes. We applied generalized linear models (GLM) with gamma distribution and log link function to total charges for all cohorts and assessed model accuracy using residual analysis. In addition to 38 major diagnostic categories, our adjusted model incorporated demographic, BMI, hospital, and census division information.

Results

The mean ages were 55 (45–64 cohort, N = 333,094) and 75 (65 + cohort, N = 327,260), respectively. We found actual total charges to be highest for African Americans (means $78,544 [45–64], $176,274 [65 +]) and lowest for Hispanics (means $29,597 [45–64], $66,911 [65 +]). African American race was strongly predictive of higher costs (p < 0.05 [45–64]; p < 0.05 [65 +]). Each total charge model had a good fit. With African American as the index race, only Asian/Pacific Islander and Biracial were non-significant in the 45–64 cohort and Biracial in the 65 + cohort. Mean residuals were lowest for Hispanics in both cohorts, highest in African Americans for the 45–64 cohort, and highest in Caucasians for the 65 + cohort. Model accuracy varied substantially by race when multimorbidity grouping was considered. For example, costs were markedly overestimated for 65 + Caucasians with multimorbidity combinations that included heart disease (e.g., hypertension + heart disease and lipidemia + hypertension + heart disease). Additionally, model residuals varied by age/obesity status. For instance, model estimates for Hispanic patients were highly underestimated for most multimorbidity combinations in the 65 + with obesity cohort compared with other age/obesity status groupings.

Conclusions

Our finding demonstrates the need for more robust models to ensure the healthcare system can better serve all populations. Future cost modeling efforts will likely benefit from factoring in multimorbidity type stratified by race/ethnicity and age/obesity status.

Peer Review reports

Background

The scarcity of economic models in multimorbidity research represents a serious challenge [1]. A significant limitation of current models is that most do not consider the varying costs of different disease combinations [2]. A better understanding of the burden of multimorbidity through cost assessment for various multimorbidity combinations will assist in targeting highest cost patients for intensive interventions [1]. Most high healthcare utilizers have at least two chronic conditions [3]. Among US adults, multimorbidity is estimated to have a prevalence of 58.4% [4]. Addressing the economic burden of multimorbidity is crucial to developing effective strategies for managing care.

Factoring in multimorbidity has been shown to explain these expenditures better than models based on population characteristics (size and demographics) alone [5]. For example, in the case of diabetes, different comorbidities have a varying impact on cost [6]. Previous studies evaluating the impact of specific multimorbidity combinations on expenditures thus far have focused on a few diseases [2]. The most prevalent multimorbidities in the USA represent a broad spectrum of diseases [7]. Effective care planning and resource management requires accurately projecting patient costs for these disease combinations [2, 8].

Modeling the associations of most prevalent multimorbidity combinations with healthcare expenditures is essential to further aging research because the majority of the 65 + population have two or more chronic conditions and account for 47% of Medicare spending [9]. Interventions aimed at slowing the aging process need to target patients with multiple diseases to be effective [10]. Mercer et al. (2016) found that multimorbidity-focused interventions are cost-effective for this patient population [11].

Better modeling of expenditures is essential for improving the health of racial and ethnic minorities. Clay et al. found that among African American men, comorbidity clusters are associated with poor outcomes, including poor health-related quality of life, disability, and higher mortality rate. As these authors suggest, better modeling of expenditures will be essential for improving the health of racial and ethnic minorities [12]. Multimorbidities exacerbate health inequalities as underserved populations are at greater risk for multimorbidity, increasing their disease burden [13]. Despite a clear need to better understand health disparities, research shows that even robust methods can be susceptible to bias. Predictive models derived from primarily homogenous populations may be poorly generalizable and can exacerbate racial/ethnic disparities [14]. Cost estimates of multimorbidity must address model racial/ethnic bias. To date, no large-scale study of the expenditures associated with common multimorbidity combinations has assessed the accuracy of model predictions across races and obesity status.

This research compares total healthcare expenditures for the most prevalent multimorbidity combinations across racial/ethnic groups. We also aim to determine if multimorbidity expenditure models have similar accuracy across racial/ethnic groups after adjustment for potential confounding factors. In addition, the study aims to assess for possible differences in total charges for 45–64 versus 65 + patients as the incidence of chronic disease rises exponentially with age [15]. This study is among the first to model total charges associated with the most prevalent multimorbidity combinations by race/ethnicity. Our previous work identified the most prevalent multimorbidity combinations by race/ethnicity, serving as the foundation for this current research [7]. Our primary objectives are to identify the expected total charges associated with the most prevalent multimorbidity combinations by race/ethnicity. Additionally, we sought to assess differences in expenditures for these multimorbidity combinations and assess differences in model accuracy by race/ethnicity.

Methods

Research design

This cross-sectional study employed de-identified data for 2016–2017 from the Cerner HealthFacts® data warehouse. The dataset includes electronic health records (EHR) representing over 490 million patient encounters for over 70 million patients treated at hospitals and clinics at 792 non-affiliated healthcare systems throughout the USA. Variable categories include encounter type, medical history, diagnoses, labs, prescriptions, patient demographics, clinic type, and procedures. Inclusion criteria for patients included the following: (1) age 45 + , (2) body mass index (BMI) value present and between 18.5 and 206, (3) EHR-identified race category, (4) EHR-identified gender, (5) patient encounters not missing total charges, and (6) an encounter with an International Classification of Diseases-10th Version-Clinical Modification (ICD-10–CM) diagnosis code among the 38 broad diagnoses that make up the most prevalent multimorbidities in the USA. These diagnoses involved in prevalent multimorbidities were identified in our previous research and are listed in Additional file 1: Table S1 [7]. In our previous research, we identified disease combinations, frequent above the threshold of 5%, shared by all races/ethnicities for each age/obesity level. The current work considers the economic impact of those multimorbidity combinations in their respective age/obesity level cohort. We aggregated ICD-10-CM sub-classifications of diseases into broad categories for all 38 diagnoses; for example, I11.9 (hypertensive heart disease without heart failure) fell under the broader parental category I11 (hypertensive heart disease). Using a prevalence-based approach for assessing multimorbidity validated through prior research, [16,17,18] we defined multimorbidity as the presence of two or more ICD-10–CM diagnosis codes in an individual during the 2-year (i.e., 2016–2017) study period. Our upper BMI cutoff is based on the highest recorded BMI value was between 206 and 224, so we considered it valid if the BMI value was 206 or less [19, 20]. Since our interest was addressing multimorbidities associated with obesity, underweight patients were excluded (BMI < 18.5) as they might have different multimorbidity issues. The 2-year assessment period was employed to maximize the probability of identifying all major prevalent multimorbidities experienced by individuals during the study period. Since diseases might not be diagnosed at the same visit or within the same year, this longer period allowed us to capture more data than would a single year.

Ethical considerations

The data were de-identified and excluded the 16 identifiable variables that necessitate Internal Review Board (IRB) approval for access. Because the study only employed de-identified data, the study was considered not human subjects’ research. Per the National Institutes of Health Office of Human Subjects Research policy, the University of Tennessee Health Science Center (UTHSC) Institutional Review Board (IRB) determined that the research was exempt. We performed this research following all other relevant research requirements.

Independent variables

Demographic, multimorbidity, and healthcare utilization variables were the primary independent variables. Demographic variables included race, age, gender, BMI, payer information, and rural or urban status. BMI was treated as a dichotomous variable, classifying patients with obesity (30 ≤ BMI < 206) and without obesity (18.5 ≤ BMI < 30). When assessing the financial burden across races in adults aged (45–64) and 65 + , controlling for factors impacting disease severity and socioeconomic issues affecting cost is critical [21]. Therefore, we assessed payer status, rurality, length of stay, and the Elixhauser Comorbidity Index (ECI) score. Hospital information and healthcare usage variables included the number of inpatient and outpatient visits, emergency visits, teaching hospital status, care-type status, and total hospital admission days, if any. Because ethnicity is not a separate variable in the Cerner HealthFacts database, Hispanic is listed as a racial category. Other racial categories included Caucasian, African American, Biracial, Asian/Pacific Islander, and Native American. Patients were stratified into two cohorts [age 45–64] and [age 65 +]) according to their age at the beginning of the study. Only patients with an EHR-identified gender (i.e., male and female) were included in the study. Additional file 1: Table S2 clarifies the remaining variables.

Dependent variable

Our primary dependent variable was the sum of total charges for all encounters over the 2-year study period for each patient. Healthcare utilization information included the total charges for each encounter. We categorized patient encounters into one of three categories: inpatient, outpatient, or emergency visit. We chose our dependent variable to be total charges since it is the amount that reflects the expense of the service received before any discounts or negotiations. Arora et al. (2015) described the challenge of answering the question “how much does healthcare cost?” and divided healthcare expenditures into three categories: price or charge, cost, and reimbursement [22]. Price or charge is defined as the amount billed by a provider for a healthcare service. Hospitals in the USA use a price list called chargemaster that includes a list of all billable services before any discounts or negotiation to arrive at the price charged, which varies across hospitals [23, 24]. The definition of cost varies with perspective. For the provider, the cost is simply the expense incurred to deliver healthcare services to the patient; for the payer, it is the amount that they will pay providers for these services; and for the patient, it is the amount they pay out-of-pocket for healthcare services rendered. Finally, reimbursement is defined as the amount paid a provider by a third party (the payer) for the services rendered to the patient. Due to different agreements and negotiations between hospital providers and payers, cost and reimbursement can vary across patients receiving the same service from the same hospital [22, 25].

Missing data

Due to their minimal numbers, we deleted hospitals with no census division or rural/urban status information. We imputed hospitals with teaching facility information missing by adding the most prevalent category [26]. We excluded encounters with $0 listed for total charges from the study. According to the Cerner HealthFacts® database data dictionary, total charges of $0 indicate that the administrative staff did not enter the billing information into the database. We compared demographics for sources with missing cost data and those with cost data present; the demographics were not statistically different. For ease of interpretation, the patient record was removed from the study if a patient was treated in two different census divisions or if the patient was treated in a rural and an urban hospital.

Statistical analysis

We examined the distribution of our dependent variable, total charges over the 2 years. We checked for skewness and outliers. Having so many variables, we also tested for multicollinearity, a linear relationship between two or more variables [27]. We used a generalized variance inflation factor (GVIF) analysis to identify variables with high multicollinearity, which is appropriate for a mix of categorical and numerical variables [28, 29]. We removed the variable with the highest GVIF^(1/2Df) score using the car R package [30]. We repeated this process until no variable had a score above two, a conservative threshold for considering multicollinearity [29].

We used regression analysis to compare the total charges of the most prevalent multimorbidity combinations by race/ethnicity. A generalized linear model (GLM) with gamma distribution and log link function was applied to estimate the total charges based on the morbidity variables [31, 32]. ECI rank was categorized into three categories based on quantile range: low, medium, and high, indicating comorbidity severity. We ran a 3-way ANOVA test on the model residuals to determine whether there was an interaction effect between BMI and race, as a combined effect, and ECI ranks on total charges (the dependent variable).

Results

Demographics

In this study, most patients in both age cohorts were female. Tables 1 and 2 show the breakdown of demographics by race for the 45–64 and 65 + cohorts, respectively. The percentages were calculated relative to the whole patient population. The average age for the 45–64 cohort (333,094 patients) was 55 years and for the 65 + cohort (327,260 patients) was 75 years.

Table 1 Demographics of the 45–64 cohort
Table 2 Demographics of the 65 + cohort

Outcomes

The breakdown of visit type, mean emergency room visits, mean ECI score, mean admission days, and mean charges for the 45–64 and 65 + cohorts are shown in Table 3. The Cerner HealthFacts® database included data from 1,500,580 45–64 patients and 1,213,069 65 + patients for the period 2016–2017. We excluded some of the ICD-9-CM diagnosis codes and removed patient encounters with missing total charges for a total of 647,801 patients remaining in the 45–64 cohort and 534,534 patients remaining in the 65 + cohort. Patients excluded due to not having a morbidity, BMI, race, gender, or age value totaled 189,213 in the 45–64 cohort and 98,222 in the 65 + cohort. In the 45–64 cohort, 68,856 patients were excluded due to duplicate hospital information, and 60,388 in the 65 + cohort. After excluding patients based on this inclusion/exclusion criteria, 333,094 patients were 45–64, and 327,260 remained. A complete breakdown of our exclusion/inclusion criteria on the patient population is displayed in Additional file 1: Fig. S1.

Table 3 Outcomes of the 45–64 and 65 + cohorts

Due to the skewness of the dependent variable (mean total charges), we performed an outlier test and used the interquartile method to eliminate outliers. After testing for collinearity, we removed the total number of morbidities variable from the analysis, as it was considered an aliased coefficient in the model, meaning that this particular variable was equivalent to one or more variable(s). We determined the unadjusted and adjusted models’ residuals for the 45–64 and 65 + cohorts using a generalized linear model with Gamma distribution and log link function (Additional file 1: Fig. S1) and assessed model performance by inspecting the residuals’ quantile–quantile (Q-Q) plots in R. Due to the skewness of the dependent variable, total charges, these models did not fit the data well. To obtain a better-fitting model, we tested log and exponential transformations. We then inspected the residuals’ Q-Q plots to measure the model performance (Additional file 1: Fig. S2 and Additional file 1: Fig. S3). We selected the exponential model as optimal for this dataset, since it exhibited the least sum of square error (SSE) in both cohorts [33].

The model

The healthcare total charges model in the 45–64 and 65 + cohorts had adjusted R-squared values of 0.3906 and 0.4695, respectively. Tables 4 and 5 show the model estimates for key demographic and patient hospital utilization factors. African American was selected as the index race. The Asian/Pacific Islander and Biracial variables (Table 4) were not significant in the 45–64 cohort. The Caucasian, Hispanic, and Native American races had negative total charges estimates. In the 65 + cohort (Table 5), the Biracial variable was not significant. The Asian/Pacific Islander and the Caucasian races had positive total charges estimates, while the Hispanic and the Native American races had negative estimates. Additional file 1: Tables S3 and S4 include model estimates for hospital-related variables and the 38 diagnoses that comprise the most prevalent multimorbidities across race/ethnicity in the USA. For each model, all diagnosis estimates were significant. All diagnosis estimates for the 45–64 cohort (Additional file 1: Table S3) were positive except heart failure, vitamin D deficiency, and chronic kidney disease, which were all slightly negative. Hospital teaching status was not a significant predictor. For the 65 + cohort (Additional file 1: Table S4), all diagnoses were positive except vitamin D deficiency and chronic kidney disease. Living in the South Atlantic region was not a significant predictor for healthcare charges for this cohort.

Table 4 The 45–64 cohort total charge model estimates
Table 5 The 65 + cohort total charge model estimates

The overall mean of the absolute value of the model residuals and the standard deviations for the 45–64 and 65 + cohorts are shown in Table 6. The model best predicted the total charges for the Hispanic race and was least accurate for the African American race. This table also displays extreme standard deviation values for the model’s residuals. All standard deviations were greater than the mean, and some races exhibited remarkably high standard deviations.

Table 6 Mean model residuals for the 45–64 and 65 + cohorts

The actual vs. the estimated mean total charges over the study period (2016–2017) for the most prevalent multimorbidity combinations shared by all races in the 45–64 and 65 + cohorts, respectively, are shown in Fig. 1. The multimorbidity combinations varied by age and obesity status as they represent the multimorbidities frequent at or above 5% for all racial/ethnic groups within each cohort based on our previous work. The variance between the actual and estimated mean total charges for the hypertension + GERD multimorbidity combination in the 45–64 cohort with obesity was almost double. In the 65 + cohort with obesity, hypertension + heart disease, lipidemia + hypertension + heart disease, and lipidemia + heart disease multimorbidities exhibited the highest variance between the actual and estimated mean total charges with values that were also almost double. In general, the mean total charges and the variance between actual and estimated mean total charges were higher in the 65 + cohorts than in the 45–64 cohorts.

Fig. 1
figure 1

Actual and estimated mean total charges for the most prevalent multimorbidity combinations by race in the 45–64 and 65 + cohorts, with and without obesity. Abbreviations: BackP, severe back pain; GERD, gastroesophageal reflux disease; HD, heart disease; HypT, hypertension; Lip, lipidemia; LimbP, pain in limb, hand, foot, fingers, and toes; T2D, type 2 diabetes mellitus

The mean residuals by race for each of the most prevalent multimorbidity combinations in both cohorts with and without obesity are shown in Fig. 2. Due to many prevalent multimorbidity combinations in the 65 + cohorts, we examined the residuals by race for only the top seven most prevalent. The mean model residuals for the shared multimorbidity patterns by race in the 45–64 cohort without obesity are shown in Fig. 2a. The model overestimated the total charges for both shared multimorbidities for the African American race and one shared multimorbidity for the Hispanic race, and it underestimated the total charges for the Caucasian race. The best estimates were for the Native American race. In contrast, the mean model residuals for the shared multimorbidity patterns by race in the 45–64 cohort with obesity (Fig. 2b) indicated that the model highly overestimated the total charges for the GERD + hypertension multimorbidity pattern in the African American race. The model underestimated all of the total charges for the multimorbidity patterns for the Hispanic race, while the estimates for the Native American race fluctuated between over- and underestimation.

Fig. 2
figure 2

Mean model residuals for shared multimorbidity patterns by race in the 65 + and 45–64 cohorts with and without obesity. * =  − 1,749,059 residual value, ** =  − 316,995 residual value (these values were too large to display in the figure). Abbreviations: BackP, severe back pain; CKD, chronic kidney disease; GERD, gastroesophageal reflux disease; HD, heart disease; HypT, hypertension; Lip, lipidemia; LimbP, pain in limb, hand, foot, fingers, and toes; OJD, other joint disorder; T2D, type 2 diabetes mellitus

The mean model residuals for the shared multimorbidity patterns by race in the 65 + cohort without obesity are shown in Fig. 2c. The total charge estimates for the African American race were overestimated or highly overestimated, while those for almost all of the Asian/Pacific Islander race were highly underestimated. The model also overestimated two patterns for the Hispanic race and highly underestimated the remainder. In contrast, the mean model residuals for the shared multimorbidity patterns by race in the 65 + cohort with obesity (Fig. 2d) indicated that the model underestimated all of the total charge estimates for the Asian/Pacific Islander race. The Hispanic race exhibited the most highly underestimated patterns, while those for the Caucasian race had one triad pattern that was significantly overestimated (− 316,995). The model estimated two multimorbidity patterns for the African American race better than others but overestimated the remaining patterns. The remaining patterns were either highly overestimated or underestimated.

The model estimated the mean total charges for the hypertension + GERD multimorbidity pattern as much higher than the actual charges. The mean model residual for the same cohort was highly overestimated for the African American race. The model also estimated the mean total charges for the lipidemia + hypertension + heart disease triad multimorbidity pattern as much higher than the actual charges. The mean model residual for the same cohort was highly overestimated for the Caucasian race. The multimorbidities with heart disease also showed extreme differences in actual vs. estimated mean total charges and an extreme over or underestimation for the mean model residuals of certain races.

The variability of mean model residuals for the African American race increased with obesity in the 45–64 cohort, yet variability decreased with obesity in the 65 + cohort. The model best estimated the lipidemia + hypertension multimorbidity pattern for this race across all cohorts. The Asian/Pacific Islander race model residuals were more extreme for patients in the 65 + cohort without obesity than those with obesity. Although the Asian/Pacific Islander race exhibited negative model estimates compared to the African American race, when the mean model residuals were categorized as a function of multimorbidity, some combinations were better estimated than others. For the Caucasian race, the variability of mean model residuals was comparable by weight class in the 45–64 cohort, but it was more accurate for patients without obesity in the 65 + cohort. The most accurate overall model estimates were for the Native American race, although variability in mean model residuals increased with obesity and aging. The mean model residuals also increased substantially by age group for the African American, Hispanic, and Native American races.

We conducted a post hoc sensitivity analysis to determine if some groups’ small numbers caused the extreme average residual values. We tested this hypothesis by re-running our models with the smaller groups (Native American and Hispanic) combined. Additional file 1: Fig. S4 displays the results. The extreme values remain extreme, and two additional multimorbidities (HypT_HD and Lip_HD) had extreme values for the Caucasian group in the 65 + with obesity cohort.

Since BMI and race as a combined effect were not significant (p-value = 0.870) when analyzed by 3-way ANOVA, we removed this interaction so that only ECI rank was significant (p-value = 0.0353) for the 45–64 cohort. For the 65 + cohort, BMI and race as a combined effect were not significant (p-value = 1.000,) and only ECI rank was significant (p-value < 0.001).

Discussion

This study adds significantly to previous literature by elucidating the complex relationships between multimorbidity and costs across racial groups [34, 35]. The well-fit cost models developed through this study show that the accuracy of estimating cost varies across race and by multimorbidity, age group, and obesity status. However, it exhibited varying patterns of over- or underestimating total charges for specific racial groups. This suggests that more robust methods will be necessary to ensure accurate cost capture, particularly for vulnerable populations. Capturing such a complex interplay is challenging. While this type of modeling has some limitations, it can help to identify the costs associated with multimorbidities to help project future patient costs. This study also showed that aging does not have a straightforward relationship with cost estimates for certain races. For example, African Americans were the index race in both models. The Caucasian race had a negative total charges estimate for the 45–64 cohort and a positive total charges estimate for the 65 + cohort.

While previous literature notes that levels and most prevalent categories of multimorbidity vary by race [36], our research shows that the relationship between cost and multimorbidity is inconsistent for each racial group. Specific total charge estimates for some multimorbidity patterns were more inaccurate for some groups. Additionally, our study demonstrated that some racial groups could be driving the overall inaccuracy of cost estimates for specific multimorbidity combinations. For example, the average estimated total charges for hypertension + GERD significantly deviated from the actual total charges. Residual analysis indicated that these estimates were significantly overestimated for the African American population in particular. As multimorbidity is associated with higher outpatient and inpatient utilization of healthcare services, [37] the importance of accurately modeling cost cannot be overstated. Given the seriousness of the inequalities in healthcare access and outcomes by race, [38] it is crucial that we generate accurate models across racial groups. Our findings provide necessary information on understanding the complexity of the relationship between cost and multimorbidity. Researchers modeling multimorbidity and cost must analyze estimates for specific patterns stratified by race to know how much specific estimates can be trusted.

Our results indicate that the pattern of model accuracy across the obesity category varies by race. These findings can help the research community identify areas for improved modeling to better estimate costs for patient populations. Except for one multimorbidity combination (hypertension + GERD), multimorbidities in the patient population with obesity exhibited less extreme average residuals in the African American group in both the 45–64 and 65 + cohorts. We observed a similar relationship in the 65 + Asian population. On the other hand, the Hispanic population exhibited more extreme residuals in the absence of obesity. We observed a similar trend in the 65 + Native American and Caucasian populations. In some instances, this could be attributed to differences in the type of multimorbidities in distinct groups, but we also observed this trend in cases where the multimorbidity is the same (e.g., lipidemia + heart disease in the 65 + cohort). Our results demonstrate the importance of stratification by weight category for improved model accuracy.

Limitations

The cross-sectional design of our study restricts our comprehension of multimorbidity, race, age, BMI, and ECI as risk factors that impacting patients’ mean total charges. The results could not produce a single model consistent in predicting total charges across races in the same weight and age groups. The Cerner HealthFacts database contains patient records with missing charges due to information not being transferred to the data warehouse. Consequently, we excluded these records from our study. However, as noted in the “Methods” section, this data is likely missing at random. A non-trivial percent of patients had missing data in the following categories: BMI, race, gender, or age. These variables are likely to have some missing not at random data which may bias our estimates. Some of this data could be missing at random as, for example, not all HealthFacts sites provide BMI values. The Uniform Hospital Discharge Data Set (UHDDS) definitions and regulations drive hospital reporting requirements for race and ethnicity data, which may not accurately reflect these categories [39]. The Cerner HealthFacts database categorized the Hispanic ethnicity as a race, yet these patients could identify as a member of the Native American, Black, White, or Asian races or could be Biracial. Our exclusion of patients with unknown race, gender, BMI, or age data substantially reduced our sample, which could impact specific groups disproportionally. For the 45–64 cohort’s model, the Asian/Pacific Islander and Biracial races were insignificant, nor was the Biracial racial category significant for the 65 + cohort’s model, which is most likely because we had small samples for these two races. Healthcare expenditures at the end of life can differ significantly from the cost of general medical care. This study was not able to include these expenditures since we do not have out-of-hospital death information.

Despite these limitations, our study is unique because it included the Biracial and the Native American groups, which are often not studied. Also, the study population reflects a nationwide sample selected from all corners of the nation and is representative of the patient group that doctors generally treat in a clinical setting.

If building a model for the most prevalent multimorbidity combinations by race is so challenging, how accurate will expenditure models be for multimorbidities that are not shared by all racial groups and how can we evaluate them? Although the model we developed was a good fit for the data we accessed, its variability in predicting total charges by race demonstrated that we need more robust models that accurately predict total healthcare charges for all racial groups. In particular, multimorbidity and race need to be studied more comprehensively in this regard.

Conclusions

This is the first study to identify total charges’ trends across Asian/Pacific Islander, African American, Biracial, Caucasian, Hispanic, and Native American populations for the most prevalent multimorbidity combinations. We also demonstrated that our model was inconsistent in its ability to predict total charges by race based on multimorbidity patterns. In general, the total charges were either over- or underestimated across multimorbidity patterns, and in some cases, the model predictions were far from the expected values. This highlights the difficulty in modeling total charge estimates for diseases that may interact in a multimorbidity, since they do not have a simple additive effect. This demonstrates the need to develop more robust models to ensure the healthcare system can better serve all populations. Improved modeling of underserved populations is necessary, and multimorbidity and race need to be studied more comprehensively.

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available as they must be licensed by Oracle Cerner (https://www.cerner.com/ap/en/solutions/data-research).

Abbreviations

BackP:

Severe back pain

BMI:

Body mass index

CKD:

Chronic kidney disease

ECI:

Elixhauser Comorbidity Index

GERD:

Gastroesophageal reflux disease

HD:

Heart disease

HypT:

Hypertension

ICD-10–CM:

International Classification of Diseases-10th Version-Clinical Modification

LimbP:

Pain in limb, hand, foot, fingers, and toes

Lip:

Lipidemia

OJD:

Other joint disorder

T2D:

Type 2 diabetes mellitus

UHDDS:

Uniform hospital discharge data set

References

  1. McPhail SM. Multimorbidity in chronic disease: impact on health care resources and costs. Risk Manag Healthc Policy. 2016;9:143–56.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Sambamoorthi U, Tan X, Deb A. Multiple chronic conditions and healthcare costs among adults. Expert Rev Pharmacoecon Outcomes Res. 2015;15(5):823–32.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Harris LJ, Graetz I, Podila PS, Wan J, Waters TM, Bailey JE. Characteristics of hospital and emergency care super-utilizers with multiple chronic conditions. J Emerg Med. 2016;50(4):e203–14.

    Article  PubMed  Google Scholar 

  4. Mossadeghi B, Caixeta R, Ondarsuhu D, Luciani S, Hambleton IR, Hennis AJM. Multimorbidity and social determinants of health in the US prior to the COVID-19 pandemic and implications for health outcomes: a cross-sectional analysis based on NHANES 2017–2018. BMC Public Health. 2023;23(1):887.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Caballer-Tarazona V, Guadalajara-Olmeda N, Vivas-Consuelo D. Predicting healthcare expenditure by multimorbidity groups. Health Policy. 2019;123(4):427–34.

    Article  PubMed  Google Scholar 

  6. Sancho-Mestre C, Vivas-Consuelo D, Alvis-Estrada L, Romero M, Usó-Talamantes R, Caballer-Tarazona V. Pharmaceutical cost and multimorbidity with type 2 diabetes mellitus using electronic health record data. BMC Health Serv Res. 2016;16(1):394.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Alshakhs M, Jackson B, Ikponmwosa D, Reynolds R, Madlock-Brown C. Multimorbidity patterns across race/ethnicity as stratified by age and obesity. Sci Rep. 2022;12(1):9716.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Schoenberg NE, Kim H, Edwards W, Fleming ST. Burden of common multiple-morbidity constellations on out-of-pocket medical expenditures among older adults. Gerontologist. 2007;47(4):423–37.

    Article  PubMed  Google Scholar 

  9. Lochner KA, Cox CS. Prevalence of multiple chronic conditions among Medicare beneficiaries United States. Prev Chronic Dis. 2013;10:E61.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Espeland MA, Crimmins EM, Grossardt BR, et al. Clinical trials targeting aging and age-related multimorbidity. J Gerontol A Biol Sci Med Sci. 2017;72(3):355–61.

    PubMed  Google Scholar 

  11. Mercer SW, Fitzpatrick B, Guthrie B, et al. The CARE Plus study - a whole-system intervention to improve quality of life of primary care patients with multimorbidity in areas of high socioeconomic deprivation: exploratory cluster randomised controlled trial and cost-utility analysis. BMC Med. 2016;14(1):88.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Clay OJ, Perkins M, Wallace G, Crowe M, Sawyer P, Brown CJ. Associations of multimorbid medical conditions and health-related quality of life among older African American men. J Gerontol B Psychol Sci Soc Sci. 2018;73(2):258–66.

    Article  PubMed  Google Scholar 

  13. Lynch CP, Gebregziabher M, Axon RN, Hunt KE, Payne E, Egede LE. Geographic and racial/ethnic variations in patterns of multimorbidity burden in patients with type 2 diabetes. J Gen Intern Med. 2015;30(1):25–32.

    Article  PubMed  Google Scholar 

  14. Noseworthy PA, Attia ZI, Brewer LC, et al. Assessing and mitigating bias in medical artificial intelligence: the effects of race and ethnicity on a deep learning model for ECG analysis. Circ Arrhythm Electrophysiol. 2020;13(3):e007988.

    Article  PubMed  PubMed Central  Google Scholar 

  15. St Sauver JL, Boyd CM, Grossardt BR, et al. Risk of developing multimorbidity across all ages in an historical cohort study: differences by sex and ethnicity. BMJ Open. 2015;5(2):e006413.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Almirall J, Fortin M. The coexistence of terms to describe the presence of multiple concurrent diseases. J Comorb. 2013;3:4–9.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Smith SM, Soubhi H, Fortin M, Hudon C, O’Dowd T. Managing patients with multimorbidity: systematic review of interventions in primary care and community settings. BMJ Med. 2012;345:e5205.

    Article  Google Scholar 

  18. Simard M, Rahme E, Calfat AC, Sirois C. Multimorbidity measures from health administrative data using ICD system codes: a systematic review. Pharmacoepidemiol Drug Saf. 2022;31(1):1–12.

    Article  PubMed  Google Scholar 

  19. Guinness World Records. Heaviest man ever. London, UK: Guinness World Records, Ltd.; Cited 22 June 2022. Available from: https://www.guinnessworldrecords.com/world-records/heaviest-man.

  20. Terkawi AS, Rafiq M, Algadaan R, et al. General anesthesia for the heaviest man in the world. Saudi J Anaesth. 2014;8(Suppl 1):S101–4.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kuo RN, Lai MS. The influence of socio-economic status and multimorbidity patterns on healthcare costs: a six-year follow-up under a universal healthcare system. Int J Equity Health. 2013;12:69.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Arora V, Moriates C, Shah N. The challenge of understanding health care costs and charges. AMA J Ethics. 2015;17(11):1046–52.

    Article  PubMed  Google Scholar 

  23. Batty M, Ippolito B. Mystery of the chargemaster: examining the role of hospital list prices in what patients actually pay. Health Aff (Millwood). 2017;36(4):689–96.

    Article  PubMed  Google Scholar 

  24. Linde S, Egede LE. Hospital price transparency in the United States: an examination of chargemaster, cash, and negotiated, price variation for 14 common procedures. Med Care. 2022;60(10):768–74.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Brookings Health System. Charge, Cost & Price. Brookings, SD, USA: Brookings Health System. Cited 2023, Mar 1. Available from: https://www.brookingshealth.org/why-brookings-health/health-care-value/understanding-medical-prices/charge-cost-price.

  26. Sterne JA, White IR, Carlin JB, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Multicollinearity AA. WIREs. Comput Statistics. 2010;2(3):370–4.

    Google Scholar 

  28. Allison P. When can you safely ignore multicollinearity?. Ardmore, PA, USA. Statistical Horizons: 2012. [updated 2012; cited 2022, Nov 22]. Available from: https://statisticalhorizons.com/multicollinearity/.

  29. Fox J, Monette G. Generalized collinearity diagnostics. J Am Stat Assoc. 1992;87(417):178–83.

    Article  Google Scholar 

  30. Fox J, Weisberg S. An R Companion to Applied Regression. 3rd ed. Inc.: SAGE Publications; 2018.

    Google Scholar 

  31. Turner H. Introduction to generalized linear models. Warwick, UK: ESRC National Centre for Research Methods: 2008 [updated 2008; cited 2023, Mar 1]. Available from: http://statmath.wu.ac.at/courses/heather_turner/glmCourse_001.pdf.

  32. Manning WG, Mullahy J. Estimating log models: to transform or not to transform? J Health Econ. 2001;20(4):461–94.

    Article  CAS  PubMed  Google Scholar 

  33. Mohr DL, Wilson WJ, Freund RJ. Chapter 6 - inferences for two or more means. In: Mohr DL, Wilson WJ, Freund RJ, editors. Statistical Methods (Fourth Edition). Academic Press; 2022. p. 243–99.

    Chapter  Google Scholar 

  34. KonéPefoyo AJ, Bronskill SE, Gruneir A, et al. The increasing burden and complexity of multimorbidity. BMC Public Health. 2015;15:415.

    Article  Google Scholar 

  35. Hajat C, Siegal Y, Adler-Waxman A. Clustering and healthcare costs with multiple chronic conditions in a US study. Front Public Health. 2020;8: 607528.

    Article  PubMed  Google Scholar 

  36. Rocca WA, Boyd CM, Grossardt BR, et al. Prevalence of multimorbidity in a geographically defined American population: patterns by age, sex, and race/ethnicity. Mayo Clin Proc. 2014;89(10):1336-49. d.

    Article  PubMed  Google Scholar 

  37. Zulman DM, Pal Chee C, Wagner TH, et al. Multimorbidity and healthcare utilisation among high-cost patients in the US Veterans Affairs Health Care System. BMJ Open. 2015;5(4):e007771.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Dickman SL, Himmelstein DU, Woolhandler S. Inequality and the health-care system in the USA. Lancet. 2017;389(10077):1431–41.

    Article  PubMed  Google Scholar 

  39. Strmic-Pawl HV, Jackson BA, Garner S. Race counts: racial and ethnic data on the U.S. census and the implications for tracking inequality. Sociol Race Ethn. 2017;4(1):1–13.

    Google Scholar 

Download references

Acknowledgements

The authors also wish to thank Dr. Kyle Johnson Moore in the UTHSC Office of Scientific Writing for editing the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this work was supported by the National Institute of Aging award R15AG067232.

Author information

Authors and Affiliations

Authors

Contributions

M.A.: conception, design, analysis, methods, data collection, and writing. P.J.G. writing, design, and review. J.B. writing and review. C.M.B.: conception, supervision, design, and review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Charisse Madlock-Brown.

Ethics declarations

Ethics approval and consent to participate

Because the study only employed de-identified data, the University of Tennessee Health Science Center (UTHSC) Institutional Review Board (IRB) determined that the research was exempt.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

 Supplemental Table 1.Most Prevalent Morbidities. Supplemental Table 2. Study variables. Supplemental Table 3. Additional model estimates for the 45-64 cohort. Supplemental Table 4. Additional model estimates for the 65+ cohort. Supplemental Figure 1. Patient population. Supplemental Figure 2. Model residuals for the 45-64 and 65+ Cohorts. Supplemental Figure 3. Log and Exponential model residuals. Supplemental Figure 4. Mean model residuals by race.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alshakhs, M., Goedecke, P.J., Bailey, J.E. et al. Racial differences in healthcare expenditures for prevalent multimorbidity combinations in the USA: a cross-sectional study. BMC Med 21, 399 (2023). https://doi.org/10.1186/s12916-023-03084-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12916-023-03084-2

Keywords