Healthcare worker absenteeism, child care costs, and COVID-19 school closures: a simulation analysis

School closures have been enacted as a measure of mitigation during the ongoing COVID-19 pandemic due to their ability to reduce transmission. It has been shown that school closures could cause absenteeism amongst healthcare workers with dependent children, but there is a need for detailed high-resolution analysis of the relationship between school closures and healthcare worker absenteeism to inform local community preparedness. We provide national- and county-level simulations of school closures and absenteeism across the United States. At the national level, we estimate the projected absenteeism rate to range from 7.5% to 8.6%, and the effectiveness of school closures to range from 172 to 218 fewer hospital beds used per 100,000 people at peak demand. At the county-level, we ﬁnd substantial variations of projected absenteeism and school closure effects, ranging from 2.0% to 18.6% absenteeism and 88 to 280 fewer hospital beds used per 100,000 people at peak demand. We also ﬁnd signiﬁcant associations between levels of absenteeism and COVID-19 complication factors. We observe from our models that an estimated 98.8% of counties would ﬁnd it less expensive to provide child care to all healthcare workers with children than to bear the costs of healthcare worker absenteeism during school closures, identifying child care subsidization as a potential solution to help maintain healthcare systems during a pandemic.


Introduction
School closures are a common measure of pandemic mitigation for many countries, driven by the logic that social distancing reduces transmission. [1][2][3] Although school closures are known to reduce transmission, previous works have suggested that school closures could have downstream consequences on the healthcare system such as healthcare worker absenteeism. 4,5 However, lack of granular, high-resolution data has restrained previous studies to providing estimates based on national data with strong assumptions, underscoring the need for more detailed analysis. 3 Newly available demographic and occupational data from the American Community Survey permits county-level scenario analyses of school closures with respect to healthcare worker absenteeism across the United States for the first time.
In the absence of a federal mandate for school closures, the decision of whether or not to close a school is determined by local authorities. The needs and capabilities of both schools and healthcare systems vary drastically across the United States, so county-level simulations of healthcare worker absenteeism and school closures could be far more impactful and targeted for local communities than state or national-level simulations. China's CDC reports that patients with comorbidities such as cardiovascular disease or diabetes had higher rates of fatality from COVID-19 (0.9% case fatality with no reported comorbidities, 10.5% and 7.3% for comorbidities of cardiovascular disease and diabetes, respectively). 6 The prevalence of these comorbidities also varies geographically in the US, further highlighting the importance of regional analysis. [7][8][9] To maintain healthcare systems in the event of a school closure, it could be beneficial to assist healthcare workers with child care. Previous work has shown that increased wages are associated with lower absenteeism, so it is possible that child care subsidies could reduce absenteeism by alleviating the financial burden of child care for healthcare workers as well as further incentivizing them to remain at work. 10,11 Furthermore, the costs of child care (which is the main barrier to finding child care) and the salaries of healthcare workers vary regionally, so the necessity of child care subsidization for healthcare workers could also vary regionally. 12 Here, we provide county-level microsimulation analyses on the potential effects of COVID-19 school closures on healthcare worker absenteeism and hospital bed demand across the entirety of the United States over a range of plausible scenarios. We estimate projected healthcare worker absenteeism, county-level associations between absenteeism and severe COVID-19 risk factors, and the effectiveness of school closures on reducing hospital bed demand; we also identify child care subsidies as a potential solution to help maintain healthcare systems during a pandemic. Figure 1. Study design. HCW = healthcare worker, SEIR = Susceptible-Exposed-Infectious-Recovered, ICU = Intensive Care Unit, GLM = generalized linear model.

Data
To find county-level demographic and occupational data, we used 5-year estimates from the American Community Survey (ACS) 13 and the Integrated Public Use Microdata Series (IPUMS) 14 , a database derived from ACS. The ACS provides comprehensive coverage of data at the county level across factors such as education, housing, employment, and income. For probability estimates of child care dependency, we used data from The National Household Education Survey and a Pew Research Center survey on working parents. 12, 15 For county-level estimates of health assessments we used the Institute for Health Metrics and Evaluation and the CDC Diabetes Interactive Atlas. 16,17 For county-level fair market rent estimates, we use data from the U.S. Department of Housing and Urban Development (HUD). 18 For child care cost estimates, we use data from Child Care Aware of America (CCAoA). 19 We defined healthcare workers as individuals belonging to the ACS categories of practitioners (e.g. physicians, nurses, technicians) or support staff (e.g. orderlies, aides, assistants).

Population simulation
We simulated county-level demographic and occupational data for each county in the United States. We obtained estimates of the number of healthcare workers in each county, and simulated distributions of them into gender and household type (No children, married with children, single male with children, single female with children) based on existing county-specific estimates from the ACS. We focus our analyses on households with children below age 13 -although children under the age of 5 do not attend school, daycare services would likely also be closed in the event of school closures.
We seed probabilities of being unable to find child care with data from NHES, Pew Research Center, the US Census Bureau, and IPUMS. Child care arrangements vary significantly based on parental employment, familial relations, between single and dual-parent households, and gender differences in caretaking of children. 20 In order to simulate which individual in a married couple would be responsible for child care in the event of a school closure, we draw upon survey data from both the Pew Research Center and the US Census Bureau indicating that 89% of working couples rely on the mother for primary child care. 15 To simulate ability to find child care in the event of a school closure, we test two different model assumptions: 1. Healthcare workers have difficulty finding child care at the same rates as national estimates. To simulate the probability a worker can find a child care alternative, we draw upon data from the NHES, which found that 50% of households had difficulty finding or could not find satisfactory child care.
2. Difficulty finding child care could be estimated from the household structure of healthcare workers. To simulate household statistics of healthcare workers, we use nationally representative microdata from IPUMS. We take employed healthcare workers who are either the head of the household or the partner of the head of the household and extract the age, relationship, and employment status of each member of the household. We estimate the ability to find child care by identifying other members of a household that could provide care. We define alternative child care as any member within the household that is over 13 and not employed (under 16, unemployed, or not in the labor force). We stratify the data by state, sex, occupation (practitioner or support staff), and partnership status (single or couple) to estimate the state-specific family structures of healthcare workers. We weight these state-specific derived absenteeism rates based on county-level demographic information to obtain estimates for each county.
Models under the first assumption may provide better estimates in that they include cases beyond household structure (e.g. child care from a relative living elsewhere), but are limited by the assumption that healthcare workers have the same difficulties finding childcare as the national average. Models under the second assumption may provide better estimates in that they account for child care difficulties specific to healthcare workers, but are limited by the assumption that all possible caregivers live in the same household as the child.

Absenteeism estimation
We estimate healthcare worker absenteeism over each county in the United States. Using the probabilities determined in the previous step, we simulate whether or not a given health care worker will be able to find alternative child care in the event of a school closure. At both the national and county-level, we draw 1000 simulations from a multinomial distribution. We determine absenteeism by simulating whether a healthcare worker is the primary caregiver of a household, and whether they are able to find alternative child care in the event of a school closure. We calculate absenteeism rates by dividing the sum of absent healthcare workers over the total number of healthcare workers.
We then repeat absenteeism estimation across healthcare worker subgroups (practitioner or support staff) to get a range of estimates. We estimate absenteeism separately for each group as well as combined. We also perform different absenteeism estimations based on the different model assumptions proposed in the previous step.

Transmission models
We modeled the impact of school closures by county using a Susceptible-Exposed-Infected-Recovered (SEIR) model. 21 We divided the population into four age groups: 0-19 years, 29-39 years, 40-59 years, and 60+ years. Transmission events occur through contact between susceptible and infectious individuals. Since rates of contact differ between age groups, we constructed a WAIFW (Who Acquires Infection From Whom) matrix from non-physical and physical contact data between age groups. 22 We assume that social distancing will result in a 50% reduction of interactions and school closures will result in a 90% reduction in interactions among children. 23 Since increased household interactions is often cited as an unintended side effect of school closures 2425 , we also increase interactions between children and other age groups by 10%.

3/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. .
We assumed an incubation period of 5.1 days and an infectious period of 6.5 days. 25,26 The R 0 of COVID-19 is estimated to be between 2.0 and 6.0, and we examine values within that range (R 0 =2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0). Since the COVID-19 epidemic curve is over a short duration, we ignore births, death, and immigration. We assume that 86% of infections are mild or asymptomatic, with asymptomatic individuals as 50% less infectious than symptomatic individuals. 27 Symptomatic individuals are assumed to reduce contact by 75%. Age-stratified hospitalization rates and infection fatality ratios were obtained from Verity et al. 28 We choose to only apply the infection fatality ratios to symptomatic individuals to obtain a conservative estimate. COVID-19 is not currently known to have reinfections, so we assume that individuals develop immunity after infection in the short term. To estimate the demand on the healthcare system, we assume that 30% of hospitalizations will require critical care (invasive mechanical ventilation, vasopressor support, or further intensive care-level intervention), and that individuals requiring hospitalization will stay for 8 days and individuals requiring critical care will stay in the ICU for 10.4 days. 25 We calculate peak hospitalization and ICU bed demand by integrating the number of individuals within ±4 days and ±5 days of the time with the maximum number of patients, respectively. To simulate the effects on a particular county, we seed the simulation for county age demographics. We estimate the effectiveness of school closures by calculating the reduction in peak ICU bed demand between social distancing and social distancing plus school closure conditions.

Regression analysis
We perform multiple ecological regression analysis to find associations between healthcare worker absenteeism and known COVID-19 complicating factors. We use a quasi-Poisson GLM with absenteeism rates as the outcome, and healthcare worker population as weights. 29 Our factors of interest, based on available county data, are adult cardiovascular disease mortality and adult diabetes rates, due to how cardiovascular disease and diabetes are reported to exacerbate COVID-19 outcomes. We control for race, age, state, household status, sex, population, and fair market rent. We run separate models for cardiovascular disease and diabetes.

Economic analysis
We calculate the economic costs of healthcare worker absenteeism from school closures and compare them to the costs of providing child care to healthcare workers with children. We estimate the cost of absenteeism as worker wages multiplied by number of workers (split by gender and practitioner/support staff subgroups) within a county, multiplied by a constant δ to account for value not captured by wages, such as taxes, pension, cost of overtime, paid sick leave, etc. A previous study used δ = 1.4; we set δ = {1.4, 1.2, 1.0} to test sensitivity.
We estimate the cost of providing child care to healthcare workers with children by estimating county-level child care costs and the number of healthcare workers with children per county. We estimate county-level child care costs with a method similar to that used by the Economic Policy Institute's Family Budget Calculator (see Supplement). 30 To compare the two costs, we divide the cost of healthcare worker absenteeism from school closure by the cost of providing child care to all healthcare workers with children at the county level to get a coefficient ω. We then calculate the percentage of counties with ω > 1 at each level of δ , indicating the percentage of counties where it is cheaper to provide child care to all healthcare workers with children than it is to bear the costs of healthcare worker absenteeism from school closure.

Absenteeism estimation
Our national level simulation based on NHES data provided absenteeism estimates of 7.5%, 7.2%, and 7.9% for all healthcare workers, healthcare practitioners/technicians, and healthcare support staff, respectively. Our simulation based on IPUMS data provided higher estimates of 8.6%, 9.2%, and 7.4%. Our county-level approach revealed substantial variation in estimated healthcare worker absenteeism across counties (Figure 2), ranging from 2.0% to 18.6%.

Transmission models
In our transmission analysis, our national level simulation provided estimated decreases of 171.53 and 62.18 hospital beds and ICU beds per 100,000 people, respectively. Our county-level estimates showed a reduction in peak hospitalization and ICU rates for all counties under school closure conditions, with substantial variation in the decrease in hospital demand across counties, ranging from 87.85 to 280.38 hospital beds and 18.17 to 125.48 ICU beds per 100,000 people when assuming R 0 = 2.5. Our sensitivity analyses show the effectiveness of school closures decreases with increasing R 0 values, which is consistent with past findings 31 . We observe from our models a reduction in hospital demand as a result of school closures with and without

4/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. . https://doi.org/10.1101/2020.03. 19.20039404 doi: medRxiv preprint increased household interactions (Supplement Table 2). We compare absenteeism with effectiveness of school closures as estimated by reduction in peak hospital and ICU bed demand (Figure 3).

5/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020.

Regression analysis
Our regression analysis showed significant associations between healthcare worker absenteeism and COVID-19 complicating factors at the county level (p < 0.05). Diabetes is positively associated with healthcare worker absenteeism with an effect size of .22, meaning a one percent increase in diabetes prevalence is associated with a .22% increase in healthcare worker absenteeism. Cardiovascular disease mortality is negatively associated with absenteeism, with an effect size of less than −1.859 × 10 4 , suggesting a negligible effect despite significance. (Supplement Table 3)

Aggregate analysis
We observe a number of counties that could be viable targets for child care subsidies based on our estimates (Table 1, Figure 3). Counties like Conecuh County, Alabama, have high rates of diabetes, projected absenteeism, as well as a high ω, suggesting that they would suffer disproportionately from COVID-19 in the event of school closures, but also that a child care subsidy would be relatively inexpensive for them. Similarly, Hidalgo County, Texas and Fresno County, California have high projected absenteeism rates and ω, suggesting they are viable targets for child care subsidies. San Francisco County, California is one of the few counties with ω < 1 (due to high child care costs, low wages, and low projected absenteeism), illustrating the variance of our estimates within states. Counties like Morrison County in Minnesota or Franklin County in Pennsylvania that have high projected school closure effectiveness but also high projected absenteeism rates, could also consider child care subsidies given the large estimated benefit of school closures.

Discussion
We observed large variance of our estimates between counties for all of our county-level analyses, emphasizing the importance of county-level information. Our transmission models projected reduced hospitalizations from school closures, but it is highly likely that hospitalizations and ICU bed demand would still far exceed bed capacity for many hospitals. 25 Furthermore, our models estimated generally high absenteeism rates (> 7%) across different assumptions, highlighting the need for reducing absenteeism in the event of school closures. Our regression analysis estimated that counties that are already vulnerable to COVID-19 complications due to diabetes prevalence would also have higher rates of absenteeism from school closures, illustrating an exacerbated scenario in the absence of adequate child care.
To identify a potential approach to reducing absenteeism, we estimated that the vast majority of counties (> 98%) could save money by providing child care to their healthcare workers with children in the event of a school closure (ω > 1). Although it is likely that many child care avenues would also be closed in the event of school closures, subsidized child care costs could still prevent absenteeism by (1) incentivizing work attendance with extra wages, and (2) alleviating the financial burden on the entire household, enabling other family or household members to participate in child care.
Our findings contribute to the literature of school closures during pandemics by providing county-level estimates of absenteeism and school closure effectiveness, relating those estimates to county-level COVID-19 complication factors, and providing an economic analysis of child care subsidies. In previous works, Sadique et al and Lempel et al provided national level cost analyses of school closures under a variety of model assumptions and closure lengths. 4,5 In a recent preprint, Bayham and Fenichel provide state-level estimates and include a tradeoff analysis on whether closing schools reduces mortality after accounting for disruption to healthcare systems from absenteeism. Given the close tradeoff in mortality for school closures and absenteeism, we believe it would be beneficial to explore ways to circumvent the ostensible tradeoff through child care

6/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. . https://doi.org/10.1101/2020.03. 19.20039404 doi: medRxiv preprint subsidies. 32 As a simulation study, there are important limitations to our analysis. Simulations rely on assumptions to make predictions, and ours use assumptions derived from available data. For example, we do not know the number of healthcare workers with dependents -we estimate this based on representative data that could be inaccurate for some regions. Similarly, there are no datasets that tell us how many healthcare workers would be unable to find child care in the event of school closures -we instead estimate this using representative microdata. Lack of available data prohibits us from making precise estimates for counties with small populations. Given the current uncertainty of transmission parameters, our transmission models should not be used to accurately predict infection and hospitalization rates, but rather to estimate the relative effectiveness of school closures based on the age-demographics of each county. Additionally, although our economic analysis demonstrates the affordability of a child care subsidy, our method does not prove that child care subsidies would necessarily reduce absenteeism resulting from school closures. We emphasize that our work does not argue for or against school closures due to currently unclear fatality and transmission data, but rather that we highlight areas that would suffer more in the event of school closure and could therefore benefit more from child care subsidies.
Further research should investigate whether child care subsidies for healthcare workers would reduce absenteeism in the event of school closures from a pandemic. Additionally, research efforts should identify how school closures in pandemics impact more vulnerable populations for whom robust data does not currently exist. Further research efforts should also be placed to determine the effect of school closures on the absenteeism of other kinds of essential workers, instead of just healthcare workers.
Our results provide detailed estimates that could help local communities prepare for the event of school closures during the COVID-19 pandemic, and potentially for future pandemics. Such preparations could help reduce preventable harm resulting from school closures.

County estimates of child care and wages
Here we describe how we obtained county-level estimates of childcare costs and wages.
We use state-level child care costs from CCAoA and adjust them to county-level by applying the ratio between state-level and county-level fair market rents from HUD. We calculate state-level rents from HUD by taking population-weighted averages of county rents. To estimate the number of healthcare workers with children at the county-level, we take the state-level proportion of healthcare workers with children from IPUMS and apply it to the county-level number of healthcare workers from ACS. We then calculate the county-level cost of providing child care to healthcare workers by multiplying child care costs by the proportion of healthcare workers with children.
For estimating county-level wages, some counties with low populations had redacted wages to preserve anonymity. We used multiple imputation by chained equations to impute these cases. For estimating county-level wages, some counties with low populations had redacted wages to preserve anonymity. We used multiple imputation by chained equations to impute these cases. To get all county-level wages, we multiplied the number of healthcare workers (by occupation group and sex) by their subgroup-respective county-level median wages.

9/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

10/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. . https://doi.org/10.1101/2020.03. 19.20039404 doi: medRxiv preprint Figure 6. Differential equations for SEIR models. i is the age group, p a = 0.83 is the proportion of infected that are asymptomatic. r a = 0.5 is the reduction of infectiveness of an asymptomatic individual. r i = 0.25 is the reduction in interaction of a symptomatic individual. β i is age stratified contact rates derived from a WAIFW matrix, τ is the probability of transmission given contact derived from R 0 . The average length of incubation was set to 1/σ = 5.1 days and the average length of infections was set to 1/γ = 6.5 days.  Table 2. Sensitivity analysis of transmission models under varying R 0 values and contact conditions. School closures (SC) reduce the risk of child-child interactions by 90%. Household (HH) interactions increase child-other age group interactions by 10%. Both models assume social distancing, which reduces all interactions by 50%

11/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 23, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 23, 2020.  Table 3. Output from quasi-Poisson regressions for diabetes and cardiovascular mortality.

14/14
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 23, 2020. . https://doi.org/10.1101/2020.03. 19.20039404 doi: medRxiv preprint