An assessment of the vaccination of school-aged children in England against SARS-CoV-2

Background Children and young persons are known to have a high number of close interactions, often within the school environment, which can facilitate rapid spread of infection; yet for SARS-CoV-2, it is the elderly and vulnerable that suffer the greatest health burden. Vaccination, initially targeting the elderly and vulnerable before later expanding to the entire adult population, has been transformative in the control of SARS-CoV-2 in England. However, early concerns over adverse events and the lower risk associated with infection in younger individuals means that the expansion of the vaccine programme to those under 18 years of age needs to be rigorously and quantitatively assessed. Methods Here, using a bespoke mathematical model matched to case and hospital data for England, we consider the potential impact of vaccinating 12–17 and 5–11-year-olds. This analysis is reported from an early model (generated in June 2021) that formed part of the evidence base for the decisions in England, and a later model (from November 2021) that benefits from a richer understanding of vaccine efficacy, greater knowledge of the Delta variant wave and uses data on the rate of vaccine administration. For both models, we consider the population wide impact of childhood vaccination as well as the specific impact on the age groups targeted for vaccination. Results Projections from June suggested that an expansion of the vaccine programme to those 12–17 years old could generate substantial reductions in infection, hospital admission and deaths in the entire population, depending on population behaviour following the relaxation of control measures. The benefits within the 12–17-year-old cohort were less marked, saving between 660 and 1100 (95% PI (prediction interval) 280–2300) hospital admissions and between 22 and 38 (95% PI 9–91) deaths depending on assumed population behaviour. For the more recent model, the benefits within this age group are reduced, saving on average 630 (95% PI 300–1300) hospital admissions and 11 (95% PI 5–28) deaths for 80% vaccine uptake, while the benefits to the wider population represent a reduction of 8–10% in hospital admissions and deaths. The vaccination of 5–11-year-olds is projected to have a far smaller impact, in part due to the later roll-out of vaccines to this age group. Conclusions Vaccination of 12–170-year-olds and 5–11-year-olds is projected to generate a reduction in infection, hospital admission and deaths for both the age groups involved and the population in general. For any decision involving childhood vaccination, these benefits needs to be balanced against potential adverse events from the vaccine, the operational constraints on delivery and the potential for diverting resources from other public health campaigns. Supplementary Information The online version contains supplementary material available at (10.1186/s12916-022-02379-0).


Infection modelling
As is common to most epidemiological modelling we stratify the population into multiple disjoint compartments and capture the flow of the population between compartments in terms of ordinary di↵erential equations. At the heart of the model is a modified SEIR equation, where individuals may be susceptible (S), exposed (E), infectious with symptoms (I), infectious and either asymptomatic or with very mild symptoms (A) or recovered (R). Both symptomatic and asymptomatic individuals are able to transmit infection, but asymptomatic infections do so at a reduced rate given by ⌧ . Hence, the force of infection is proportional to I + ⌧ A. To some extent, the separation into symptomatic (I) and asymptomatic (A) states within the model is somewhat artificial as there are a wide spectrum of symptom severities that can be experienced, with the classification of symptoms changing over time. Our classification reflects early case detection, when only relatively severe symptoms were recognised.
To obtain a better match to the infection time scales, we model the exposed class as a 3-stage processthis provides a better match to the time from infection to becoming infectious, such that in a stochastic formulation the distribution of the latent period would be an Erlang distribution.
where ↵ 1 , and 1 are the mean latent and infectious periods, while d is the proportion of infections that develop symptoms.

Age Structure and Transmission Structure
The simple model structure is expanded to twenty-one 5-year age-groups (0-4, 5-9, .... ,95-99, 100+). Age has three major impacts on the epidemiological dynamics, with each element parameterised from the available data: • Older individuals have a higher susceptibility to SARS-CoV-2 infection (captured by the parameter ). • Older individuals have a higher risk of developing symptoms, and therefore have a greater rate of transmission per contact.
• Older individuals have a higher risk of more severe consequences of infection including hospital admission and death.
The age-groups interact through four who-acquired-infection-from-whom transmission matrices, which capture the epidemiological relevant mixing in four settings: household ( H ), school ( S ), workplace ( W ) and other ( O ). We took these matrices from Prem et al. [41] to allow easy translation to other geographic settings, although other sources could be used.
One of the main modifiers of mixing and therefore transmission is the level of precautionary behaviour, (see Figure 2 of the main text). This scaling parameter changes the who-acquired-infection-fromwhom transmission matrices in each transmission setting, such that when = 1 mixing in workplaces and other settings take their lowest value, whereas when = 0 the mixing returns to pre-pandemic levels. Mixing within the school setting follows the prescribed opening and closing of schools.
For simplicity of notation, we write the sum of the four age-structured mixing matrices as .
To ensure that we can replicate the long-term dynamics of infection we allow the population to age. The aging process occurs annually (corresponding to the new school year in September) in which approximately one fifth of each age-group moves to the next oldest age cohort -small changes to the proportion moving between age-groups are made to keep the population size within each age-group constant.

Capturing Quarantining
One of the key characteristics of the COVID-19 pandemic in the UK has been the use of self-isolation and household quarantining to reduce transmission. We approximate this process by distinguishing between first infections (caused by infection related to any non-household mixing) and subsequent household infections (caused by infection due to household mixing). The first symptomatic case within a household (which might not be the first infection) has a probability (H) of leading to household quarantining; this curtails the non-household mixing of the individual and all subsequent infections generated by this individual.
In our notation, we let superscripts denote the first infection in a household (F ), a subsequent infection from a symptomatic household member (SI) and a subsequent infection from an asymptomatic household member (SA); the first detected case in a household who is quarantined (QF ) and all their subsequent household infections (QS). For a simple SEIR model (ignoring multiple E categories and age-structure) our extension would give: This formation has been shown to be able to reduce R below one even when there is strong within household transmission, as infection from quarantined individuals cannot escape the household [? ].

Spatial Modelling
Within England the model operates at the scale of NHS regions (East of England, London, Midlands, North East, North West, South East and South West). For simplicity and speed of simulation we assume that each of these regions acts independently and in isolation -we do not model the movement of people or infection across borders. In addition, the majority of parameters are regionally specific, reflecting di↵erent demographics, deprivation and social structures within each region. However, we include a hyper-prior on the shared parameters such that the behaviour of each region helps inform the value in others.

Variant Modelling
The model also captures the three main variants that have been responsible for most infections in England: the wildtype virus (encapsulating all pre-Alpha variants), the Alpha variant and the Delta variant. Each of these requires a replication of the infectious states for each variant type modelled. We assume that infection with each variant confers immunity to all variants, such that there is indirect competition for susceptible individuals. This competition is driven by the transmission advantage of each variant which is estimated by matching to the proportion of positive community PCR tests (Pillar 2 test) that are positive for the S-gene. The TaqPath system that is used for the majority of PCR tests in England is unable to detect the S-gene in Alpha variants, due to mutations in the S-gene. The switch from S-gene positive to S-gene negative and back to S-gene positive corresponds with the dominance of wildtype, Alpha and Delta variants. We infer the transmissibility of Alpha and Delta variants to be 52% (CI 35-71%) and 156% (CI 117-210%) greater than wildtype, respectively.

Vaccination Modelling
We capture vaccination using a leaky approach, although non-leaky (all-or-nothing) models produce extremely similar results over the time-scales considered. The model replicates the action of: • first and second doses of vaccine, at rates v 1 and v 2 respectively that move susceptible individuals through to vaccinated states (V S 1 and V S 2 ) but have no impact on infected or recovered individuals; • waning vaccine e cacy at rates ! 1 and ! 2 , giving a two-step process from fully vaccinated to waned e cacy (in the equation below, for simplicity we assume everyone who gets a first dose of vaccine also gets a second, so that waning from state S 1 is unnecessary); • waning immunity at rates ⌦ 1 and ⌦ 2 which are assumed to be slower than the waning of vaccine e cacy. In the June model waning was not included, hence ! 1 = ! 2 = ⌦ 1 = ⌦ 2 = 0. The model also needs to capture the total number of individuals who have been given a first or second dose of vaccine (V 1 or V 2 out of a total population size of N ) to ensure that only individuals that have not been vaccinated are o↵ered a first dose, and only individuals that have been vaccinated once are o↵ered a second dose.
For those in the classes where the vaccines generate protection (VS1, VS2 and WS1), the degree of protection is determined by the ratio of AstraZeneca (ChAdOx1) vaccine to mRNA vaccines (either Pfizer BNT162b2 or the Moderna COVID-19 vaccine) that has been given to that age-group (see Tables 1 and 2). If a vaccinated individual becomes infected, their probability of being admitted to hospital or dying -which normally only depends on age -is modified by the appropriate vaccine e cacy according to the ratio of the two vaccine types deployed within that age-group. Booster vaccinations are implemented by moving individuals from the vaccinated or waned class into the booster class where the level of protection is enhanced. Waning from the booster state is assumed to occur at a low rate comparable to that of recovered individuals (Fig. S2.1).

Parameter Inference
Key to the accuracy of any model are the parameters that underpin the dynamics. With a model of this complexity, a large number of parameters are required. Some, such as vaccine e cacy, are assumed values based on the current literature; while others are inferred from the epidemic dynamics. Of these inferred parameters there are three basic classes; those, such as scalings of the case-hospitalisation ratios, that are di↵erent between regions and variants; others such as age-dependent susceptiblity are universal (the same for all regions and variants); while the level of precautionary behaviour over time changes on a weekly time-scale. Bayesian inference, using an MCMC process, is applied to each of the seven NHS regions in England to determine posterior distributions for each of the regional parameters (further details are given in [52]). The distribution of parameters leads to uncertainty in model projections, which is represented by the 95% prediction interval in all graphs (this interval contains 95% of all predictions). We note that when we compare two scenarios (for example vaccination of 11-17 year olds, with no vaccination in this age-group) we compared simulations with the same parameters chosen from the posterior distributions -and then computer means and 95% prediction intervals based on these results.
As the epidemic has progressed, new posterior distributions based on the latest data are initialised from previous MCMC chains -ensuring a rapid fit to historical data. In general this refitting process has been performed weekly (or twice weekly) throughout the pandemic. We currently match to six observations: hospital admissions, hospital occupancy, ICU occupancy, deaths, proportion of pillar 2 (community) test that are positive, and the proportion of pillar 2 tests that are S-gene positive (as a signal of the ratio of wild-type to Alpha variant, then a signal of the ratio of Delta to Alpha variant, and more recently a signal of Omicron to Delta). (We note that in [52], which was written in the early stages of the pandemic, we did not fit to S-gene data as we had been dealing with a single variant.) Although not part of the underlying transmission dynamics, these six quantities for each region can be generated from the number, age and type of infection within the model. Observations and model results are compared by considering the likelihood of generating the observations assuming they are Poisson distributed (for numbers) or Binomially distributed (for proportions) with a mean given by the results of the deterministic model.

Comparison between Model and Data for Age-structured Hospital Admissions
In Fig. 2 of the main text we compare the two models with data on age-dependent hospital admissions to highlight that we capture general trends -especially in terms of the scale of admissions for the di↵erent age-groups. Here we extend this comparison, and plot data and model results on the same figure for six di↵erent age ranges Fig. S2.2. We note that the age-groups for which hospital admissions are reported (as shown in Fig. 2 and Fig. S2.2) do not necessarily correspond with the 5-year age bins used within the simulations. We therefore assume homogeneity within each age-group of the model (i.e. all individual aged 0 to 4 have the same mean risk of infection and hospital admission), this means that when computing the expected number of daily admissions for those aged 6-17 we sum 80% of the second age-group in the model (age 5-9), 100% of the third age-group (age 10-14) and 60% of the forth age-group (age 15-19). The impact of this homogeneity assumption is likely to be minimal in terms of (i) equal population size in each one-year cohort and (ii) equal risk of severe illness across each five-year age-group.
It should be noted, as described above, that the inference processes only use aggregate (non-agestructured) data to determine the match between model and the unfolding epidemic. The exception is the risk of hospital admission and death which for each variant of concern are matched to the total reported for each age-group. In general we see that the June model tends to over-estimate the number of hospital admissions for the younger age-groups (from July to October 2021), but underestimates the number of admissions in 18-84 year olds. This leads to the generally lower levels of total admissions seen in Fig. 2.
Fig. S1.2: Comparison between models and data for age-structured hospital admissions. The June model is shown in red, the November model in blue and the data as black dots. The shaded area shows the 95% prediction interval (i.e. it contains 95% of all predictions at each point in time); for the November model this simply captures parameter uncertainty but for the June model this also captures the seven di↵erent assumptions about precautionary behaviour and the return to pre-pandemic behaviour.