The impact of local vaccine coverage and recent incidence on measles transmission in France between 2009 and 2018

Background Subnational heterogeneity in immunity to measles can create pockets of susceptibility and result in long-lasting outbreaks despite high levels of national vaccine coverage. The elimination status defined by the World Health Organization aims to identify countries where the virus is no longer circulating and can be verified after 36 months of interrupted transmission. However, since 2018, numerous countries have lost their elimination status soon after reaching it, showing that the indicators defining elimination may not be associated with lower risks of outbreaks. Methods We quantified the impact of local vaccine coverage and recent levels of incidence on the dynamics of measles in each French department between 2009 and 2018, using mathematical models based on the “Endemic-Epidemic” regression framework. After fitting the models using daily case counts, we simulated the effect of variations in the vaccine coverage and recent incidence on future transmission. Results High values of local vaccine coverage were associated with fewer imported cases and lower risks of local transmissions, but regions that had recently reported high levels of incidence were also at a lower risk of local transmission. This may be due to additional immunity accumulated during recent outbreaks. Therefore, the risk of local transmission was not lower in areas fulfilling the elimination criteria. A decrease of 3% in the 3-year average vaccine uptake led to a fivefold increase in the average annual number of cases in simulated outbreaks. Conclusions Local vaccine uptake was a reliable indicator of the intensity of transmission in France, even if it only describes yearly coverage in a given age group, and ignores population movements. Therefore, spatiotemporal variations in vaccine coverage, caused by disruptions in routine immunisation programmes, or lower trust in vaccines, can lead to large increases in both local and cross-regional transmission. The incidence indicator used to define the elimination status was not associated with a lower number of local transmissions in France, and may not illustrate the risks of imminent outbreaks. More detailed models of local immunity levels or subnational seroprevalence studies may yield better estimates of local risk of measles outbreaks. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-022-02277-5.


Background
Immunity against infectious diseases accumulates following infection and, if a vaccine is available, routine immunisation programmes and vaccination campaigns. Measles is highly infectious and can cause large outbreaks in populations with low immunity [1,2]. Therefore, high levels of vaccine coverage are required to minimise the risks of outbreaks [3]. Furthermore, vaccine uptake must be homogeneously high across the territory to avoid local transmission sustained by regional discrepancies [4,5]. The large-scale implementation of routine immunisation programmes led to a drastic reduction in measles cases worldwide, and measles was targeted for elimination in five World Health Organization (WHO) Regions by 2020 under the Global Vaccine Action Plan 2011-2020 [6].
Elimination status, as defined by the WHO, refers to "the absence of endemic measles transmission for more than 12 months in the presence of a well-performing surveillance system" in a given country or region and is verified "after 36 months of interrupted endemic measles virus transmission" [7]. Although imported cases, or cases directly related to importations could still be expected, there should be no continuous transmission persisting over a long period of time in a region where measles was eliminated. A given WHO region can declare measles eliminated when all countries in the region document interruption of endemic transmission for more than 36 months.
Recently, several countries had their elimination status revoked following large outbreaks less than 5 years after it was verified. For instance, the UK achieved elimination in 2017, and lost the status in 2019 along with Albania, Czechia, Greece, Venezuela, and Brazil [8,9]. In these countries, interruption of transmission during a few years was not indicative of reduced risks of major outbreaks. Such occurrences can be explained by several factors, such as a replenishment of susceptible individuals after years without transmission, or importations of cases into subnational areas with lower levels of immunity caused by heterogeneity in vaccine coverage [10][11][12][13]. The number and geographical distribution of the susceptible individuals is not routinely monitored in most countries given the perceived cost and logistical challenges of large serological surveys, yet it is a main predictor of outbreak risk [3]. Local values of vaccine coverage can be an alternative measure of heterogeneity, but they are not always available and can be outdated because of the mobility between regions. Furthermore, they only describe vaccine-induced immunity, and therefore ignore the immunity caused by previous outbreaks. In this study, we aim to (i) estimate the impact of recent local transmission and local vaccine coverage on the current risk of outbreaks, and the changes in transmission dynamics that would result from variations in these factors, and (ii) identify areas most at risk for local transmission using France as a case study.
To do so, we implemented an Endemic-Epidemic timeseries model using hhh4, a framework developed by Held, Höhle and Hofmann to study the separate impact of covariates on importation, cross-regional transmission and local transmissions on aggregated case counts [14,15]. We adapted this framework to daily case counts and applied it to the daily number of measles cases per department (NUTS3 levels) in France reported to the European Center for Disease Prevention and Control (ECDC) between January 2009 and December 2018. We computed the average values of vaccine uptake and the number of cases per department in the past 3 years to mimic the timeframe used to define the elimination status, and modelled their impact on the local risks of outbreaks.

Description of the hhh4 framework
We used the modelling framework implemented in the "hhh4" model, which is part of the R package "surveillance" [15], to analyse infectious disease case counts. All the notations are defined in Table 1. The expected number of cases (μ i, t ) reported in the region i at time Table 1 Table of  Since the endemic component does not depend on Y t − 1 , it represents the background importations that cannot be linked to the mechanistic components. Therefore, these cases either correspond to importations from outside the modelled area (France in our case), or cases that are not otherwise predicted by the other two components.
The full equation for the expected number of cases in region i at time t in an Endemic-Epidemic model is as follows: The number of observed cases at t in i Y i, t follows a negative binomial distribution of mean μ i, t [16]. The overdispersion parameter ψ is estimated. The predictors λ i, t , φ i, t and ν i, t are independently impacted by different covariates, which means that, for instance, a covariate may be associated with a reduction of importations while having little impact on local transmission.
The predictors λ i, t , φ i, t and ν i, t are estimated using log-linear regressions (as a standard Generalised Linear Model). For each predictor, we estimate (i) the intercept α (identical across spatial units) and (ii) the vector of (1) coefficients β associated with z i, t the vector of covariates at t in i included in each component. The parameters α and β are estimated using a (penalised) maximum likelihood approach [15].

Data
The observed case counts Y i, t was computed from 14,461 cases (10,988 confirmed and 3473 probable cases) routinely collected in metropolitan France and reported to the ECDC between January 2009 and December 2018 (Fig. 1A). This data was retrieved on The European Surveillance System (TESSy) on 22 January 2019. The cases were stratified by the metropolitan department they were reported in. The department correspond to French NUTS3 regions. We excluded three cases where this information was not available. We used the date of symptom onset reported for each case to compute the daily number of cases from 2009 to 2018 per department.

Adaptation of hhh4 to daily case counts
In hhh4, the average number of new cases stemming from the autoregressive and neighbourhood components depends on the number of cases at the previous time step. Therefore, if we use daily case counts, the number of cases at t is only impacted by the number of cases the day before. In reality, however, the serial interval of measles is estimated to be 11 days on average [17]. Previous studies using hhh4 relied on temporally aggregated case counts, which partially solved this problem: if the time step is close to the average serial interval, cases of the same generation of transmission can be assumed to be roughly grouped together in the same time point [18]. Nevertheless, studying weekly (or fortnightly) aggregated case counts does not reflect the distribution of the serial interval (i.e. it ignores overlapping generations of transmission because of shorter or longer delays between primary and secondary cases). This can lead to directly connected cases being grouped in the same time step, or separated by more than one time step. This aggregation also ignores the potential for unreported cases, which may lead to cases causing transmission 2 to 3 weeks after their onset date via an intermediate, unobserved case.
Finally, the starting date of aggregation influences how cases are grouped, which can lead to discrepancies in the parameter estimates.
Recent developments in the surveillance package included weight estimation to represent the relative impact of previous time steps on the number of cases at t [19]. Since we are using daily case counts, we set the weights of the different time steps from the distribution of the serial interval. We computed Y′ it , the transmission potential for each department and time step, by multiplying the number of recent cases by the distribution of the serial interval f(t): . Only a subset of measles cases are reported to the surveillance system [20]; therefore, we accounted for the risks of unreported cases by computing a composite serial interval from three different transmission scenarios ( Fig. 1B):

1-In case of direct transmission between two cases
i and j, the number of days between the two cases f 1 (t) follows a normal distribution truncated at 0: f 1 (t)~N(11.7, 2) [17]. 2-In case of unreported cases between i and j, the number of days between the two cases f 2 (t) follows a normal distribution truncated at 0: f 2 (t)~N (23.4, √8).
This distribution corresponds to the convolution of f 1 (t) with itself. 3-If i and j share the same unreported index case, the number of days between i and j (with i being the first reported case) follows a half-normal distribution (excluding 0) of standard deviation √8 days. This distribution corresponds to the distribution of the difference of f 1 (t) with itself, excluding values below 1. We added this last scenario to account for multiple concurrent importations stemming from an unreported infector.
We considered that 50% of the composite serial interval reflected direct transmission (scenario 1, without missing generations between cases), and 50% came from the two scenarios with unreported cases (scenarios 2 and 3, 25% each). The distribution of the composite serial interval is shown in Fig. 1B. We ran sensitivity analysis to estimate the parameters of the model using composite serial intervals computed with different proportions of direct transmission, and observed it had little influence on the estimation of each parameter (Additional file 1: Section 1). The potential for transmission Y′ replaces Y in the equation of the expected number of cases (5):

Connectivity between departments
In the hhh4 framework, the average number of cases caused in the department i at time t by cases from another department j is quantified by the neighbourhood component. It is equal to φ i, t × ω ji × Y j, t − 1 (Eq. 1). Therefore, the number of cases caused by cases from j in i is influenced by three factors: • The susceptibility of the department i, quantified by the neighbourhood predictor φ i, t , defined as it . • The number of connections from j to i, calculated using an exponential gravity model [21], whereby the number of connections between i and j is proportional to the product of the number of inhabitants in the department of origin m j , the department of destination m i and an exponential decrease in the distance between i and j d ji . Therefore, the number of connections from j to i in an exponential gravity model is calculated as w ji = e −δd ji m ǫ it m γ jt . • The proportion of the population in j that is infectious.
Therefore, the average number of cases expected from department j to department i at t can be written as the product of these three factors: Therefore, the log-population log(m it ) was added as a covariate of the predictor φ, and the connectivity matrix ω was computed as ω = The number of inhabitants in each French department between 2009 and 2018 was retrieved from the INSEE website [22].
We implemented two models, using different methods to compute the connectivity between departments.
1. In Model 1, every department can be connected to each other; therefore, only importations coming from outside the departments included in the study fall into the endemic component. The connectivity matrix was computed using the distance between the population centroids of each depart- ment, which were calculated using the 1 km 2 European Grid dataset [23]. This dataset contains the number of inhabitants in each grid cell covering the country (resolution 1 km). We computed the weighted population centre in each department using the R function zonal from the package raster [24] and calculated the distance between population centres.
2. In Model 2, the neighbourhood component only takes into account transmission between neighbouring departments, assuming that cross-regional transmissions between non-neighbouring departments would be captured by the baseline number of daily importations (i.e. the endemic component): Therefore, the neighbourhood component in Model 1 includes both the neighbourhood component and part of the endemic transmission in Model 2.

Covariates
Different covariates can be added in each component of the hhh4 framework [25]. We implemented the same set of covariates in the two models. The two covariates of interest were the impact of vaccine coverage and the category of incidence in each department in the past 3 years. We chose this timeframe in order to match the requirements of the elimination status assessment. We also included the number of inhabitants, the surface area of each department, and the seasonality as control variables, as explained below:

Vaccine coverage
For each department i and time step t, we computed u i, t , the average proportion unvaccinated in the department i over the 3 years prior to t according to local m jt * Y j,t−1 if i and j share a border 0 otherwise coverage reports. We averaged over the past three years in order to use the same timeframe as the elimination status assessment. We used the yearly first dose uptake among 2-year-old children in each French department between 2006 and 2017. This data is publicly available on the website Santé Publique France [26][27][28]. The uptake of the second dose was not reported before 2010, and many departments had missing entries after 2010. Therefore, only the local coverage of the first dose was used in the model.
Since 26% of the entries in the coverage dataset were missing, we ran a beta mixed model to infer the missing values [29]. We used the time and squared time (in years) as covariates, and random effects stratified by department. We used the average prediction to infer the missing values from the fitted model and get the complete vaccine coverage dataset. More details on the regression, and the sensitivity analyses that were run are presented in the Appendix (Additional file 1: Section 2). All values of coverage in 2009 were missing, and were not imputed; we computed the average vaccine coverage in 2010, 2011 and 2012 using only two of the three previous years.
Adding the log-proportion of unvaccinated to the model was the most appropriate approach, since it allows the rate of disease spread (i.e. the value of the predictors λ, ν and φ) to be proportional to the density of susceptibles [25]. Therefore, we calculated the average log-proportion of unvaccinated in the three years before t and added it as a covariate in all three components.

Impact of recent incidence
This covariate quantifies the impact of past outbreaks on current transmission. Departments are eligible for WHO certification of elimination status if they have maintained low levels of transmission over the past 3 years [7]. Therefore, we computed n i, t , the number of cases per million reported between a month and 3 years before t in i. We excluded cases reported in the last month since recent cases may be directly linked to current transmission.
We aggregated n i, t in three categories: major outbreak reported in the department in recent years. The threshold of 45 cases per million corresponds to the last tercile of n i, t , hence 33% of n i, t fall into this last category. The two thresholds correspond to an annual incidence of respectively 3.3 and 15 reported cases per million. Sensitivity analyses were ran using other incidence thresholds, or incidence as a continuous variable, and showed the estimates were robust (Additional file 1: Section 3). In every model, the category of reference for the level of recent incidence was N (0) i,t .Computing the level of recent incidence required the number of cases per department in the past 3 years. Therefore, since this analysis integrates case counts data from 2009, we needed to compute the incidence in each department between 2006 and 2008. Less than 50 cases were reported in France per year in 2006 and 2007 [30], we considered their contribution to the recent level of incidence per department was null. On the other hand, 597 measles cases were reported to the ECDC in France in 2008, but were not stratified by department. However, the number of cases reported per department in 2008 was publicly available on Sante-Publique-France [31], without information on the individual onset date. Therefore, we integrated the number of cases reported per department in 2008 in the computation of N i, t for t < 2012.
The level of recent incidence was a covariate in all three components.

Number of inhabitants and surface area
In the subsection "Connectivity between departments", we discussed the impact of the number of inhabitants on the number of movements between departments. Furthermore, several studies have indicated a potential association between the population density and the number of secondary transmissions [32][33][34]. Therefore, we controlled for the impact of the number of inhabitants in each department, and the surface area (i.e. the geographical size) on the number of local transmissions.
The log-number of inhabitants log(m i, t ) in the department i at time t was added as a covariate in all three components. The log-surface of the department log(s i, t ) was added as a covariate in the autoregressive component.

Seasonality
We control for the impact of the seasonality of measles outbreaks in France on transmission by adding two covariates (sine-cosine) to all three components.

Full model equations for predictors
The covariates are all integrated in the covariate vectors in Eqs. 2, 3 and 4, yielding:

Model calibration
A model is deemed well-calibrated if it is able to correctly identify its own uncertainty in making predictions [35,36]. The most straightforward method to evaluate whether hhh4 models are well-calibrated is to generate a one-stepahead forecast over a chosen test period and compare them with the data [15]. Since we use daily case counts, this method would only assess the ability of the models to capture the number of cases on the next day. We explored the calibration of our models several days ahead. To do so, we selected the last 2 years of data as the test period, fit the model up to each day, and simulated the number of cases over the next 3, 7, 10 and 14 days for each day of the test period in each department. For each date, we ran at least 100,000 simulations. If the number of cases observed in the data had not been generated in 100,000 simulations, we ran simulations until it was reached.
From these simulations, we generated the predictive probability distribution at each time step in each department. The probability integral transform (PIT) histogram were used to illustrate whether the forecasts and uncertainty of the model were correctly calibrated to the data. In a model with perfect calibration, the actual number of cases follows the predictive probability distribution (μ it~Pit for all predictive distributions P it ), i.e. the PIT histogram is uniform (e.g. 10% of the actual values fall into the first decile of the forecasts). We computed the PIT histograms in both models for predictions over 3, 7, 10 and 14 days. The PIT histograms were computed using a non-randomised yet uniform version of the PIT histogram correcting for the use of discrete values described in Czado et al. [37] and implemented in hhh4.
The PIT histograms were used to estimate whether the short-term forecasts were in line with the data, and whether the models were consistently missing some scenarios of transmission.

Simulation study
In order to highlight the impact of variations in the local vaccine coverage or the level of recent transmission on the risks of Endemic predictor : outbreaks, we generated simulations of the number of cases in France across 1 year under different conditions. To compute these simulations, we used the last values of average vaccine coverage (the average was computed from the values in 2015, 2016 and 2017) and the levels of recent incidence in mid-2018, and simulated the daily number of cases between the 1st of August 2018 and the 31st of December 2019. We started the simulations during the period of the year associated with the lowest number of cases so that the spatial distribution of the simulated outbreaks does not depend on the distribution of cases at the starting point. Indeed, if we had used the last 3 months of data (until November 2018), some departments may have been repeatedly associated with higher numbers of cases in our simulations, not because they are more at risk of importation or transmission, but because there had been cases reported in these departments at the beginning of the epidemic year. We were only interested in highlighting the impact of variations in coverage and recent transmission, rather than predicting the level of transmission for the entire year of 2019. We generated 100 samples of the regression coefficients using the variance-covariance matrix and assumed they followed a multivariate normal distribution. For each sample, we computed the values of the three predictors between the 1st of August 2018 and the 31st of December 2019, and simulated the daily number of cases in each department across the year. We ran 100 simulations per sample (i.e. 10,000 simulations were generated per scenario).
We studied four scenarios: (i) using the latest local values of coverage (averaged over the past three years), population and category of recent incidence, (ii) increasing the vaccination coverage in each department by 3%, (iii) decreasing the vaccination coverage in each department by 3% and (iv) setting the recent incidence in each department to minimal levels, in order to observe the outbreaks generated under conditions fulfilling the current WHO elimination status requirements.
Finally, since tourism and local events can lead to mass gatherings and trigger repeated importations independent of parameters included in the model [38,39], we studied the impact of repeated local importations of cases into specific departments. To do so, we simulated one year of transmission (i.e. until the end of 2019) following the importations of 10 cases in a given department in December 2018. In these simulations, we did not allow for any other baseline importations throughout the year, in order to assess the potential for geographical spread throughout the country after importation in one department.

Impact of the covariates on each component
The parameter estimates obtained in both models are shown in . This indicates that these departments were at higher risks of background importations, and secondary transmission upon importation. In both components, the effect of vaccination was slightly stronger in Model 2, where cross-regional transmission is restricted to neighbouring departments, than in Model 1, where cross-regional transmission can happen between all departments, although the confidence intervals overlapped. In Model 1, the proportion unvaccinated also had an aggravating effect on the number of cross-departmental transmissions (0.47 [0.23-0.71]), whereas in Model 2 there was no clear association between the proportion unvaccinated and an increase in cross-regional transmission (− 0.02 [− 0.29-0.25]). The differences between the models' coefficients were due to the crossregional transmission in Model 1 corresponding to both the neighbourhood component and some of the endemic transmission in Model 2.
The association between the level of incidence over the past 3 years (parameters: incid 1 and incid 2 in Fig. 2) and the components of transmission was similar in both models. In the autoregressive component, departments that reported high incidence over the past 3 years (incid 2) were associated with fewer secondary cases per case in the department ( . This could be linked to outbreak-induced immunity causing a depletion of susceptibles in departments where incidence was high over the past few years. On the other hand, departments with high incidence in the past 3 years were at higher , meaning departments that recently reported moderate levels of transmission were associated with more background importations, but no difference was noticeable in cross-regional or withinregion transmission. The effect of increasing the level of recent incidence on the risks of cross-regional transmission and background importation was opposite to its impact on the autoregressive component. Several mechanisms could explain this result: Although the depletion of susceptibles should decrease the risks of importations, different categories of population may be involved in importation and local transmission (e.g. different age groups). Furthermore, regions that report high levels of transmissions may be more connected than others, therefore having relatively higher risks of importation, and offsetting the impact of a depletion of susceptibles.
The other covariates included in the model showed that the number of inhabitants in a department had an important impact on both the endemic and neighbourhood components: departments with more individuals were more likely to report background importations and cross-regional transmission, and slightly more likely to report local transmission. We also observed a strong impact of seasonality on the three components (Fig. 2). Indeed, the peak values of the predictors were 20 to 37% higher than the average value in all components of transmission (Additional file 1: Section 4). The peak of the autoregressive component was in February for both models, the endemic peak was in May for Model 1 (April in Model 2), whereas the neighbourhood component peaked in December in Model 1 (March in Model 2).
Using the mean parameter estimates, and the latest values of vaccination coverage, incidence and number of inhabitants per department, we computed the local predictors φ i , λ i and ν i in both models to highlight the spatial heterogeneity of the transmission risks (Fig. 3). The predictors were computed ignoring the impact of seasonality, which is not region-dependent and does not change the geographic distribution of risks. The maps correspond to the average local value of the predictors the year following the last data entry (i.e. the 30th of November 2018). The geographic distributions of the autoregressive predictor are similar in Model 1 and Model 2. This indicates that the same departments were classified as having higher risks of local transmission in both models. Areas with lower values of vaccine uptake such as the South East and South West of France were associated with higher risks of secondary transmission. Indeed, the highest values of within-region transmission were reported in Bouches-du-Rhône and Var (in the South East of France). Populous departments in the North of France were also at risk of secondary transmission despite higher vaccination coverage.
As expected, the overall number of baseline importations (i.e. connected to the endemic component) in Model 1 was lower than in Model 2, which was compensated by a higher number of cross-regional transmissions (i.e. due to the neighbourhood component) (Fig. 3). This shows that some of the cases that could not be linked to local transmission, or transmission between neighbouring departments in Model 2, were classified as crossregional transmissions in Model 1, which would indicate long-distance transmission events. In both models, departments with a higher number of inhabitants were most at risk of cross-regional and baseline importations, which corresponds to the strong association between the number of inhabitants and the endemic and neighbourhood components highlighted in Fig. 2. Departments like Bouches-du-Rhône that combine a high number of inhabitants with low vaccine coverage were associated with the highest number of baseline and crossregional importations in both models. The variations in the autoregressive component were smaller than in the importation-related components: For instance, the highest autoregressive predictor value (Var 0.81 [0.74-0.88]) was 35% higher than the lowest value (Lozère 0.60 [0.53-0.66]) in Model 1, whereas the number of baseline importations in Bouches-du-Rhônes was more than 100 times above the number of importations in Lozère (South of France). This can be explained by the coefficients of the autoregressive components being much closer to 0 than the most extreme coefficients in the importation-related components (Fig. 2).

Model fit and calibration
The fits of Model 1 and Model 2 indicate that they were able to match the transmission dynamics observed in France between 2009 and 2017, despite the wide variations in the annual number of cases (Fig. 4A, B, Additional file 1: Section 5). In years where active transmission was reported, most of the cases stemmed from the autoregressive component, indicating that the local outbreaks were sustained by transmission within the departments. Indeed, across all years, the autoregressive component accounted for 72.9% of the cases, whereas 23.7% of the cases came from cross-regional transmission and 3.4% from the endemic component (Additional file 1: Fig. S14). This shows that in Model 1, 96.6% of the cases were explained by the transmission stemming from other cases reported in the dataset (93.2% in Model 2). The endemic component described the minority of isolated cases that could not be linked to any concurrent transmission cluster. Therefore, these cases would be more likely to be reported at times of low national levels of transmission when no other case could be linked to them, which explains the shift in seasonality of the endemic component ( Fig. 2 and Additional file 1: Section 4).
In order to visually assess the calibration of the model, and its ability to provide reliable short-term predictions for the number of cases per department, we generated PIT histograms showing the probability integral transform obtained when forecasting the number of cases 3, 7, 10 and 14 days ahead (Fig. 4C-F). The PIT histogram is uniform for predictions 3 and 7 days ahead (all groups are above 0.95 and below 1.05), which shows the number of occurrences where the predictions of the model did not capture the number of cases 1 week ahead was not higher than expected under a uniform distribution. As we increased the number of days of forecast, there were more occurrences of the model mis-predicting the number of cases to come. Indeed, the U-shape observed in panel F of Fig. 4 indicates the model was less capable of identifying extreme events 2 weeks in advance. The calibration study indicated that Model 2 was more prone to under-estimating the number of cases than Model 1 and showed signs of bias for the 7-, 10-and 14-day predictions (Additional file 1: Section 5). The national number of cases predicted by Model 1 and Model 2 were similar, and match the data for predictions 7 days ahead (Additional file 1: Fig. S13). The AIC scores and the calibration study indicated Model 1 was able to fit the data better than Model 2 and was better calibrated. The rest of the "Results" section therefore focuses on the conclusions

Impact of vaccination and recent incidence on onwards transmission
In order to illustrate the impact of recent outbreaks and variations in vaccine coverage on the transmission dynamics in France, we generated 10,000 simulations and computed the number of cases per department in 2019. We ran the simulations from August 2018 (during the historically low transmission season), until 31 December 2019. We generated four sets of simulations under different initial conditions: using the last measures of average local vaccine coverage, category of recent incidence and number of inhabitants; increasing or decreasing the vaccine coverage by 3%, and setting the category of recent incidence to 0 in each department.
Under the latest measures of coverage and incidence (Additional file 1: Section 6), the simulated outbreaks display a wide variation in the number of cases in 2019 (minimum 100 cases, median 1100 cases, maximum 11,100 cases). Active transmission was generated in a wide range of departments. Indeed, across the simulation set, 44 of the 94 French departments reported more than 10 cases in at least 25% of the simulations. There was noteworthy spatial heterogeneity in the levels of incidence. Indeed, in 12 departments, there was no case generated in more than half of the simulations (Fig. 5, top right panel). The departments most vulnerable to active transmissions were highly populated urban areas, such as Paris, the Bouches-du-Rhône, and the North of France. Because they are highly populated, these departments were susceptible to repeated importations (they reported at least 1 case in more than 95% of the simulations), which could then cause large transmission clusters. This was especially evident in the South East of France, where we highlighted that the number of secondary cases per case in the department was among the highest in the country (Figs. 3 and 5). Numerous departments were affected by large outbreaks in a subset of the simulated datasets: 27 departments reported more than 50 cases in at least 5% of the simulations (Fig. 5). Further, at least one major outbreak was generated in the majority of the simulations: in 55% of the simulations, one department  Percentage of simulations where the number of cases reported in each department in 2019 was at least 1, 10 and 50 cases for each scenario using parameter estimates from Model 1. Each row corresponds to a different scenario: (i) reference, (ii) minimum level of recent incidence in each department, (iii) local vaccine coverage decreased by 3% in each department, (iv) local vaccine coverage increased by 3% in each department reported more than 100 cases (the most commonly affected department were Paris and its surroundings, the Nord, and Bouches-du-Rhône).
Decreasing the average 3-year vaccine coverage by 3% led to an important increase in the number of cases per outbreak (median 4900 cases, more than 95% of the simulations resulted in more than 1000 cases). This was first due to an increase in importations and cross-regional transmission: all 94 departments had at least one case in more than half of the simulations, 77 in at least 90% of the simulations. Furthermore, the decrease in vaccination coverage resulted in higher chances of uncontrolled transmissions in many departments (Fig. 5, third  row). On the other hand, increasing the vaccine coverage by 3% caused an important drop in the number of cases (median 605 cases, 80% of the simulations generated less than 1000 cases), caused by both a decrease in the number of importations, and in the potential for secondary transmission following importations. Although outbreaks were still punctually generated, these events are much rarer than in the other two simulation sets: in 25.8% of the simulations, at least one department generated more than 100 cases (54.1% with the baseline scenario, 95.4% when we reduced the local vaccine coverage).Finally, we set the local recent incidence to the minimum level in each department in order to analysis what transmission dynamics would be generated by the model if the elimination requirements were fulfilled in every department. We observed a decrease in the number of importations and cross-regional transmission, and an increase in the number of infections within each department (Fig. 2). In this simulation set, the number of departments where no cases were generated in more than half of the simulations was similar to when the vaccine coverage was increased (24 departments in this simulation set, 29 when the vaccine coverage was increased, Fig. 5), which shows the reduction in the number of cross-regional transmission and background importations. Conversely, the number of large outbreaks was only marginally inferior to the reference simulation set: in 44% of the simulations, there were more than 100 cases generated in at least one department (54% in the reference dataset). The geographical distribution of the risks of large outbreaks was almost identical to the reference simulation set (Fig. 5). Therefore, although the number of importations was reduced, changing the level of recent incidence did not have a clear impact on the risks of active transmission. More departments became vulnerable to secondary transmission, and despite importations in these departments being rarer, they were more likely to lead to large outbreaks when they happened. The two opposing effects recent incidence had on importation and transmission therefore created a different dynamic of transmission observed in the simulation set, without strongly reducing the risks of outbreaks.
Each of these simulation sets highlighted the wide range of scenarios that could be generated using the parameter distributions inferred by our model. In order to gain more understanding on the spatial spread and consequences of importations, we then explored the impact of repeated and localised importations on overall transmission.

Impact of local clusters of transmission
Since the endemic component, which can be interpreted as external importations, represented a minority of the cases in our model (Additional file 1: Fig. S14), repeated importations in a given department over a short timespan rarely occurred in the simulations. Furthermore, due to the seasonality of the endemic component, fewer importations are generated early in December to February, which corresponds to the peak period of the other components, and would therefore be more likely to cause secondary transmissions (Additional file 1: Section 4). We simulated 1 year of transmission following ten importations in December 2018 to illustrate (i) the potential for local outbreaks and (ii) the spatial spread of transmission following repeated local importations. We selected four departments to compare the impact of repeated importations in a range of settings: Paris (many inhabitants, 91% vaccine coverage, surrounded by urban areas), Bouchesdu-Rhône (many inhabitants, 84% vaccine coverage), Haute Garonne (many inhabitants, 91% vaccine coverage but high levels of recent incidence, surrounded by rural areas with lower vaccine coverage), and Gers (Rural area, 79% vaccine coverage) (Fig. 6).
Firstly, major local outbreaks in the department of importation were generated in all four simulation sets, and especially in Paris and Bouches-du-Rhône, where the proportion of simulations that yielded more than 100 subsequent cases in the department was 40% and 39%, respectively. In the Bouches-du-Rhône, large outbreaks were mostly due to the low vaccination coverage, whereas in Paris, outbreaks were mostly linked to the connectivity to nearby areas and the high number of inhabitants, which meant the department was likely to attract crossregional transmissions. Major local outbreaks were rarer in the other two scenarios (9% of simulations above 100 in Haute Garonne, 10% in Gers). The lower proportion of large outbreaks resulted from different factors: recent large outbreaks in Haute Garonne reduced the autoregressive predictor, lowering the number of secondary cases per case imported, whereas since Gers is a rural department, with a low number of inhabitants, almost all the local cases were due to local transmission Fig. 6 Percentage of simulations where the number of cases reported in each department in 2019 was at least 1, 10 and 50 cases following the importations of ten cases in December 2018, and using the parameter estimates from Model 1. For each row, the department of importation is indicated by a black dot (autoregressive component), with very few cross-regional transmissions into Gers.
Conversely, the simulations where cases were imported in Gers yielded the largest spatial spread throughout the country: the median number of departments that reported at least 1 case was 53 (16 when the importations were generated in Haute Garonne; 15 in Bouchesdu-Rhône; 39 in Paris). As stated in the method, the number of cross-regional transmissions is the product of the predictor and the connectivity matrix, divided by the number of inhabitants in the department of origin, to represent that only a fraction of commuters will be infected. Therefore, populous areas are more likely to attract cross-regional transmissions, whereas more rural departments are more likely to seed outbreaks in other areas. The relatively high spatial spread when cases were imported in Paris is due to the short distance between Paris and its suburbs, which is then more likely to cause cross-regional transmission in the northern departments. Despite the cross-regional spread observed in both of these simulations sets, outbreaks remained local, and occurrences of nationwide outbreaks were almost null. The departments most at risk of outbreak following cross-regional spread were some of the direct neighbours of the department of importations, or the large urban areas (Fig. 6). To further explore this, we ran the same simulations decreasing the vaccine coverage by 3%, which greatly increased the number of departments exposed in each simulation set, and increased the risk of local transmission (Additional file 1: Section 7). Therefore, although repeated importations could cause active transmission in and around the departments of importation, the current values of vaccine coverage and the seasonality of transmission were able to prevent nationwide transmission.

Discussion
This analysis explored which local factors were associated with high risks of transmission in France over the last decade. Since 2017, immunity gaps, caused by failures to vaccinate, have been linked to a resurgence of measles in all WHO regions [40]. In countries near-elimination, large outbreaks have been linked to heterogeneity in the levels of immunity, with pockets of susceptibles fuelling punctual outbreaks despite high national vaccine uptake [1,2,4,25]. Our study showed that local values of vaccine coverage were linked to lower transmission, whereas lower levels of recent incidence were not associated with lower risks of local transmission. Furthermore, we highlighted that a drop of 3% in the 3-year vaccine coverage triggered a fivefold increase in the number of cases simulated in a year.
Insights from our analysis could help inform planning and control for measles in near-elimination settings. Our results show the importance of collecting accurate measures of local vaccine coverage, which can then be used to identify areas at risk and anticipate imminent outbreaks. The existence of areas with consistently low vaccine uptake was shown to be a better indicator of the risks of transmission events than recent incidence, which suggests it should be the first factor to consider when assessing the elimination status of a country, rather than recent incidence. As spatial heterogeneity will be concealed by national-level vaccine uptake, high-quality data collection of local vaccine uptake is crucial for correctly estimating risk. Given that previous studies showed that the immunity levels required to minimise risks of outbreaks were age-dependent [3,41,42], it would be beneficial to collect and report local vaccine uptake data across several age groups (e.g. Wales is collecting MMR coverage in children aged 2, 4, 5 and 16 years old [43]), so these data can be used as an proxy for immunity. Such data could then be integrated in Endemic-Epidemic models, to better account for the immune profile of the population.
The fact that higher vaccine coverage was associated with a lower number of secondary cases is consistent with prior expectations and would confirm that the local values of first dose vaccine coverage are a good indicator of the actual immunity in the population and risks of future transmission. Reporting accurate values of local vaccine coverage is challenging, for instance because the vaccination status of people moving regions can be hard to track and lead to measurement errors. Furthermore, we did not have access to complete data on the coverage of the second MMR dose, which would be a better indicator of vulnerable areas. Therefore, detecting the association between recent vaccine uptake and incidence is encouraging. The impact of local vaccination coverage on transmission may also be muddled by sub-regional vaccine heterogeneity. For instance, pockets of susceptibles within a region, i.e. areas within the region where the vaccine coverage is substantially lower than the regional average, may be at high risk of transmission and would not be observable in regional coverage [44]. This phenomenon can only be hypothesised here and could be explored using local data on incidence and vaccine uptake at a sub-regional scale.
Variations in vaccine coverage had a noticeable impact on the number of cases generated in the simulation study. We showed the effects of a 3% increase and decrease of the 3-year average vaccine coverage on the number of cases, which highlighted the risks of uncontrolled transmission in the event of a decrease of vaccine-induced protection. Events such as the disruption caused by the SARS-COV-19 pandemic on routine measles vaccination campaigns could therefore highly increase the risks of uncontrolled measles transmission in the years to come [45,46].
The departments that reported low incidence in the past 3 years were associated with higher risks of local transmission (autoregressive component). Therefore, all else being equal, regions fulfilling the elimination status criteria were not associated with lower risks of local transmission. Conversely, high levels of recent transmission were associated with a higher number of crossregional transmissions and importations, although we cannot methodologically establish the causality of this association. When we set the category of recent incidence to the lowest level in the simulations, departments were less exposed to cases (since importations and spatial spread were rarer), but there was little change in the number of major outbreaks compared to the baseline scenario. Therefore, the simulations showed an "all-or-nothing" situation: lower risks of exposure, but higher risks of local outbreak conditional on exposure. These results would indicate that looking into the level of incidence to quantify the future risks of outbreaks can be deceptive, and importations in a department with low recent incidence would result in large transmission clusters.
We proposed a new framing of the Endemic-Epidemic model implemented in hhh4 by adapting it to daily count data using the distribution of the serial interval to compute the local transmission potential. Using daily case counts allowed us to avoid biases associated with aggregated case counts, such as the influence of the arbitrary aggregation date, by accounting for the impact of variation in the serial intervals. We also accounted for the risks of unreported cases by computing a composite multimodal serial interval, thus allowing for transmission with a missing generation, or an unreported ancestor. The model was able to capture the dynamic of transmission better than the 10-day aggregated model, as shown by the calibration study (Additional file 1: Section 8). Nevertheless, our framing of the hhh4 model introduced new biases: we used a distribution of the serial interval based on previous studies rather than estimating the weights during the fitting procedure and set the proportion of missing generations in the composite serial interval. We explored the impact of the proportion of missing generations by fitting the model with different composite serial intervals and concluded that the impact of each covariate was robust to these changes (Additional file 1: Section 1). We also integrated a potential day-of-theweek effect, and observed that although it had an impact on the autoregressive component, it did not change the estimates of the other parameters, and therefore did not change the conclusions of the study (Additional file 1: Section 9). Using the hhh4 model allowed us to analyse the different impact of various covariates on local and crossregional transmission, and background importation of cases. According to the models we implemented, an overwhelming majority (> 90%) of the transmission came from the cross-regional and local components of the regression. This indicates that in the models, the endemic component only corresponds to rare background cases that could not be linked to concurrent transmission events. This could point towards model misspecifications, for example, connecting unrelated importations to concurrent local transmission. Since endemic transmission tends to refer to cases otherwise unexplained by the mechanistic components, the seasonality of the endemic component is decoupled from the other components, i.e. endemic cases are likely when local and cross-regional transmission are lower.
Since the endemic component accounted for such a small minority of the cases, group importations of cases in a given department were rarely observed in the simulations. However, tourism and local events lead to large gatherings and can increase the risks of group importations in a limited period of time [38,39]. We simulated the spatial spread following repeated importations in a given department, and highlighted that although large outbreaks in the department of importations were common, nationwide transmission following these importations was very rare. Only the departments where all cases had been imported, and its neighbours, were at risk of uncontrolled outbreaks. Decreasing the level of vaccination by 3% was associated with a large increase in the level of exposure of all departments, and in the number of departments where large outbreaks were generated (Additional file 1: Section 7 and 8). The high levels of transmission observed in recent years in France suggest that importations are frequent, and even a small drop in vaccination could dramatically increase measles transmission in the country.
The calibration study showed that both models were able to generate accurate short-term predictions. However, the 14-day PIT histograms indicate that longer-term predictions may suffer from biases. We identify several factors that could explain the discrepancies observed for longer-term predictions: (i) the indicator of local immunity we used was flawed: two-dose coverage may be a better indicator of the proportion of the population that is protected. (ii) The sub-regional heterogeneity in coverage and past incidence could be concealed by NUTS3 aggregated data: because of social groups that rarely mix with one another, or large NUTS regions. (iii) Other indicators that could not be integrated in the model had an impact on the risks of transmission or importations, such as socio-economic heterogeneity, local investment in public health [47], or local mass gatherings (which could lead to repeated importations). Nevertheless, we believe that the results obtained using limited publicly available covariates are encouraging and we intend to apply this method using more complete data.
We identified a number of limitations of this study that have not yet been mentioned: Firstly, potential reactive control measures in response to high levels of transmission were not accounted for. It is likely that if the level of incidence was increasing over a short period of time, control measures would be implemented and the behaviour of the individuals may change (e.g. school closures, catch-up vaccination campaigns). This could impact the number of expected cases after a certain threshold is passed, and change the dynamics in the simulated outbreaks. Secondly, we did not include information on the age or the genotype of the cases. Therefore, unrelated importations in successive time steps in a given region may be considered as linked by our model. Further development of this method could focus on taking this aspect into account, in order to give information on the number of independent concurrent chains. Thirdly, since this is not a transmission model which does not explicitly take into account the depletion of susceptibles, some extreme values could trigger unlikely behaviour. For instance, sporadic transmissions could still happen if the vaccination rate would be 100%. We do not think this would be entirely implausible since only the vaccination coverage reported in the past 3 years was taken into account in the models (i.e. even if it was 100% coverage, there could be susceptible individuals in different age groups). Finally, the impact of the different covariates on the number of cases was constant through time. However, the impact of seasonality may depend on factors varying each year (e.g. weather), which would not be accounted for in the model we developed.
We used variables collected in a wide range of settings (regional vaccine coverage, incidence, number of inhabitants, surface); therefore, this analysis can be reproduced in other countries to analyse the potential for local transmission as well as the impact of recent incidence and vaccine-induced immunity. Since the case counts data are not publicly available, we share the code used to generate the analysis applied to a simulated dataset on a Github repository (https:// github. com/ alxsr obert/ measl es-regio nal-immun ity).

Conclusions
Local vaccine coverage was shown to be a better indicator of imminent risks of measles outbreak in France than the recent level of incidence. Homogeneously high vaccine coverage was required to substantially mitigate the number of local transmission following importations. Simulated outbreaks generated using the parameter estimates of the model showed that drops in vaccine coverage, potentially due to increased vaccine hesitancy or disruption of routine immunisation programmes, would rapidly lead to large increases in the level of transmission in each department.
Additional file 1: Figure S1. Estimates of the parameters in each component of Model 1, using the reference fit (grey) and ten different values of the composite interval (orange). unvax corresponds to the effect of u i, t , the mean proportion unvaccinated over the three years before t in i; incid1 and incid2 correspond to the effect of N 1 i,t and N 2 i,t the category of incidence in the three years before t in i; pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. The orange arrows indicate the extreme values of the 95% confidence interval obtained using different distributions of the composite serial intervals. Figure S2. Estimates of the parameters in each component of Model 2, using the reference fit (grey) and ten different values of the composite interval (orange). unvax corresponds to the effect of u i, t , the mean proportion unvaccinated over the three years before t in i; incid1 and incid2 correspond to the effect of N 1 i,t and N 2 i,t the category of incidence in the three years before t in i; pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. The orange arrows indicate the extreme values of the 95% confidence interval obtained using different distributions of the composite serial intervals.  Figure S7.
Estimates of the parameters in each component of Model 1, using the reference fit (grey) and 100 different values of coverage for the inferred entries (orange). unvax corresponds to the effect of u i, t , the mean proportion unvaccinated over the three years before t in i; incid 1 and incid2 correspond to the effect of N 1 i,t and N 2 i,t the category of incidence in the three years before t in i; pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. The orange arrows indicate the extreme values of the 95% confidence interval obtained drawing different values of coverage for the missing entries. Figure S8. Estimates of the parameters in each component of Model 2, using the reference fit (grey) and 100 different values of coverage for the inferred entries (orange). unvax corresponds to the effect of u i, t , the mean proportion unvaccinated over the three years before t in i; incid1 and incid2 correspond to the effect of N 1 i,t andN 2 i,t the category of incidence in the three years before t in i; pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ in Equation X); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. The orange arrows indicate the extreme values of the 95% confidence interval obtained drawing different values of coverage for the missing entries. Figure S9. i,t the category of incidence in the three years before t in i compared to the level of reference (i.e. N 0 0,t ); pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. Note different y-axes between graphs. The y-axis. unvax corresponds to the effect of u i, t , the mean proportion unvaccinated over the three years before t in i; incid correspond to the effect of increasing the incidence in the three years before t in i; pop corresponds to the effect of m i, t the number of inhabitants at t in i; area corresponds to the effect of the surface; sin and cos correspond to the effects of seasonality; distance and population correspond to the spatial parameters of the connectivity matrix ω (δ and γ); overdisp is the estimate of the log-overdispersion parameter in the negative binomial distribution of Y i, t . Dots show the mean values associated with the parameters; arrows show the 95% Confidence interval. Note different y-axes between graphs. Figure S11. Seasonality of each component in Model 1 (Panel A) and Model 2 (Panel B). We quantified the impact of seasonality using the percent of variation around the mean value every day. Figure S12. Panel A and B: Daily and weekly fit between the data and Model 2. The inferred number of cases is split among the three components of the model. Panel C to F: PIT histograms of Model 2, generated respectively for predictions 3, 7, 10, and 14 days ahead. Figure S13. Comparison between predictions 3, 7, 10, and 14 days ahead using model 1 and model 2 and the data. Lines correspond to the median estimates, shaded areas correspond to the 95% predictions intervals. The blue and purple predictions are similar for the entire calibration period, hence the curves overlap in all panels. Black dots represent the number of cases 3, 7, 10, and 14 days ahead in France at each date. Figure S14. Proportion of cases per component in both models. Figure S15. Percentage of simulations where the number of cases reported in each region in 2019 was at least 1, 10, and 50 cases for each scenario using parameter estimates from Model 2. Each row corresponds to a different scenario: i) Reference, ii) Minimum level of recent incidence in each region, iii) Local vaccine coverage increased by 3% in each region, iv) Local vaccine coverage decreased by 3% in each region. Figure S16. Percentage of simulations where the number of cases reported in each region in 2019 was at least 1, 10, and 50 cases following the importations of ten cases in December 2018, and using the parameter estimates from Model 2. For each row, the region of importation is indicated by a black dot. Figure S17. Geographic distribution of the number of inhabitants (right panel), average vaccine coverage (central panel), and recent incidence (left panel) at the end of 2018. Figure S18.
Percentage of simulations where the number of cases reported in each region in 2019 was at least 1, 10, and 50 cases following the importations of ten cases in December 2018, and using the parameter estimates from Model 1 and a three percent decrease in vaccine coverage in each region. For each row, the region of importation is indicated by a black dot. Figure  S19. Percentage of simulations where the number of cases reported in each region in 2019 was at least 1, 10, and 50 cases following the importations of ten cases in December 2018, and using the parameter estimates from Model 1 and a three percent increase in vaccine coverage in each region. For each row, the region of importation is indicated by a black dot. Figure S20. Sharpness, bias and RPS scores of the Daily model with and without Random Effects (RE), and of the aggregated model. Figure S21. Number of cases by day of the week. We used the onset date for each case reported to the ECDC between 2009 and 2018. Figure S22.
Comparison of the parameter estimates obtained in Model 1 (similar to Fig. 2) with or without a weekday covariate added in each compartment. Weekday was computed as a binary covariate, whose value was 1 on Saturdays and Sundays, and 0 otherwise.