Individual social contact data and population mobility data as early markers of SARS-CoV-2 transmission dynamics during the first wave in Germany—an analysis based on the COVIMOD study

Background The effect of contact reduction measures on infectious disease transmission can only be assessed indirectly and with considerable delay. However, individual social contact data and population mobility data can offer near real-time proxy information. The aim of this study is to compare social contact data and population mobility data with respect to their ability to reflect transmission dynamics during the first wave of the SARS-CoV-2 pandemic in Germany. Methods We quantified the change in social contact patterns derived from self-reported contact survey data collected by the German COVIMOD study from 04/2020 to 06/2020 (compared to the pre-pandemic period from previous studies) and estimated the percentage mean reduction over time. We compared these results as well as the percentage mean reduction in population mobility data (corrected for pre-pandemic mobility) with and without the introduction of scaling factors and specific weights for different types of contacts and mobility to the relative reduction in transmission dynamics measured by changes in R values provided by the German Public Health Institute. Results We observed the largest reduction in social contacts (90%, compared to pre-pandemic data) in late April corresponding to the strictest contact reduction measures. Thereafter, the reduction in contacts dropped continuously to a minimum of 73% in late June. Relative reduction of infection dynamics derived from contact survey data underestimated the one based on reported R values in the time of strictest contact reduction measures but reflected it well thereafter. Relative reduction of infection dynamics derived from mobility data overestimated the one based on reported R values considerably throughout the study. After the introduction of a scaling factor, specific weights for different types of contacts and mobility reduced the mean absolute percentage error considerably; in all analyses, estimates based on contact data reflected measured R values better than those based on mobility. Conclusions Contact survey data reflected infection dynamics better than population mobility data, indicating that both data sources cover different dimensions of infection dynamics. The use of contact type-specific weights reduced the mean absolute percentage errors to less than 1%. Measuring the changes in mobility alone is not sufficient for understanding the changes in transmission dynamics triggered by public health measures. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-021-02139-6.


Reproduction number of the German Public Health Institute (Robert Koch Institute (RKI))
The method applied by the RKI to obtain R values is based on the reported numbers of individuals notified for being newly infected with SARS-CoV-2. Incident cases are attributed to the day of the first symptoms.
If this information is not available, it is imputed taking into account the age of the case, the day and week of the notification, and the estimated delays from the day of the first symptom to the notification date of the cases. The estimation of the notification delay was based on cases for which all information was available; changes in these delays over the course of the pandemic are taken into account [1][2][3].
As for recent SARS-CoV-2 cases, only those with a short time period between a positive test and the notification are reported, a nowcasting approach is applied [4]. The nowcasting produces an estimate of the additional number of SARS-CoV-2 cases that have already occurred but have not yet been identified by the surveillance system in Germany, taking into account the delay in diagnosis, reporting and transmission. For this purpose, the proportion of cases that were reported after a certain number of days, x, after the onset of the disease, were determined. This proportion is then used to correct the number of cases submitted with the onset of illness x days before the analysis [1][2][3].
The imputation and the nowcasting lead to an estimated epidemic curve based on which the time-dependent reproduction number (R) can be estimated. With an assumed constant generation time of 4 days, R is defined as the quotient of the number of new infections in two consecutive time frames of 4 days each.
where ̅ 4 is the sliding average of the number of infected cases over 4 days [1,2].
In this study, the 4-day reproduction number calculated by the RKI was used. This provides information on the transmission dynamics 8 to 13 days prior [1][2][3]. The R values are continuously corrected by the RKI retrospectively for delayed notifications. As we extracted the estimated R values over a year after the day they were calculated for, all delayed notifications were already accounted for in the estimates used in this study.

Estimating the relative reduction in contacts (COVIMOD), population mobility (Google and
Apple) and the reproduction number obtained from the RKI NOTE: We assumed R0 pre-pandemically to follow a normal distribution with mean 2.6 and standard deviation 0.54 [5].
Google and Apple data provided were compared to pre-pandemic times. The baseline for Google is the median value of the corresponding day of the week during the 5-week period January 3 to February 6, 2020.
The below analysis produced Figure 2 and Figure 4a in the manuscript. The same analysis was performed for Figure 3, where we considered a relative reduction of mobility/contacts in Google and COVIMOD for different settings (i.e. home, school, work, transport, and others).

STEP 1 -weighing
For COVIMOD and POLYMOD we assigned survey weights based on participants' age, sex, household size, day of the week the questionnaire was filled in for and region of residence. We weighted the population mobility data (Google and Apple) and our reference reproduction number (RKI) by the day of the week only.

STEP 2 -weighted means
We generated the weighted relative reduction in population mobility for Google and Apple, ̅ ̅ , the weighted mean reproduction number, ̅ , the weighted mean of the reported number of contacts during the SARS-CoV-2 pandemic (COVIMOD -simple approach), ̅ , and the weighted mean number of contacts before the pandemic (POLYMOD) ̅ .
From the COVIMOD data, we also estimated the mean reproduction number from the eigenvalue obtained from the symmetric next generation matrix by assuming R0 pre-pandemically to follow a normal distribution with mean 2.6 and standard deviation 0.54 and taking the survey weights into account (COVIMODcomplex approach; ̅ ).

STEP 3 -relative reductions in transmission dynamics
Reproduction number (RKI): We estimated the relative reduction in the reproduction number from the basic reproduction number for the reproduction number estimated by the RKI by assuming R0 to follow a normal distribution with mean 2.6.
That is, RKI relative reduction = (2.6 − ̅ ) 2.6 * 100 Population mobility: As both Google and Apple mobility data already describe the relative reduction in mobility compared to mobility before the SARS-CoV-2 pandemic in Germany, the weighted means obtained above are describing the relative reduction.
Google relative reduction = ̅ Apple relative reduction = ̅ COVIMOD simple approach: We estimated the relative reduction in contacts during the pandemic compared to the pre-pandemic using the weighted mean reported number of contacts from COVIMOD and using the weighted mean reported number of contacts from the POLYMOD data as a baseline.
COVIMOD simple approachrelative reduction = ( ̅ − ̅ ) * 100 ̅ COVIMOD complex approach: For the complex approach we used the estimated reproduction number based on the COVIMOD data and calculated the relative reduction compared to the basic reproduction number: We used the package boot to compute 95% confidence intervals around the weighted mean of the relative reductions (1000 samples) [8].

Relative reduction in transmission dynamics with a scaling factor but without separate weighing for home/non-home contacts
The below analysis produced Figure 4b in the manuscript The following estimates obtained under (2) are used as a base: RKIrelative reduction, COVIMOD simple approachrelative reduction, COVIMOD complex approachrelative reduction, Googlerelative reduction, Applerelative reduction.

STEP 1
We performed an analysis in which we fitted a scaling factor with the same weight for all types of contacts and mobility. For this, we minimised the residual sum of squares across the four survey waves using the optim function in R, i.e.: Where is the scaling factor and DATA is either COVIMOD simple approachrelative reduction, COVIMOD complex approachrelative reduction, Googlerelative reduction or Applerelative reduction.
The relative reduction including the scaling is therefore: where DATA is either COVIMOD simple approachrelative reduction, COVIMOD complex approachrelative reduction, Googlerelative reduction or Applerelative reduction and is the scaling factor.

STEP 2
To compute confidence intervals, we run the optim function inside a bootstrap-routine with 1000 bootstrapped samples, i.e.
For the estimation of the reproduction number based on COVIMOD data (COVIMOD complex approach) already 10,000 bootstrapped samples were drawn, we used those instead, i.e. we calculated the relative reductions for each of the estimated reproduction numbers of the 10,000 samples and multiplied that by the scaling factor for the COVIMOD complex approach in step 1.

Relative reduction in transmission dynamics with fitted weights for home/non-home contacts/mobility
The below analysis produced Figure 4c in the manuscript NOTE: Apple mobility data could not be used for these analyses as there is no differentiation in home/nonhome mobility available The following estimates obtained under (2) are used as a base: RKIrelative reduction, COVIMOD simple approachrelative reduction, COVIMOD complex approachrelative reduction, Googlerelative reduction.

STEP 1 -weighted means home/non-home contacts
We generated the weighted relative reduction in population mobility for Google separately for home and non-home mobility, ̅ From the COVIMOD data, we also estimated the mean reproduction number from the eigenvalue obtained from the symmetric next generation matrix by assuming R0 pre-pandemically to follow a normal distribution with mean 2.6 and standard deviation 0.54 and taking the survey weights into account also separately for home and non-home contacts (COVIMODcomplex approach; ̅ _ℎ and ̅ _ −ℎ ).

STEP 2 -relative reductions in transmission dynamics separately for home and non-home contacts
Population mobility: As Google mobility data already describe the relative reduction in mobility compared to mobility before the SARS-CoV-2 pandemic in Germany, the weighted means obtained above are describing the relative reduction.
Google relative reduction_home = ̅ _ℎ Google relative reduction_non-home = ̅ _ −ℎ We then computed a weighing variable (WGoogle), so that: (Google relative reduction_home + WGoogle*Google relative reduction_non-home)/2 = Google relative reduction COVIMOD simple approach: We estimated the relative reduction in contacts during the pandemic compared to the pre-pandemic using the weighted mean reported number of contacts from COVIMOD and using the weighted mean reported number of contacts from the POLYMOD data as a baseline.

STEP 3
We performed an analysis in which we fitted the relative reductions in contacts/mobility to the reference (RKI estimates) and allowed independent scaling factors for home contacts/mobility and non-home contacts/mobility. For this, we minimised the residual sum of squares across the four survey waves using the optim function in R, i.e.: Where (1) and (2)  The relative reduction including the scaling for home and non-home contacts/mobility is, therefore: Where (1) and (2)

STEP 4
To compute confidence intervals, we run the optim function inside a bootstrap-routine with 1000 bootstrapped samples, i.e.
For the estimation of the reproduction number based on COVIMOD data (COVIMOD complex approach) already 10,000 bootstrapped samples were drawn, we used those instead, i.e. we calculated the relative reductions for each of the estimated reproduction numbers of the 10,000 samples and multiplied those by the scaling factors for the COVIMOD complex approach in step 3.

Relative reduction in transmission dynamics with normalised weights for home/non-home contacts/mobility as well as allowing a scaling factor.
We used normalised weights based on setting-specific secondary attack rates (SAR) based on a metaanalysis by Thompson et al [9]. The household SAR was estimated to be 21.1 (home), and the average of healthcare, workplace and casual close contacts was estimated to be 2.23 (non-home).
Therefore: The below analysis produced Figure 4d in the manuscript NOTE: Apple mobility data could not be used for these analyses as there is no differentiation in home/nonhome mobility available The following estimates obtained under (2) are used as a base: RKIrelative reduction, COVIMOD simple approachrelative reduction, COVIMOD complex approachrelative reduction, Googlerelative reduction.

STEP 1 -weighted means home/non-home contacts
We generated the weighted relative reduction in population mobility for Google separately for home and non-home mobility, ̅ From the COVIMOD data, we also estimated the mean reproduction number from the eigenvalue obtained from the symmetric next generation matrix by assuming R0 pre-pandemically to follow a normal distribution with mean 2.6 and standard deviation 0.54 and taking the survey weights into account also separately for home and non-home contacts (COVIMODcomplex approach; ̅ _ℎ and ̅ _ −ℎ ).

STEP 2 -relative reductions in transmission dynamics separately for home and non-home contacts
Population mobility: As Google mobility data already describe the relative reduction in mobility compared to mobility before the SARS-CoV-2 pandemic in Germany, the weighted means obtained above describe the relative reduction.
We then computed a weighing variable (WGoogle), so that: (Googlerelative reduction_home + WGoogle*Googlerelative reduction_non-home)/2 = Googlerelative reduction COVIMOD simple approach: We estimated the relative reduction in contacts during the pandemic compared to the pre-pandemic using the weighted mean reported number of contacts from COVIMOD and using the weighted mean reported number of contacts from the POLYMOD data as a baseline.

STEP 3
In addition to the normalised weights, we allowed for an additional scaling factor per contact survey approach, i.e. COVIMOD simple approach, COVIMOD complex approach and Google mobility data.
For this, we minimised the residual sum of squares across the four survey waves using the optim function in R, i.e.: Where α is the scaling factor and DATAhome/DATAnon-home are either COVIMOD simple approachrelative reduction_home/COVIMOD simple approachrelative reduction_non-home, COVIMOD complex approachrelative reduction_home/COVIMOD complex approachrelative reduction_non-home, Googlerelative reduction_home/Googlerelative reduction_nonhome and W is WGoogle or WCOVIMOD simple or WCOVIMOD complex.
The relative reduction including the scaling for home and non-home contacts/mobility is, therefore: Where α is the scaling factor and DATAhome/DATAnon-home are either COVIMOD simple approach relative reduction_home/COVIMOD simple approachrelative reduction_non-home, COVIMOD complex approach relative reduction_home/COVIMOD complex approach relative reduction_non-home, Google relative reduction_home/Google relative reduction_home and W is WGoogle or WCOVIMOD simple or WCOVIMOD complex.

STEP 4
To compute confidence intervals, we run the optim function inside a bootstrap-routine with 1000 bootstrapped samples, i.e. Where _bootstrapped denotes the bootstrapped COVIMOD simple approachrelative reduction_home/ COVIMOD simple approachrelative reduction_non-home, Google relative reduction_home/ Google relative reduction_non-home, or RKI relative reduction.
For the estimation of the reproduction number based on COVIMOD data (COVIMOD complex approach) already 10,000 bootstrapped samples were drawn, we used those instead, i.e. we calculated the relative reductions for each of the estimated reproduction numbers of the 10,000 samples and multiplied those by the scaling factors for the COVIMOD complex approach in step 3.

Repeated measures ANOVA
Repeated measures ANOVA was used to assess differences between error rates provided by the different data sources (COVIMOD, Google, Apple). The repeated measures ANOVA generated an F-statistic used in determining the statistical significance. The F statistic is represented as is the number of data category (e.g., Google and Apple we are considering), is the number of waves (wave 1, wave 2, wave 3, wave 4) under each (i th ) data category. ̅ is the mean score for each (i th ) data category, ̅ is the grand mean and ̅ is the mean of waves i.
The R aov function was used for this analysis.
R version 4.0.2 was used for all analyses [10]