Contact surveys
Pandemic contact survey—COVIMOD
The contact survey COVIMOD was initiated in April 2020 based on participants of the online panel i-say.com. To ensure the samples’ broad representativeness of the German population, participants were recruited by sending email invitations to existing members of the panel based on age, sex and regional quotas. To gain information on children’s social contacts, a defined subgroup of adult participants with under-aged children (< 18 years of age) living in their household were invited to provide information as a proxy for their children. This approach, however, resulted in the sample being no longer representative of the German population as we under-sampled the middle-aged participants who instead filled out the questionnaire for their children. The first COVIMOD survey wave was launched on 30/04/2020 corresponding to the time of the strictest contact reduction measures in Germany. Survey waves 2 to 4 were launched during a time of a gradual easing of the contact reduction measures in May and June 2020. For wave 1, a sample of 1500 participants was recruited with an expected response rate of 85% for the next survey waves. Before the launch of survey wave 4, the sample was increased by 1000 additional participants.
The COVIMOD questionnaire is based on the questionnaire of the CoMix study and includes questions on demographics, current behaviours, attitudes towards SARS-CoV-2 and the social contacts of participants [3]. Participants were asked to provide each social contact between 5 am the preceding day and 5 am the day of the survey, the age and sex of the contact, the duration they spend with each contact, the setting where the contact occurred and if the contact was a household member or not. The questionnaire can be found in Additional file 1.
We defined a contact in COVIMOD in line with the POLYMOD study’s definition as “people who you met in person and with whom you exchanged at least a few words, or with whom you had physical contact” [12]. During survey waves 1 and 2, participants were asked to provide each contact separately. Instead of providing each contact one by one, some participants included a group of contacts as one contact (e.g. “customers”). For these groups, we assumed a specific number of social contacts (Additional file 2). From survey wave 3 onwards, participants were offered the opportunity to provide a number of additional contacts (group contacts) they were not able to list individually in case they had too many contacts.
As participants were offered to enter these additional contacts separately, we used different analysis approaches to work with these contacts (sensitivity analyses). The main scenario includes all reported contacts plus group contacts weighted for the German population for COVIMOD and POLYMOD. Unweighted results and those without group contacts can be found in Additional file 3.
Pre-pandemic contact survey—POLYMOD
The European contact survey POLYMOD was used as a baseline pre-pandemic comparison. In Germany, POLYMOD was conducted paper-based with the help of a market research company in 2005/2006. Further details about POLYMOD can be found elsewhere [12]. As in COVIMOD, participants in POLYMOD were also allowed to enter the number of additional contacts (group contacts) they had if participants had too many contacts to report them separately.
Mobility data
We obtained publicly available aggregated mobility data from the Google COVID-19 Community Mobility Reports and from the COVID-19 Apple Mobility trends for the times corresponding to the COVIMOD survey waves [20, 21].
Google COVID-19 Community Mobility Reports provide the percentage change in mobility from February 2020 onwards compared to the median of the corresponding weekday between 03/01/2020 and 06/02/2020. Google COVID-19 Community Mobility Reports use aggregated information about true individual movement histories to provide location-specific changes in mobility over time. Data are stratified by the destination of the movement, i.e. retail and recreation, grocery and pharmacy, transit stations, workplace, residential and parks. COVID-19 Apple Mobility trends provide information about the relative volume of requests for directions for all weeks in 2020 compared to a base volume on 13/01/2020.
Reproduction number estimates by the German Public Health Institute
R values used in our analysis as the “reference standard” for infection dynamics were obtained from the German Public Health Institute (Robert Koch Institute (RKI)) [22, 23]. The method applied by the RKI to obtain current R values is based on the reported numbers of individuals notified for being newly infected with SARS-CoV-2 and includes a nowcasting approach taking into account the delay in diagnosis, reporting and data delivery. If possible, incident cases are attributed to the day of first symptoms (an information available for the majority of cases in the German notification system). If this information is not available, it is imputed taking into account measured delays from the day of the first symptom to the notification date, age of the case and day and week of notification. Based on this nowcasting, RKI estimates the time-dependent reproduction number [24]. The 4-day reproduction number calculated by the RKI provides information on the transmission dynamics 8 to 13 days prior [23]. The R values are continuously corrected retrospectively for delayed notifications. We used R values provided by the RKI for 10 days after the timing of our survey waves as a reference for the comparison of infection dynamics. Since we extracted R values more than 1 year after the day they were calculated for, all delayed notifications were already accounted for. The R values based on case numbers as reported by RKI reflect both changes in transmission dynamics due to contact reduction measures as well as due to developing immunity in the population, while contact survey and mobility data cannot take into account population immunity. For this analysis, we assumed that SARS-CoV-2 immunity in the population is negligible for our analyses as this study only includes the first wave of the SARS-CoV-2 pandemic in Germany, and seroprevalence estimates for this period are below 1% in representative studies [25].
Data management and statistical analyses
Contact surveys
As the COVIMOD sample is not fully representative of the German population, we used data from the 2011 census to apply survey weights based on the participants’ age, sex, household size and region of residence [26]. The region of residence was not available for POLYMOD, so the POLYMOD data were only weighted according to the participants’ age, sex and household size using the R package “survey” [27]. As the COVIMOD data collection was not always started on the same day of the week and the duration of the survey waves did vary slightly, we also weighted both COVIMOD and POLYMOD for weekdays/weekends.
We calculated the mean number of social contacts per participant per day as well as the 95% confidence interval of the bootstrapped mean of 1000 samples. We stratified social contacts by age group, sex, household size and the day of the week. Additionally, we assessed setting-specific contacts, i.e. home, childcare/school/university, work, public transport and others; childcare/school/university contacts were assessed in the subgroup of participants who reported to attend childcare, school or university, and work contacts were assessed in the subgroup of participants who worked full- or part-time. We calculated social contact matrices for the age-specific mean number of direct social contacts using the “socialmixr” package in R [28]. To obtain the final contact matrices, the age-specific mean number of daily contacts were adjusted, so that the total number of contacts of one group with another was the same as vice versa [28]. For the calculation of the contact matrices, participants who reported more than 100 group contacts were excluded from the analysis (COVIMOD: wave 3, 6 participants; wave 4, 13 participants; POLYMOD, 10 participants).
To assess how changes in infection dynamics are reflected by contact survey data, we applied two different approaches. First, we performed a simple analysis for which we calculated the mean relative reduction in contacts for each COVIMOD wave when compared to pre-pandemic data. For this, we translated the number of the mean contacts and the corresponding 95% confidence interval values into a mean relative reduction from baseline, i.e. in this case, the number of mean contacts before the SARS-CoV-2 pandemic as estimated in the POLYMOD study.
Second, we performed a more complex analysis by using additional information from the contact survey for calculating the next-generation matrix. We assumed that the next-generation matrix for SARS-CoV-2 is a function of the age-specific effective contact rate, given by the number of age-specific contacts multiplied by the probability of transmission per contact, and the duration of infectiousness [29]. Hence, the basic reproduction number (R0) is proportional to the dominant eigenvalue of the contact matrix [30]. To be able to calculate R as the result of a relative reduction in R0, we assumed that the social contact patterns before the implementation of the contact reduction measures were similar to the POLYMOD contact patterns and that the duration of infectiousness and the per-contact transmission probability remained constant. Additionally, we assumed that the transmission probability did not depend on age. Under these assumptions, the relative reduction of R compared to R0 is equivalent to the reduction in the contact matrices’ dominant eigenvalue allowing us to estimate the reproduction number corresponding to contacts recorded in COVIMOD. We assumed R0 during the first wave in Germany to follow a normal distribution with a mean of 2.6 and a standard deviation of 0.54 [3]. We drew 10,000 bootstrap samples from POLYMOD and COVIMOD to assess uncertainty.
Similar to the first approach, we then translated the R estimates from the COVIMOD study into a mean relative reduction from baseline, i.e. in this case, the basic reproduction number (assumed as R0 = 2.6).
Mobility data
We used mobility data collected for the same time intervals as the COVIMOD waves’ timings and compared it to the pre-pandemic data available from the respective data sources. In addition to assessing the distinct movement types provided by Google, we also composed an indicator for overall mobility by averaging across all the movement types separately for both the Google mobility data and the Apple mobility data (with the exception of movements to parks as this is expected to vary considerably during seasons).
We calculated the mean relative change compared to pre-pandemic data within the time intervals corresponding to the COVIMOD waves as well as the 95% confidence interval of the bootstrapped mean of 1000 samples for Google and Apple mobility. In line with the approach we applied for COVIMOD and POLYMOD, we weighted the population mobility data for weekdays/weekends.
RKI reproduction number estimates
We calculated the mean R estimates for the corresponding time intervals 10 days after the COVIMOD waves as well as 95% confidence interval of the bootstrapped mean of 1000 samples based on the daily R estimates provided by the RKI, the German Public Health Institute. We then translated the mean and 95% confidence interval value into a relative reduction from baseline, i.e. in this case, the basic reproduction number (assumed as R0 = 2.6 during the first wave in Germany), to provide a reference standard for infection dynamics against which the changes in social contact data and population mobility data could be compared.
Weights by contact type and calibration of scaling factors
As the probability that a contact leads to a transmission varies according to the setting, we performed additional analyses using two different concepts to take this into account. First, we assigned different but specific weights to home contacts/home mobility and non-home contacts/non-home mobility (i.e. all other contact settings combined) based on setting-specific secondary attack rates (SAR) from a systematic review by Thompson et al. [31]. Based on Thompson et al., the household SAR was estimated to be 21.1 and the SAR in a healthcare setting, at the workplace and with casual close contacts to be 3.6%, 1.9% and 1.2%, respectively. We used normalised weights based on household SAR and the average of the healthcare, workplace and casual close contacts (SAR = 2.23%) and applied the household weight to the home contacts/home mobility and the non-household weight to the non-home contacts/non-home mobility. We then allowed for an additional scaling factor per contact survey approach, i.e. simple approach—mean relative reduction in contacts, complex approach—contact data with next-generation matrix, google mobility data; the same scaling factor was used within each approach for all waves as well as for all types of contacts in the contact survey approaches and all types of mobility, in the mobility approach. We used this scaling approach with the aim to obtain the minimum residual sum of squares across the four survey waves when compared to our reference standard, i.e. relative reductions estimated based on R values reported by the RKI. For a better understanding of the effect of contact/mobility-type weights, we also performed an analysis in which we fitted the scaling factor with the same weight for all types of contacts and mobility. In the second concept, we did not apply pre-defined weights for home/non-home contacts and for home/non-home mobility but fitted them from the data by allowing independent scaling factors for home contacts and home mobility and non-home contacts and non-home mobility per approach, i.e. simple approach—mean relative reduction in contacts, complex approach—contact data with next-generation matrix, google mobility data. By doing so, we estimated the relative weights for both contact/mobility types based on the data collected for this study and did not take into account external information for transmission probabilities in different settings. The optim function in R was used for the fitting/scaling. Apple mobility data could not be used for these analyses as there is no differentiation in home/non-home mobility available.
Comparison of the results of the different approaches with the reference standard
For all analyses, we calculated the mean absolute percentage error of the estimates obtained by the approaches for the COVIMOD contact data as well as for the Google and Apple mobility data when compared to the reference standard of relative changes in infection dynamics based on R estimates from RKI. We did this in the base case concept without scaling factor and contact type-specific weighting, as well in all three concepts with scaling factors. Moreover, we applied repeated measures ANOVA to assess the differences between error rates provided by the different data sources.
R version 4.0.2 was used for all analyses [32]. Further specifications of the analyses can be found in Additional file 4.