Social networks have proven to be useful for understanding and predicting infectious disease dynamics. There is a discussion on how detailed network data must be in order to be useful in epidemiological applications [6, 14, 22]. However, even mapping low-detail social contact networks is typically too resource-intensive to be a practical possibility for most communities and institutions. What is needed instead are low-cost proxies for individual network properties that can serve as epidemiological predictors. Spatial distance measures, for example, have recently been found to be significant predictors of social ties (among other predictors) , and it is therefore reasonable to expect that spatial proxies can also serve as useful epidemiological predictors. The collocation ranking method presented here is based on spatio-temporal considerations, and our results suggest that it may effectively identify subpopulations suited for sentinel surveillance systems and prevention strategies.
Current methods to identify subpopulations for sentinel surveillance systems and prevention strategies typically rely on demographic variables such as age (for example, children and young adults in influenza surveillance systems [24–26]) and geographic location (for example, administrative units in invasive meningococcal disease surveillance systems [27–29]). These methods work because there is sufficient variance of such demographic variables at the societal level. However, at the level of communities and institutions such as schools, there is often too little variance to make these methods applicable. Furthermore, because demographic variables are not direct proxies for transmission routes, they may fail to identify individuals with high transmission potential who fall outside of the targeted range of the demographic variable. In contrast, the collocation ranking indicator proposed here is a direct proxy of potential disease transmission events as given by the contact network.
Random selection serves as a null model method in the absence of epidemiologically relevant information about a population. The collocation ranking method significantly outcompetes the random method. As expected, some network indicators, such as the strength, were able to outcompete the collocation ranking method to identify subpopulations for early detection or targeted intervention strategies. This is not surprising because strength is essentially a direct measure of exposure, and it can thus serve as an indicator that can identify subpopulations which are almost identical to the optimal subpopulation. Nevertheless, measuring strength is resource-intensive, while collocation ranking is not.
Our research is not without limitations. The first limitation is that we rely on widely used computational simulation models of disease spread, rather than validating our method in an empirical setting. Our simulation model is based on high-resolution contact network data  as well as established disease transmission parameters [20, 21], but ideally, any benchmark would be based on empirical outbreak data instead of simulated data. However, infection transmission is a highly stochastic process, requiring multiple outbreaks for a robust evaluation of the collocation ranking method presented above.
Limitations and uncertainties of our model are, in particular, the following: (i) There is still debate on the relative importance of the different potential pathways of influenza transmission [30–32]. Most models of influenza spread assume transmission by close contact, but there is the possibility that other transmission pathways are more important than currently thought. (ii) We model the spread between members of the school population during school hours, but we do not capture potentially infectious contacts between school members during their leisure time. (iii) We assumed that the probability of being an index case is homogeneous. In reality, this is most likely not the case. (iv) We also assumed that all individuals are fully susceptible. In reality, individuals differ in their serostatus and (partial) immunity is linked to patterns of previous exposure. (v) It might be that an ongoing epidemic changes the contact behavior not only of the symptomatic individuals, but also of the healthy ones who continue to attend school. Such potential behavior changes are not reflected in our model.
Another limitation is that the data to test our method were collected in one school only. Moreover, the data covers only one school day. While the method worked very well in this setting, the generalizability to other settings remains to be established.
Finally, we had to reconstruct individual schedules from aggregated schedules and mote data. Reconstructions may be incomplete (compare with Additional file 1), and the real course of a school day may differ from the scheduled sequence of classes. While it is important to recognize that we currently cannot conclusively validate our method, our simulation results indicate that the collocation method is an effective, low-cost tool that warrants further research.