Skip to main content

Table 27 Relationship and collaboration between TG9 (topic group 9) and the other topic groups of the STRATOS initiative

From: Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges

All other topic groups work on issues that are also relevant for the analysis of HDD. Obviously, all papers are written in the context of LDD. Appropriate study designs (TG5) are a key to improve research in the health sciences. It is well known that mistakes in design are often irremediable [229]. Nearly all studies in HDD and LDD have to cope with missing data (TG1, [58]) and data preprocessing is a relevant topic for all studies, closely related to tasks in initial data analysis (TG3). Analyzing LDD, the importance of IDA was largely ignored and a recent review showed that reporting of IDA is sparse [230]. In section “IDA: Initial data analysis and preprocessing,” we provided a discussion of IDA aspects in the context of HDD. Measurement error and misclassification (TG4) is a common problem in many studies in LDD and HDD, which is often ignored in practice [231]. Studies with a survival time output are popular in HDD, and they have to cope with several issues discussed in the survival analysis group (TG8, [232])

In the context of LDD, TG2 published a review focusing on approaches and issues for deriving multivariable regression models for description [136]. Although analyses of HDD concentrate more on models for prediction, some of the issues are also relevant and the very large number of variables and (too) small sample sizes strengthen some problems severely. In LDD, issues in deriving models for prediction are discussed in TG6 [233]. Finally, the overarching aim of many HDD studies is to discover knowledge that is causally related to an outcome of interest. However, causal inference imposes several important challenges (TG7, [234])