The “network approach to psychopathology,” a collective term for theoretical, methodological, and empirical work conceptualizing psychiatric disorders as networks of causally interacting symptoms (i.e., nodes), reflective of complex systems, has become increasingly prominent over the last decade [1]. Specifically, according to this approach, psychopathology is not the result of an underlying latent variable responsible for causing the observant symptoms, but rather emerges from the dynamic and causal interaction among symptoms [2,3,4,5,6,7]. Thus, a presumably causal network of symptoms (“nodes”), and the connections between them (“edges”), establishes a specific disorder [8]. While key theoretical concepts and hypotheses underlying this approach have been outlined by several different contributors [2, 5, 6, 9,10,11,12,13,14,15], they all share the fundamental assumption that applying concepts and methods developed in “network science” will successfully lead to novel insights into the nature of psychopathology, yielding relevant and important clinical implications (e.g., [13]).
In this regard, high hopes were especially put in the concept of node centrality [11], an indicator of the importance of different nodes within a specific network [16]. Put differently, nodes’ centrality reflects their influence over other nodes in the network, or how relevant they are to the entire network structure, such that nodes with high centrality are considered to have above average influence on the rest of the network [2]. In empirical data, node centrality can be determined using several centrality metrics [6], including, among others, node strength, predictability, and expected influence (for more details see [17, 18]), with higher values reflecting greater node centrality/influence. More formally, strength is defined as to the sum of the absolute value of all edge weights of a node [17]. Expected influence is similar to strength, but takes the directionality (i.e., if an edge weight is negative or positive) into account by removing the usage of absolute values of edge weights when computing a node’s strength in favor of actual values [18]. Predictability is equal to the upper bound of the shared variance of a given node (measured in R2) with all its neighboring nodes, assuming that all connections are directed towards that given node [19]. Thus, strength and expected influence are both relative measures of node centrality, whereas predictability is considered a more “objective” centrality measure, as it can be compared across different networks. From a clinical standpoint, it has been argued that if central nodes within a psychopathology network represent highly causal influential symptoms, then the treatments specifically targeting these central nodes/symptoms should be more efficacious than other treatments that do not. Specifically, targeting highly central nodes to reduce their severity should propagate to other nodes in the network causally affected by them, thereby eventually collapsing the entire network structure and alleviating the overall psychopathology [2]. For example, if sleep quality is a central node causally affecting concentration and irritability, then enhancing sleep quality would also increase concentration abilities and reduce irritability. Indeed, the results of numerous empirical network studies in psychopathology have been interpreted in light of this stipulation (e.g., [20,21,22,23]), with some grounding the rational for conducting their studies, at least partially, on this claim (e.g., [24,25,26,27]).Footnote 1 Hence, elucidating the validity of this hypothesis is of crucial importance for the network approach to psychopathology in general, and, more specifically, for its clinical significance and implications.
Here we summarize key theoretical, methodological, and empirical evidence pertaining to the centrality hypothesis. We focus on networks derived from cross-sectional between-subject data as most network research in psychopathology have used this kind of data [1, 30, 31], including prior empirical investigations specifically exploring the predictive validity of central nodes as treatment targets [18, 32,33,34]. We first introduce and discuss several theoretical limitations of the centrality hypothesis. We then summarize existing empirical evidence pertaining to the centrality hypothesis and discuss key methodological issues of extant research. Next, using a specific dataset as an example, we empirically test the centrality hypothesis by replicating the methods used by prior studies, while addressing some of their limitations. Specifically, we examine a sample of 710 treatment-seeking posttraumatic stress disorder (PTSD) veteran adult patients who completed a PTSD assessment, including both clinician-assessed and self-reported measures, before and after PTSD-specific treatments. Finally, we discuss the implications of our empirical results in light of the presented theoretical and methodological arguments for both researchers and clinicians working under the “network approach to psychopathology.”
Theoretical aspects
The validity of any hypothesis is always built upon the validity of its underlying assumptions. Thus, here we will outline and examine the validity of some explicit and implicit assumptions underlying the centrality hypothesis in general, and, more specifically, in the context of networks based on cross-sectional between-subject data. In doing so, we assume that the critical hypothesis of the network approach, namely, that “symptoms may cohere as syndromes because of causal relations among the symptoms themselves” [1] is true and that this can indeed be modeled by network analytic methods.
A first fundamental underlying assumption of the centrality hypothesis is that centrality metrics reliably model the causal importance of individual nodes. This assumption, however, has been questioned on different grounds. First, commonly used centrality metrics stem from the field of social networks and it remains unclear whether centrality measures can be indeed effectively applied to complex networks describing psychopathology, as they are based on assumptions that seem implausible in relation to psychopathology [35]. For example, the nodes of a network are assumed to be fully interchangeable (i.e., that they are conceptually equivalent), which seems implausible when considering the clinical meaning of psychopathological symptoms. For instance, although suicidality and insomnia are both symptoms of a major depressive episode, their clinical meaning and implications differ significantly when estimating depression severity, prognosis, and treatment options. No clinician will consider the two substitutable. Thus, the assumption that nodes are fully interchangeable is clearly violated. Moreover, the conceptual validity of the developed centrality metrics has been doubted even in social network science (for more details see [34]). Second, a network, and thus its centrality measures, can only reflect true causal relations if all variables with a relevant causal effect are indeed included in the model [36], without omitting any important causal variables [8]. Currently, however, it seems highly implausible that all necessary causal effects of an examined psychopathology are even known, let alone included in the corresponding networks.
To assume the validity of the centrality hypothesis, a second fundamental assumption must be made, namely, that the abovementioned first assumption (i.e., that centrality metrics reliably model the causal importance of individual nodes) holds in any specific empirical context under which it is being used or examined. However, here, too, several discrepancies and inconsistencies arise. First, the assumption that symptoms causally interact with each other implies that they do so within the individual and over time, necessitating empirical methods which can recover these effects with the adequate precision. Considering the within individual requirement, the sufficient and necessary assumptions under which individual effects can be recovered from between-subject data settings, known as group-to-individual generalizability, are highly debated [37,38,39,40]. While some claim that generalizability is only possible if group effects are homogeneous across individuals (i.e., that they are ergodic—a process in which every sample is equally representative of the whole [37]), others consider ergodicity as a sufficient condition, questioning it as a necessary one [39] (For more details on this important debate, see [40,41,42]).Footnote 2 Considering the overtime aspect of the causality assumption, research has shown that networks based on longitudinal data differ from networks based on the same “cross-sectionalized” data (e.g., by averaging the data, [43]; that some effects [e.g., temporal ones] can only be assessed in longitudinal data, [44]; and that centrality derived from a network based on longitudinal data does not correlate with centrality derived from a cross-sectional network based on the same averaged, longitudinal data, [45]). Second, while networks based on cross-sectional data and/or group-level analysis are most common [30, 46], some have used ideographically collected data to estimate centrality measures. However, a recent simulation study demonstrated that current network analytic methods are only partially successful in recovering the properties and dynamics of bi-stable systems (indicating a healthy and “sick” state) in a common ideographic research setting [47]. Third, results have shown that Gaussian graphical models, the most often used models to estimate networks based on cross-sectional between-subject data, to be incapable of differentiating several possible underlying causal models (i.e., directed acyclic graphs [48]), with centrality found to potentially reflect common endpoints (i.e., causal results) rather than causally important symptoms [1]. Finally, the methodological choices made during the process of estimating a data-driven network have a substantial influence on the resulting network structure and, hence, on the emerging centrality measures [49, 50].Footnote 3 Moreover, even when following the same procedure outlined and implemented in an R package, instability of some centrality indices across studies still emerges [30].
In sum, theoretically-wise, it seems that centrality metrics are limited in their ability to reveal causally influential nodes [52]. In addition, standardized processing pipelines are highly needed to enable comparability of empirical results across studies. Taken together, the entirety of the theoretical assumptions and concerns challenges the validity of centrality measures in identifying symptoms constituting optimal treatment targets, especially in cross-sectional between-person networks, which have nevertheless dominated the network empirical research over the last several years [1, 30].
Methodological aspects
Putting aside the theoretical aspects described above, one should also consider some of the methodological aspects of research efforts aimed at exploring the centrality hypothesis, which we will now discuss. First, we will describe extant studies examining the centrality hypothesis more indirectly, not focusing specifically on symptom change over time. We will then elaborate on a more direct approach used to examine the centrality hypothesis, describe findings of studies that have used it, and address some of the inherent limitations characterizing it.
While no study has yet to investigate the centrality hypothesis straightforwardly by examining the clinical efficacy of an intervention targeting pre-treatment central symptoms compared with an intervention targeting pre-treatment non-central symptoms, different studies have tried to elucidate the validity of the centrality hypothesis, or some of its assumptions, using different methodological approaches. Some have compared different features of networks constructed for the same sample at two different time-points, as symptoms were expected to differ between them. However, opposite and contrasting network connectivity-to-overall-symptoms associations emerged [53, 54]. Others have compared the baseline network structures (i.e., assessed at a single time-point) of two sub-samples of a single cohort “created” based on a difference in symptoms found at a later time-point (i.e., poor vs good treatment responders). Here, too, opposite result patterns were reached [55, 56]. Some have tried to address the centrality hypothesis by using simulation-aided procedures, showing that the removal of central nodes from a given network has no larger effect on the resultant network structure compared to removing nodes at random [57]. However, simulation studies can provide only indirect evidence with regard to the centrality hypothesis. Finally, others have examined whether centrality measures could predict clinical outcomes at a later time-point [33, 58]. While showing some positive findings, these studies did not examine symptom change over time, providing only indirect evidence for the centrality hypothesis. Furthermore, the latter study also found that the centrality-outcome relationships were not significantly stronger compared to the simple feature of symptom count [58].
While the aforementioned research has considerably advanced our knowledge in the field, only three studies to date were designed to more directly assess the centrality hypothesis as it relates to symptom change over time [18, 32, 34], with two examining the validity of pre-treatment central nodes in predicting symptom change over the course of treatment [18, 32]. All three studies used the same procedure developed by Robinaugh et al. [18]. The Robinaugh et al. procedure is based on the assumption that if nodes are causally connected, then changes in one node’s individual severity from one time-point to another (Δnode) would impact the severity of all remaining nodes of the network to which it is connected (summed up as Δnetwork [18]). Hence, a relation between Δnode and Δnetwork is assumed. Given that centrality identifies nodes with higher causal importance within a network, then changes in central nodes from one time-point to the next should cause proportionally greater changes in the rest of the network, compared to changes in less central nodes. Consequentially, centrality should be associated with the relation between Δnode and Δnetwork [18].
Examining the results of studies using the Robinaugh et al. procedure reveals mixed findings and some limitations characterizing each of them [18]. First, Robinaugh et al., examining complicated grief using a 13-item questionnaire among 195 participants, reported that all assessed centrality measures (e.g., strength, closeness, betweenness, expected influence) strongly correlated with the Δnode-Δnetwork association [18]. However, obtained results had large confidence intervals (e.g., for strength, r(11) = .66 [.18, .89]) lowering their specificity. Also, and most relevant to the present investigation, the authors did not investigate a treatment sample with pre- and post-treatment assessment, but rather a cohort from a longitudinal study of bereavement. Second, Rodebaugh et al., examining social anxiety using a 22-item measure in a sample of 244 patients undergoing treatment, also found a significant correlation between several centrality measures (strength, betweenness, and a composite centrality index) and the Δnode-Δnetwork association [32]. However, the observed effects failed to generalize to three additional social anxiety measures,Footnote 4 a generalization that is to be expected under the centrality hypothesis. Moreover, infrequency of symptom endorsement (i.e., number of times the symptom was rated zero by participants), specifically chosen because it has no obvious causal effect on the Δnode-Δnetwork association, was not only found to be predictive, but also generalized across the other measures. Finally, Papini et al. examining posttraumatic symptoms using a 17-item questionnaire in a sample of 306 female patients with co-occurring substance use disorders and full or subthreshold PTSD, found two pre-treatment centrality measures (i.e., node strength and predictability) and one non-centrality node property (i.e., symptom severity) to be significantly correlated with the Δnode-Δnetwork association [34]. However, these measures were also found to be of limited robustness. Also, generalization to other measures or the effect of infrequency of symptom endorsement was not examined. Finally, a shared limitation of all three studies was the employment of a relatively small sample size which limits the stability of the network structure and the corresponding centrality metrics.Footnote 5
While the Robinaugh et al. procedure is assumed to more directly examine the centrality hypothesis, some the procedure’s inherent limitations should be discussed, which may also explain the aforementioned mixed findings [18]. First, as pointed out by Rodebaugh et al., centrality measures are known to be affected by item properties like variance or ceiling effects [32, 59]. Thus, it may be that the predictiveness of centrality measures is simply driven by these simple, non-causal item properties. Second, (symptom) change is a second order concept that is inferred from differences obtained between constructs/networks at two (or more) different assessments, assumed to be the “same” [42, 60, 61]. However, the assumption of invariance has been mostly overlooked in the context of repeated cross-sectional network analyses. Third, as the measure used in this procedure is the correlation between node’s centrality and the Δnode-Δnetwork association, the number of nodes corresponds to the number of observations. Thus, the power of this analysis is a priori restricted by the number of nodes included in the examined network (assuming a constant effect size and alpha level). Consequentially, to reach an adequate statistical power of 0.8, a network must be constituted by at least 21 nodes for a strong-sized effect and 64 nodes for a medium-sized effect. However, psychopathology measures containing symptom checklists with 64 items rarely exist. Moreover, the precise estimation of centrality in a network containing 64 nodes will require a sample size of several hundred participants, limiting the contexts in which the hypothesis can be investigated using this approach. Consequently, investigations based on fewer nodes will not only have limited power but also result in imprecise values of the investigated correlation (i.e., large confidence intervals).
Taken together, prior experimental investigation of the validity of centrality measures as signaling symptom change has produced some mixed findings, with different methodologies, centrality measures, and effects used and examined across studies [33, 57, 58]. While the three studies using the Robinaugh et al. procedure, more directly examining the validity of central nodes in predicting treatment change, did show that centrality was partially successful in doing so, this was limited to the measure used to construct the network, not generalizing to other measures of the same examined psychopathology, which should be expected under the centrality hypothesis [18]. In addition, results also showed some simple non-centrality measures to outperform centrality measures.
The empirical study
Notwithstanding the aforementioned methodological and theoretical arguments, if we do choose to assume that the centrality hypothesis is true, and the procedure by Robinaugh et al. [18] is, in principle, adequate for investigating the centrality hypothesis in the context of cross-sectional, between-subject context, then we should be able to reliably demonstrate (a) the predictive validity of centrality indices and (b) their generalizability to different measures of the same psychopathology (i.e., predictiveness across different questionnaires).
In this empirical part of our study, we aimed to test these two hypotheses in a large sample of PTSD patients (N = 710), assessed before and after treatment completion. To ensure comparability with previous work, we tested the centrality hypothesis using the same method applied by the three studies mentioned above. As different centrality measures were used in these studies, we chose to examine all those that were found to be predictive of the Δnode-Δnetwork association in any of the studies (i.e., strength, expected influence, and predictability). To test the generalizability of obtained results to other measures of the same psychopathology, explored only in one of the previous studies [32], we examined the predictability of included centrality indices using two measures of PTSD. We also repeated the above-described analyses examining two simple non-centrality symptom measures (i.e., mean symptom severity and infrequency of symptom endorsement), whose predictive properties are not based on causal assumptions deduced from network theory, to better tease apart predictiveness from causality. We chose measures that were used in previous studies but that have yielded mixed results [32, 34]. Finally, to examine the invariance assumption, we assessed the degree of invariance of both networks from before to after treatment by comparing the pre- and post-treatment networks.
In sum, here we examine the centrality hypothesis using (1) three centrality measures (i.e., strength, expected influence, predictability); (2) in a large sample of patients with the same primary disorder (n = 710); (3) assessed before and after treatment; (4) using two psychopathology measures of PTSD; (5) while also incorporating two simple non-causal node properties (i.e., mean symptom severity and infrequency of symptom endorsement).