I personally prefer constraining the layout of F-R to be equal in case there are considerable similarities across networks, using the averageLayout() function in R (e.g. https://osf.io/2t7qp/).

]]>This is an excellent post (clearly I’m a bit behind on finding it), thank you so much. I had a short question for you regarding your suggestion to Amy. Will using the method you suggested to her create a sort of “averaged” Fruchtermam-Reingold layout that is no longer technically a Fruchterman-Reingold layout (but more of a “forced” layout? I am working on a psychopathy paper with multiple samples and the reviewers want the nodes to be in the same place for ease of comparison, and I am struggling between having supplementary graphs that just use a “circle” layout and using the method you specify here.

Thanks so much for these tutorials and any help!

Jon

]]>“Okay, if you use the full partial-correlation permutation (A~B+C etc) then you indeed have a lot of estimation to do. But what if you take the raw correlations. Aren’t they somehow ‘observed’ (measured as the strength of association between eg. sleep issues and lack of concentration – similar to asking A how much friend s/he is with B and vice versa [1-7]?)? Or would you say there is still a difference between a low-noise observation and a more noisy measurement (estimation?) of the association between two fMRI timeseries?”

—— I am conceptually not interested in correlations — I do not want a fully connected network where every symptom at time t predicts time t+1, I want the information which symptoms at t predict t+1 symptoms controlling for all other associations. I am interested in unique variances, the same way I am interested in predicting mortality by all my relevant variables in one prediction model. Univariate prediction will give you is spurious relationships, many of which would disappear once controlling for other variables. Regarding the question if a correlation is observed: not in a statistical sense. Your dataframe has the item responses, and you can estimate a mean of an item, or a correlation, no? From how I’ve seen the words observed vs estimated seen, a correlation is estimated. Regarding fMRI timeseries, I’m afraid I can’t answer that question, sorry. I really don’t know what models you are using in that field for time series estimation.

Regarding your second point:

“My reservation, however,is that you can find about as many papers arguing the opposite and backing that up with simulation and limited empirical data”

—— Cool, will have to check this out. The point we are making is conceptual, not statistical (nothing wrong with correlations, but they don’t give us what we’re looking for, based on the assumption where our data come from).

“However, in the more exploratory search of how different symptoms hang together as a network, doesn’t this kind of model come with assumptions? In particular, it seems to ignore interactions or higher-order relationships (beyond pairs) that seem theoretically plausible to me, especially from a dynamical systems perspective. E.g. this article blew my mind (https://arxiv.org/pdf/1608.03520.pdf) when I read it.”

—— Absolutely, this is a big limitation of pairwise estimation (i.e. both correlation and partial correlation): we miss higher order interactions. A PhD student in our lab is looking at interactions at the moment, and it looks like there is a considerable amount of higher order stuff happening in the data.

“(regularized) partial correlations are just an analysis method. They don’t introduce new information to the data.”

—— If your data come from a unidimensional factor model — if “g” really causes all subtests of intelligence — then a factor model is the appropriate model to fit to your data. I’m not sure it “introduces new information”, but factor loadings are actual estimates and reflect the data, while other models would not adequately reflect the data.

“I was wondering about whether all the challenges you mention and about potential downsides of an exclusive focus on partial correlation.”

—— I wish we would have discussed this more in the paper, you’re absolutely right that this deserves more thought.

Another point I just discussed with Sacha Epskamp: if your data come from A→B→C, your partial correlation network will recover this structure A—B—C, but your correlation networks will be fully connected. If your data come from a latent variable model, your partial correlation structure and your correlation structure will be fully connected. That means that the partial correlation network is different from different data generating mechanisms (at least in many cases), while the correlation network simply doesn’t allow for any of such inferences.

]]>thanks. Very helpful – though I’ll need a bit time to digest. Just a brief follow-up.

Let’s start with the 2nd question, which indeed I am myself was not fully clear how to ask: I see how estimating many noise-laden things will bring you in trouble. Agreed. You mention that the huge differences between social networks is that edges are observed (friendships), whereas you have to estimate them. Okay, if you use the full partial-correlation permutation (A~B+C etc) then you indeed have a lot of estimation to do. But what if you take the raw correlations. Aren’t they somehow ‘observed’ (measured as the strength of association between eg. sleep issues and lack of concentration – similar to asking A how much friend s/he is with B and vice versa [1-7]?)? Or would you say there is still a difference between a low-noise observation and a more noisy measurement (estimation?) of the association between two fMRI timeseries?

I guess that brings us back to the question #1, i.e. whether you want/should start getting into (regularized) partial correlations. Thanks for the Schmittmann paper, which I had not known and indeed liked. My reservation, however,is that you can find about as many papers arguing the opposite and backing that up with simulation and limited empirical data (e.g. Kim, Wozniak, Mueller, Pan, 2015; Zalesky, Fornito, & Bullmore, 2012; or Smith, Miller, Salimi et al., 2011). Okay, that is not in itself a great argument, but it casts at least a little doubt on the point that partial correlations are the demonstrated go-to method. I can see how in your example about mortalityT2 ~ depressionT1 + age + gender you want to control for these things. However, in the more exploratory search of how different symptoms hang together as a network, doesn’t this kind of model come with assumptions? In particular, it seems to ignore interactions or higher-order relationships (beyond pairs) that seem theoretically plausible to me, especially from a dynamical systems perspective. E.g. this article blew my mind (https://arxiv.org/pdf/1608.03520.pdf) when I read it. Again, I am not so confident in my knowledge to make any strong statement here, but the bottom line to me seems to be: (regularized) partial correlations are just an analysis method. They don’t introduce new information to the data. Any added insight comes from either the fact that their assumptions and the transformations they introduce are plausible or that they let you observe the data from a perspective that is otherwise obstructed (as in your A-B-C example). It seems to boil down to that ‘conditional independence’ issue which I will need to think about more. However, given the typically larger complexity of real-world data, my intuition has always been to stay close to the phenomenon and I didn’t yet fully grasp why partial corrs are necessary. I agree that in your mortality prediction and A-B-C examples that isolating/controlling are desirable, but in case of uncovering the network structure of “depression” symptoms in the first place, I was wondering about whether all the challenges you mention and about potential downsides of an exclusive focus on partial correlation.

Thanks again for the reply. Your papers have tremendously enriched my thinking

Best r

1) If you want to predict mortality at time 2 by depression at time 1, you arguably want to control for a few things, such as gender, age, and maybe comorbid conditions and health behaviors such as smoking. Visualizing the correlation table as a network here — which is not a model of any kind — will not get you anywhere; you would likely find that all variables relate to mortality. But a multiple regression that is able to account for shared variance among the predictors might give you the answer that depression does not predict mortality once you end up controlling for other variables (this is actually what we found in a recent study). This is one of the reasons we use regularized partial correlation networks. We are interested in conditional (in-)dependence relations among items. If the true model among 3 items is A → B → C, and you have cross-sectional data, your bivariate correlations will reveal correlations between all variables. Only the conditional dependence, however, will truthfull return A — B — C *without* a connection between A and C.

Verena Schmittmann et al. have a kickass paper in Plos One showing why neuroimaging should move to partial correlation networks.

For more information on this topic, see work by Pearl (e.g his 2000 or 2001 book on causality), and the two papers on the Ising Model and the Gaussian Graphical Model, that also explain the difference between partial correlations and regularized partial correlations in more detail.

2) I’m not quite sure I get the second question, and I admit I don’t know how time-series networks look in brain connectivity research. But in principle, the huge difference between e.g. social networks and psychological networks is that in social networks, both nodes (people) and edges (e.g. friendships) are observed, while the latter need to be estimated in our case (and I assume in your case, too). If you have 30 people, 10 measurement per day for 2 weeks, and 10 items of interest, you can calculate the crazy amount of associations you need to estimate, especially if you want to get both contemporaneous and temporal networks out of the data (see reference manual; note also that people and timepoints give you power, while more items reduce your power). Maybe in brain networks you just have a ridiculously high number of timepoints, which gives you many orders of magnitude more power?

The same parameter issueholds for cross-sectional data: there is no observation of edges, we need psychometric models to estimate these, and that is what regularized partial correlation networks do. If you have 30 items, and you are interested in conditional dependence relationships, you need 30*29/2 edge parameters (because you regress every node on every other node; so for 3 items A B C you estimate A~B+C; B~A+C; C~A+B; then choose a specific rule how to deal with the fact that A~B and B~A can be estimated differently; and then you draw the graph). Hope this makes sense, please let me know if this remains unclear.

]]>I greatly benefitted from the insights of the networks approach ever since I learned about it around a year ago and it complements related interests I have in various fields. However, I have two basic questions that have been bothering me for a while now and so far haven’t been able to find good answers/solutions. Because this post is about pitfalls, challenges, and future directions, I wanted to bring them up here:

1) Why partial correlation? I know what they are and how attractive it is to ‘isolate’ things. But what’s actually wrong with using straightforward Pearson correlations to construct weigthed networks?

2) Heterogeneity/Reliability/Sample size? I get the issue, re-read relevant sections in your article Fried & Cramer, and I follow related discussions in neuroimaging (part of my work is in that field). Specifically, in functional brain connectivity networks are constructed from bascially correlations (or other measures – including partial corrs) between different regional time series. At the individual level, this also is affected by lots of noise. However, the network here is estimated for each individual and can then be aggregated across people, which will help to ‘beat down the noise’. What I don’t quite understand is where exactly a different issue arises with psychological networks (or is neuroimaging just blind to it? [rhetorical q – there are many papers on reliability coming out over the past years and recently particularly for connectivity]). Isn’t it possible to perform significance testing with multiple comparison correction for each edge? Or what about a null-model based on data from normals (whatever that means) and compare the disordered against? I must be missing sth. … As said, I totally get the importance of good measures (minimize noise to see clearly) and the problem of heterogeneity (even noise-free measurement will still give blurred solutions if people differ), but why is this soo special to networks? Your writing seems to imply that’s because so many parameters have to be estimated? But why do they have to be estimated instead of be ‘measured’ (see q. above)?

Thanks a ton – awesome work

Ralf

That’s already helpful, I am reading the Barabasi’s book and looking for more valuable sources. ]]>