Block models for network data

Aleks writes,

Edoardo Airoldi has written a new paper (with S. Fienberg and David Blei, and E. Xing) about blockmodeling. This is perhaps the state-of-the-art model for network data. You might be interested in this in the context of social network data. Another potentially interesting thing is their use of variational methods for coping with complex Bayesian models.

I also had an idea regarding network modeling. In particular, high gregariousness of a node does explain that node’s high connectivity, but the connections radiating from a gregarious node might be quite random. For that reason, it might make sense to treat links as secondary to gregariousness; it is worth modeling a link only when the observed connections cannot be explained by high gregariousness or by, say, transitivity. Such an approach would help understand highly connected networks.

I haven’t tried to follow all the details of the model, but it looks pretty cool. I’m not thrilled about the treatment of selecting the number of groups. Ultimately, there are as many groups as there are people (maybe more!) but for the purpose of data reduction and model understanding, you have to pick something. As always, I don’t see the number of groups as any Platonic underlying parameter, and it doesn’t make sense to me to use a BIC-type statistic to pick this number, although it might give a reasonable answer in this particular example. (See here for more on this topic.)

Figure 2 is interesting. I’d like to see the authors take the next step and simulate a few sets of fake data from the fitted model and compare the replicated to actual data (as in Chapter 6 of BDA).

On an unrelated topic, it’s a little distracting that all the references in the paper are in blue. Finally, Figure 14 is unreadable (and also has a bizzare red color)–something needs to be fixed.

2 thoughts on “Block models for network data

Comments are closed.