论文标题
分类社区结构的统计推断
Statistical inference of assortative community structures
论文作者
论文摘要
我们开发了一种原则性的方法,以基于非参数贝叶斯对种植分区模型的表述来推断网络中的分类社区。我们表明,这种方法成功地在网络中找到了具有统计学意义的分类模块,这与诸如模块化最大化之类的替代方案不同,这些替代方案在人工和经验示例中有系统地过度拟合。此外,我们表明,只要有统计证据,我们的方法不受分辨率限制的约束,并且可以揭示任意大量的社区。我们的公式适合模型选择程序,这使我们能够将其与基于随机块模型的更通用方法进行比较,并以这种方式揭示了相互作用是否实际上是主导的大规模混合模式。我们与多个经验网络进行了比较,并确定了许多情况,即传统的社区检测方法夸大了网络的分类性,并且我们展示了如何确定更忠实的分类程度。
We develop a principled methodology to infer assortative communities in networks based on a nonparametric Bayesian formulation of the planted partition model. We show that this approach succeeds in finding statistically significant assortative modules in networks, unlike alternatives such as modularity maximization, which systematically overfits both in artificial as well as in empirical examples. In addition, we show that our method is not subject to a resolution limit, and can uncover an arbitrarily large number of communities, as long as there is statistical evidence for them. Our formulation is amenable to model selection procedures, which allow us to compare it to more general approaches based on the stochastic block model, and in this way reveal whether assortativity is in fact the dominating large-scale mixing pattern. We perform this comparison with several empirical networks, and identify numerous cases where the network's assortativity is exaggerated by traditional community detection methods, and we show how a more faithful degree of assortativity can be identified.