论文标题

社区结构意识到网络中节点的嵌入

Community Structure aware Embedding of Nodes in a Network

论文作者

Chattopadhyay, Swarup, Ganguly, Debasis

论文摘要

检测社区或现实生活网络的模块化结构(例如,社交网络或产品购买网络)是一项重要任务,因为网络功能通常由其社区确定。传统的社区检测方法涉及基于模块化的算法,这些算法一般来说,基于启发式方法构建分区,旨在最大程度地提高分区内边缘与它们之间的边缘的比率。另一方面,节点嵌入方法表示图中的每个节点作为实值向量,从而能够将图中社区检测的问题转换为群集一组向量的问题。现有的节点嵌入方法主要基于首先基于从每个节点启动随机步行以构建节点的上下文,然后使节点的向量表示靠近其上下文。但是,标准节点嵌入方法并未直接考虑网络的社区结构,同时构建每个节点周围的上下文。为了减轻这一点,我们探索了两个不同的工作线程。首先,我们研究了最大基于熵的随机步道的使用,以获得节点的嵌入的更多中心性,这可能会导致嵌入式空间中更有效的簇。其次,我们提出了一种社区结构感知的节点嵌入方法,在该方法中,我们将基于模块化的启发式方法纳入节点嵌入的目标函数。我们证明,我们提出的组合组合和社区检测的嵌入方法优于在标准节点插入的(Node2Vec)矢量上的许多基于模块化的基准和K-均值聚类,这些基线在广泛的现实生活和合成网络上的不同范围和密度的范围。

Detecting communities or the modular structure of real-life networks (e.g. a social network or a product purchase network) is an important task because the way a network functions is often determined by its communities. Traditional approaches to community detection involve modularity-based algorithms, which generally speaking, construct partitions based on heuristics that seek to maximize the ratio of the edges within the partitions to those between them. On the other hand, node embedding approaches represent each node in a graph as a real-valued vector and is thereby able to transform the problem of community detection in a graph to that of clustering a set of vectors. Existing node embedding approaches are primarily based on, first, initiating random walks from each node to construct a context of a node, and then make the vector representation of a node close to its context. However, standard node embedding approaches do not directly take into account the community structure of a network while constructing the context around each node. To alleviate this, we explore two different threads of work. First, we investigate the use of maximum entropy-based random walks to obtain more centrality preserving embedding of nodes, which may lead to more effective clusters in the embedded space. Second, we propose a community structure-aware node embedding approach, where we incorporate modularity-based partitioning heuristics into the objective function of node embedding. We demonstrate that our proposed combination of the combinatorial and the embedding approaches for community detection outperforms a number of modularity-based baselines and K-means clustering on a standard node-embedded (node2vec) vector space on a wide range of real-life and synthetic networks of different sizes and densities.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源