易 灿,何 敏,吴帮吕,吕 亮.一种融合社区关系及共同邻居社区信息的 链路预测算法[J].电子测量与仪器学报,2021,35(5):174-181
一种融合社区关系及共同邻居社区信息的 链路预测算法
Link prediction algorithm combining with community relations andcommunity information of common neighbors
  
DOI:
中文关键词:  社区信息  共同邻居  链路预测  相似性指标
英文关键词:community information  common neighbors(CN)  link prediction  similarity index
基金项目:云南省科技创新强省计划(2014AB016)项目资助
作者单位
易 灿 1.云南大学 信息学院 
何 敏 1.云南大学 信息学院 
吴帮吕 1.云南大学 信息学院 
吕 亮 1.云南大学 信息学院 
AuthorInstitution
Yi Can 1.School of Information Science and Technology, Yunnan University 
He Min 1.School of Information Science and Technology, Yunnan University 
Wu Banglv 1.School of Information Science and Technology, Yunnan University 
Lv Liang 1.School of Information Science and Technology, Yunnan University 
摘要点击次数: 724
全文下载次数: 3
中文摘要:
      共同邻居的相似性指标因其只利用了网络的局部信息使得预测效果不理想,而网络的社区信息包含了节点的网络结构 特征,有助于提高链路预测算法的准确性。 为了提升预测精度,引入社区结构信息,提出了一种融合社区关系和共同邻居的社 区信息的链路预测算法。 算法首先采用 DeepWalk 和 Node2vec 图嵌入算法进行社区划分,即利用深度学习模型 Skip-Gram 训练 得到的短随机游走节点序列的节点嵌入向量来划分社区,从而获得包含更多网络拓扑信息的高质量社区;然后,通过定义社区 间的边关系提出了社区的相似性模型;最后,结合节点的相似性、节点所处社区的相似性、节点共同邻居的社区信息三者来度量 两个未知节点的链接概率。 实验在 USAir 等 6 个不同领域的真实网络上进行,与 CN 指标等 4 组基线相比,AUC 指标最高提升 了 2. 3%,表明社区结构信息对提升链路预测的效果起着重要的作用。
英文摘要:
      The performance of CN-based similarity index is not satisfied due to only taking into account the local information of a network. The community information contains the network structure features of nodes, which can be adopted to improve the prediction accuracy. Therefore, a community-based link prediction algorithm using the community structure information is proposed to address the problem. Employing community relations and community information of common neighbors, it was developed in an attempt to improve the prediction precision. Firstly, two graph embedding methods--DeepWalk and Node2vec were employed, that is, a deep learning model, i. e. Skip-Gram was adopted to train the nodes’ sequences generated from short random walk and then the acquired embedding vectors of nodes were used in communities division to obtain high quality communities that contain more network topology information. Then, the similarity model of communities was proposed via defining the edge relationship between communities. Finally, the similarity of nodes, the similarity between the communities where the nodes are located, and the community information of the nodes’ common neighbors were integrated into the suggested algorithm to evaluate the link probability of two unknown nodes. Finally, experiments on six real-world networks like USAir are conducted, and the AUC of the suggested method is increased about 2. 3% at most compared with four benchmark algorithms including CN. Thus it shows that community structure information plays an important role when predicting the latent links.
查看全文  查看/发表评论  下载PDF阅读器