摘要:
本文提出了利用邻节点及其与中心节点邻边相乘的方法,提取节点间交互信息;同时设计了新颖的采样策略加速训练
方法关键:
在原有简单属性相加的基础上
可以存在异构的问题
本文提出让邻节点及其相应的边相乘的方法
细节上,本文进一步解决了high dimensionalities and heavy redundancies的问题
LASE原理
LASE can be divided into three common modules, namely a gate, an amplifier and an aggregator
门:the gate evaluates v’s influence in u’s neighborhood
放大器:The amplifier amplifies the node attributes using link information
控制器:The aggregator sums up neighbor embeddings and combines them with the central node embedding using various strategies
Aggregators proposed in [Hamilton et al.,2017] may also be used in LASE.
实验
数据集
Reddit:节点为post,边为两条post的共同评论用户在不同社区的平均评论分布
dblp:文章为节点,以文章提取出来的tf-idf向量为节点属性,边属性为one-hot embeddings of the common authors。pca降维到200维
email和fmobile,节点为联系人,边为时间切片后节点之间的联系
结果分析
应用数据集做节点分类
汇总邻居属性的能力上,GCNs模型强于proximity-based models(LINE DeepWalk)
在利用边属性的能力上,LASE强于其他
LASE-RW和LASE-SAGE强于原始的简单拼接LASE-concat
Although there is no original features in two temporal networks, LASE still outperforms pre-trained features by exploring edge attributes, while GCN and GraphSAGE do not capture these additional information and struggles in over-fitting the proximity-based features.
贡献:
LASE provides a ubiquitous solution to a wider class of graph data by incorporating link attributes;
数据集适用广泛
LASE outperforms strong baselines and naive concatenating implementations by adequately leveraging the information in the link attributes;
充分利用边的属性信息,胜过原始方法
LASE adopt a more explainable approach in determining the neural architecture and thus enjoys better explainability
神经网络结构中采用了一种更具解释性的方法
不足:
- Sampling setup不够好,在超大图上很笨拙
- 模型为保持通用性而不够简洁优雅。应用到具体领域时可以进行优化
Sampling: To control batch scales, we leverage the Monte Carlo method to estimate the summed neighborhood information by sampling a fixed number of neighbors.