GLEE: Geometric Laplacian Eigenmap Embedding

Leo Torres; Kevin S Chan; Tina Eliassi-Rad

doi:10.1093/comnet/cnaa007

GLEE: Geometric Laplacian Eigenmap Embedding

Journal of Complex Networks ◽

10.1093/comnet/cnaa007 ◽

2020 ◽

Vol 8 (2) ◽

Author(s):

Leo Torres ◽

Kevin S Chan ◽

Tina Eliassi-Rad

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Laplacian Matrix ◽

Dimensional Representation ◽

Laplacian Eigenmaps ◽

New Approach ◽

Graph Reconstruction ◽

Node Similarity ◽

Distance Minimization ◽

Low Dimensional

Abstract Graph embedding seeks to build a low-dimensional representation of a graph $G$. This low-dimensional representation is then used for various downstream tasks. One popular approach is Laplacian Eigenmaps (LE), which constructs a graph embedding based on the spectral properties of the Laplacian matrix of $G$. The intuition behind it, and many other embedding techniques, is that the embedding of a graph must respect node similarity: similar nodes must have embeddings that are close to one another. Here, we dispose of this distance-minimization assumption. Instead, we use the Laplacian matrix to find an embedding with geometric properties instead of spectral ones, by leveraging the so-called simplex geometry of $G$. We introduce a new approach, Geometric Laplacian Eigenmap Embedding, and demonstrate that it outperforms various other techniques (including LE) in the tasks of graph reconstruction and link prediction.

Download Full-text

Appraisal Study of Similarity-Based and Embedding-Based Link Prediction Methods on Graphs

10.5121/csit.2021.111106 ◽

2021 ◽

Author(s):

Md Kamrul Islam ◽

Sabeur Aridhi ◽

Malika Smail-Tabbone

Keyword(s):

Link Prediction ◽

Black Box ◽

Prediction Methods ◽

Good Prediction ◽

Dimensional Representation ◽

Good Prediction Performance ◽

Node Similarity ◽

Category Comparison ◽

Low Dimensional ◽

Selection Of

The task of inferring missing links or predicting future ones in a graph based on its current structure is referred to as link prediction. Link prediction methods that are based on pairwise node similarity are well-established approaches in the literature and show good prediction performance in many real-world graphs though they are heuristic. On the other hand, graph embedding approaches learn low-dimensional representation of nodes in graph and are capable of capturing inherent graph features, and thus support the subsequent link prediction task in graph. This appraisal paper studies a selection of methods from both categories on several benchmark (homogeneous) graphs with different properties from various domains. Beyond the intra and inter category comparison of the performances of the methods our aim is also to uncover interesting connections between Graph Neural Network(GNN)-based methods and heuristic ones as a means to alleviate the black-box well-known limitation.

Download Full-text

TransET: Knowledge Graph Embedding with Entity Types

Electronics ◽

10.3390/electronics10121407 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1407

Author(s):

Peng Wang ◽

Jing Zhou ◽

Yuzhang Liu ◽

Xingchen Zhou

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Score Function ◽

Graph Embedding ◽

Vector Spaces ◽

Knowledge Graph ◽

Semantic Features ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.

Download Full-text

Exploiting node metadata to predict interactions in large networks using graph embedding and neural networks

10.1101/2021.06.10.447991 ◽

2021 ◽

Author(s):

Rogini Runghen ◽

Daniel B Stouffer ◽

Giulio Valentino Dalla Riva

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Link Prediction ◽

Graph Embedding ◽

Feature Space ◽

Machine Learning Techniques ◽

Large Networks ◽

Data Set ◽

Learning Techniques ◽

Low Dimensional

Collecting network interaction data is difficult. Non-exhaustive sampling and complex hidden processes often result in an incomplete data set. Thus, identifying potentially present but unobserved interactions is crucial both in understanding the structure of large scale data, and in predicting how previously unseen elements will interact. Recent studies in network analysis have shown that accounting for metadata (such as node attributes) can improve both our understanding of how nodes interact with one another, and the accuracy of link prediction. However, the dimension of the object we need to learn to predict interactions in a network grows quickly with the number of nodes. Therefore, it becomes computationally and conceptually challenging for large networks. Here, we present a new predictive procedure combining a graph embedding method with machine learning techniques to predict interactions on the base of nodes' metadata. Graph embedding methods project the nodes of a network onto a---low dimensional---latent feature space. The position of the nodes in the latent feature space can then be used to predict interactions between nodes. Learning a mapping of the nodes' metadata to their position in a latent feature space corresponds to a classic---and low dimensional---machine learning problem. In our current study we used the Random Dot Product Graph model to estimate the embedding of an observed network, and we tested different neural networks architectures to predict the position of nodes in the latent feature space. Flexible machine learning techniques to map the nodes onto their latent positions allow to account for multivariate and possibly complex nodes' metadata. To illustrate the utility of the proposed procedure, we apply it to a large dataset of tourist visits to destinations across New Zealand. We found that our procedure accurately predicts interactions for both existing nodes and nodes newly added to the network, while being computationally feasible even for very large networks. Overall, our study highlights that by exploiting the properties of a well understood statistical model for complex networks and combining it with standard machine learning techniques, we can simplify the link prediction problem when incorporating multivariate node metadata. Our procedure can be immediately applied to different types of networks, and to a wide variety of data from different systems. As such, both from a network science and data science perspective, our work offers a flexible and generalisable procedure for link prediction.

Download Full-text

Proximity Measures as Graph Convolution Matrices for Link Prediction in Biological Networks

10.1101/2020.11.14.382655 ◽

2020 ◽

Author(s):

Mustafa Coşkun ◽

Mehmet Koyutürk

Keyword(s):

Link Prediction ◽

Similarity Measures ◽

Graph Representation ◽

Supplementary Information ◽

Great Promise ◽

Network Embedding ◽

Common Neighbor ◽

Node Similarity ◽

Topological Characteristics ◽

Low Dimensional

AbstractMotivationLink prediction is an important and well-studied problem in computational biology, with a broad range of applications including disease gene prioritization, drug-disease associations, and drug response in cancer. The general principle in link prediction is to use the topological characteristics and the attributes–if available– of the nodes in the network to predict new links that are likely to emerge/disappear. Recently, graph representation learning methods, which aim to learn a low-dimensional representation of topological characteristics and the attributes of the nodes, have drawn increasing attention to solve the link prediction problem via learnt low-dimensional features. Most prominently, Graph Convolution Network (GCN)-based network embedding methods have demonstrated great promise in link prediction due to their ability of capturing non-linear information of the network. To date, GCN-based network embedding algorithms utilize a Laplacian matrix in their convolution layers as the convolution matrix and the effect of the convolution matrix on algorithm performance has not been comprehensively characterized in the context of link prediction in biomedical networks. On the other hand, for a variety of biomedical link prediction tasks, traditional node similarity measures such as Common Neighbor, Ademic-Adar, and other have shown promising results, and hence there is a need to systematically evaluate the node similarity measures as convolution matrices in terms of their usability and potential to further the state-of-the-art.ResultsWe select 8 representative node similarity measures as convolution matrices within the single-layered GCN graph embedding method and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug–drug interaction (DDI) prediction, protein–protein interaction (PPI) prediction. Our experimental results demonstrate that the node similarity-based convolution matrices significantly improves GCN-based embedding algorithms and deserve more attention in the future biomedical link predictionAvailabilityOur method is implemented as a python library and is available at [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Understanding Negative Sampling in Knowledge Graph Embedding

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12105 ◽

2021 ◽

Vol 12 (1) ◽

pp. 71-81

Author(s):

Jing Qian ◽

Gangmin Li ◽

Katie Atkinson ◽

Yong Yue

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Knowledge Graph ◽

Direct Impact ◽

Dimensional Vector Space ◽

Dynamic Distribution ◽

Space Efficiency ◽

Node Classification ◽

Low Dimensional

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

Download Full-text

An Experimental Evaluation of Similarity-Based and Embedding-Based Link Prediction Methods on Graphs

International Journal of Data Mining & Knowledge Management Process ◽

10.5121/ijdkp.2021.11501 ◽

2021 ◽

Vol 11 (5) ◽

pp. 1-18

Author(s):

Md Kamrul Islam ◽

Sabeur Aridhi ◽

Malika Smail-Tabbone

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Black Box ◽

Prediction Performance ◽

Prediction Methods ◽

Good Prediction ◽

Good Prediction Performance ◽

Node Similarity ◽

Category Comparison ◽

Selection Of

The task of inferring missing links or predicting future ones in a graph based on its current structure is referred to as link prediction. Link prediction methods that are based on pairwise node similarity are well-established approaches in the literature and show good prediction performance in many realworld graphs though they are heuristic. On the other hand, graph embedding approaches learn lowdimensional representation of nodes in graph and are capable of capturing inherent graph features, and thus support the subsequent link prediction task in graph. This paper studies a selection of methods from both categories on several benchmark (homogeneous) graphs with different properties from various domains. Beyond the intra and inter category comparison of the performances of the methods, our aim is also to uncover interesting connections between Graph Neural Network(GNN)- based methods and heuristic ones as a means to alleviate the black-box well-known limitation.

Download Full-text

An Experimental Evaluation of Similarity-Based and Embedding-Based Link Prediction Methods on Graphs

International Journal of Data Mining & Knowledge Management Process ◽

10.5121/ijdkp2021.11501 ◽

2021 ◽

Vol 11 (05) ◽

pp. 1-18

Author(s):

Md Kamrul Islam ◽

Sabeur Aridhi ◽

Malika Smail-Tabbone

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Black Box ◽

Prediction Performance ◽

Prediction Methods ◽

Good Prediction ◽

Good Prediction Performance ◽

Node Similarity ◽

Category Comparison ◽

Selection Of

The task of inferring missing links or predicting future ones in a graph based on its current structure is referred to as link prediction. Link prediction methods that are based on pairwise node similarity are well-established approaches in the literature and show good prediction performance in many realworld graphs though they are heuristic. On the other hand, graph embedding approaches learn lowdimensional representation of nodes in graph and are capable of capturing inherent graph features, and thus support the subsequent link prediction task in graph. This paper studies a selection of methods from both categories on several benchmark (homogeneous) graphs with different properties from various domains. Beyond the intra and inter category comparison of the performances of the methods, our aim is also to uncover interesting connections between Graph Neural Network(GNN)- based methods and heuristic ones as a means to alleviate the black-box well-known limitation.

Download Full-text

Persona2vec: a flexible multi-role representations learning framework for graphs

PeerJ Computer Science ◽

10.7717/peerj-cs.439 ◽

2021 ◽

Vol 7 ◽

pp. e439

Author(s):

Jisung Yoon ◽

Kai-Cheng Yang ◽

Woo-Sung Jung ◽

Yong-Yeol Ahn

Keyword(s):

Community Structure ◽

Link Prediction ◽

Graph Mining ◽

State Of The Art ◽

Multiple Representations ◽

Graph Embedding ◽

Learning Framework ◽

Overlapping Community ◽

Art Performance ◽

Low Dimensional

Graph embedding techniques, which learn low-dimensional representations of a graph, are achieving state-of-the-art performance in many graph mining tasks. Most existing embedding algorithms assign a single vector to each node, implicitly assuming that a single representation is enough to capture all characteristics of the node. However, across many domains, it is common to observe pervasively overlapping community structure, where most nodes belong to multiple communities, playing different roles depending on the contexts. Here, we propose persona2vec, a graph embedding framework that efficiently learns multiple representations of nodes based on their structural contexts. Using link prediction-based evaluation, we show that our framework is significantly faster than the existing state-of-the-art model while achieving better performance.

Download Full-text

Enhanced Unsupervised Graph Embedding via Hierarchical Graph Convolution Network

Mathematical Problems in Engineering ◽

10.1155/2020/5702519 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

H. Zhang ◽

J. J. Zhou ◽

R. Li

Keyword(s):

Graph Embedding ◽

Network Node ◽

Dimensional Representation ◽

Undirected Graphs ◽

Node Label ◽

Structure Information ◽

Initial Node ◽

Hierarchical Graph ◽

Node Classification ◽

Low Dimensional

Graph embedding aims to learn the low-dimensional representation of nodes in the network, which has been paid more and more attention in many graph-based tasks recently. Graph Convolution Network (GCN) is a typical deep semisupervised graph embedding model, which can acquire node representation from the complex network. However, GCN usually needs to use a lot of labeled data and additional expressive features in the graph embedding learning process, so the model cannot be effectively applied to undirected graphs with only network structure information. In this paper, we propose a novel unsupervised graph embedding method via hierarchical graph convolution network (HGCN). Firstly, HGCN builds the initial node embedding and pseudo-labels for the undirected graphs, and then further uses GCNs to learn the node embedding and update labels, finally combines HGCN output representation with the initial embedding to get the graph embedding. Furthermore, we improve the model to match the different undirected networks according to the number of network node label types. Comprehensive experiments demonstrate that our proposed HGCN and HGCN∗ can significantly enhance the performance of the node classification task.

Download Full-text

Deep Attributed Network Embedding

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/467 ◽

2018 ◽

Cited By ~ 34

Author(s):

Hongchang Gao ◽

Heng Huang

Keyword(s):

Topological Structure ◽

Link Prediction ◽

Network Embedding ◽

Dimensional Representation ◽

Attributed Network ◽

Real World Applications ◽

Benchmark Datasets ◽

Node Attributes ◽

Novel Strategy ◽

Low Dimensional

Network embedding has attracted a surge of attention in recent years. It is to learn the low-dimensional representation for nodes in a network, which benefits downstream tasks such as node classification and link prediction. Most of the existing approaches learn node representations only based on the topological structure, yet nodes are often associated with rich attributes in many real-world applications. Thus, it is important and necessary to learn node representations based on both the topological structure and node attributes. In this paper, we propose a novel deep attributed network embedding approach, which can capture the high non-linearity and preserve various proximities in both topological structure and node attributes. At the same time, a novel strategy is proposed to guarantee the learned node representation can encode the consistent and complementary information from the topological structure and node attributes. Extensive experiments on benchmark datasets have verified the effectiveness of our proposed approach.

Download Full-text