Discovering Correlation Indices for Link Prediction Using Differential Evolution

Giulio Biondi; Valentina Franzoni

doi:10.3390/math8112097

Discovering Correlation Indices for Link Prediction Using Differential Evolution

Mathematics ◽

10.3390/math8112097 ◽

2020 ◽

Vol 8 (11) ◽

pp. 2097

Author(s):

Giulio Biondi ◽

Valentina Franzoni

Keyword(s):

Differential Evolution ◽

Link Prediction ◽

Predictive Power ◽

Similarity Measures ◽

Correlation Index ◽

Domain Specific ◽

Similarity Indices ◽

Binary Correlation ◽

Correlation Space ◽

The Given

Binary correlation indices are crucial for forecasting and modelling tasks in different areas of scientific research. The setting of sound binary correlations and similarity measures is a long and mostly empirical interactive process, in which researchers start from experimental correlations in one domain, which usually prove to be effective in other similar fields, and then progressively evaluate and modify those correlations to adapt their predictive power to the specific characteristics of the domain under examination. In the research of prediction of links on complex networks, it has been found that no single correlation index can always obtain excellent results, even in similar domains. The research of domain-specific correlation indices or the adaptation of known ones is therefore a problem of critical concern. This paper presents a solution to the problem of setting new binary correlation indices that achieve efficient performances on specific network domains. The proposed solution is based on Differential Evolution, evolving the coefficient vectors of meta-correlations, structures that describe classes of binary similarity indices and subsume the most known correlation indices for link prediction. Experiments show that the proposed evolutionary approach always results in improved performances, and in some cases significantly enhanced, compared to the best correlation indices available in the link prediction literature, effectively exploring the correlation space and exploiting its self-adaptability to the given domain to improve over generations.

Download Full-text

SIMILARITY INDEX BASED ON THE INFORMATION OF NEIGHBOR NODES FOR LINK PREDICTION OF COMPLEX NETWORK

Modern Physics Letters B ◽

10.1142/s0217984913500395 ◽

2013 ◽

Vol 27 (06) ◽

pp. 1350039 ◽

Cited By ~ 9

Author(s):

JING WANG ◽

LILI RONG

Keyword(s):

Link Prediction ◽

High Efficiency ◽

Similarity Index ◽

Similarity Measures ◽

Nearest Neighbors ◽

Clustering Coefficient ◽

Local Similarity ◽

Common Neighbor ◽

Similarity Indices ◽

Node Similarity

Link prediction in complex networks has attracted much attention recently. Many local similarity measures based on the measurements of node similarity have been proposed. Among these local similarity indices, the neighborhood-based indices Common Neighbors (CN), Adamic-Adar (AA) and Resource Allocation (RA) index perform best. It is found that the node similarity indices required only information on the nearest neighbors are assigned high scores and have very low computational complexity. In this paper, a new index based on the contribution of common neighbor nodes to edges is proposed and shown to have competitively good or even better prediction than other neighborhood-based indices especially for the network with low clustering coefficient with its high efficiency and simplicity.

Download Full-text

An effective community-based link prediction model for improving accuracy in social networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211821 ◽

2021 ◽

pp. 1-17

Author(s):

M. Mohamed Iqbal ◽

K. Latha

Keyword(s):

Social Networks ◽

Prediction Model ◽

Real World ◽

Link Prediction ◽

Centrality Measures ◽

Community Based ◽

Proposed Model ◽

Similarity Indices ◽

Community Information ◽

The Given

Link prediction plays a predominant role in complex network analysis. It indicates to determine the probability of the presence of future links that depends on available information. The existing standard classical similarity indices-based link prediction models considered the neighbour nodes have a similar effect towards link probability. Nevertheless, the common neighbor nodes residing in different communities may vary in real-world networks. In this paper, a novel community information-based link prediction model has been proposed in which every neighboring node’s community information (community centrality) has been considered to predict the link between the given node pair. In the proposed model, the given social network graph can be divided into different communities and community centrality is calculated for every derived community based on degree, closeness, and betweenness basic graph centrality measures. Afterward, the new community centrality-based similarity indices have been introduced to compute the community centralities which are applied to nine existing basic similarity indices. The empirical analysis on 13 real-world social networks datasets manifests that the proposed model yields better prediction accuracy of 97% rather than existing models. Moreover, the proposed model is parallelized efficiently to work on large complex networks using Spark GraphX Big Data-based parallel Graph processing technique and it attains a lesser execution time of 250 seconds.

Download Full-text

An information theoretic approach to link prediction in multiplex networks

Scientific Reports ◽

10.1038/s41598-021-92427-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Seyed Hossein Jafari ◽

Amir Mahdi Abdolhosseini-Qomi ◽

Masoud Asadpour ◽

Maseud Rahgozar ◽

Naser Yazdani

Keyword(s):

Real World ◽

Link Prediction ◽

Large Scale ◽

Similarity Measures ◽

Prediction Method ◽

General Purpose ◽

Fast Method ◽

Theoretic Approach ◽

Multiplex Networks ◽

Wide Range

AbstractThe entities of real-world networks are connected via different types of connections (i.e., layers). The task of link prediction in multiplex networks is about finding missing connections based on both intra-layer and inter-layer correlations. Our observations confirm that in a wide range of real-world multiplex networks, from social to biological and technological, a positive correlation exists between connection probability in one layer and similarity in other layers. Accordingly, a similarity-based automatic general-purpose multiplex link prediction method—SimBins—is devised that quantifies the amount of connection uncertainty based on observed inter-layer correlations in a multiplex network. Moreover, SimBins enhances the prediction quality in the target layer by incorporating the effect of link overlap across layers. Applying SimBins to various datasets from diverse domains, our findings indicate that SimBins outperforms the compared methods (both baseline and state-of-the-art methods) in most instances when predicting links. Furthermore, it is discussed that SimBins imposes minor computational overhead to the base similarity measures making it a potentially fast method, suitable for large-scale multiplex networks.

Download Full-text

Semantic Similarity Measures for Topological Link Prediction

Computational Science and Its Applications – ICCSA 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-58814-4_10 ◽

2020 ◽

pp. 132-142 ◽

Cited By ~ 1

Author(s):

Giulio Biondi ◽

Valentina Franzoni

Keyword(s):

Semantic Similarity ◽

Link Prediction ◽

Similarity Measures

Download Full-text

Domain-specific Evaluation Dataset Generator for Multilingual Text Analysis

Journal of Intelligent Systems with Applications ◽

10.54856/jiswa.201912084 ◽

2019 ◽

pp. 140-147

Author(s):

Emrah Inan ◽

Vahab Mostafapour ◽

Fatif Tekbacak

Keyword(s):

Text Analysis ◽

General Purpose ◽

Entity Linking ◽

Named Entity ◽

Domain Specific ◽

Benchmark Datasets ◽

Concise Information ◽

Multilingual Text ◽

The Given ◽

Specific Evaluation

Web enables to retrieve concise information about specific entities including people, organizations, movies and their features. Additionally, large amount of Web resources generally lies on a unstructured form and it tackles to find critical information for specific entities. Text analysis approaches such as Named Entity Recognizer and Entity Linking aim to identify entities and link them to relevant entities in the given knowledge base. To evaluate these approaches, there are a vast amount of general purpose benchmark datasets. However, it is difficult to evaluate domain-specific approaches due to lack of evaluation datasets for specific domains. This study presents WeDGeM that is a multilingual evaluation set generator for specific domains exploiting Wikipedia category pages and DBpedia hierarchy. Also, Wikipedia disambiguation pages are used to adjust the ambiguity level of the generated texts. Based on this generated test data, a use case for well-known Entity Linking systems supporting Turkish texts are evaluated in the movie domain.

Download Full-text

DISPOSITIONAL TRAITS AS PREDICTORS OF SELF-EFFICACY

10.36315/2021pad04 ◽

2020 ◽

pp. 32-44

Author(s):

Elena Lisá ◽

Keyword(s):

Personality Traits ◽

Achievement Motivation ◽

Self Efficacy ◽

Predictive Power ◽

Career Decision ◽

Self Control ◽

Personality Theory ◽

Domain Specific ◽

Specific Career ◽

General Self Efficacy

Introduction: We started from Bandura's theory of self-efficacy, the onion model of achievement motivation according to Schuler & Prochaska, and the 5-factor personality theory by Costa & McCrae. The study aimed to analyze the predictive power of achievement motivation and personality traits on general self-efficacyand domain-specific career decision self-efficacy. We expected the more significant relationship of stable personality characteristics with general self-efficacy than with specific-domain career decision self-efficacy. Methods: 690adult participants (university students and working adults) completed a career decision self-efficacy questionnaire,and 268of them a general self-efficacy scale. All participants also fulfilled an achievement motivation questionnaire and afive-factor personality theory questionnaire. Results: All five personality traits, combined with four dimensions of achievement motivation (dominance, confidence in success, self-control, and competitiveness) explain 61% of general self-efficacy variability. Extraversion, agreeableness, andconscientiousness with six achievement motivation dimensions (dominance, engagement, confidence in success, fearlessness, competitiveness, and goal setting) explain 42.5% of career decision self-efficacy variability. Discussion: Stable traits and achievement motivation dimensions had more significant predictive power on general self-efficacy than on domain-specific career decision self-efficacy. For further research, there is a suggestion about a theoretically and empirically integrated model of dispositional and social-cognitive approaches.

Download Full-text

Embedding Methods or Link-based Similarity Measures, Which is Better for Link Prediction?

10.1109/ic-nidc54101.2021.9660590 ◽

2021 ◽

Author(s):

Masoud Reyhani Hamedani ◽

Sang-Wook Kim

Keyword(s):

Link Prediction ◽

Similarity Measures ◽

Embedding Methods

Download Full-text

User Link Prediction based on Logistic Regression Model with Local Similarity Indices in Microblog Network

Journal of Convergence Information Technology ◽

10.4156/jcit.vol8.issue2.7 ◽

2013 ◽

Vol 8 (2) ◽

pp. 49-58

Author(s):

Jie Lian ◽

Haiqiang Chen ◽

Yun Liu ◽

Fei Xiong ◽

Yuan Wen

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Link Prediction ◽

Logistic Regression Model ◽

Local Similarity ◽

Similarity Indices

Download Full-text

A Machine Learning Approach to Data Cleaning in Databases and Data Warehouses

Handbook of Research on Fuzzy Information Processing in Databases ◽

10.4018/978-1-59904-853-6.ch030 ◽

2011 ◽

pp. 745-759

Author(s):

Hamid Haidarian Shahri

Keyword(s):

Fuzzy Inference ◽

Data Cleaning ◽

Similarity Measures ◽

Entity Resolution ◽

String Similarity ◽

Domain Specific ◽

Meta Level ◽

Neuro Fuzzy ◽

Resolution Problem ◽

String Similarity Measures

Entity resolution (also known as duplicate elimination) is an important part of the data cleaning process, especially in data integration and warehousing, where data are gathered from distributed and inconsistent sources. Learnable string similarity measures are an active area of research in the entity resolution problem. Our proposed framework builds upon our earlier work on entity resolution, in which fuzzy rules and membership functions are defined by the user. Here, we exploit neuro-fuzzy modeling for the first time to produce a unique adaptive framework for entity resolution, which automatically learns and adapts to the specific notion of similarity at a meta-level. This framework encompasses many of the previous work on trainable and domain-specific similarity measures. Employing fuzzy inference, it removes the repetitive task of hard-coding a program based on a schema, which is usually required in previous approaches. In addition, our extensible framework is very flexible for the end user. Hence, it can be utilized in the production of an intelligent tool to increase the quality and accuracy of data.

Download Full-text

A new study of using temporality and weights to improve similarity measures for link prediction of social networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-17770 ◽

2018 ◽

Vol 34 (4) ◽

pp. 2667-2678

Author(s):

Farshad Aghabozorgi ◽

Mohammad Reza Khayyambashi

Keyword(s):

Social Networks ◽

Link Prediction ◽

Similarity Measures

Download Full-text