On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs

Masoud Reyhani Hamedani; Sang-Wook Kim

doi:10.3390/app11010162

On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs

Applied Sciences ◽

10.3390/app11010162 ◽

2020 ◽

Vol 11 (1) ◽

pp. 162

Author(s):

Masoud Reyhani Hamedani ◽

Sang-Wook Kim

Keyword(s):

Social Relations ◽

Dimensional Space ◽

Similarity Measures ◽

Parameter Tuning ◽

Original Graph ◽

Wide Range ◽

Similarity Computation ◽

Effectiveness And Efficiency ◽

Low Dimensional ◽

Embedding Methods

One of the important tasks in a graph is to compute the similarity between two nodes; link-based similarity measures (in short, similarity measures) are well-known and conventional techniques for this task that exploit the relations between nodes (i.e., links) in the graph. Graph embedding methods (in short, embedding methods) convert nodes in a graph into vectors in a low-dimensional space by preserving social relations among nodes in the original graph. Instead of applying a similarity measure to the graph to compute the similarity between nodes a and b, we can consider the proximity between corresponding vectors of a and b obtained by an embedding method as the similarity between a and b. Although embedding methods have been analyzed in a wide range of machine learning tasks such as link prediction and node classification, they are not investigated in terms of similarity computation of nodes. In this paper, we investigate both effectiveness and efficiency of embedding methods in the task of similarity computation of nodes by comparing them with those of similarity measures. To the best of our knowledge, this is the first work that examines the application of embedding methods in this special task. Based on the results of our extensive experiments with five well-known and publicly available datasets, we found the following observations for embedding methods: (1) with all datasets, they show less effectiveness than similarity measures except for one dataset, (2) they underperform similarity measures with all datasets in terms of efficiency except for one dataset, (3) they have more parameters than similarity measures, thereby leading to a time-consuming parameter tuning process, (4) increasing the number of dimensions does not necessarily improve their effectiveness in computing the similarity of nodes.

Download Full-text

Relation Structure-Aware Heterogeneous Information Network Embedding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014456 ◽

2019 ◽

Vol 33 ◽

pp. 4456-4463 ◽

Cited By ~ 8

Author(s):

Yuanfu Lu ◽

Chuan Shi ◽

Linmei Hu ◽

Zhiyuan Liu

Keyword(s):

Real World ◽

Dimensional Space ◽

Structural Characteristics ◽

Information Network ◽

Network Embedding ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Real World Datasets ◽

Low Dimensional ◽

Embedding Methods

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.

Download Full-text

Processing Top-N Queries Based on p-Norm Distances

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.490-491.1293 ◽

2014 ◽

Vol 490-491 ◽

pp. 1293-1297

Author(s):

Liang Zhu ◽

Fei Fei Liu ◽

Wu Chen ◽

Qing Ma

Keyword(s):

Fundamental Principle ◽

High Dimensional ◽

Query Point ◽

Ranking Functions ◽

Important Method ◽

Wide Range ◽

Effectiveness And Efficiency ◽

Data Objects ◽

Low Dimensional ◽

Ranked List

Top-Nqueries are employed in a wide range of applications to obtain a ranked list of data objects that have the highest aggregate scores over certain attributes. The threshold algorithm (TA) is an important method in many scenarios. However, TA is effective only when the ranking function is monotone and the query point is fixed. In the paper, we propose an approach that alleviates the limitations of TA-like methods for processing top-Nqueries. Based onp-norm distances as ranking functions, our methods utilize the fundamental principle of Functional Analysis so that the candidate tuples of top-Nquery with ap-norm distance can be obtained by the Maximum distance. We conduct extensive experiments to prove the effectiveness and efficiency of our method for both low-dimensional (2, 3 and 4) and high-dimensional (25,50 and 104) data.

Download Full-text

Modeling Complementarity in Behavior Data with Multi-Type Itemset Embedding

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3458724 ◽

2021 ◽

Vol 12 (4) ◽

pp. 1-25

Author(s):

Daheng Wang ◽

Qingkai Zeng ◽

Nitesh V. Chawla ◽

Meng Jiang

Keyword(s):

Computational Models ◽

Team Building ◽

Representation Learning ◽

Behavior Prediction ◽

Team Members ◽

Space Experiments ◽

Reading Materials ◽

Effectiveness And Efficiency ◽

Low Dimensional ◽

Embedding Methods

People are looking for complementary contexts, such as team members of complementary skills for project team building and/or reading materials of complementary knowledge for effective student learning, to make their behaviors more likely to be successful. Complementarity has been revealed by behavioral sciences as one of the most important factors in decision making. Existing computational models that learn low-dimensional context representations from behavior data have poor scalability and recent network embedding methods only focus on preserving the similarity between the contexts. In this work, we formulate a behavior entry as a set of context items and propose a novel representation learning method, Multi-type Itemset Embedding , to learn the context representations preserving the itemset structures. We propose a measurement of complementarity between context items in the embedding space. Experiments demonstrate both effectiveness and efficiency of the proposed method over the state-of-the-art methods on behavior prediction and context recommendation. We discover that the complementary contexts and similar contexts are significantly different in human behaviors.

Download Full-text

Speaker/Style-Dependent Neural Network Speech Synthesis Based on Speaker/Style Embedding

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.2020.023 ◽

2020 ◽

Vol 26 (4) ◽

pp. 434-453

Author(s):

Milan Sečujski ◽

Darko Pekar ◽

Siniša Suzić ◽

Anton Smirnov ◽

Tijana Nosek

Keyword(s):

Neural Network ◽

Speech Synthesis ◽

American English ◽

Dimensional Space ◽

Training Data ◽

Customer Interaction ◽

Wide Range ◽

Low Dimensional ◽

Speaking Style ◽

Target Speaker

The paper presents a novel architecture and method for training neural networks to produce synthesized speech in a particular voice and speaking style, based on a small quantity of target speaker/style training data. The method is based on neural network embedding, i.e. mapping of discrete variables into continuous vectors in a low-dimensional space, which has been shown to be a very successful universal deep learning technique. In this particular case, different speaker/style combinations are mapped into different points in a low-dimensional space, which enables the network to capture the similarities and differences between speakers and speaking styles more efficiently. The initial model from which speaker/style adaptation was carried out was a multi-speaker/multi-style model based on 8.5 hours of American English speech data which corresponds to 16 different speaker/style combinations. The results of the experiments show that both versions of the obtained system, one using 10 minutes and the other as little as 30 seconds of target data, outperform the state of the art in parametric speaker/style-dependent speech synthesis. This opens a wide range of application of speaker/style dependent speech synthesis based on small quantities of training data, in domains ranging from customer interaction in call centers to robot-assisted medical therapy.

Download Full-text

Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics

10.1101/385534 ◽

2018 ◽

Cited By ~ 5

Author(s):

Qiwen Hu ◽

Casey S. Greene

Keyword(s):

Gene Expression ◽

Single Cell ◽

Gene Expression Data ◽

Large Scale ◽

Dimensional Space ◽

Parameter Tuning ◽

Generative Models ◽

Underlying Structure ◽

Expression Data ◽

Low Dimensional

Single-cell RNA sequencing (scRNA-seq) is a powerful tool to profile the transcriptomes of a large number of individual cells at a high resolution. These data usually contain measurements of gene expression for many genes in thousands or tens of thousands of cells, though some datasets now reach the million-cell mark. Projecting high-dimensional scRNA-seq data into a low dimensional space aids downstream analysis and data visualization. Many recent preprints accomplish this using variational autoencoders (VAE), generative models that learn underlying structure of data by compress it into a constrained, low dimensional space. The low dimensional spaces generated by VAEs have revealed complex patterns and novel biological signals from large-scale gene expression data and drug response predictions. Here, we evaluate a simple VAE approach for gene expression data, Tybalt, by training and measuring its performance on sets of simulated scRNA-seq data. We find a number of counter-intuitive performance features: i.e., deeper neural networks can struggle when datasets contain more observations under some parameter configurations. We show that these methods are highly sensitive to parameter tuning: when tuned, the performance of the Tybalt model, which was not optimized for scRNA-seq data, outperforms other popular dimension reduction approaches – PCA, ZIFA, UMAP and t-SNE. On the other hand, without tuning performance can also be remarkably poor on the same data. Our results should discourage authors and reviewers from relying on self-reported performance comparisons to evaluate the relative value of contributions in this area at this time. Instead, we recommend that attempts to compare or benchmark autoencoder methods for scRNA-seq data be performed by disinterested third parties or by methods developers only on unseen benchmark data that are provided to all participants simultaneously because the potential for performance differences due to unequal parameter tuning is so high.

Download Full-text

Using social face-space to estimate social variables in real and artificial faces

10.31234/osf.io/btv84 ◽

2019 ◽

Cited By ~ 1

Author(s):

Benjamin Balas ◽

Josselyn Thrash

Keyword(s):

Dimensional Space ◽

Social Characteristics ◽

Social Variables ◽

Social Judgments ◽

Social Evaluation ◽

Face Space ◽

Wide Range ◽

Low Dimensional ◽

Human Faces ◽

Judgment Data

Observers estimate a range of social characteristics from images of human faces. An important unifying framework for these judgments is the observation that a low-dimensional social face-space based on perceived valence and dominance captures most of the variance across a wide range of social evaluation judgments. However, it is not known whether or not this low-dimensional space can be used to infer the outcome of new social judgments. Further, the extent to which such social inference may differ across real and computer-generated faces is also unknown. We addess both of these issues by recovering valence/dominance axes from social judgments made to real and artificial faces, then attempt to use these coordinates to predict independent social judgment data obtained from new human observers. We find that above-chance performance can be achieved, though performance appears to be better with artificial faces than real ones.

Download Full-text

Link Prediction in Complex Networks using Embedding Techniques and Similarity Measures

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2762.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1690-1696

Keyword(s):

Biological Networks ◽

Link Prediction ◽

Preferential Attachment ◽

Similarity Measures ◽

Telecommunication Networks ◽

Different Dimensions ◽

Low Dimensional ◽

Embedding Methods ◽

Node Embeddings ◽

Interacting Components

Networks have proved to be very helpful in modelling complex systems with interacting components. There are various problems across various domains where the systems can be modelled in the form of a network with links between interacting components. The Problem of Link Prediction deals with predicting missing links in a given network. The application of link prediction ranges across various disciplines including biological networks, transportation networks, social networks, telecommunication networks, etc. In this paper, we use node embedding methods to encode the nodes into low dimensional embeddings and predict links based on the edge embeddings computed by taking the hadamard product of the participating nodes. We further compare the accuracy of the models trained on different dimensions of embeddings. We also study how the introduction of additional features changes the accuracy when introduced to various dimensions of node embeddings. The additional features include overlapping measures such as Jaccard similarity, Adamic-Adar score and dot product between node embeddings as well as heuristic features i.e. Common Neighbors, Resource Allocation, preferential attachment and friend tns score.

Download Full-text

Typologization of security regulation

Scientific and Informational Bulletin of Ivano-Frankivsk University of Law Named after King Danylo Halytskyi - Scientific and informational bulletin of Ivano-Frankivsk University of Law named after King Danylo Halytskyi ◽

10.33098/2078-6670.2020.9.21.20-27 ◽

2020 ◽

pp. 20-27

Author(s):

Denis Tikhomirov

Keyword(s):

Social Relations ◽

Scientific Literature ◽

Legal Regulation ◽

Procedural Understanding ◽

Limited Ability ◽

Methodological Basis ◽

Wide Range ◽

The Subject ◽

Definition Of ◽

Legal Understanding

The purpose of the article is to typologize terminological definitions of security, to find out the general, to identify the originality of their interpretations depending on the subject of legal regulation. The methodological basis of the study is the methods that made it possible to obtain valid conclusions, in particular, the method of comparison, through which it became possible to correlate different interpretations of the term "security"; method of hermeneutics, which allowed to elaborate texts of normative legal acts of Ukraine, method of typologization, which made it possible to create typologization groups of variants of understanding of the term "security". Scientific novelty. The article analyzes the understanding of the term "security" in various regulatory acts in force in Ukraine. Typological groups were understood to understand the term "security". Conclusions. The analysis of the legal material makes it possible to confirm that the issues of security are within the scope of both legislative regulation and various specialized by-laws. However, today there is no single conception on how to interpret security terminology. This is due both to the wide range of social relations that are the subject of legal regulation and to the relativity of the notion of security itself and the lack of coherence of views on its definition in legal acts and in the scientific literature. The multiplicity of definitions is explained by combinations of material and procedural understanding, static - dynamic, and conditioned by the peculiarities of a particular branch of legal regulation, limited ability to use methods of one or another branch, the inter-branch nature of some variations of security, etc. Separation, common and different in the definition of "security" can be used to further standardize, in fact, the regulatory legal understanding of security to more effectively implement the legal regulation of the security direction.

Download Full-text

Geometry and Physics: Volume II

10.1093/oso/9780198802020.001.0001 ◽

2018 ◽

Keyword(s):

Moduli Spaces ◽

Mirror Symmetry ◽

Geometric Analysis ◽

Mathematical Physics ◽

Poisson Geometry ◽

Special Holonomy ◽

Low Dimensional Topology ◽

Generalized Complex Structures ◽

Wide Range ◽

Low Dimensional

These volumes contain the proceedings of the conference held at Aarhus, Oxford and Madrid in September 2016 to mark the seventieth birthday of Nigel Hitchin, one of the world’s foremost geometers and Savilian Professor of Geometry at Oxford. The proceedings contain twenty-nine articles, including three by Fields medallists (Donaldson, Mori and Yau). The articles cover a wide range of topics in geometry and mathematical physics, including the following: Riemannian geometry, geometric analysis, special holonomy, integrable systems, dynamical systems, generalized complex structures, symplectic and Poisson geometry, low-dimensional topology, algebraic geometry, moduli spaces, Higgs bundles, geometric Langlands programme, mirror symmetry and string theory. These volumes will be of interest to researchers and graduate students both in geometry and mathematical physics.

Download Full-text

A Generative-Discriminative Framework that Integrates Imaging, Genetic, and Diagnosis into Coupled Low Dimensional Space

NeuroImage ◽

10.1016/j.neuroimage.2021.118200 ◽

2021 ◽

pp. 118200

Author(s):

Sayan Ghosal ◽

Qiang Chen ◽

Giulio Pergola ◽

Aaron L. Goldman ◽

William Ulrich ◽

...

Keyword(s):

Dimensional Space ◽

Low Dimensional

Download Full-text