scholarly journals On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs

2020 ◽  
Vol 11 (1) ◽  
pp. 162
Author(s):  
Masoud Reyhani Hamedani ◽  
Sang-Wook Kim

One of the important tasks in a graph is to compute the similarity between two nodes; link-based similarity measures (in short, similarity measures) are well-known and conventional techniques for this task that exploit the relations between nodes (i.e., links) in the graph. Graph embedding methods (in short, embedding methods) convert nodes in a graph into vectors in a low-dimensional space by preserving social relations among nodes in the original graph. Instead of applying a similarity measure to the graph to compute the similarity between nodes a and b, we can consider the proximity between corresponding vectors of a and b obtained by an embedding method as the similarity between a and b. Although embedding methods have been analyzed in a wide range of machine learning tasks such as link prediction and node classification, they are not investigated in terms of similarity computation of nodes. In this paper, we investigate both effectiveness and efficiency of embedding methods in the task of similarity computation of nodes by comparing them with those of similarity measures. To the best of our knowledge, this is the first work that examines the application of embedding methods in this special task. Based on the results of our extensive experiments with five well-known and publicly available datasets, we found the following observations for embedding methods: (1) with all datasets, they show less effectiveness than similarity measures except for one dataset, (2) they underperform similarity measures with all datasets in terms of efficiency except for one dataset, (3) they have more parameters than similarity measures, thereby leading to a time-consuming parameter tuning process, (4) increasing the number of dimensions does not necessarily improve their effectiveness in computing the similarity of nodes.

Author(s):  
Yuanfu Lu ◽  
Chuan Shi ◽  
Linmei Hu ◽  
Zhiyuan Liu

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.


2014 ◽  
Vol 490-491 ◽  
pp. 1293-1297
Author(s):  
Liang Zhu ◽  
Fei Fei Liu ◽  
Wu Chen ◽  
Qing Ma

Top-Nqueries are employed in a wide range of applications to obtain a ranked list of data objects that have the highest aggregate scores over certain attributes. The threshold algorithm (TA) is an important method in many scenarios. However, TA is effective only when the ranking function is monotone and the query point is fixed. In the paper, we propose an approach that alleviates the limitations of TA-like methods for processing top-Nqueries. Based onp-norm distances as ranking functions, our methods utilize the fundamental principle of Functional Analysis so that the candidate tuples of top-Nquery with ap-norm distance can be obtained by the Maximum distance. We conduct extensive experiments to prove the effectiveness and efficiency of our method for both low-dimensional (2, 3 and 4) and high-dimensional (25,50 and 104) data.


2021 ◽  
Vol 12 (4) ◽  
pp. 1-25
Author(s):  
Daheng Wang ◽  
Qingkai Zeng ◽  
Nitesh V. Chawla ◽  
Meng Jiang

People are looking for complementary contexts, such as team members of complementary skills for project team building and/or reading materials of complementary knowledge for effective student learning, to make their behaviors more likely to be successful. Complementarity has been revealed by behavioral sciences as one of the most important factors in decision making. Existing computational models that learn low-dimensional context representations from behavior data have poor scalability and recent network embedding methods only focus on preserving the similarity between the contexts. In this work, we formulate a behavior entry as a set of context items and propose a novel representation learning method, Multi-type Itemset Embedding , to learn the context representations preserving the itemset structures. We propose a measurement of complementarity between context items in the embedding space. Experiments demonstrate both effectiveness and efficiency of the proposed method over the state-of-the-art methods on behavior prediction and context recommendation. We discover that the complementary contexts and similar contexts are significantly different in human behaviors.


2020 ◽  
Vol 26 (4) ◽  
pp. 434-453
Author(s):  
Milan Sečujski ◽  
Darko Pekar ◽  
Siniša Suzić ◽  
Anton Smirnov ◽  
Tijana Nosek

The paper presents a novel architecture and method for training neural networks to produce synthesized speech in a particular voice and speaking style, based on a small quantity of target speaker/style training data. The method is based on neural network embedding, i.e. mapping of discrete variables into continuous vectors in a low-dimensional space, which has been shown to be a very successful universal deep learning technique. In this particular case, different speaker/style combinations are mapped into different points in a low-dimensional space, which enables the network to capture the similarities and differences between speakers and speaking styles more efficiently. The initial model from which speaker/style adaptation was carried out was a multi-speaker/multi-style model based on 8.5 hours of American English speech data which corresponds to 16 different speaker/style combinations. The results of the experiments show that both versions of the obtained system, one using 10 minutes and the other as little as 30 seconds of target data, outperform the state of the art in parametric speaker/style-dependent speech synthesis. This opens a wide range of application of speaker/style dependent speech synthesis based on small quantities of training data, in domains ranging from customer interaction in call centers to robot-assisted medical therapy.


2018 ◽  
Author(s):  
Qiwen Hu ◽  
Casey S. Greene

Single-cell RNA sequencing (scRNA-seq) is a powerful tool to profile the transcriptomes of a large number of individual cells at a high resolution. These data usually contain measurements of gene expression for many genes in thousands or tens of thousands of cells, though some datasets now reach the million-cell mark. Projecting high-dimensional scRNA-seq data into a low dimensional space aids downstream analysis and data visualization. Many recent preprints accomplish this using variational autoencoders (VAE), generative models that learn underlying structure of data by compress it into a constrained, low dimensional space. The low dimensional spaces generated by VAEs have revealed complex patterns and novel biological signals from large-scale gene expression data and drug response predictions. Here, we evaluate a simple VAE approach for gene expression data, Tybalt, by training and measuring its performance on sets of simulated scRNA-seq data. We find a number of counter-intuitive performance features: i.e., deeper neural networks can struggle when datasets contain more observations under some parameter configurations. We show that these methods are highly sensitive to parameter tuning: when tuned, the performance of the Tybalt model, which was not optimized for scRNA-seq data, outperforms other popular dimension reduction approaches – PCA, ZIFA, UMAP and t-SNE. On the other hand, without tuning performance can also be remarkably poor on the same data. Our results should discourage authors and reviewers from relying on self-reported performance comparisons to evaluate the relative value of contributions in this area at this time. Instead, we recommend that attempts to compare or benchmark autoencoder methods for scRNA-seq data be performed by disinterested third parties or by methods developers only on unseen benchmark data that are provided to all participants simultaneously because the potential for performance differences due to unequal parameter tuning is so high.


2019 ◽  
Author(s):  
Benjamin Balas ◽  
Josselyn Thrash

Observers estimate a range of social characteristics from images of human faces. An important unifying framework for these judgments is the observation that a low-dimensional social face-space based on perceived valence and dominance captures most of the variance across a wide range of social evaluation judgments. However, it is not known whether or not this low-dimensional space can be used to infer the outcome of new social judgments. Further, the extent to which such social inference may differ across real and computer-generated faces is also unknown. We addess both of these issues by recovering valence/dominance axes from social judgments made to real and artificial faces, then attempt to use these coordinates to predict independent social judgment data obtained from new human observers. We find that above-chance performance can be achieved, though performance appears to be better with artificial faces than real ones.


Networks have proved to be very helpful in modelling complex systems with interacting components. There are various problems across various domains where the systems can be modelled in the form of a network with links between interacting components. The Problem of Link Prediction deals with predicting missing links in a given network. The application of link prediction ranges across various disciplines including biological networks, transportation networks, social networks, telecommunication networks, etc. In this paper, we use node embedding methods to encode the nodes into low dimensional embeddings and predict links based on the edge embeddings computed by taking the hadamard product of the participating nodes. We further compare the accuracy of the models trained on different dimensions of embeddings. We also study how the introduction of additional features changes the accuracy when introduced to various dimensions of node embeddings. The additional features include overlapping measures such as Jaccard similarity, Adamic-Adar score and dot product between node embeddings as well as heuristic features i.e. Common Neighbors, Resource Allocation, preferential attachment and friend tns score.


Author(s):  
Denis Tikhomirov

The purpose of the article is to typologize terminological definitions of security, to find out the general, to identify the originality of their interpretations depending on the subject of legal regulation. The methodological basis of the study is the methods that made it possible to obtain valid conclusions, in particular, the method of comparison, through which it became possible to correlate different interpretations of the term "security"; method of hermeneutics, which allowed to elaborate texts of normative legal acts of Ukraine, method of typologization, which made it possible to create typologization groups of variants of understanding of the term "security". Scientific novelty. The article analyzes the understanding of the term "security" in various regulatory acts in force in Ukraine. Typological groups were understood to understand the term "security". Conclusions. The analysis of the legal material makes it possible to confirm that the issues of security are within the scope of both legislative regulation and various specialized by-laws. However, today there is no single conception on how to interpret security terminology. This is due both to the wide range of social relations that are the subject of legal regulation and to the relativity of the notion of security itself and the lack of coherence of views on its definition in legal acts and in the scientific literature. The multiplicity of definitions is explained by combinations of material and procedural understanding, static - dynamic, and conditioned by the peculiarities of a particular branch of legal regulation, limited ability to use methods of one or another branch, the inter-branch nature of some variations of security, etc. Separation, common and different in the definition of "security" can be used to further standardize, in fact, the regulatory legal understanding of security to more effectively implement the legal regulation of the security direction.


These volumes contain the proceedings of the conference held at Aarhus, Oxford and Madrid in September 2016 to mark the seventieth birthday of Nigel Hitchin, one of the world’s foremost geometers and Savilian Professor of Geometry at Oxford. The proceedings contain twenty-nine articles, including three by Fields medallists (Donaldson, Mori and Yau). The articles cover a wide range of topics in geometry and mathematical physics, including the following: Riemannian geometry, geometric analysis, special holonomy, integrable systems, dynamical systems, generalized complex structures, symplectic and Poisson geometry, low-dimensional topology, algebraic geometry, moduli spaces, Higgs bundles, geometric Langlands programme, mirror symmetry and string theory. These volumes will be of interest to researchers and graduate students both in geometry and mathematical physics.


NeuroImage ◽  
2021 ◽  
pp. 118200
Author(s):  
Sayan Ghosal ◽  
Qiang Chen ◽  
Giulio Pergola ◽  
Aaron L. Goldman ◽  
William Ulrich ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document