scholarly journals On the estimation of latent distances using graph distances

2021 ◽  
Vol 15 (1) ◽  
pp. 722-747
Author(s):  
Ery Arias-Castro ◽  
Antoine Channarond ◽  
Bruno Pelletier ◽  
Nicolas Verzelen
Keyword(s):  
PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255067
Author(s):  
Annamaria Ficara ◽  
Lucia Cavallaro ◽  
Francesco Curreri ◽  
Giacomo Fiumara ◽  
Pasquale De Meo ◽  
...  

Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific methods: (i) random edge removal, simulating the scenario in which the Law Enforcement Agencies fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) node removal, modeling the situation in which some suspects cannot be intercepted or investigated. Finally we compute spectral distances (i.e., Adjacency, Laplacian and normalized Laplacian Spectral Distances) and matrix distances (i.e., Root Euclidean Distance) between the complete and pruned networks, which we compare using statistical analysis. Our investigation identifies two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., when 10% of edges are removed); second, removing even a small fraction of suspects not investigated (i.e., 2% of nodes are removed) may lead to significant misinterpretation of the overall network.


Author(s):  
Simón Lunagómez ◽  
Sofia C. Olhede ◽  
Patrick J. Wolfe
Keyword(s):  

2021 ◽  
Author(s):  
Tobias Wängberg ◽  
Chun-Biu Li ◽  
Joanna Tyrcha

Abstract The t-distributed Stochastic Neighbour Embedding (t-SNE) method has emerged as one of the leading methods for visualising High Dimensional (HD) data in a wide variety of fields, especially for revealing cluster structure in HD single cell transcriptomics data. However, several shortcomings of the algorithm have been identified. Specifically, t-SNE is often unable to correctly represent hierarchical relationships between clusters and spurious patterns may arise in the embedding due to incorrect parameter settings, which could lead to misinterpretations of the data. Here we incorporate t-SNE with shape-aware graph distances, a method termed shape-aware stochastic neighbour embedding (SASNE), to mitigate these limitations of the t-SNE. The merits of the SASNE are first demonstrated using synthetic data sets, where we see a significant improvement in embedding imbalanced and nonlinear clusters, as well as preservation of hierarchical structure, based on quantitative validation in clustering and dimensionality reductions. Moreover, we propose a data-driven parameter setting which we find consistently optimal in all test cases. Lastly, we demonstrate the superior performance of SASNE in embedding the MNIST image data and the single cell transcriptomics gene expression data.


Algorithmica ◽  
2020 ◽  
Vol 82 (8) ◽  
pp. 2292-2315
Author(s):  
Karl Bringmann ◽  
Thore Husfeldt ◽  
Måns Magnusson

Author(s):  
Armin Moharrer ◽  
Jasmin Gao ◽  
Shikun Wang ◽  
Jose Bento ◽  
Stratis Ioannidis
Keyword(s):  

2019 ◽  
Vol 286 (1902) ◽  
pp. 20182727 ◽  
Author(s):  
Arjun Chandrasekhar ◽  
Saket Navlakha

Neural arbors (dendrites and axons) can be viewed as graphs connecting the cell body of a neuron to various pre- and post-synaptic partners. Several constraints have been proposed on the topology of these graphs, such as minimizing the amount of wire needed to construct the arbor (wiring cost), and minimizing the graph distances between the cell body and synaptic partners (conduction delay). These two objectives compete with each other—optimizing one results in poorer performance on the other. Here, we describe how well neural arbors resolve this network design trade-off using the theory of Pareto optimality. We develop an algorithm to generate arbors that near-optimally balance between these two objectives, and demonstrate that this algorithm improves over previous algorithms. We then use this algorithm to study how close neural arbors are to being Pareto optimal. Analysing 14 145 arbors across numerous brain regions, species and cell types, we find that neural arbors are much closer to being Pareto optimal than would be expected by chance and other reasonable baselines. We also investigate how the location of the arbor on the Pareto front, and the distance from the arbor to the Pareto front, can be used to classify between some arbor types (e.g. axons versus dendrites, or different cell types), highlighting a new potential connection between arbor structure and function. Finally, using this framework, we find that another biological branching structure—plant shoot architectures used to collect and distribute nutrients—are also Pareto optimal, suggesting shared principles of network design between two systems separated by millions of years of evolution.


2021 ◽  
Author(s):  
Annamaria Ficara ◽  
Francesco Curreri ◽  
Lucia Cavallaro ◽  
Pasquale De Meo ◽  
Giacomo Fiumara ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document