scholarly journals Artificial Benchmark for Community Detection (ABCD)—Fast random graph model with community structure

2021 ◽  
pp. 1-26
Author(s):  
Bogumił Kamiński ◽  
Paweł Prałat ◽  
François Théberge

Abstract Most of the current complex networks that are of interest to practitioners possess a certain community structure that plays an important role in understanding the properties of these networks. For instance, a closely connected social communities exhibit faster rate of transmission of information in comparison to loosely connected communities. Moreover, many machine learning algorithms and tools that are developed for complex networks try to take advantage of the existence of communities to improve their performance or speed. As a result, there are many competing algorithms for detecting communities in large networks. Unfortunately, these algorithms are often quite sensitive and so they cannot be fine-tuned for a given, but a constantly changing, real-world network at hand. It is therefore important to test these algorithms for various scenarios that can only be done using synthetic graphs that have built-in community structure, power law degree distribution, and other typical properties observed in complex networks. The standard and extensively used method for generating artificial networks is the LFR graph generator. Unfortunately, this model has some scalability limitations and it is challenging to analyze it theoretically. Finally, the mixing parameter μ, the main parameter of the model guiding the strength of the communities, has a non-obvious interpretation and so can lead to unnaturally defined networks. In this paper, we provide an alternative random graph model with community structure and power law distribution for both degrees and community sizes, the Artificial Benchmark for Community Detection (ABCD graph). The model generates graphs with similar properties as the LFR one, and its main parameter ξ can be tuned to mimic its counterpart in the LFR model, the mixing parameter μ. We show that the new model solves the three issues identified above and more. In particular, we test the speed of our algorithm and do a number of experiments comparing basic properties of both ABCD and LFR. The conclusion is that these models produce graphs with comparable properties but ABCD is fast, simple, and can be easily tuned to allow the user to make a smooth transition between the two extremes: pure (independent) communities and random graph with no community structure.

2001 ◽  
Vol 10 (1) ◽  
pp. 53-66 ◽  
Author(s):  
William Aiello ◽  
Fan Chung ◽  
Linyuan Lu

2013 ◽  
Vol 2013 ◽  
pp. 1-12 ◽  
Author(s):  
István Fazekas ◽  
Bettina Porvázsnyik

A random graph evolution mechanism is defined. The evolution studied is a combination of the preferential attachment model and the interaction of four vertices. The asymptotic behaviour of the graph is described. It is proved that the graph exhibits a power law degree distribution; in other words, it is scale-free. It turns out that any exponent in(2,∞)can be achieved. The proofs are based on martingale methods.


2019 ◽  
Vol 51 (2) ◽  
pp. 358-377 ◽  
Author(s):  
Tobias Müller ◽  
Merlijn Staps

AbstractWe consider a random graph model that was recently proposed as a model for complex networks by Krioukov et al. (2010). In this model, nodes are chosen randomly inside a disk in the hyperbolic plane and two nodes are connected if they are at most a certain hyperbolic distance from each other. It has previously been shown that this model has various properties associated with complex networks, including a power-law degree distribution and a strictly positive clustering coefficient. The model is specified using three parameters: the number of nodes N, which we think of as going to infinity, and $\alpha, \nu > 0$, which we think of as constant. Roughly speaking, $\alpha$ controls the power-law exponent of the degree sequence and $\nu$ the average degree. Earlier work of Kiwi and Mitsche (2015) has shown that, when $\alpha \lt 1$ (which corresponds to the exponent of the power law degree sequence being $\lt 3$), the diameter of the largest component is asymptotically almost surely (a.a.s.) at most polylogarithmic in N. Friedrich and Krohmer (2015) showed it was a.a.s. $\Omega(\log N)$ and improved the exponent of the polynomial in $\log N$ in the upper bound. Here we show the maximum diameter over all components is a.a.s. $O(\log N),$ thus giving a bound that is tight up to a multiplicative constant.


2019 ◽  
Vol 66 ◽  
pp. 443-472
Author(s):  
Carlos Ansótegui ◽  
Maria Luisa Bonet ◽  
Jesús Giráldez-Cru ◽  
Jordi Levy ◽  
Laurent Simon

Modern SAT solvers have experienced a remarkable progress on solving industrial instances. It is believed that most of these successful techniques exploit the underlying structure of industrial instances. Recently, there have been some attempts to analyze the structure of industrial SAT instances in terms of complex networks, with the aim of explaining the success of SAT solving techniques, and possibly improving them. In this paper, we study the community structure, or modularity, of industrial SAT instances. In a graph with clear community structure, or high modularity, we can find a partition of its nodes into communities such that most edges connect variables of the same community. Representing SAT instances as graphs, we show that most application benchmarks are characterized by a high modularity. On the contrary, random SAT instances are closer to the classical Erdös-Rényi random graph model, where no structure can be observed. We also analyze how this structure evolves by the effects of the execution of a CDCL SAT solver, and observe that new clauses learned by the solver during the search contribute to destroy the original structure of the formula. Motivated by this observation, we finally present an application that exploits the community structure to detect relevant learned clauses, and we show that detecting these clauses results in an improvement on the performance of the SAT solver. Empirically, we observe that this improves the performance of several SAT solvers on industrial SAT formulas, especially on satisfiable instances.


2020 ◽  
Author(s):  
Shalin Shah

<p>Consumer behavior in retail stores gives rise to product graphs based on copurchasing</p><p>or co-viewing behavior. These product graphs can be analyzed using</p><p>the known methods of graph analysis. In this paper, we analyze the product graph</p><p>at Target Corporation based on the Erd˝os-Renyi random graph model. In particular,</p><p>we compute clustering coefficients of actual and random graphs, and we find that</p><p>the clustering coefficients of actual graphs are much higher than random graphs.</p><p>We conduct the analysis on the entire set of products and also on a per category</p><p>basis and find interesting results. We also compute the degree distribution and</p><p>we find that the degree distribution is a power law as expected from real world</p><p>networks, contrasting with the ER random graph.</p>


2015 ◽  
Vol 3 (3) ◽  
pp. 348-360 ◽  
Author(s):  
DAVID MEHRLE ◽  
AMY STROSSER ◽  
ANTHONY HARKIN

AbstractModularity maximization has been one of the most widely used approaches in the last decade for discovering community structure in networks of practical interest in biology, computing, social science, statistical mechanics, and more. Modularity is a quality function that measures the difference between the number of edges found within clusters minus the number of edges one would statistically expect to find based on some equivalent random graph model. We explore a natural generalization of modularity based on the difference between the actual and expected number of walks within clusters, which we refer to as walk-modularity. Walk-modularity can be expressed in matrix form, and community detection can be performed by finding the leading eigenvector of the walk-modularity matrix. We demonstrate community detection on both synthetic and real-world networks and find that walk-modularity maximization returns significantly improved results compared to traditional modularity maximization.


2020 ◽  
Author(s):  
Shalin Shah

<p>Consumer behavior in retail stores gives rise to product graphs based on copurchasing</p><p>or co-viewing behavior. These product graphs can be analyzed using</p><p>the known methods of graph analysis. In this paper, we analyze the product graph</p><p>at Target Corporation based on the Erd˝os-Renyi random graph model. In particular,</p><p>we compute clustering coefficients of actual and random graphs, and we find that</p><p>the clustering coefficients of actual graphs are much higher than random graphs.</p><p>We conduct the analysis on the entire set of products and also on a per category</p><p>basis and find interesting results. We also compute the degree distribution and</p><p>we find that the degree distribution is a power law as expected from real world</p><p>networks, contrasting with the ER random graph.</p>


Sign in / Sign up

Export Citation Format

Share Document