scholarly journals Walk-modularity and community structure in networks

2015 ◽  
Vol 3 (3) ◽  
pp. 348-360 ◽  
Author(s):  
DAVID MEHRLE ◽  
AMY STROSSER ◽  
ANTHONY HARKIN

AbstractModularity maximization has been one of the most widely used approaches in the last decade for discovering community structure in networks of practical interest in biology, computing, social science, statistical mechanics, and more. Modularity is a quality function that measures the difference between the number of edges found within clusters minus the number of edges one would statistically expect to find based on some equivalent random graph model. We explore a natural generalization of modularity based on the difference between the actual and expected number of walks within clusters, which we refer to as walk-modularity. Walk-modularity can be expressed in matrix form, and community detection can be performed by finding the leading eigenvector of the walk-modularity matrix. We demonstrate community detection on both synthetic and real-world networks and find that walk-modularity maximization returns significantly improved results compared to traditional modularity maximization.

2021 ◽  
pp. 1-26
Author(s):  
Bogumił Kamiński ◽  
Paweł Prałat ◽  
François Théberge

Abstract Most of the current complex networks that are of interest to practitioners possess a certain community structure that plays an important role in understanding the properties of these networks. For instance, a closely connected social communities exhibit faster rate of transmission of information in comparison to loosely connected communities. Moreover, many machine learning algorithms and tools that are developed for complex networks try to take advantage of the existence of communities to improve their performance or speed. As a result, there are many competing algorithms for detecting communities in large networks. Unfortunately, these algorithms are often quite sensitive and so they cannot be fine-tuned for a given, but a constantly changing, real-world network at hand. It is therefore important to test these algorithms for various scenarios that can only be done using synthetic graphs that have built-in community structure, power law degree distribution, and other typical properties observed in complex networks. The standard and extensively used method for generating artificial networks is the LFR graph generator. Unfortunately, this model has some scalability limitations and it is challenging to analyze it theoretically. Finally, the mixing parameter μ, the main parameter of the model guiding the strength of the communities, has a non-obvious interpretation and so can lead to unnaturally defined networks. In this paper, we provide an alternative random graph model with community structure and power law distribution for both degrees and community sizes, the Artificial Benchmark for Community Detection (ABCD graph). The model generates graphs with similar properties as the LFR one, and its main parameter ξ can be tuned to mimic its counterpart in the LFR model, the mixing parameter μ. We show that the new model solves the three issues identified above and more. In particular, we test the speed of our algorithm and do a number of experiments comparing basic properties of both ABCD and LFR. The conclusion is that these models produce graphs with comparable properties but ABCD is fast, simple, and can be easily tuned to allow the user to make a smooth transition between the two extremes: pure (independent) communities and random graph with no community structure.


2019 ◽  
Vol 66 ◽  
pp. 443-472
Author(s):  
Carlos Ansótegui ◽  
Maria Luisa Bonet ◽  
Jesús Giráldez-Cru ◽  
Jordi Levy ◽  
Laurent Simon

Modern SAT solvers have experienced a remarkable progress on solving industrial instances. It is believed that most of these successful techniques exploit the underlying structure of industrial instances. Recently, there have been some attempts to analyze the structure of industrial SAT instances in terms of complex networks, with the aim of explaining the success of SAT solving techniques, and possibly improving them. In this paper, we study the community structure, or modularity, of industrial SAT instances. In a graph with clear community structure, or high modularity, we can find a partition of its nodes into communities such that most edges connect variables of the same community. Representing SAT instances as graphs, we show that most application benchmarks are characterized by a high modularity. On the contrary, random SAT instances are closer to the classical Erdös-Rényi random graph model, where no structure can be observed. We also analyze how this structure evolves by the effects of the execution of a CDCL SAT solver, and observe that new clauses learned by the solver during the search contribute to destroy the original structure of the formula. Motivated by this observation, we finally present an application that exploits the community structure to detect relevant learned clauses, and we show that detecting these clauses results in an improvement on the performance of the SAT solver. Empirically, we observe that this improves the performance of several SAT solvers on industrial SAT formulas, especially on satisfiable instances.


2014 ◽  
Vol 28 (05) ◽  
pp. 1450037 ◽  
Author(s):  
Hui-Jia Li ◽  
Bingying Xu ◽  
Liang Zheng ◽  
Jia Yan

Revealing ideal community structure efficiently is very important for scientists from many fields. However, it is difficult to infer an ideal community division structure by only analyzing the topology information due to the increment and complication of the social network. Recent research on community detection uncovers that its performance could be improved by incorporating the node attribute information. Along this direction, this paper improves the Blondel–Guillaume–Lambiotte (BGL) method, which is a fast algorithm based on modularity maximization, by integrating the community attribute entropy. To fulfill this goal, our algorithm minimizes the community attribute entropy by removing the boundary nodes which are generated in the modularity maximization at each iteration. By this way, the communities detected by our algorithm make a balance between modularity maximization and community attribute entropy minimization. In addition, another merit of our algorithm is that it is free of parameters. Comprehensive experiments have been conducted on both artificial and real networks to compare the proposed community detection algorithm with several state-of-the-art ones. As the experimental results indicate, our algorithm demonstrates superior performance.


Author(s):  
Mark Newman

A discussion of the most fundamental of network models, the configuration model, which is a random graph model of a network with a specified degree sequence. Following a definition of the model a number of basic properties are derived, including the probability of an edge, the expected number of multiedges, the excess degree distribution, the friendship paradox, and the clustering coefficient. This is followed by derivations of some more advanced properties including the condition for the existence of a giant component, the size of the giant component, the average size of a small component, and the expected diameter. Generating function methods for network models are also introduced and used to perform some more advanced calculations, such as the calculation of the distribution of the number of second neighbors of a node and the complete distribution of sizes of small components. The chapter ends with a brief discussion of extensions of the configuration model to directed networks, bipartite networks, networks with degree correlations, networks with high clustering, and networks with community structure, among other possibilities.


Author(s):  
Mark Newman

An introduction to the mathematics of the Poisson random graph, the simplest model of a random network. The chapter starts with a definition of the model, followed by derivations of basic properties like the mean degree, degree distribution, and clustering coefficient. This is followed with a detailed derivation of the large-scale structural properties of random graphs, including the position of the phase transition at which a giant component appears, the size of the giant component, the average size of the small components, and the expected diameter of the network. The chapter ends with a discussion of some of the shortcomings of the random graph model.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Vesa Kuikka

AbstractWe present methods for analysing hierarchical and overlapping community structure and spreading phenomena on complex networks. Different models can be developed for describing static connectivity or dynamical processes on a network topology. In this study, classical network connectivity and influence spreading models are used as examples for network models. Analysis of results is based on a probability matrix describing interactions between all pairs of nodes in the network. One popular research area has been detecting communities and their structure in complex networks. The community detection method of this study is based on optimising a quality function calculated from the probability matrix. The same method is proposed for detecting underlying groups of nodes that are building blocks of different sub-communities in the network structure. We present different quantitative measures for comparing and ranking solutions of the community detection algorithm. These measures describe properties of sub-communities: strength of a community, probability of formation and robustness of composition. The main contribution of this study is proposing a common methodology for analysing network structure and dynamics on complex networks. We illustrate the community detection methods with two small network topologies. In the case of network spreading models, time development of spreading in the network can be studied. Two different temporal spreading distributions demonstrate the methods with three real-world social networks of different sizes. The Poisson distribution describes a random response time and the e-mail forwarding distribution describes a process of receiving and forwarding messages.


Biology ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 499
Author(s):  
Ali Andalibi ◽  
Naoru Koizumi ◽  
Meng-Hao Li ◽  
Abu Bakkar Siddique

Kanagawa and Hokkaido were affected by COVID-19 in the early stage of the pandemic. Japan’s initial response included contact tracing and PCR analysis on anyone who was suspected of having been exposed to SARS-CoV-2. In this retrospective study, we analyzed publicly available COVID-19 registry data from Kanagawa and Hokkaido (n = 4392). Exponential random graph model (ERGM) network analysis was performed to examine demographic and symptomological homophilies. Age, symptomatic, and asymptomatic status homophilies were seen in both prefectures. Symptom homophilies suggest that nuanced genetic differences in the virus may affect its epithelial cell type range and can result in the diversity of symptoms seen in individuals infected by SARS-CoV-2. Environmental variables such as temperature and humidity may also play a role in the overall pathogenesis of the virus. A higher level of asymptomatic transmission was observed in Kanagawa. Moreover, patients who contracted the virus through secondary or tertiary contacts were shown to be asymptomatic more frequently than those who contracted it from primary cases. Additionally, most of the transmissions stopped at the primary and secondary levels. As expected, significant viral transmission was seen in healthcare settings.


Sign in / Sign up

Export Citation Format

Share Document