Generating bipartite networks with a prescribed joint degree distribution

Asma Azizi Boroojeni; Jeremy Dewar; Tong Wu; James M Hyman

doi:10.1093/comnet/cnx014

Generating bipartite networks with a prescribed joint degree distribution

Journal of Complex Networks ◽

10.1093/comnet/cnx014 ◽

2017 ◽

Vol 5 (6) ◽

pp. 839-857 ◽

Cited By ~ 1

Author(s):

Asma Azizi Boroojeni ◽

Jeremy Dewar ◽

Tong Wu ◽

James M Hyman

Keyword(s):

Degree Distribution ◽

Bipartite Network ◽

Degree Sequences ◽

Bipartite Networks ◽

Original Network ◽

Software Environment ◽

Target Network ◽

Disjoint Sets ◽

New Algorithms ◽

Partnership Networks

Abstract We describe a class of new algorithms to construct bipartite networks that preserves a prescribed degree and joint-degree (degree–degree) distribution of the nodes. Bipartite networks are graphs that can represent real-world interactions between two disjoint sets, such as actor–movie networks, author–article networks, co-occurrence networks and heterosexual partnership networks. Often there is a strong correlation between the degree of a node and the degrees of the neighbours of that node that must be preserved when generating a network that reflects the structure of the underling system. Our bipartite $2K$ ($B2K$) algorithms generate an ensemble of networks that preserve prescribed degree sequences for the two disjoint set of nodes in the bipartite network, and the joint-degree distribution that is the distribution of the degrees of all neighbours of nodes with the same degree. We illustrate the effectiveness of the algorithms on a romance network using the NetworkX software environment to compare other properties of a target network that are not directly enforced by the $B2K$ algorithms. We observe that when average degree of nodes is low, as is the case for romance and heterosexual partnership networks, then the $B2K$ networks tend to preserve additional properties, such as the cluster coefficients, than algorithms that do not preserve the joint-degree distribution of the original network.

Download Full-text

ON THE CONDITIONAL MATCHING OF FRACTAL NETWORKS

Fractals ◽

10.1142/s0218348x16500547 ◽

2016 ◽

Vol 24 (04) ◽

pp. 1650054 ◽

Cited By ~ 1

Author(s):

YANCHUN WANG ◽

WEIGANG SUN ◽

JINGYUAN ZHANG ◽

SEN QIN

Keyword(s):

Cayley Tree ◽

Degree Sequences ◽

Original Network

In this paper, we propose a new matching (called a conditional matching), where the condition refers to the matching of the new constructed network which includes all the nodes in the original network. We then enumerate the conditional matchings of the new network and prove that the number of conditional matchings is just the product of degree sequences of the original network. We choose two families of fractal networks to show our obtained results, including the pseudofractal network and Cayley tree. Finally, we calculate the entropy of the conditional matchings on the considered networks and see that the entropy of Cayley tree is smaller than that of the pseudofractal network.

Download Full-text

Continual representation learning for evolving biomedical bipartite networks

Bioinformatics ◽

10.1093/bioinformatics/btab067 ◽

2021 ◽

Author(s):

Kishlay Jha ◽

Guangxu Xun ◽

Aidong Zhang

Keyword(s):

Network Structure ◽

Learning Strategy ◽

Structure Learning ◽

Fundamental Problem ◽

Representation Learning ◽

Research Area ◽

Bipartite Network ◽

Bipartite Networks ◽

Straightforward Application ◽

Low Dimensional

Abstract Motivation Many real-world biomedical interactions such as ‘gene-disease’, ‘disease-symptom’ and ‘drug-target’ are modeled as a bipartite network structure. Learning meaningful representations for such networks is a fundamental problem in the research area of Network Representation Learning (NRL). NRL approaches aim to translate the network structure into low-dimensional vector representations that are useful to a variety of biomedical applications. Despite significant advances, the existing approaches still have certain limitations. First, a majority of these approaches do not model the unique topological properties of bipartite networks. Consequently, their straightforward application to the bipartite graphs yields unsatisfactory results. Second, the existing approaches typically learn representations from static networks. This is limiting for the biomedical bipartite networks that evolve at a rapid pace, and thus necessitate the development of approaches that can update the representations in an online fashion. Results In this research, we propose a novel representation learning approach that accurately preserves the intricate bipartite structure, and efficiently updates the node representations. Specifically, we design a customized autoencoder that captures the proximity relationship between nodes participating in the bipartite bicliques (2 × 2 sub-graph), while preserving both the global and local structures. Moreover, the proposed structure-preserving technique is carefully interleaved with the central tenets of continual machine learning to design an incremental learning strategy that updates the node representations in an online manner. Taken together, the proposed approach produces meaningful representations with high fidelity and computational efficiency. Extensive experiments conducted on several biomedical bipartite networks validate the effectiveness and rationality of the proposed approach.

Download Full-text

A Bayesian Inference Method Using Monte Carlo Sampling for Estimating the Number of Communities in Bipartite Networks

Scientific Programming ◽

10.1155/2019/9471201 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Guo-Zheng Wang ◽

Li Xiong ◽

Hu-Chen Liu

Keyword(s):

Monte Carlo ◽

Bayesian Inference ◽

Probability Distribution ◽

Community Detection ◽

Degree Distribution ◽

Prior Probability ◽

Bipartite Networks ◽

Inference Method ◽

Prior Probability Distribution ◽

Bayesian Inference Method

Community detection is an important analysis task for complex networks, including bipartite networks, which consist of nodes of two types and edges connecting only nodes of different types. Many community detection methods take the number of communities in the networks as a fixed known quantity; however, it is impossible to give such information in advance in real-world networks. In our paper, we propose a projection-free Bayesian inference method to determine the number of pure-type communities in bipartite networks. This paper makes the following contributions: (1) we present the first principle derivation of a practical method, using the degree-corrected bipartite stochastic block model that is able to deal with networks with broad degree distributions, for estimating the number of pure-type communities of bipartite networks; (2) a prior probability distribution is proposed over the partition of a bipartite network; (3) we design a Monte Carlo algorithm incorporated with our proposed method and prior probability distribution. We give a demonstration of our algorithm on synthetic bipartite networks including an easy case with a homogeneous degree distribution and a difficult case with a heterogeneous degree distribution. The results show that the algorithm gives the correct number of communities of synthetic networks in most cases and outperforms the projection method especially in the networks with heterogeneous degree distributions.

Download Full-text

An Empirical Analysis of Developer Collaboration Network

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.2177 ◽

2013 ◽

Vol 303-306 ◽

pp. 2177-2181

Author(s):

Cheng Xiang Peng

Keyword(s):

Open Source Software ◽

Network Theory ◽

Collaboration Network ◽

Node Degree ◽

Bipartite Network ◽

Bipartite Networks ◽

Software Projects ◽

Intrinsic Nature ◽

Domain Experts ◽

Social Collaboration

To further verify the uses of bipartite network theory and understand the intrinsic nature in social collaboration network. In this paper, we get the information of open source software projects from Source-Forge web and construct a project management collaboration network by analyzing the data of project and manager. Then, through the ordinary projection two kinds of one-mode network are made and the degree distribution of one-mode network and origin bipartite networks shows a power-law like. Finally we evaluate the node's importance on manager network to acquire the core nodes, namely domain experts, by using the metric of node degree, between and topological potential respectively, and provide some helpful applications.

Download Full-text

Detecting Communities in 2-Mode Networks via Fast Nonnegative Matrix Trifactorization

Mathematical Problems in Engineering ◽

10.1155/2015/937090 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Liu Yang ◽

Wang Tao ◽

Ji Xin-sheng ◽

Liu Caixia ◽

Xu Mingyan

Keyword(s):

Community Structure ◽

Rapid Development ◽

Nonnegative Matrix ◽

Communication Technologies ◽

Bipartite Network ◽

Bipartite Networks ◽

Slow Convergence ◽

Network Information ◽

Relational Networks ◽

Multiple Dimensions

With the rapid development of the Internet and communication technologies, a large number of multitype relational networks widely emerge in real world applications. The bipartite network is one representative and important kind of complex networks. Detecting community structure in bipartite networks is crucial to obtain a better understanding of the network structures and functions. Traditional nonnegative matrix factorization methods usually focus on homogeneous networks, and they are subject to several problems such as slow convergence and large computation. It is challenging to effectively integrate the network information of multiple dimensions in order to discover the hidden community structure underlying heterogeneous interactions. In this work, we present a novel fast nonnegative matrix trifactorization (F-NMTF) method to cocluster the 2-mode nodes in bipartite networks. By constructing the affinity matrices of 2-mode nodes as manifold regularizations of NMTF, we manage to incorporate the intratype and intratype information of 2-mode nodes to reveal the latent community structure in bipartite networks. Moreover, we decompose the NMTF problem into two subproblems, which are involved with much less matrix multiplications and achieve faster convergence. Experimental results on synthetic and real bipartite networks show that the proposed method improves the slow convergence of NMTF and achieves high accuracy and stability on the results of community detection.

Download Full-text

Generation of 2-mode scale-free graphs for link-level internet topology modeling

PLoS ONE ◽

10.1371/journal.pone.0240100 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0240100

Author(s):

Khalid Bakhshaliyev ◽

Mehmet Hadi Gunes

Keyword(s):

Power Law ◽

Degree Distribution ◽

Autonomous Systems ◽

Node Degree ◽

Bipartite Network ◽

Research Issues ◽

Scale Free ◽

Internet Topology ◽

Scale Free Graphs ◽

Multi Access

Comprehensive analysis that aims to understand the topology of real-world networks and the development of algorithms that replicate their characteristics has been significant research issues. Although the accuracy of newly developed network protocols or algorithms does not depend on the underlying topology, the performance generally depends on the topology. As a result, network practitioners have concentrated on generating representative synthetic topologies and utilize them to investigate the performance of their design in simulation or emulation environments. Network generators typically represent the Internet topology as a graph composed of point-to-point links. In this study, we discuss the implications of multi-access links on the synthetic network generation and modeling of the networks as bi-partite graphs to represent both subnetworks and routers. We then analyze the characteristics of sampled Internet topology data sets from backbone Autonomous Systems (AS) and observe that in addition to the commonly recognized power-law node degree distribution, the subnetwork size and the router interface distributions often exhibit power-law characteristics. We introduce a SubNetwork Generator (SubNetG) topology generation approach that incorporates the observed measurements to produce bipartite network topologies. In particular, generated topologies capture the 2-mode relation between the layer-2 (i.e., the subnetwork and interface distributions) and the layer-3 (i.e., the degree distribution) that is missing from the current network generators that produce 1-mode graphs. The SubNetG source code and experimental data is available at https://github.com/netml/sonet.

Download Full-text

An Effective Approach for Modular Community Detection in Bipartite Network Based on Integrating Rider with Harris Hawks Optimization Algorithms

Journal of Mathematics ◽

10.1155/2021/9511425 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Bader Fahad Alkhamees ◽

Mogeeb A. A. Mosleh ◽

Hussain AlSalman ◽

Muhammad Azeem Akbar

Keyword(s):

Genetic Algorithm ◽

Community Detection ◽

Bipartite Network ◽

Bipartite Networks ◽

Simulation Experiments ◽

H Index ◽

The Hierarchical Structure ◽

Node Similarity ◽

Information Recommendation ◽

Evaluation Metric

The strenuous mining and arduous discovery of the concealed community structure in complex networks has received tremendous attention by the research community and is a trending domain in the multifaceted network as it not only reveals details about the hierarchical structure of multifaceted network but also assists in better understanding of the core functions of the network and subsequently information recommendation. The bipartite networks belong to the multifaceted network whose nodes can be divided into a dissimilar node-set so that no edges assist between the vertices. Even though the discovery of communities in one-mode network is briefly studied, community detection in bipartite networks is not studied. In this paper, we propose a novel Rider-Harris Hawks Optimization (RHHO) algorithm for community detection in a bipartite network through node similarity. The proposed RHHO is developed by the integration of the Rider Optimization (RO) algorithm with the Harris Hawks Optimization (HHO) algorithm. Moreover, a new evaluation metric, i.e., h-Tversky Index (h-TI), is also proposed for computing node similarity and fitness is newly devised considering modularity. The goal of modularity is to quantify the goodness of a specific division of network to evaluate the accuracy of the proposed community detection. The quantitative assessment of the proposed approach, as well as thorough comparative evaluation, was meticulously conducted in terms of fitness and modularity over the citation networks datasets (cit-HepPh and cit-HepTh) and bipartite network datasets (Movie Lens 100 K and American Revolution datasets). The performance was analyzed for 250 iterations of the simulation experiments. Experimental results have shown that the proposed method demonstrated a maximal fitness of 0.74353 and maximal modularity of 0.77433, outperforming the state-of-the-art approaches, including h-index-based link prediction, such as Multiagent Genetic Algorithm (MAGA), Genetic Algorithm (GA), Memetic Algorithm for Community Detection in Bipartite Networks (MATMCD-BN), and HHO.

Download Full-text

Extracting Core Users Based on Features of Users and Their Relationships in Recommender Systems

International Journal of Web Services Research ◽

10.4018/ijwsr.2017040101 ◽

2017 ◽

Vol 14 (2) ◽

pp. 1-23

Author(s):

Li Kuang ◽

Gaofeng Cao ◽

Liang Chen

Keyword(s):

Recommender System ◽

Degree Distribution ◽

Information Overload ◽

Extraction Methods ◽

Long Tail ◽

Trust Relationships ◽

Tail Distribution ◽

Trust Degree ◽

Interest Similarity ◽

New Algorithms

As an effective way to solve information overload, recommender system has drawn attention of scholars from various fields. However, existing works mainly focus on improving the accuracy of recommendation by designing new algorithms, while the different importance of individual users has not been well addressed. In this paper, the authors propose new approaches to identifying core users based on trust relationships and interest similarity between users, and the popular degree, trust influence and resource of individual users. First, the trust degree and interest similarity between all user pairs, as well as the three attributes of individuals are calculated. Second, a global core user set is constructed based on three strategies, which are frequency-based, rank-based, and fusion-sorting-based. Finally, the authors compare their proposed methods with other existing methods from accuracy, novelty, long-tail distribution and user degree distribution. Experiments show the effectiveness of the authors' core user extraction methods.

Download Full-text

Undirected Bipartite Networks as an Alternative Methodology to Probabilistic Exploration

Advances in Wireless Technologies and Telecommunication - Graph Theoretic Approaches for Analyzing Large-Scale Social Networks ◽

10.4018/978-1-5225-2814-2.ch005 ◽

2018 ◽

pp. 75-94

Author(s):

Juan-Francisco Martínez-Cerdá ◽

Joan Torrent-Sellens

Keyword(s):

Learning Analytics ◽

Online Courses ◽

Binary Logistic Regression ◽

First Year ◽

Bipartite Network ◽

Bipartite Networks ◽

Online Behavior ◽

Binary Logistic Regression Model ◽

Speed Up ◽

Alternative Methodology

This chapter explores how graph analysis techniques are able to complement and speed up the process of learning analytics and probability theory. It uses a sample of 2,353 e-learners from six European countries (France, Germany, Greece, Poland, Portugal, and Spain), who were enrolled in their first year of open online courses offered by HarvardX and MITX. After controlling the variables for socio-demographics and online content interactions, the research reveals two main results relating student-content interactions and online behavior. First, a multiple binary logistic regression model tests that students who explore online chapters are more likely to be certified. Second, the authors propose an algorithm to generate an undirected bipartite network based on tabular data of student-content interactions (2,392 nodes, 25,883 edges, a visual representation based on modularity, degree and ForceAtlas2 layout); the graph shows a clear relationship between interactions with online chapters and chances of getting certified.

Download Full-text

MVGCN: data integration through multi-view graph convolutional network for predicting links in biomedical bipartite networks

Bioinformatics ◽

10.1093/bioinformatics/btab651 ◽

2021 ◽

Author(s):

Haitao Fu ◽

Feng Huang ◽

Xuan Liu ◽

Yang Qiu ◽

Wen Zhang

Keyword(s):

Molecular Mechanisms ◽

Learning Strategy ◽

Information Aggregation ◽

Supplementary Information ◽

Bipartite Network ◽

Bipartite Networks ◽

Biomolecular Systems ◽

Convolutional Network ◽

Benchmark Datasets ◽

Node Attributes

Abstract Motivation There are various interaction/association bipartite networks in biomolecular systems. Identifying unobserved links in biomedical bipartite networks helps to understand the underlying molecular mechanisms of human complex diseases and thus benefits the diagnosis and treatment of diseases. Although a great number of computational methods have been proposed to predict links in biomedical bipartite networks, most of them heavily depend on features and structures involving the bioentities in one specific bipartite network, which limits the generalization capacity of applying the models to other bipartite networks. Meanwhile, bioentities usually have multiple features, and how to leverage them has also been challenging. Results In this study, we propose a novel multi-view graph convolution network (MVGCN) framework for link prediction in biomedical bipartite networks. We first construct a multi-view heterogeneous network (MVHN) by combining the similarity networks with the biomedical bipartite network, and then perform a self-supervised learning strategy on the bipartite network to obtain node attributes as initial embeddings. Further, a neighborhood information aggregation (NIA) layer is designed for iteratively updating the embeddings of nodes by aggregating information from inter- and intra-domain neighbors in every view of the MVHN. Next, we combine embeddings of multiple NIA layers in each view, and integrate multiple views to obtain the final node embeddings, which are then fed into a discriminator to predict the existence of links. Extensive experiments show MVGCN performs better than or on par with baseline methods and has the generalization capacity on six benchmark datasets involving three typical tasks. Availability and implementation Source code and data can be downloaded from https://github.com/fuhaitao95/MVGCN. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text