BGRAP: Balanced GRAph Partitioning Algorithm for Large Graphs

2021 ◽  
Vol 2 (2) ◽  
pp. 116-135
Author(s):  
Adnan El Moussawi ◽  
Nacera Bennacer Seghouani ◽  
Francesca Bugiotti

The definition of effective strategies for graph partitioning is a major challenge in distributed environments since an effective graph partitioning allows to considerably improve the performance of large graph data analytics computations. In this paper, we propose a multi-objective and scalable Balanced GRAph Partitioning (\algo) algorithm, based on Label Propagation (LP) approach, to produce balanced graph partitions. \algo defines a new efficient initialization procedure and different objective functions to deal with either vertex or edge balance constraints while considering edge direction in graphs. \algo is implemented of top of the open source distributed graph processing system Giraph. The experiments are performed on various graphs with different structures and sizes (going up to 50.6M vertices and 1.9B edges) while varying the number of partitions. We evaluate \algo using several quality measures and the computation time. The results show that \algo (i) provides a good balance while reducing the cuts between the different computed partitions (ii) reduces the global computation time, compared to LP-based algorithms.

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 72801-72813
Author(s):  
Minho Bae ◽  
Minjoong Jeong ◽  
Sangyoon Oh

Author(s):  
Fatma Chiheb ◽  
Fatima Boumahdi ◽  
Hafida Bouarfa

Big Data is an important topic for discussion and research. It has gained this importance due to the meaningful value that could be extracted from these data. The application of Big Data in the modern business allows enterprises to take faster and smarter decisions, achieving a real competitive advantage. However, a lot of Big Data projects provide disappointing results that don't address the decision-makers' needs due to many reasons. The main reason for this failure can be summarized in neglecting the study of the decision-making aspect of these projects. In light of this challenge, this study proposes the integration of decision aspect into Big Data as a solution. Therefore, this article presents three main contributions: 1) Clarify the definition of Big Data; 2) Presents BD-Da model, a conceptual model describes the levels that should be considered to develop a Big Data project aiming to solve a problem that calls a decision; 3) Describes a particular, logical, requirements-like approach that explains how a company develops a Big Data analytics project to support decision-making.


2016 ◽  
Vol 2016 ◽  
pp. 1-15 ◽  
Author(s):  
Dongqing Zhou ◽  
Xing Wang

The paper addresses particle swarm optimization (PSO) into community detection problem, and an algorithm based on new label strategy is proposed. In contrast with other label propagation strategies, the main contribution of this paper is to design the definition of the impact of node and take it into use. Special initialization and update approaches based on it are designed in order to make full use of it. Experiments on synthetic and real-life networks show the effectiveness of proposed strategy. Furthermore, this strategy is extended to signed networks, and the corresponding objective function which is called modularity density is modified to be used in signed networks. Experiments on real-life networks also demonstrate that it is an efficacious way to solve community detection problem.


Kybernetes ◽  
2016 ◽  
Vol 45 (3) ◽  
pp. 508-520
Author(s):  
Mario Iván Tarride

Purpose – The purpose of this paper is to discuss the condition of human beings and organizations producing goods and/or services as autopoietic and allopoietic machines, with the aim of establishing a functional homomorphism between the productive system of an organization and the productive system of human beings, a matter that involves reflecting on what human beings do that is distinguished as allopoietic by an observer. Design/methodology/approach – Use is made of Ashby’s concept of functional homomorphism to establish similarities between human beings and organizations. The definitions of autopoietic and allopoietic machine of Maturana and Varela are used to distinguish similarities and differences between what organizations do and what human beings do. Findings – As a result of using the autopoietic/allopoietic viewpoint, it is proposed to homologate the human nervous system with the production system of an organization, defining the latter as a world-creating energy/communication processing system. Research limitations/implications – A homomorphism is established here between a human nervous system and the production system of an organization; it remains pending the other homomorphisms that can be made between the systems of the human body and the organization. Practical implications – A proposal is made to understand an organization as a world-creating energy/communication processing system, and it is estimated that this would imply displacing attention, at present strongly centered on the generated products and/or services, toward the sense that they have for both persons and society, restating the question on the world we construct/live in, from the organizational standpoint. Originality/value – Human beings are seen as allopoietic machines, aiming to contribute to the discussion about what it is that we call human, homologating it with the work of an organization. As a result a new definition of organization is proposed.


2017 ◽  
Author(s):  
Alex D. Washburne

AbstractData from biological communities are composed of species connected by the phylogeny. A greedy algorithm ‘phylofactorization’ - was developed to construct an isometric log-ratio transform whose balances correspond to edges along which traits arose, controlling for previously made inferences.In this paper, the general theory of phylofactorization is presented as a graph-partitioning algorithm. A special case-regression phylofactorization-chooses coordinates based on sequential maximization of objective functions from regression on “contrast” variables such as an isometric log-ratio transform. The connections between regression phylofactorization and other methods is discussed, including matrix factorization, hierarchical regression, factor analysis and latent variable models. Open challenges in the statistical analysis of phylofactorization are presented, including criteria for choosing the number of factors and approximating null-distributions of commonly used test statistics and objective functions. As a graph-partitioning algorithm, cross-validation of phylo factorization across datasets requires graph-topological considerations, such as how to deal with novel nodes and edges and whether or not to control for partition order. Overcoming these challenges can accelerate our analysis of phylogenetically-structured data and allow annotations of edges in an online tree of life.


Author(s):  
Masatomo Inui ◽  
Kouhei Nishimiya ◽  
Nobuyuki Umezu

Abstract Clearance is a basic parameter in the design of mechanical products, generally specified as the distance between two shape elements, for example, the width of a slot. This definition is unsuitable for evaluating the clearance during assembly or manufacturing tasks, where the depth information is also critical. In this paper, we propose a novel definition of clearance for the surface of three-dimensional objects. Unlike the typical methods used to define clearance, the proposed method can simultaneously handle the relationship between the width and depth in the clearance, and thus, obtain an intuitive understanding regarding the assembly and manufacturing capability of a product. Our definition is based on the accessibility cone of a point on the object’s surface; further, the peak angle of the accessibility cone corresponds to the clearance at this point. A computation method of the clearance is presented and the results of its application are demonstrated. Our method uses the rendering function of a graphics processing unit to compute the clearance. A large computation time necessary for the analysis is considered as a problem regarding the practical use of this clearance definition.


2020 ◽  
Vol 10 (21) ◽  
pp. 7842
Author(s):  
Hyundo Yoon ◽  
Soojung Moon ◽  
Youngki Kim ◽  
Changhee Hahn ◽  
Wonjun Lee ◽  
...  

Public key encryption with keyword search (PEKS) enables users to search over encrypted data outsourced to an untrusted server. Unfortunately, updates to the outsourced data may incur information leakage by exploiting the previously submitted queries. Prior works addressed this issue by means of forward privacy, but most of them suffer from significant performance degradation. In this paper, we present a novel forward private PEKS scheme leveraging Software Guard Extension (SGX), a trusted execution environment provided by Intel. The proposed scheme presents substantial performance improvements over prior work. Specifically, we reduce the query processing cost from O(n) to O(1), where n is the number of encrypted data. According to our performance analysis, the overall computation time is reduced by 80% on average. Lastly, we provide a formal security definition of SGX-based forward private PEKS, as well as a rigorous security proof of the proposed scheme.


2020 ◽  
Vol 27 (9) ◽  
pp. 1466-1475
Author(s):  
Lytske Bakker ◽  
Jos Aarts ◽  
Carin Uyl-de Groot ◽  
William Redekop

Abstract Objective Much has been invested in big data analytics to improve health and reduce costs. However, it is unknown whether these investments have achieved the desired goals. We performed a scoping review to determine the health and economic impact of big data analytics for clinical decision-making. Materials and Methods We searched Medline, Embase, Web of Science and the National Health Services Economic Evaluations Database for relevant articles. We included peer-reviewed papers that report the health economic impact of analytics that assist clinical decision-making. We extracted the economic methods and estimated impact and also assessed the quality of the methods used. In addition, we estimated how many studies assessed “big data analytics” based on a broad definition of this term. Results The search yielded 12 133 papers but only 71 studies fulfilled all eligibility criteria. Only a few papers were full economic evaluations; many were performed during development. Papers frequently reported savings for healthcare payers but only 20% also included costs of analytics. Twenty studies examined “big data analytics” and only 7 reported both cost-savings and better outcomes. Discussion The promised potential of big data is not yet reflected in the literature, partly since only a few full and properly performed economic evaluations have been published. This and the lack of a clear definition of “big data” limit policy makers and healthcare professionals from determining which big data initiatives are worth implementing.


Big Data ◽  
2016 ◽  
pp. 1-29 ◽  
Author(s):  
Yushi Shen ◽  
Yale Li ◽  
Ling Wu ◽  
Shaofeng Liu ◽  
Qian Wen

This chapter provides an overview of big data and its environment and opportunities. It starts with a definition of big data and describes the unique characteristics, structure, and value of big data, and the business drivers for big data analytics. It defines the role of the data scientist and describes the new ecosystem for big data processing and analysis.


Sign in / Sign up

Export Citation Format

Share Document