scholarly journals MAPSkew: Metaheuristic Approaches for Partitioning Skew in MapReduce

Algorithms ◽  
2018 ◽  
Vol 12 (1) ◽  
pp. 5
Author(s):  
Matheus Pericini ◽  
Lucas Leite ◽  
Francisco de Carvalho-Junior ◽  
Javam Machado ◽  
Cenez Rezende

MapReduce is a parallel computing model in which a large dataset is split into smaller parts and executed on multiple machines. Due to its simplicity, MapReduce has been widely used in various applications domains. MapReduce can significantly reduce the processing time of a large amount of data by dividing the dataset into smaller parts and processing them in parallel in multiple machines. However, when data are not uniformly distributed, we have the so called partitioning skew, where the allocation of tasks to machines becomes unbalanced, either by the distribution function splitting the dataset unevenly or because a part of the data is more complex and requires greater computational effort. To solve this problem, we propose an approach based on metaheuristics. For evaluating purposes, three metaheuristics were implemented: Simulated Annealing, Local Beam Search and Stochastic Beam Search. Our experimental evaluation, using a MapReduce implementation of the Bron-Kerbosch Clique Algorithm, shows that the proposed method can find good partitionings while better balancing data among machines.

2011 ◽  
Vol 10 (1-2) ◽  
pp. 39
Author(s):  
A. N. Diógenes ◽  
L. O. E. dos Santos ◽  
C. P. Fernandes

The procedure for obtaining the particle size distribution by visual inspection of a sample involves stereological errors, given the cut of the sample. A cut particle, supposedly spherical, with radius R, will be counted as a circular particle with radius r, r≤R. The difference between r and R depends on how far from the center of the sphere the cut was performed. This introduces errors when the extrapolation of the properties from two to three dimensions during the analysis of a sample. The usual method is to correct the distribution by probabilistic functions, which have large errors. This paper presents a method to reduce the error inherent to this problem. The method is to compute a simulation of the preparation process in a sample whose structure can be described by non-penetrating spheres of various diameters which meet a known probability distribution function, for example, a log-logistic function, or even a constant function. For each distribution radius, a number of spheres is generated and virtually cut, generating a bi-dimensional (2D) distribution. The 2D curves of the spheres distribution obtained in this simulation are compared with that obtained by the experimental procedure and then the parameters of the threedimensional distribution function are adjusted until the 2D curves are similar to the experimental one using the optimization method Simulated Annealing for the curve-fitting. In future this method will be applied to the analysis of the oil reservoir rocks.


Sign in / Sign up

Export Citation Format

Share Document