Optimal hot spot allocation on meshes for large-scale data-parallel algorithms

1995 ◽  
Vol 6 (8) ◽  
pp. 788-802 ◽  
Author(s):  
Soo-Young Lee ◽  
Chung-Ming Chen
Author(s):  
Krzysztof Jurczuk ◽  
Marcin Czajkowski ◽  
Marek Kretowski

AbstractThis paper concerns the evolutionary induction of decision trees (DT) for large-scale data. Such a global approach is one of the alternatives to the top-down inducers. It searches for the tree structure and tests simultaneously and thus gives improvements in the prediction and size of resulting classifiers in many situations. However, it is the population-based and iterative approach that can be too computationally demanding to apply for big data mining directly. The paper demonstrates that this barrier can be overcome by smart distributed/parallel processing. Moreover, we ask the question whether the global approach can truly compete with the greedy systems for large-scale data. For this purpose, we propose a novel multi-GPU approach. It incorporates the knowledge of global DT induction and evolutionary algorithm parallelization together with efficient utilization of memory and computing GPU’s resources. The searches for the tree structure and tests are performed simultaneously on a CPU, while the fitness calculations are delegated to GPUs. Data-parallel decomposition strategy and CUDA framework are applied. Experimental validation is performed on both artificial and real-life datasets. In both cases, the obtained acceleration is very satisfactory. The solution is able to process even billions of instances in a few hours on a single workstation equipped with 4 GPUs. The impact of data characteristics (size and dimension) on convergence and speedup of the evolutionary search is also shown. When the number of GPUs grows, nearly linear scalability is observed what suggests that data size boundaries for evolutionary DT mining are fading.


2017 ◽  
Vol 7 (4) ◽  
pp. 37-49
Author(s):  
Amrit Pal ◽  
Manish Kumar

Frequent Itemset Mining is a well-known area in data mining. Most of the techniques available for frequent itemset mining requires complete information about the data which can result in generation of the association rules. The amount of data is increasing day by day taking form of BigData, which require changes in the algorithms for working on such large-scale data. Parallel implementation of the mining techniques can provide solutions to this problem. In this paper a survey of frequent itemset mining techniques is done which can be used in a parallel environment. Programming models like Map Reduce provides efficient architecture for working with BigData, paper also provides information about issues and feasibility about technique to be implemented in such environment.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zihe Huang ◽  
Shangbing Gao ◽  
Chuangxin Cai ◽  
Hao Zheng ◽  
Zhigeng Pan ◽  
...  

AbstractWith the development of city size and vehicle interconnection, visual analysis technology is playing a very important role in the course of city calculation and city perception. A Reasonable visual model can effectively present the feature of city. In order to solve the problem of traditional density algorithm that cluster the large scale data slowly and cannot find cluster centers to adapt taxi track data. The DBSCAN+ (density-based spatial clustering of applications with noise plus) algorithm that can split data and extract maximum density clusters under the large scale data was proposed in the paper. The passenger points should be cleaned from the original point of the passenger trajectory data firstly, and then the massive passenger points are sliced and clustered cyclically. In the clustering process, the cluster centers can be extracted based on maximum density, and finally the clustering results are visualized according to the results. The experimental results show that compared with other popular methods, the proposed method has significant advantages in clustering speed, precision and visualization for large-scale city passenger hotspots. Moreover, it provides important decisions for further urban planning and promotes the traffic efficiency.


2009 ◽  
Vol 28 (11) ◽  
pp. 2737-2740
Author(s):  
Xiao ZHANG ◽  
Shan WANG ◽  
Na LIAN

2016 ◽  
Author(s):  
John W. Williams ◽  
◽  
Simon Goring ◽  
Eric Grimm ◽  
Jason McLachlan

Sign in / Sign up

Export Citation Format

Share Document