scholarly journals A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Dawen Xia ◽  
Xiaonan Lu ◽  
Huaqing Li ◽  
Wendong Wang ◽  
Yantao Li ◽  
...  

Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with MapReduce. To conquer these challenges, this paper presents a MapReduce-based Parallel Frequent Pattern growth (MR-PFP) algorithm to analyze the spatiotemporal characteristics of taxi operating using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, we first implement three methods, that is, Hadoop Archives (HAR), CombineFileInputFormat (CFIF), and Sequence Files (SF), to overcome the existing defects of Hadoop and then propose two strategies based on their performance evaluations. Next, we incorporate SF into Frequent Pattern growth (FP-growth) algorithm and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of taxi operating in both spatial and temporal dimensions by MR-PFP in parallel. The results demonstrate that MR-PFP is superior to existing Parallel FP-growth (PFP) algorithm in efficiency and scalability.

2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Weifeng Li ◽  
Xiaoyun Cheng ◽  
Zhengyu Duan ◽  
Dongyuan Yang ◽  
Gaohua Guo

The overall understanding of spatial interaction and the exact knowledge of its dynamic evolution are required in the urban planning and transportation planning. This study aimed to analyze the spatial interaction based on the large-scale mobile phone data. The newly arisen mass dataset required a new methodology which was compatible with its peculiar characteristics. A three-stage framework was proposed in this paper, including data preprocessing, critical activity identification, and spatial interaction measurement. The proposed framework introduced the frequent pattern mining and measured the spatial interaction by the obtained association. A case study of three communities in Shanghai was carried out as verification of proposed method and demonstration of its practical application. The spatial interaction patterns and the representative features proved the rationality of the proposed framework.


2014 ◽  
pp. 225-259 ◽  
Author(s):  
David C. Anastasiu ◽  
Jeremy Iverson ◽  
Shaden Smith ◽  
George Karypis

2015 ◽  
Vol 2015 ◽  
pp. 1-18 ◽  
Author(s):  
Dawen Xia ◽  
Binfeng Wang ◽  
Yantao Li ◽  
Zhuobo Rong ◽  
Zili Zhang

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-PhaseK-Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy ofK-Means and then employ a MapReduce paradigm to redesign the optimizedK-Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared withK-Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.


2013 ◽  
pp. 788-789 ◽  
Author(s):  
Jesús Aguilar-Ruiz ◽  
Domingo Rodríguez -Baena ◽  
Ronnie Alves

Sign in / Sign up

Export Citation Format

Share Document