scholarly journals An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

2015 ◽  
Vol 2015 ◽  
pp. 1-18 ◽  
Author(s):  
Dawen Xia ◽  
Binfeng Wang ◽  
Yantao Li ◽  
Zhuobo Rong ◽  
Zili Zhang

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-PhaseK-Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy ofK-Means and then employ a MapReduce paradigm to redesign the optimizedK-Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared withK-Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.

PLoS ONE ◽  
2014 ◽  
Vol 9 (4) ◽  
pp. e91315 ◽  
Author(s):  
Minchao Wang ◽  
Wu Zhang ◽  
Wang Ding ◽  
Dongbo Dai ◽  
Huiran Zhang ◽  
...  

Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Dawen Xia ◽  
Xiaonan Lu ◽  
Huaqing Li ◽  
Wendong Wang ◽  
Yantao Li ◽  
...  

Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with MapReduce. To conquer these challenges, this paper presents a MapReduce-based Parallel Frequent Pattern growth (MR-PFP) algorithm to analyze the spatiotemporal characteristics of taxi operating using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, we first implement three methods, that is, Hadoop Archives (HAR), CombineFileInputFormat (CFIF), and Sequence Files (SF), to overcome the existing defects of Hadoop and then propose two strategies based on their performance evaluations. Next, we incorporate SF into Frequent Pattern growth (FP-growth) algorithm and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of taxi operating in both spatial and temporal dimensions by MR-PFP in parallel. The results demonstrate that MR-PFP is superior to existing Parallel FP-growth (PFP) algorithm in efficiency and scalability.


2021 ◽  
Vol 13 (4) ◽  
pp. 544
Author(s):  
Guohao Zhang ◽  
Bing Xu ◽  
Hoi-Fung Ng ◽  
Li-Ta Hsu

Accurate localization of road agents (GNSS receivers) is the basis of intelligent transportation systems, which is still difficult to achieve for GNSS positioning in urban areas due to the signal interferences from buildings. Various collaborative positioning techniques were recently developed to improve the positioning performance by the aid from neighboring agents. However, it is still challenging to study their performances comprehensively. The GNSS measurement error behavior is complicated in urban areas and unable to be represented by naive models. On the other hand, real experiments requiring numbers of devices are difficult to conduct, especially for a large-scale test. Therefore, a GNSS realistic urban measurement simulator is developed to provide measurements for collaborative positioning studies. The proposed simulator employs a ray-tracing technique searching for all possible interferences in the urban area. Then, it categorizes them into direct, reflected, diffracted, and multipath signal to simulate the pseudorange, C/N0, and Doppler shift measurements correspondingly. The performance of the proposed simulator is validated through real experimental comparisons with different scenarios based on commercial-grade receivers. The proposed simulator is also applied with different positioning algorithms, which verifies it is sophisticated enough for the collaborative positioning studies in the urban area.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Yiwen Zhang ◽  
Yuanyuan Zhou ◽  
Xing Guo ◽  
Jintao Wu ◽  
Qiang He ◽  
...  

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2229 ◽  
Author(s):  
Sen Zhang ◽  
Yong Yao ◽  
Jie Hu ◽  
Yong Zhao ◽  
Shaobo Li ◽  
...  

Traffic congestion prediction is critical for implementing intelligent transportation systems for improving the efficiency and capacity of transportation networks. However, despite its importance, traffic congestion prediction is severely less investigated compared to traffic flow prediction, which is partially due to the severe lack of large-scale high-quality traffic congestion data and advanced algorithms. This paper proposes an accessible and general workflow to acquire large-scale traffic congestion data and to create traffic congestion datasets based on image analysis. With this workflow we create a dataset named Seattle Area Traffic Congestion Status (SATCS) based on traffic congestion map snapshots from a publicly available online traffic service provider Washington State Department of Transportation. We then propose a deep autoencoder-based neural network model with symmetrical layers for the encoder and the decoder to learn temporal correlations of a transportation network and predicting traffic congestion. Our experimental results on the SATCS dataset show that the proposed DCPN model can efficiently and effectively learn temporal relationships of congestion levels of the transportation network for traffic congestion forecasting. Our method outperforms two other state-of-the-art neural network models in prediction performance, generalization capability, and computation efficiency.


Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 3928 ◽  
Author(s):  
Rateb Jabbar ◽  
Mohamed Kharbeche ◽  
Khalifa Al-Khalifa ◽  
Moez Krichen ◽  
Kamel Barkaoui

The concept of smart cities has become prominent in modern metropolises due to the emergence of embedded and connected smart devices, systems, and technologies. They have enabled the connection of every “thing” to the Internet. Therefore, in the upcoming era of the Internet of Things, the Internet of Vehicles (IoV) will play a crucial role in newly developed smart cities. The IoV has the potential to solve various traffic and road safety problems effectively in order to prevent fatal crashes. However, a particular challenge in the IoV, especially in Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communications, is to ensure fast, secure transmission and accurate recording of the data. In order to overcome these challenges, this work is adapting Blockchain technology for real time application (RTA) to solve Vehicle-to-Everything (V2X) communications problems. Therefore, the main novelty of this paper is to develop a Blockchain-based IoT system in order to establish secure communication and create an entirely decentralized cloud computing platform. Moreover, the authors qualitatively tested the performance and resilience of the proposed system against common security attacks. Computational tests showed that the proposed solution solved the main challenges of Vehicle-to-X (V2X) communications such as security, centralization, and lack of privacy. In addition, it guaranteed an easy data exchange between different actors of intelligent transportation systems.


2011 ◽  
Vol 34 (7) ◽  
pp. 850-861 ◽  
Author(s):  
Guan Yuan ◽  
Shixiong Xia ◽  
Lei Zhang ◽  
Yong Zhou ◽  
Cheng Ji

With the development of location-based services, such as the Global Positioning System and Radio Frequency Identification, a great deal of trajectory data can be collected. Therefore, how to mine knowledge from these data has become an attractive topic. In this paper, we propose an efficient trajectory-clustering algorithm based on an index tree. Firstly, an index tree is proposed to store trajectories and their similarity matrix, with which trajectories can be retrieved efficiently; secondly, a new conception of trajectory structure is introduced to analyse both the internal and external features of trajectories; then, trajectories are partitioned into trajectory segments according to their corners; furthermore, the similarity between every trajectory segment pairs is compared by presenting the structural similarity function; finally, trajectory segments are grouped into different clusters according to their location in the different levels of the index tree. Experimental results on real data sets demonstrate not only the efficiency and effectiveness of our algorithm, but also the great flexibility that feature sensitivity can be adjusted by different parameters, and the cluster results are more practically significant.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Xiang Ji ◽  
Huiqun Yu ◽  
Guisheng Fan ◽  
Huaiying Sun ◽  
Liqiong Chen

Vehicular ad hoc network (VANET) is an emerging technology for the future intelligent transportation systems (ITSs). The current researches are intensely focusing on the problems of routing protocol reliability and scalability across the urban VANETs. Vehicle clustering is testified to be a promising approach to improve routing reliability and scalability by grouping vehicles together to serve as the foundation for ITS applications. However, some prominent characteristics, like high mobility and uneven spatial distribution of vehicles, may affect the clustering performance. Therefore, how to establish and maintain stable clusters has become a challenging problem in VANETs. This paper proposes a link reliability-based clustering algorithm (LRCA) to provide efficient and reliable data transmission in VANETs. Before clustering, a novel link lifetime-based (LLT-based) neighbor sampling strategy is put forward to filter out the redundant unstable neighbors. The proposed clustering scheme mainly composes of three parts: cluster head selection, cluster formation, and cluster maintenance. Furthermore, we propose a routing protocol of LRCA to serve the infotainment applications in VANET. To make routing decisions appropriate, we nominate special nodes at intersections to evaluate the network condition by assigning weights to the road segments. Routes with the lowest weights are then selected as the optimal data forwarding paths. We evaluate clustering stability and routing performance of the proposed approach by comparing with some existing schemes. The extensive simulation results show that our approach outperforms in both cluster stability and data transmission.


2009 ◽  
Vol 4 (10) ◽  
Author(s):  
Jianfeng Yang ◽  
Puliu Yan ◽  
Yinbo Xie ◽  
Qing Geng ◽  
Jolly Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document