Research on Taxi Travel Time Prediction Based on GBDT Machine Learning Method

Author(s):  
Liqiang Huang ◽  
Linying Xu
Logistics ◽  
2019 ◽  
Vol 4 (1) ◽  
pp. 1 ◽  
Author(s):  
Nikolaos Servos ◽  
Xiaodi Liu ◽  
Michael Teucke ◽  
Michael Freitag

Accurate travel time prediction is of high value for freight transports, as it allows supply chain participants to increase their logistics quality and efficiency. It requires both sufficient input data, which can be generated, e.g., by mobile sensors, and adequate prediction methods. Machine Learning (ML) algorithms are well suited to solve non-linear and complex relationships in the collected tracking data. Despite that, only a minority of recent publications use ML for travel time prediction in multimodal transports. We apply the ML algorithms extremely randomized trees (ExtraTrees), adaptive boosting (AdaBoost), and support vector regression (SVR) to this problem because of their ability to deal with low data volumes and their low processing times. Using different combinations of features derived from the data, we have built several models for travel time prediction. Tracking data from a real-world multimodal container transport relation from Germany to the USA are used for evaluation of the established models. We show that SVR provides the best prediction accuracy, with a mean absolute error of 17 h for a transport time of up to 30 days. We also show that our model performs better than average-based approaches.


2021 ◽  
Vol 13 (13) ◽  
pp. 7454
Author(s):  
Bo Qiu ◽  
Wei (David) Fan

Due to the increasing traffic volume in metropolitan areas, short-term travel time prediction (TTP) can be an important and useful tool for both travelers and traffic management. Accurate and reliable short-term travel time prediction can greatly help vehicle routing and congestion mitigation. One of the most challenging tasks in TTP is developing and selecting the most appropriate prediction algorithm using the available data. In this study, the travel time data was provided and collected from the Regional Integrated Transportation Information System (RITIS). Then, the travel times were predicted for short horizons (ranging from 15 to 60 min) on the selected freeway corridors by applying four different machine learning algorithms, which are Decision Trees (DT), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Long Short-Term Memory neural network (LSTM). Many spatial and temporal characteristics that may affect travel time were used when developing the models. The performance of prediction accuracy and reliability are compared. Numerical results suggest that RF can achieve a better prediction performance result than any of the other methods not only in accuracy but also with stability.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Bo Qiu ◽  
Wei Fan

Purpose Metropolitan areas suffer from frequent road traffic congestion not only during peak hours but also during off-peak periods. Different machine learning methods have been used in travel time prediction, however, such machine learning methods practically face the problem of overfitting. Tree-based ensembles have been applied in various prediction fields, and such approaches usually produce high prediction accuracy by aggregating and averaging individual decision trees. The inherent advantages of these approaches not only get better prediction results but also have a good bias-variance trade-off which can help to avoid overfitting. However, the reality is that the application of tree-based integration algorithms in traffic prediction is still limited. This study aims to improve the accuracy and interpretability of the models by using random forest (RF) to analyze and model the travel time on freeways. Design/methodology/approach As the traffic conditions often greatly change, the prediction results are often unsatisfactory. To improve the accuracy of short-term travel time prediction in the freeway network, a practically feasible and computationally efficient RF prediction method for real-world freeways by using probe traffic data was generated. In addition, the variables’ relative importance was ranked, which provides an investigation platform to gain a better understanding of how different contributing factors might affect travel time on freeways. Findings The parameters of the RF model were estimated by using the training sample set. After the parameter tuning process was completed, the proposed RF model was developed. The features’ relative importance showed that the variables (travel time 15 min before) and time of day (TOD) contribute the most to the predicted travel time result. The model performance was also evaluated and compared against the extreme gradient boosting method and the results indicated that the RF always produces more accurate travel time predictions. Originality/value This research developed an RF method to predict the freeway travel time by using the probe vehicle-based traffic data and weather data. Detailed information about the input variables and data pre-processing were presented. To measure the effectiveness of proposed travel time prediction algorithms, the mean absolute percentage errors were computed for different observation segments combined with different prediction horizons ranging from 15 to 60 min.


2020 ◽  
Author(s):  
Homa Taghipour ◽  
Amir Bahador Parsa ◽  
Abolfazl Mohammadian

Having access to accurate travel time is of great importance for both highway network users and traffic engineers. The travel time which is currently reported on several highways is estimated by employing naïve methods and using limited sources of data. This results in unreliable and inaccurate travel time prediction and could impose delay on travelers. Therefore, the main objective of this study is short-term prediction of travel time for highways using multiple data sources including loop detectors, probe vehicles, weather condition, network, accidents, road works, and special events in order to consider the effect of different factors on travel time. To this end, two machine learning methods, K-Nearest Neighbors and Random Forest, are employed. After applying data cleaning process on datasets and combining them, the models are trained to predict and compare short-term harmonic average speed as a representative of travel time for 5-minute prediction horizons in one hour ahead. The travel time is calculated as the ratio of the length of each link and the harmonic average speed for all reporting vehicles. Hence, a model is trained for each technique to predict travel time 5 minutes ahead, 10 minutes ahead, and all the way down to 60 minutes ahead. The results confirm satisfying performance of both models in short-term travel time prediction with slightly outperformance of Random Forest model. A feature importance and sensitivity analysis also applied for the Random Forest model, and traffic variables are found as the most effective variables in predicting the travel time.


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Xu Miao ◽  
Bing Wu ◽  
Yajie Zou ◽  
Lingtao Wu

Freeway travel time prediction is a key technology of Intelligent Transportation Systems (ITS). Many scholars have found that periodic function plays a positive role in improving the prediction accuracy of travel time prediction models. However, very few studies have comprehensively evaluated the impacts of different periodic functions on statistical and machine learning models. In this paper, our primary objective is to evaluate the performance of the six commonly used multistep ahead travel time prediction models (three statistical models and three machine learning models). In addition, we compared the impacts of three periodic functions on multistep ahead travel time prediction for different temporal scales (5-minute, 10-minute, and 15-minute). The results indicate that the periodic functions can improve the prediction performance of machine learning models for more than 60 minutes ahead prediction and improve the over 30 minutes ahead prediction accuracy for statistical models. Three periodic functions show a slight difference in improving the prediction accuracy of the six prediction models. For the same prediction step, the effect of the periodic function is more obvious at a higher level of aggregation.


Sign in / Sign up

Export Citation Format

Share Document