scholarly journals Applying Machine Learning and Statistical Approaches for Travel Time Estimation in Partial Network Coverage

2019 ◽  
Vol 11 (14) ◽  
pp. 3822 ◽  
Author(s):  
Fahad Alrukaibi ◽  
Rushdi Alsaleh ◽  
Tarek Sayed

The objective of this study is to estimate the real time travel times on urban networks that are partially covered by moving sensors. The study proposes two machine learning approaches; the random forest (RF) model and the multi-layer feed forward neural network (MFFN) to estimate travel times on urban networks which are partially covered by moving sensors. A MFFN network with three hidden layers was developed and trained using the back-propagation learning algorithm, and the neural weights were optimized using the Levenberg–Marquardt optimization technique. A case study of an urban network with 100 links is considered in this study. The performance of the proposed models was compared to a statistical model, which uses the empirical Bayes (EB) method and the spatial correlation between travel times. The models’ performances were evaluated using data generated from VISSIM microsimulation model. Results show that the machine learning algorithms, e.g., RF and ANN, achieve average improvements of about 4.1% and 2.9% compared with the statistical approach. The RF, MFFN, and the statistical approach models correctly predict real time travel times with estimation accuracies reaching 90.7%, 89.5%, and 86.6% respectively. Moreover, results show that at low moving sensor penetration rate, the RF and MFFN achieve higher estimation accuracy compared with the statistical approach. At probe penetration rate of 1%, the RF, MFFN, and the statistical approach models correctly predict real time travel times with estimation accuracy of 85.6%, 84.4%, and 80.9% respectively. Furthermore, the study investigated the impact of the probe penetration rate on real time neighbor links coverage. Results show that at probe penetration rates of 1%, 3%, and 5%, the models cover the estimation of real time travel times on 73.8%, 94.8%, and 97.2% of the estimation intervals.

Author(s):  
Nicholas Westing ◽  
Brett Borghetti ◽  
Kevin Gross

The increasing spatial and spectral resolution of hyperspectral imagers yields detailed spectroscopy measurements from both space-based and airborne platforms. Machine learning algorithms have achieved state-of-the-art material classification performance on benchmark hyperspectral data sets; however, these techniques often do not consider varying atmospheric conditions experienced in a real-world detection scenario. To reduce the impact of atmospheric effects in the at-sensor signal, atmospheric compensation must be performed. Radiative Transfer (RT) modeling can generate high-fidelity atmospheric estimates at detailed spectral resolutions, but is often too time-consuming for real-time detection scenarios. This research utilizes machine learning methods to perform dimension reduction on the transmittance, upwelling radiance, and downwelling radiance (TUD) data to create high accuracy atmospheric estimates with lower computational cost than RT modeling. The utility of this approach is investigated using the instrument line shape for the Mako long-wave infrared hyperspectral sensor. This study employs physics-based metrics and loss functions to identify promising dimension reduction techniques. As a result, TUD vectors can be produced in real-time allowing for atmospheric compensation across diverse remote sensing scenarios.


T-Comm ◽  
2021 ◽  
Vol 15 (9) ◽  
pp. 24-35
Author(s):  
Irina A. Krasnova ◽  

The paper analyzes the impact of setting the parameters of Machine Learning algorithms on the results of traffic classification in real-time. The Random Forest and XGBoost algorithms are considered. A brief description of the work of both methods and methods for evaluating the results of classification is given. Experimental studies are conducted on a database obtained on a real network, separately for TCP and UDP flows. In order for the results of the study to be used in real time, a special feature matrix is created based on the first 15 packets of the flow. The main parameters of the Random Forest (RF) algorithm for configuration are the number of trees, the partition criterion used, the maximum number of features for constructing the partition function, the depth of the tree, and the minimum number of samples in the node and in the leaf. For XGBoost, the number of trees, the depth of the tree, the minimum number of samples in the leaf, for features, and the percentage of samples needed to build the tree are taken. Increasing the number of trees leads to an increase in accuracy to a certain value, but as shown in the article, it is important to make sure that the model is not overfitted. To combat overfitting, the remaining parameters of the trees are used. In the data set under study, by eliminating overfitting, it was possible to achieve an increase in classification accuracy for individual applications by 11-12% for Random Forest and by 12-19% for XGBoost. The results show that setting the parameters is a very important step in building a traffic classification model, because it helps to combat overfitting and significantly increases the accuracy of the algorithm’s predictions. In addition, it was shown that if the parameters are properly configured, XGBoost, which is not very popular in traffic classification works, becomes a competitive algorithm and shows better results compared to the widespread Random Forest.


2020 ◽  
Vol 39 (5) ◽  
pp. 6579-6590
Author(s):  
Sandy Çağlıyor ◽  
Başar Öztayşi ◽  
Selime Sezgin

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.


2019 ◽  
Vol 9 (6) ◽  
pp. 1154 ◽  
Author(s):  
Ganjar Alfian ◽  
Muhammad Syafrudin ◽  
Bohan Yoon ◽  
Jongtae Rhee

Radio frequency identification (RFID) is an automated identification technology that can be utilized to monitor product movements within a supply chain in real-time. However, one problem that occurs during RFID data capturing is false positives (i.e., tags that are accidentally detected by the reader but not of interest to the business process). This paper investigates using machine learning algorithms to filter false positives. Raw RFID data were collected based on various tagged product movements, and statistical features were extracted from the received signal strength derived from the raw RFID data. Abnormal RFID data or outliers may arise in real cases. Therefore, we utilized outlier detection models to remove outlier data. The experiment results showed that machine learning-based models successfully classified RFID readings with high accuracy, and integrating outlier detection with machine learning models improved classification accuracy. We demonstrated the proposed classification model could be applied to real-time monitoring, ensuring false positives were filtered and hence not stored in the database. The proposed model is expected to improve warehouse management systems by monitoring delivered products to other supply chain partners.


Sign in / Sign up

Export Citation Format

Share Document