An Exploration Study for Augmented and Virtual Reality Enhancing Situation Awareness for Plant Teleanalysis

Volume 1: 37th Computers and Information in Engineering Conference ◽

10.1115/detc2017-67790 ◽

2017 ◽

Cited By ~ 1

Author(s):

Doris Aschenbrenner ◽

Nicolas Maltry ◽

Klaus Schilling ◽

Jouke Verlinden

Keyword(s):

Situation Awareness ◽

User Study ◽

Large Data ◽

Industrial Plant ◽

Visualization Method ◽

Molding Machine ◽

Data Set ◽

Industrial Plants ◽

Industrial Manipulator ◽

Better Than

This work wants to investigate which visualization method is able to support remote teleanalysis of industrial plants best regarding comprehension, usability and situation awareness. The application goal is the remote optimization of an industrial plant and the examined scenario was generated out of a large data set of a real production entity. The plant consists of an industrial manipulator, a molding machine and a montage system. Prior studies on the same plant with video based visualization explored by remote experts showed a large potential for optimization, but indicated a higher demand for situation awareness. In order to test the influence of the visualization method, a user study has been carried out with 60 student participants with six different visualization methods, including various VR and AR implementations. Overall, our used AR environment performed significantly better than the used VR and video implementations, but the VR implementation surpasses AR regarding situation awareness.

Download Full-text

Extreme Learning Machine with sigmoid activation function on large data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1433.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 3523-3526

Keyword(s):

Efficient Algorithm ◽

Large Data ◽

Activation Function ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Learning Machine ◽

Sigmoid Activation Function ◽

State Of Art ◽

Better Than

This paper describes an efficient algorithm for classification in large data set. While many algorithms exist for classification, they are not suitable for larger contents and different data sets. For working with large data sets various ELM algorithms are available in literature. However the existing algorithms using fixed activation function and it may lead deficiency in working with large data. In this paper, we proposed novel ELM comply with sigmoid activation function. The experimental evaluations demonstrate the our ELM-S algorithm is performing better than ELM,SVM and other state of art algorithms on large data sets.

Download Full-text

Estimating Intersection Control Delay Using Large Data Sets of Travel Time from a Global Positioning System

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198105191700103 ◽

2005 ◽

Vol 1917 (1) ◽

pp. 18-27

Author(s):

Brian Hoeschen ◽

Darcy Bullock ◽

Mark Schlappi

Keyword(s):

Travel Time ◽

Traffic Engineering ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Control Delay ◽

Diverse Data ◽

Intersection Control ◽

Better Than

Historically, stopped delay was used to characterize the operation of intersection movements because it was relatively easy to measure. During the past decade, the traffic engineering community has moved away from using stopped delay and now uses control delay. That measurement is more precise but quite difficult to extract from large data sets if strict definitions are used to derive the data. This paper evaluates two procedures for estimating control delay. The first is based on a historical approximation that control delay is 30% larger than stopped delay. The second is new and based on segment delay. The procedures are applied to a diverse data set collected in Phoenix, Arizona, and compared with control delay calculated by using the formal definition. The new approximation was observed to be better than the historical stopped delay procedure; it provided an accurate prediction of control delay. Because it is an approximation, this methodology would be most appropriately applied to large data sets collected from travel time studies for ranking and prioritizing intersections for further analysis.

Download Full-text

Methodologies for Imputation of Missing Values in Rice Pest Data

Current Journal of Applied Science and Technology ◽

10.9734/cjast/2021/v40i531304 ◽

2021 ◽

pp. 64-73

Author(s):

V. Jinubala ◽

P. Jeyakumar

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Missing Values ◽

Large Data ◽

Research Field ◽

Data Set ◽

Imputation Methods ◽

Predictive Mean Matching ◽

Rice Pest ◽

Better Than

Data Mining is an emerging research field in the analysis of agricultural data. In fact the most important problem in extracting knowledge from the agriculture data is the missing values of the attributes in the selected data set. If such deficiencies are there in the selected data set then it needs to be cleaned during preprocessing of the data in order to obtain a functional data. The main objective of this paper is to analyse the effectiveness of the various imputation methods in producing a complete data set that can be more useful for applying data mining techniques and presented a comparative analysis of the imputation methods for handling missing values. The pest data set of rice crop collected throughout Maharashtra state under Crop Pest Surveillance and Advisory Project (CROPSAP) during 2009-2013 was used for analysis. The different methodologies like Deleting of rows, Mean & Median, Linear regression and Predictive Mean Matching were analysed for Imputation of Missing values. The comparative analysis shows that Predictive Mean Matching Methodology was better than other methods and effective for imputation of missing values in large data set.

Download Full-text

Measurements and Analysis of Primary Ship Waves in the Stockholm Archipelago, Sweden

Journal of Marine Science and Engineering ◽

10.3390/jmse8100743 ◽

2020 ◽

Vol 8 (10) ◽

pp. 743

Author(s):

Björn Almström ◽

Magnus Larson

Keyword(s):

Large Data ◽

Field Measurements ◽

Physical Processes ◽

Data Set ◽

Ship Waves ◽

Stockholm Archipelago ◽

New Equation ◽

Water Level Measurements ◽

Marine Vessels ◽

Better Than

Primary ship waves generated by conventional marine vessels were investigated in the Furusund fairway located in the Stockholm archipelago, Sweden. Continuous water level measurements at two locations in the fairway were analyzed. In total, 466 such events were extracted during two months of measurements. The collected data were used to evaluate 13 existing predictive equations for drawdown height or squat. However, none of the equations were able to satisfactorily predict the drawdown height. Instead, a new equation for drawdown height and period was derived based on simplified descriptions of the main physical processes together with field measurements, employing multiple regression analysis to derive coefficients in the equation. The proposed equation for drawdown height performed better than the existing equations with an R2 value of 0.65, whereas the equation for the drawdown period was R2 = 0.64. The main conclusion from this study is that an empirical equation can satisfactorily predict primary ship waves for a large data set.

Download Full-text

Discriminating Between Second-Order Model With/Without Interaction Base on Central Tendency Estimation

African Journal of Mathematics and Statistics Studies ◽

10.52589/ajmss-71mqsbgz ◽

2021 ◽

Vol 4 (3) ◽

pp. 47-63

Author(s):

Owhondah P.S. ◽

Enegesele D. ◽

Biu O.E. ◽

Wokoma D.S.A.

Keyword(s):

Regression Model ◽

Sample Size ◽

Large Data ◽

Second Order ◽

Data Sets ◽

Large Sample Size ◽

Order Model ◽

Data Set ◽

The Mean ◽

Better Than

The study deals with discriminating between the second-order models with/without interaction on central tendency estimation using the ordinary least square (OLS) method for the estimation of the model parameters. The paper considered two different sets of data (small and large) sample size. The small sample size used data of unemployment rate as a response, inflation rate and exchange rate as the predictors from 2007 to 2018 and the large sample size was data of flow-rate on hydrate formation for Niger Delta deep offshore field. The〖 R〗^2, AIC, SBC, and SSE were computed for both data sets to test for adequacy of the models. The results show that all three models are similar for smaller data set while for large data set the second-order model centered on the median with/without interaction is the best base on the number of significant parameters. The model’s selection criterion values (R^2, AIC, SBC, and SSE) were found to be equal for models centered on median and mode for both large and small data sets. However, the model centered on median and mode with/without interaction were better than the model centered on the mean for large data sets. This study shows that the second-order regression model centered on median and mode are better than the model centered on the mean for large data set, while they are similar for smaller data set. Hence, the second-order regression model centered on median and mode with or without interaction are better than the second-order regression model centered on the mean.

Download Full-text

The observed velocity distribution of young pulsars – II. Analysis of complete PSRπ

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa958 ◽

2020 ◽

Vol 494 (3) ◽

pp. 3663-3674 ◽

Cited By ~ 1

Author(s):

Andrei P Igoshev

Keyword(s):

Large Data ◽

Maxwellian Distribution ◽

Proper Motions ◽

Likelihood Methods ◽

Complete Sample ◽

Low Velocity ◽

Data Set ◽

Maximum Likelihood Methods ◽

Distribution Parameters ◽

Better Than

ABSTRACT Understanding the natal kicks, or birth velocities, of neutron stars is essential for understanding the evolution of massive binaries and double neutron star formation. We use maximum likelihood methods as published in Verbunt et al. to analyse a new large data set of parallaxes and proper motions measured by Deller et al. This sample is roughly three times larger than number of measurements available before. For both the complete sample and its younger part (spin-down ages τ < 3 Myr), we find that a bimodal Maxwellian distribution describes the measured parallaxes and proper motions better than a single Maxwellian with probability of 99.3 and 95.0 per cent, respectively. The bimodal Maxwellian distribution has three parameters: fraction of low-velocity pulsars and distribution parameters σ1 and σ2 for low- and high-velocity modes. For a complete sample, these parameters are as follows: $42_{-15}^{+17}$ per cent, $\sigma _1=128_{-18}^{+22}$ km s−1, and σ2 = 298 ± 28 km s−1. For younger pulsars, which are assumed to represent the natal kick, these parameters are as follows: $20_{-10}^{+11}$ per cent, $\sigma _1=56_{-15}^{+25}$ km s−1, and σ2 = 336 ± 45 km s−1. In the young population, 5 ± 3 per cent of pulsars have velocities less than 60 km s−1. We perform multiple Monte Carlo tests for the method taking into account realistic observational selection. We find that the method reliably estimates all parameters of the natal kick distribution. Results of the velocity analysis are weakly sensitive to the exact values of scale lengths of the Galactic pulsar distribution.

Download Full-text

Comparisons of forecasting for hepatitis in Guangxi Province, China by using three neural networks models

PeerJ ◽

10.7717/peerj.2684 ◽

2016 ◽

Vol 4 ◽

pp. e2684 ◽

Cited By ~ 5

Author(s):

Ruijing Gan ◽

Ni Chen ◽

Daizheng Huang

Keyword(s):

Neural Networks ◽

Back Propagation ◽

Seasonal Fluctuation ◽

Large Data ◽

Small Data ◽

Guangxi Province ◽

Data Set ◽

Generalized Regression Neural Networks ◽

Modeling And Forecasting ◽

Better Than

This study compares and evaluates the prediction of hepatitis in Guangxi Province, China by using back propagation neural networks based genetic algorithm (BPNN-GA), generalized regression neural networks (GRNN), and wavelet neural networks (WNN). In order to compare the results of forecasting, the data obtained from 2004 to 2013 and 2014 were used as modeling and forecasting samples, respectively. The results show that when the small data set of hepatitis has seasonal fluctuation, the prediction result by BPNN-GA will be better than the two other methods. The WNN method is suitable for predicting the large data set of hepatitis that has seasonal fluctuation and the same for the GRNN method when the data increases steadily.

Download Full-text

A Comparison of ORC-Compress Performance with Big Data Workload on Virtualization

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.855.153 ◽

2016 ◽

Vol 855 ◽

pp. 153-158

Author(s):

Kritwara Rattanaopas ◽

Sureerat Kaewkeerat ◽

Yanapat Chuchuen

Keyword(s):

Big Data ◽

Execution Time ◽

Large Data ◽

Map Reduce ◽

Data Set ◽

Relational Information ◽

Open Source Data ◽

Space Saving ◽

Source Data ◽

Better Than

Big Data is widely used in many organizations nowadays. Hive is an open source data warehouse system for managing large data set. It provides a SQL-like interface to Hadoop over Map-Reduce framework. Currently, Big Data solution starts to adopt HiveQL tool to improve execution time of relational information. In this paper, we investigate on an execution time of query processing issues comparing two algorithm of ORC file: ZLIB and SNAPPY. The results show that ZLIB can compress data up to 87% compared to NONE compressing data. It was better than SNAPPY which has space saving 79%. However, the key for reducing execution time is Map-Reduce that were shown by a less query execution time when mapper and data node were equal. For example, all query suites in 6-node(ZLIB/SNAPPY) with 250-million table rows has quite similar execution time comparison to 9-node(ZLIB/SNAPPY) with 350-million table rows.

Download Full-text

RECENT RESULTS IN HIERARCHICAL CLUSTERING: I–THE REDUCIBLE NEIGHBORHOODS CLUSTERING ALGORITHM

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001493000285 ◽

1993 ◽

Vol 07 (03) ◽

pp. 541-571 ◽

Cited By ~ 5

Author(s):

MICHEL BRUYNOOGHE

Keyword(s):

Hierarchical Clustering ◽

Speech Processing ◽

Clustering Algorithm ◽

Large Data ◽

Original Data ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Hierarchical Clustering Algorithm ◽

Better Than

The clustering of large data sets is of great interest in fields such as pattern recognition, numerical taxonomy, image or speech processing. The traditional Ascendant Hierarchical Algorithm (AHC) cannot be run for sets of more than a few thousand elements. The reducible neighborhoods clustering algorithm, which is presented in this paper, has overtaken the limits of the traditional hierarchical clustering algorithm by generating an exact hierarchy on a large data set. The theoretical justification of this algorithm is the so-called Bruynooghe reducibility principle, that lays down the condition under which the exact hierarchy may be constructed locally, by carrying out aggregations in restricted regions of the representation space. As for the Day and Edelsbrunner algorithm, the maximum theoretical time complexity of the reducible neighborhoods clustering algorithm is O(n2 log n), regardless of the chosen clustering strategy. But the reducible neighborhoods clustering algorithm uses the original data table and its practical performances are by far better than Day and Edelsbrunner’s algorithm, thus allowing the hierarchical clustering of large data sets, i.e. composed of more than 10 000 objects.

Download Full-text

Some statistical and CI models to predict chaotic high-frequency financial data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189107 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6419-6430

Author(s):

Dusan Marcek

Keyword(s):

Time Series Data ◽

Moving Average ◽

Methodological Approach ◽

Back Propagation ◽

Large Data ◽

Series Data ◽

Data Set ◽

Training Time ◽

Optimal Population ◽

Forecast Time

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.

Download Full-text