scholarly journals Accurate Pupil Center Detection in Off-the-Shelf Eye Tracking Systems Using Convolutional Neural Networks

Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6847
Author(s):  
Andoni Larumbe-Bergera ◽  
Gonzalo Garde ◽  
Sonia Porta ◽  
Rafael Cabeza ◽  
Arantxa Villanueva

Remote eye tracking technology has suffered an increasing growth in recent years due to its applicability in many research areas. In this paper, a video-oculography method based on convolutional neural networks (CNNs) for pupil center detection over webcam images is proposed. As the first contribution of this work and in order to train the model, a pupil center manual labeling procedure of a facial landmark dataset has been performed. The model has been tested over both real and synthetic databases and outperforms state-of-the-art methods, achieving pupil center estimation errors below the size of a constricted pupil in more than 95% of the images, while reducing computing time by a 8 factor. Results show the importance of use high quality training data and well-known architectures to achieve an outstanding performance.

Author(s):  
Y. A. Lumban-Gaol ◽  
K. A. Ohori ◽  
R. Y. Peters

Abstract. Satellite-Derived Bathymetry (SDB) has been used in many applications related to coastal management. SDB can efficiently fill data gaps obtained from traditional measurements with echo sounding. However, it still requires numerous training data, which is not available in many areas. Furthermore, the accuracy problem still arises considering the linear model could not address the non-relationship between reflectance and depth due to bottom variations and noise. Convolutional Neural Networks (CNN) offers the ability to capture the connection between neighbouring pixels and the non-linear relationship. These CNN characteristics make it compelling to be used for shallow water depth extraction. We investigate the accuracy of different architectures using different window sizes and band combinations. We use Sentinel-2 Level 2A images to provide reflectance values, and Lidar and Multi Beam Echo Sounder (MBES) datasets are used as depth references to train and test the model. A set of Sentinel-2 and in-situ depth subimage pairs are extracted to perform CNN training. The model is compared to the linear transform and applied to two other study areas. Resulting accuracy ranges from 1.3 m to 1.94 m, and the coefficient of determination reaches 0.94. The SDB model generated using a window size of 9x9 indicates compatibility with the reference depths, especially at areas deeper than 15 m. The addition of both short wave infrared bands to the four visible bands in training improves the overall accuracy of SDB. The implementation of the pre-trained model to other study areas provides similar results depending on the water conditions.


Geophysics ◽  
2021 ◽  
pp. 1-45
Author(s):  
Runhai Feng ◽  
Dario Grana ◽  
Niels Balling

Segmentation of faults based on seismic images is an important step in reservoir characterization. With the recent developments of deep-learning methods and the availability of massive computing power, automatic interpretation of seismic faults has become possible. The likelihood of occurrence for a fault can be quantified using a sigmoid function. Our goal is to quantify the fault model uncertainty that is generally not captured by deep-learning tools. We propose to use the dropout approach, a regularization technique to prevent overfitting and co-adaptation in hidden units, to approximate the Bayesian inference and estimate the principled uncertainty over functions. Particularly, the variance of the learned model has been decomposed into aleatoric and epistemic parts. The proposed method is applied to a real dataset from the Netherlands F3 block with two different dropout ratios in convolutional neural networks. The aleatoric uncertainty is irreducible since it relates to the stochastic dependency within the input observations. As the number of Monte-Carlo realizations increases, the epistemic uncertainty asymptotically converges and the model standard deviation decreases, because the variability of model parameters is better simulated or explained with a larger sample size. This analysis can quantify the confidence to use fault predictions with less uncertainty. Additionally, the analysis suggests where more training data are needed to reduce the uncertainty in low confidence regions.


2019 ◽  
Vol 11 (12) ◽  
pp. 1461 ◽  
Author(s):  
Husam A. H. Al-Najjar ◽  
Bahareh Kalantar ◽  
Biswajeet Pradhan ◽  
Vahideh Saeidi ◽  
Alfian Abdul Halin ◽  
...  

In recent years, remote sensing researchers have investigated the use of different modalities (or combinations of modalities) for classification tasks. Such modalities can be extracted via a diverse range of sensors and images. Currently, there are no (or only a few) studies that have been done to increase the land cover classification accuracy via unmanned aerial vehicle (UAV)–digital surface model (DSM) fused datasets. Therefore, this study looks at improving the accuracy of these datasets by exploiting convolutional neural networks (CNNs). In this work, we focus on the fusion of DSM and UAV images for land use/land cover mapping via classification into seven classes: bare land, buildings, dense vegetation/trees, grassland, paved roads, shadows, and water bodies. Specifically, we investigated the effectiveness of the two datasets with the aim of inspecting whether the fused DSM yields remarkable outcomes for land cover classification. The datasets were: (i) only orthomosaic image data (Red, Green and Blue channel data), and (ii) a fusion of the orthomosaic image and DSM data, where the final classification was performed using a CNN. CNN, as a classification method, is promising due to hierarchical learning structure, regulating and weight sharing with respect to training data, generalization, optimization and parameters reduction, automatic feature extraction and robust discrimination ability with high performance. The experimental results show that a CNN trained on the fused dataset obtains better results with Kappa index of ~0.98, an average accuracy of 0.97 and final overall accuracy of 0.98. Comparing accuracies between the CNN with DSM result and the CNN without DSM result for the overall accuracy, average accuracy and Kappa index revealed an improvement of 1.2%, 1.8% and 1.5%, respectively. Accordingly, adding the heights of features such as buildings and trees improved the differentiation between vegetation specifically where plants were dense.


2020 ◽  
Vol 12 (23) ◽  
pp. 3953
Author(s):  
Ashley N. Ellenson ◽  
Joshua A. Simmons ◽  
Greg W. Wilson ◽  
Tyler J. Hesser ◽  
Kristen D. Splinter

Nearshore morphology is a key driver in wave breaking and the resulting nearshore circulation, recreational safety, and nutrient dispersion. Morphology persists within the nearshore in specific shapes that can be classified into equilibrium states. Equilibrium states convey qualitative information about bathymetry and relevant physical processes. While nearshore bathymetry is a challenge to collect, much information about the underlying bathymetry can be gained from remote sensing of the surfzone. This study presents a new method to automatically classify beach state from Argus daytimexposure imagery using a machine learning technique called convolutional neural networks (CNNs). The CNN processed imagery from two locations: Narrabeen, New South Wales, Australia and Duck, North Carolina, USA. Three different CNN models are examined, one trained at Narrabeen, one at Duck, and one trained at both locations. Each model was tested at the location where it was trained in a self-test, and the single-beach models were tested at the location where it was not trained in a transfer-test. For the self-tests, skill (as measured by the F-score) was comparable to expert agreement (CNN F-values at Duck = 0.80 and Narrabeen = 0.59). For the transfer-tests, the CNN model skill was reduced by 24–48%, suggesting the algorithm requires additional local data to improve transferability performance. Transferability tests showed that comparable F-scores (within 10%) to the self-trained cases can be achieved at both locations when at least 25% of the training data is from each site. This suggests that if applied to additional locations, a CNN model trained at one location may be skillful at new sites with limited new imagery data needed. Finally, a CNN visualization technique (Guided-Grad-CAM) confirmed that the CNN determined classifications using image regions (e.g., incised rip channels, terraces) that were consistent with beach state labelling rules.


2019 ◽  
Vol 128 (8-9) ◽  
pp. 2126-2145 ◽  
Author(s):  
Zhen-Hua Feng ◽  
Josef Kittler ◽  
Muhammad Awais ◽  
Xiao-Jun Wu

AbstractEfficient and robust facial landmark localisation is crucial for the deployment of real-time face analysis systems. This paper presents a new loss function, namely Rectified Wing (RWing) loss, for regression-based facial landmark localisation with Convolutional Neural Networks (CNNs). We first systemically analyse different loss functions, including L2, L1 and smooth L1. The analysis suggests that the training of a network should pay more attention to small-medium errors. Motivated by this finding, we design a piece-wise loss that amplifies the impact of the samples with small-medium errors. Besides, we rectify the loss function for very small errors to mitigate the impact of inaccuracy of manual annotation. The use of our RWing loss boosts the performance significantly for regression-based CNNs in facial landmarking, especially for lightweight network architectures. To address the problem of under-representation of samples with large pose variations, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation strategies. Last, the proposed approach is extended to create a coarse-to-fine framework for robust and efficient landmark localisation. Moreover, the proposed coarse-to-fine framework is able to deal with the small sample size problem effectively. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits of our RWing loss and prove the superiority of the proposed method over the state-of-the-art approaches.


2020 ◽  
Vol 12 (7) ◽  
pp. 1092
Author(s):  
David Browne ◽  
Michael Giering ◽  
Steven Prestwich

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.


Sign in / Sign up

Export Citation Format

Share Document