scholarly journals Object Detection at Level Crossing Using Deep Learning

Micromachines ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 1055
Author(s):  
Muhammad Asad Bilal Fayyaz ◽  
Christopher Johnson

Multiple projects within the rail industry across different regions have been initiated to address the issue of over-population. These expansion plans and upgrade of technologies increases the number of intersections, junctions, and level crossings. A level crossing is where a railway line is crossed by a road or right of way on the level without the use of a tunnel or bridge. Level crossings still pose a significant risk to the public, which often leads to serious accidents between rail, road, and footpath users and the risk is dependent on their unpredictable behavior. For Great Britain, there were three fatalities and 385 near misses at level crossings in 2015–2016. Furthermore, in its annual safety report, the Rail Safety and Standards Board (RSSB) highlighted the risk of incidents at level crossings during 2016/17 with a further six fatalities at level crossings including four pedestrians and two road vehicles. The relevant authorities have suggested an upgrade of the existing sensing system and the integration of new novel technology at level crossings. The present work addresses this key issue and discusses the current sensing systems along with the relevant algorithms used for post-processing the information. The given information is adequate for a manual operator to make a decision or start an automated operational cycle. Traditional sensors have certain limitations and are often installed as a “single sensor”. The single sensor does not provide sufficient information; hence another sensor is required. The algorithms integrated with these sensing systems rely on the traditional approach, where background pixels are compared with new pixels. Such an approach is not effective in a dynamic and complex environment. The proposed model integrates deep learning technology with the current Vision system (e.g., CCTV to detect and localize an object at a level crossing). The proposed sensing system should be able to detect and localize particular objects (e.g., pedestrians, bicycles, and vehicles at level crossing areas.) The radar system is also discussed for a “two out of two” logic interlocking system in case of fail-mechanism. Different techniques to train a deep learning model are discussed along with their respective results. The model achieved an accuracy of about 88% from the MobileNet model for classification and a loss metric of 0.092 for object detection. Some related future work is also discussed.

Electronics ◽  
2019 ◽  
Vol 8 (3) ◽  
pp. 331 ◽  
Author(s):  
Yifeng Xu ◽  
Huigang Wang ◽  
Xing Liu ◽  
Henry He ◽  
Qingyue Gu ◽  
...  

Recent advances in deep learning have shown exciting promise in low-level artificial intelligence tasks such as image classification, speech recognition, object detection, and semantic segmentation, etc. Artificial intelligence has made an important contribution to autopilot, which is a complex high-level intelligence task. However, the real autopilot scene is quite complicated. The first accident of autopilot occurred in 2016. It resulted in a fatal crash where the white side of a vehicle appeared similar to a brightly lit sky. The root of the problem is that the autopilot vision system cannot identify the part of a vehicle when the part is similar to the background. A method called DIDA was first proposed based on the deep learning network to see the hidden part. DIDA cascades the following steps: object detection, scaling, image inpainting assuming a hidden part beside the car, object re-detection from inpainted image, zooming back to the original size, and setting an alarm region by comparing two detected regions. DIDA was tested in a similar scene and achieved exciting results. This method solves the aforementioned problem only by using optical signals. Additionally, the vehicle dataset captured in Xi’an, China can be used in subsequent research.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2611
Author(s):  
Andrew Shepley ◽  
Greg Falzon ◽  
Christopher Lawson ◽  
Paul Meek ◽  
Paul Kwan

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 343
Author(s):  
Kim Bjerge ◽  
Jakob Bonde Nielsen ◽  
Martin Videbæk Sepstrup ◽  
Flemming Helsing-Nielsen ◽  
Toke Thomas Høye

Insect monitoring methods are typically very time-consuming and involve substantial investment in species identification following manual trapping in the field. Insect traps are often only serviced weekly, resulting in low temporal resolution of the monitoring data, which hampers the ecological interpretation. This paper presents a portable computer vision system capable of attracting and detecting live insects. More specifically, the paper proposes detection and classification of species by recording images of live individuals attracted to a light trap. An Automated Moth Trap (AMT) with multiple light sources and a camera was designed to attract and monitor live insects during twilight and night hours. A computer vision algorithm referred to as Moth Classification and Counting (MCC), based on deep learning analysis of the captured images, tracked and counted the number of insects and identified moth species. Observations over 48 nights resulted in the capture of more than 250,000 images with an average of 5675 images per night. A customized convolutional neural network was trained on 2000 labeled images of live moths represented by eight different classes, achieving a high validation F1-score of 0.93. The algorithm measured an average classification and tracking F1-score of 0.71 and a tracking detection rate of 0.79. Overall, the proposed computer vision system and algorithm showed promising results as a low-cost solution for non-destructive and automatic monitoring of moths.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


Sign in / Sign up

Export Citation Format

Share Document