scholarly journals A Baseline for General Music Object Detection with Deep Learning

2018 ◽  
Vol 8 (9) ◽  
pp. 1488 ◽  
Author(s):  
Alexander Pacha ◽  
Jan Hajič ◽  
Jorge Calvo-Zaragoza

Deep learning is bringing breakthroughs to many computer vision subfields including Optical Music Recognition (OMR), which has seen a series of improvements to musical symbol detection achieved by using generic deep learning models. However, so far, each such proposal has been based on a specific dataset and different evaluation criteria, which made it difficult to quantify the new deep learning-based state-of-the-art and assess the relative merits of these detection models on music scores. In this paper, a baseline for general detection of musical symbols with deep learning is presented. We consider three datasets of heterogeneous typology but with the same annotation format, three neural models of different nature, and establish their performance in terms of a common evaluation standard. The experimental results confirm that the direct music object detection with deep learning is indeed promising, but at the same time illustrates some of the domain-specific shortcomings of the general detectors. A qualitative comparison then suggests avenues for OMR improvement, based both on properties of the detection model and how the datasets are defined. To the best of our knowledge, this is the first time that competing music object detection systems from the machine learning paradigm are directly compared to each other. We hope that this work will serve as a reference to measure the progress of future developments of OMR in music object detection.

Author(s):  
Vibhavari B Rao

The crime rates today can inevitably put a civilian's life in danger. While consistent efforts are being made to alleviate crime, there is also a dire need to create a smart and proactive surveillance system. Our project implements a smart surveillance system that would alert the authorities in real-time when a crime is being committed. During armed robberies and hostage situations, most often, the police cannot reach the place on time to prevent it from happening, owing to the lag in communication between the informants of the crime scene and the police. We propose an object detection model that implements deep learning algorithms to detect objects of violence such as pistols, knives, rifles from video surveillance footage, and in turn send real-time alerts to the authorities. There are a number of object detection algorithms being developed, each being evaluated under the performance metric mAP. On implementing Faster R-CNN with ResNet 101 architecture we found the mAP score to be about 91%. However, the downside to this is the excessive training and inferencing time it incurs. On the other hand, YOLOv5 architecture resulted in a model that performed very well in terms of speed. Its training speed was found to be 0.012 s / image during training but naturally, the accuracy was not as high as Faster R-CNN. With good computer architecture, it can run at about 40 fps. Thus, there is a tradeoff between speed and accuracy and it's important to strike a balance. We use transfer learning to improve accuracy by training the model on our custom dataset. This project can be deployed on any generic CCTV camera by setting up a live RTSP (real-time streaming protocol) and streaming the footage on a laptop or desktop where the deep learning model is being run.


Author(s):  
Limu Chen ◽  
Ye Xia ◽  
Dexiong Pan ◽  
Chengbin Wang

<p>Deep-learning based navigational object detection is discussed with respect to active monitoring system for anti-collision between vessel and bridge. Motion based object detection method widely used in existing anti-collision monitoring systems is incompetent in dealing with complicated and changeable waterway for its limitations in accuracy, robustness and efficiency. The video surveillance system proposed contains six modules, including image acquisition, detection, tracking, prediction, risk evaluation and decision-making, and the detection module is discussed in detail. A vessel-exclusive dataset with tons of image samples is established for neural network training and a SSD (Single Shot MultiBox Detector) based object detection model with both universality and pertinence is generated attributing to tactics of sample filtering, data augmentation and large-scale optimization, which make it capable of stable and intelligent vessel detection. Comparison results with conventional methods indicate that the proposed deep-learning method shows remarkable advantages in robustness, accuracy, efficiency and intelligence. In-situ test is carried out at Songpu Bridge in Shanghai, and the results illustrate that the method is qualified for long-term monitoring and providing information support for further analysis and decision making.</p>


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1650 ◽  
Author(s):  
Xiaoming Lv ◽  
Fajie Duan ◽  
Jia-Jia Jiang ◽  
Xiao Fu ◽  
Lin Gan

Most of the current object detection approaches deliver competitive results with an assumption that a large number of labeled data are generally available and can be fed into a deep network at once. However, due to expensive labeling efforts, it is difficult to deploy the object detection systems into more complex and challenging real-world environments, especially for defect detection in real industries. In order to reduce the labeling efforts, this study proposes an active learning framework for defect detection. First, an Uncertainty Sampling is proposed to produce the candidate list for annotation. Uncertain images can provide more informative knowledge for the learning process. Then, an Average Margin method is designed to set the sampling scale for each defect category. In addition, an iterative pattern of training and selection is adopted to train an effective detection model. Extensive experiments demonstrate that the proposed method can render the required performance with fewer labeled data.


2021 ◽  
Vol 40 ◽  
pp. 01005
Author(s):  
Mudit Shrivastava ◽  
Rahul Jadhav ◽  
Pranjal Singhal ◽  
Savita R. Bhosale

As name characterizes understanding of a number plate accordingly, from past decades the use vehicles expanded rapidly, taking into account of this such a majority number of issues like overseeing and controlling trafficante keeping watch on autos and managing parking area zones to overcome this tag recognizer programming is required. The proposed work aims to detect speed of a moving vehicle through its license plate. It will fetch vehicle owner details with the help of CNN model. In this project the main focus is to detect a moving car whenever it crosses dynamic markings. It uses Tensor-flow with an SSD object detection model to detect cars and from the detection in each frame the license plate gets detected and each vehicle can be tracked across a video and can be checked if it crossed the markings made in program itself and hence speed of that vehicle can be calculated. The detected License plate will be forwarded to trained model where PyTesseract is used, which will convert image to text.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Di Tian ◽  
Yi Han ◽  
Biyao Wang ◽  
Tian Guan ◽  
Wei Wei

Pedestrian detection is a specific application of object detection. Compared with general object detection, it shows similarities and unique characteristics. In addition, it has important application value in the fields of intelligent driving and security monitoring. In recent years, with the rapid development of deep learning, pedestrian detection technology has also made great progress. However, there still exists a huge gap between it and human perception. Meanwhile, there are still a lot of problems, and there remains a lot of room for research. Regarding the application of pedestrian detection in intelligent driving technology, it is of necessity to ensure its real-time performance. Additionally, it is necessary to lighten the model while ensuring detection accuracy. This paper first briefly describes the development process of pedestrian detection and then concentrates on summarizing the research results of pedestrian detection technology in the deep learning stage. Subsequently, by summarizing the pedestrian detection dataset and evaluation criteria, the core issues of the current development of pedestrian detection are analyzed. Finally, the next possible development direction of pedestrian detection technology is explained at the end of the paper.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5323
Author(s):  
Yongsu Kim ◽  
Hyoeun Kang ◽  
Naufal Suryanto ◽  
Harashta Tatimma Tatimma Larasati ◽  
Afifatul Mukaroh ◽  
...  

Deep neural networks (DNNs), especially those used in computer vision, are highly vulnerable to adversarial attacks, such as adversarial perturbations and adversarial patches. Adversarial patches, often considered more appropriate for a real-world attack, are attached to the target object or its surroundings to deceive the target system. However, most previous research employed adversarial patches that are conspicuous to human vision, making them easy to identify and counter. Previously, the spatially localized perturbation GAN (SLP-GAN) was proposed, in which the perturbation was only added to the most representative area of the input images, creating a spatially localized adversarial camouflage patch that excels in terms of visual fidelity and is, therefore, difficult to detect by human vision. In this study, the use of the method called eSLP-GAN was extended to deceive classifiers and object detection systems. Specifically, the loss function was modified for greater compatibility with an object-detection model attack and to increase robustness in the real world. Furthermore, the applicability of the proposed method was tested on the CARLA simulator for a more authentic real-world attack scenario.


2021 ◽  
Author(s):  
Sixian Chan ◽  
Jingcheng Zheng ◽  
Lina Wang ◽  
Tingting Wang ◽  
Xiaolong Zhou ◽  
...  

Abstract Deep learning models have become the mainstream algorithm for processing computer vision tasks. In object detection tasks, the detection box is usually set as a rectangular box aligned with the coordinate axis, so as to achieve the complete package of the object. However, when facing some objects with large aspect ratio and angle, the bounding box has to become large, which makes the bounding box contain a large amount of useless background information. In this study, a different approach is taken, using a method based on YOLOv5, the angle information dimension is increased in head part and angle regression added at the same time of the border regression, combining ciou and smoothl1 to calculate the bounding box loss, so that the resulting border box fits the actual object more closely. At the same time, the original dataset tags are also preprocessed to calculate the angle information of interest. The purpose of these improvements is to realize object detection with angles in remote-sensing images, especially for objects with large aspect ratios, such as ships, airplanes, and automobiles. Compared with the traditional object detection model based on deep learning, experimental results show that the proposed method has a unique effect in detecting rotating objects.


Author(s):  
Dasom Seo ◽  
Kyoung-Chul Kim ◽  
Meonghun Lee ◽  
Kyung-Do Kwon ◽  
Gookhwan Kim

Author(s):  
Ryan Motley ◽  
Andrew L Fielding ◽  
Prabhakar Ramachandran

Abstract Purpose The aim of this study was to assess the feasibility of the development and training of a deep learning object detection model for automating the assessment of fiducial marker migration and tracking of the prostate in radiotherapy patients. Methods and Materials A fiducial marker detection model was trained on the YOLO v2 detection framework using approximately 20,000 pelvis kV projection images with fiducial markers labelled. The ability of the trained model to detect marker positions was validated by tracking the motion of markers in a respiratory phantom and comparing detection data with the expected displacement from a reference position. Marker migration was then assessed in 14 prostate radiotherapy patients using the detector for comparison with previously conducted studies. This was done by determining variations in intermarker distance between the first and subsequent fractions in each patient. Results On completion of training, a detection model was developed that operated at a 96% detection efficacy and with a root mean square error of 0.3 pixels. By determining the displacement from a reference position in a respiratory phantom, experimentally and with the detector it was found that the detector was able to compute displacements with a mean accuracy of 97.8% when compared to the actual values. Interfraction marker migration was measured in 14 patients and the average and maximum ± standard deviation marker migration were found to be 2.0±0.9 mm and 2.3±0.9 mm, respectively. Conclusion This study demonstrates the benefits of pairing deep learning object detection, and image-guided radiotherapy and how a workflow to automate the assessment of organ motion and seed migration during prostate radiotherapy can be developed. The high detection efficacy and low error make the advantages of using a pre-trained model to automate the assessment of the target volume positional variation and the migration of fiducial markers between fractions.


Sign in / Sign up

Export Citation Format

Share Document