An Ensemble One Dimensional Convolutional Neural Network with Bayesian Optimization for Environmental Sound Classification

Mohammed Gamal Ragab; Said Jadid Abdulkadir; Norshakirah Aziz; Hitham Alhussian; Abubakar Bala; Alawi Alqushaibi

doi:10.3390/app11104660

An Ensemble One Dimensional Convolutional Neural Network with Bayesian Optimization for Environmental Sound Classification

Applied Sciences ◽

10.3390/app11104660 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4660

Author(s):

Mohammed Gamal Ragab ◽

Said Jadid Abdulkadir ◽

Norshakirah Aziz ◽

Hitham Alhussian ◽

Abubakar Bala ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Audio Signal ◽

Bayesian Optimization ◽

Classification Problems ◽

Environmental Sound ◽

One Dimensional ◽

Sound Classification ◽

Hyperparameter Selection ◽

End To End

With the growth of deep learning in various classification problems, many researchers have used deep learning methods in environmental sound classification tasks. This paper introduces an end-to-end method for environmental sound classification based on a one-dimensional convolution neural network with Bayesian optimization and ensemble learning, which directly learns features representation from the audio signal. Several convolutional layers were used to capture the signal and learn various filters relevant to the classification problem. Our proposed method can deal with any audio signal length, as a sliding window divides the signal into overlapped frames. Bayesian optimization accomplished hyperparameter selection and model evaluation with cross-validation. Multiple models with different settings have been developed based on Bayesian optimization to ensure network convergence in both convex and non-convex optimization. An UrbanSound8K dataset was evaluated for the performance of the proposed end-to-end model. The experimental results achieved a classification accuracy of 94.46%, which is 5% higher than existing end-to-end approaches with fewer trainable parameters. Four measurement indices, namely: sensitivity, specificity, accuracy, precision, recall, F-measure, area under ROC curve, and the area under the precision-recall curve were used to measure the model performance. The proposed approach outperformed state-of-the-art end-to-end approaches that use hand-crafted features as input in selected measurement indices and time complexity.

Download Full-text

Environmental Sound Classification Using Neural Network and Deep Learning

Springer Tracts in Nature-Inspired Computing - Nature-Inspired Computing for Smart Application Design ◽

10.1007/978-981-33-6195-9_3 ◽

2021 ◽

pp. 25-59

Author(s):

Dharma Rane ◽

Pushkar Shirodkar ◽

Trilochan Panigrahi ◽

S. Mini

Keyword(s):

Neural Network ◽

Deep Learning ◽

Environmental Sound ◽

Sound Classification

Download Full-text

End-to-end environmental sound classification using a 1D convolutional neural network

Expert Systems with Applications ◽

10.1016/j.eswa.2019.06.040 ◽

2019 ◽

Vol 136 ◽

pp. 252-263 ◽

Cited By ~ 28

Author(s):

Sajjad Abdoli ◽

Patrick Cardinal ◽

Alessandro Lameiras Koerich

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Environmental Sound ◽

Sound Classification ◽

End To End

Download Full-text

Matching Large Baseline Oblique Stereo Images Using an End-to-End Convolutional Neural Network

Remote Sensing ◽

10.3390/rs13020274 ◽

2021 ◽

Vol 13 (2) ◽

pp. 274

Author(s):

Guobiao Yao ◽

Alper Yilmaz ◽

Li Zhang ◽

Fei Meng ◽

Haibin Ai ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Least Square ◽

Affine Invariant ◽

Stereo Images ◽

Distance Ratio ◽

Matching Algorithm ◽

End To End

The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters.

Download Full-text

Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification

2021 International Conference on Computer, Control and Robotics (ICCCR) ◽

10.1109/icccr49711.2021.9349393 ◽

2021 ◽

Author(s):

Jianrui Lu ◽

Ruofei Ma ◽

Gongliang Liu ◽

Zhiliang Qin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Deep Convolutional Neural Network ◽

Environmental Sound ◽

Sound Classification

Download Full-text

SHEDR: An End-to-End Deep Neural Event Detection and Recommendation Framework for Hyperlocal News Using Social Media

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1112 ◽

2021 ◽

Author(s):

Yuheng Hu ◽

Yili Hong

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Event Detection ◽

Large Scale ◽

Short Term Memory ◽

State Of The Art ◽

Neural Network Models ◽

Neural Event ◽

End To End

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.

Download Full-text

Convolutional Neural Network-Gated Recurrent Unit Neural Network with Feature Fusion for Environmental Sound Classification

Automatic Control and Computer Sciences ◽

10.3103/s0146411621040106 ◽

2021 ◽

Vol 55 (4) ◽

pp. 311-318

Author(s):

Yu Zhang ◽

Jinfang Zeng ◽

Youming Li ◽

Da Chen

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Feature Fusion ◽

Environmental Sound ◽

Sound Classification ◽

Gated Recurrent Unit

Download Full-text

Research on fault diagnosis of automobile engines based on the deep learning 1D-CNN method

Engineering Research Express ◽

10.1088/2631-8695/ac4834 ◽

2022 ◽

Author(s):

Canyi Du ◽

Rui Zhong ◽

Yishen Zhuo ◽

Xinyu Zhang ◽

Feifei Yu ◽

...

Keyword(s):

Neural Network ◽

Pattern Recognition ◽

Deep Learning ◽

Fault Diagnosis ◽

Convolutional Neural Network ◽

Pattern Recognition Method ◽

Recognition Method ◽

Vibration Signals ◽

One Dimensional ◽

Sample Set

Abstract Traditional engine fault diagnosis methods usually need to extract the features manually before classifying them by the pattern recognition method, which makes it difficult to solve the end-to-end fault diagnosis problem. In recent years, deep learning has been applied in different fields, bringing considerable convenience to technological change, and its application in the automotive field also has many applications, such as image recognition, language processing, and assisted driving. In this paper, a one-dimensional convolutional neural network (1D-CNN) in deep learning is used to process vibration signals to achieve fault diagnosis and classification. By collecting the vibration signal data of different engine working conditions, the collected data are organized into several sets of data in a working cycle, which are divided into a training sample set and a test sample set. Then, a one-dimensional convolutional neural network model is built in Python to allow the feature filter (convolution kernel) to learn the data from the training set and these convolution checks process the input data of the test set. Convolution and pooling extract features to output to a new space, which is characterized by learning features directly from the original vibration signals and completing fault diagnosis. The experimental results show that the pattern recognition method based on a one-dimensional convolutional neural network can be effectively applied to engine fault diagnosis and has higher diagnostic accuracy than traditional methods.

Download Full-text

End-to-End Deep Learning Fusion of Fingerprint and Electrocardiogram Signals for Presentation Attack Detection

Sensors ◽

10.3390/s20072085 ◽

2020 ◽

Vol 20 (7) ◽

pp. 2085 ◽

Cited By ~ 1

Author(s):

Rami M. Jomaa ◽

Hassan Mathkour ◽

Yakoub Bazi ◽

Md Saiful Islam

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Attack Detection ◽

Ecg Signal ◽

Ecg Signals ◽

Biometric Systems ◽

Fingerprint Biometrics ◽

End To End ◽

The Impact

Although fingerprint-based systems are the commonly used biometric systems, they suffer from a critical vulnerability to a presentation attack (PA). Therefore, several approaches based on a fingerprint biometrics have been developed to increase the robustness against a PA. We propose an alternative approach based on the combination of fingerprint and electrocardiogram (ECG) signals. An ECG signal has advantageous characteristics that prevent the replication. Combining a fingerprint with an ECG signal is a potentially interesting solution to reduce the impact of PAs in biometric systems. We also propose a novel end-to-end deep learning-based fusion neural architecture between a fingerprint and an ECG signal to improve PA detection in fingerprint biometrics. Our model uses state-of-the-art EfficientNets for generating a fingerprint feature representation. For the ECG, we investigate three different architectures based on fully-connected layers (FC), a 1D-convolutional neural network (1D-CNN), and a 2D-convolutional neural network (2D-CNN). The 2D-CNN converts the ECG signals into an image and uses inverted Mobilenet-v2 layers for feature generation. We evaluated the method on a multimodal dataset, that is, a customized fusion of the LivDet 2015 fingerprint dataset and ECG data from real subjects. Experimental results reveal that this architecture yields a better average classification accuracy compared to a single fingerprint modality.

Download Full-text

A Position Weighted Information Based Word Embedding Model for Machine Translation

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213020400059 ◽

2020 ◽

Vol 29 (07n08) ◽

pp. 2040005

Author(s):

Zhen Li ◽

Dan Qu ◽

Yanxia Li ◽

Chaojie Xie ◽

Qi Chen

Keyword(s):

Neural Network ◽

Deep Learning ◽

Machine Translation ◽

Semantic Information ◽

Word Embedding ◽

Vector Model ◽

Learning Technology ◽

Initial Value ◽

Input Layer ◽

End To End

Deep learning technology promotes the development of neural network machine translation (NMT). End-to-End (E2E) has become the mainstream in NMT. It uses word vectors as the initial value of the input layer. The effect of word vector model directly affects the accuracy of E2E-NMT. Researchers have proposed many approaches to learn word representations and have achieved significant results. However, the drawbacks of these methods still limit the performance of E2E-NMT systems. This paper focuses on the word embedding technology and proposes the PW-CBOW word vector model which can present better semantic information. We apply these word vector models on IWSLT14 German-English, WMT14 English-German, WMT14 English-French corporas. The results evaluate the performance of the PW-CBOW model. In the latest E2E-NMT systems, the PW-CBOW word vector model can improve the performance.

Download Full-text