scholarly journals Deep Learning-Based Cattle Vocal Classification Model and Real-Time Livestock Monitoring System with Noise Filtering

Animals ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 357
Author(s):  
Dae-Hyun Jung ◽  
Na Yeon Kim ◽  
Sang Ho Moon ◽  
Changho Jhin ◽  
Hak-Jin Kim ◽  
...  

The priority placed on animal welfare in the meat industry is increasing the importance of understanding livestock behavior. In this study, we developed a web-based monitoring and recording system based on artificial intelligence analysis for the classification of cattle sounds. The deep learning classification model of the system is a convolutional neural network (CNN) model that takes voice information converted to Mel-frequency cepstral coefficients (MFCCs) as input. The CNN model first achieved an accuracy of 91.38% in recognizing cattle sounds. Further, short-time Fourier transform-based noise filtering was applied to remove background noise, improving the classification model accuracy to 94.18%. Categorized cattle voices were then classified into four classes, and a total of 897 classification records were acquired for the classification model development. A final accuracy of 81.96% was obtained for the model. Our proposed web-based platform that provides information obtained from a total of 12 sound sensors provides cattle vocalization monitoring in real time, enabling farm owners to determine the status of their cattle.

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4045
Author(s):  
Alessandro Sassu ◽  
Jose Francisco Saenz-Cogollo ◽  
Maurizio Agelli

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 555
Author(s):  
Jui-Sheng Chou ◽  
Chia-Hsuan Liu

Sand theft or illegal mining in river dredging areas has been a problem in recent decades. For this reason, increasing the use of artificial intelligence in dredging areas, building automated monitoring systems, and reducing human involvement can effectively deter crime and lighten the workload of security guards. In this investigation, a smart dredging construction site system was developed using automated techniques that were arranged to be suitable to various areas. The aim in the initial period of the smart dredging construction was to automate the audit work at the control point, which manages trucks in river dredging areas. Images of dump trucks entering the control point were captured using monitoring equipment in the construction area. The obtained images and the deep learning technique, YOLOv3, were used to detect the positions of the vehicle license plates. Framed images of the vehicle license plates were captured and were used as input in an image classification model, C-CNN-L3, to identify the number of characters on the license plate. Based on the classification results, the images of the vehicle license plates were transmitted to a text recognition model, R-CNN-L3, that corresponded to the characters of the license plate. Finally, the models of each stage were integrated into a real-time truck license plate recognition (TLPR) system; the single character recognition rate was 97.59%, the overall recognition rate was 93.73%, and the speed was 0.3271 s/image. The TLPR system reduces the labor force and time spent to identify the license plates, effectively reducing the probability of crime and increasing the transparency, automation, and efficiency of the frontline personnel’s work. The TLPR is the first step toward an automated operation to manage trucks at the control point. The subsequent and ongoing development of system functions can advance dredging operations toward the goal of being a smart construction site. By intending to facilitate an intelligent and highly efficient management system of dredging-related departments by providing a vehicle LPR system, this paper forms a contribution to the current body of knowledge in the sense that it presents an objective approach for the TLPR system.


Energies ◽  
2020 ◽  
Vol 13 (15) ◽  
pp. 3930 ◽  
Author(s):  
Ayaz Hussain ◽  
Umar Draz ◽  
Tariq Ali ◽  
Saman Tariq ◽  
Muhammad Irfan ◽  
...  

Increasing waste generation has become a significant issue over the globe due to the rapid increase in urbanization and industrialization. In the literature, many issues that have a direct impact on the increase of waste and the improper disposal of waste have been investigated. Most of the existing work in the literature has focused on providing a cost-efficient solution for the monitoring of garbage collection system using the Internet of Things (IoT). Though an IoT-based solution provides the real-time monitoring of a garbage collection system, it is limited to control the spreading of overspill and bad odor blowout gasses. The poor and inadequate disposal of waste produces toxic gases, and radiation in the environment has adverse effects on human health, the greenhouse system, and global warming. While considering the importance of air pollutants, it is imperative to monitor and forecast the concentration of air pollutants in addition to the management of the waste. In this paper, we present and IoT-based smart bin using a machine and deep learning model to manage the disposal of garbage and to forecast the air pollutant present in the surrounding bin environment. The smart bin is connected to an IoT-based server, the Google Cloud Server (GCP), which performs the computation necessary for predicting the status of the bin and for forecasting air quality based on real-time data. We experimented with a traditional model (k-nearest neighbors algorithm (k-NN) and logistic reg) and a non-traditional (long short term memory (LSTM) network-based deep learning) algorithm for the creation of alert messages regarding bin status and forecasting the amount of air pollutant carbon monoxide (CO) present in the air at a specific instance. The recalls of logistic regression and k-NN algorithm is 79% and 83%, respectively, in a real-time testing environment for predicting the status of the bin. The accuracy of modified LSTM and simple LSTM models is 90% and 88%, respectively, to predict the future concentration of gases present in the air. The system resulted in a delay of 4 s in the creation and transmission of the alert message to a sanitary worker. The system provided the real-time monitoring of garbage levels along with notifications from the alert mechanism. The proposed works provide improved accuracy by utilizing machine learning as compared to existing solutions based on simple approaches.


2020 ◽  
Vol 12 (1) ◽  
pp. 1-11
Author(s):  
Arivudainambi D. ◽  
Varun Kumar K.A. ◽  
Vinoth Kumar R. ◽  
Visu P.

Ransomware is a malware which affects the systems data with modern encryption techniques, and the data is recovered once a ransom amount is paid. In this research, the authors show how ransomware propagates and infects devices. Live traffic classifications of ransomware have been meticulously analyzed. Further, a novel method for the classification of ransomware traffic by using deep learning methods is presented. Based on classification, the detection of ransomware is approached with the characteristics of the network traffic and its communications. In more detail, the behavior of popular ransomware, Crypto Wall, is analyzed and based on this knowledge, a real-time ransomware live traffic classification model is proposed.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 210
Author(s):  
Dongsuk Park ◽  
Seungeui Lee ◽  
SeongUk Park ◽  
Nojun Kwak

With the upsurge in the use of Unmanned Aerial Vehicles (UAVs) in various fields, detecting and identifying them in real-time are becoming important topics. However, the identification of UAVs is difficult due to their characteristics such as low altitude, slow speed, and small radar cross-section (LSS). With the existing deterministic approach, the algorithm becomes complex and requires a large number of computations, making it unsuitable for real-time systems. Hence, effective alternatives enabling real-time identification of these new threats are needed. Deep learning-based classification models learn features from data by themselves and have shown outstanding performance in computer vision tasks. In this paper, we propose a deep learning-based classification model that learns the micro-Doppler signatures (MDS) of targets represented on radar spectrogram images. To enable this, first, we recorded five LSS targets (three types of UAVs and two different types of human activities) with a frequency modulated continuous wave (FMCW) radar in various scenarios. Then, we converted signals into spectrograms in the form of images by Short time Fourier transform (STFT). After the data refinement and augmentation, we made our own radar spectrogram dataset. Secondly, we analyzed characteristics of the radar spectrogram dataset with the ResNet-18 model and designed the ResNet-SP model with less computation, higher accuracy and stability based on the ResNet-18 model. The results show that the proposed ResNet-SP has a training time of 242 s and an accuracy of 83.39%, which is superior to the ResNet-18 that takes 640 s for training with an accuracy of 79.88%.


2021 ◽  
Author(s):  
Xiangjian Liu ◽  
Yishan Zou ◽  
Yu Sun

Dogs have the tendency to bark at loud noises that they perceive as an intruder or a threat, and the hostile barking can often last up to hours depending on the duration of such noise. These barking sessions are unnecessary and negatively impact the quality of life of the others in your community, causing annoyance to your neighbors [1]. Having the rights to file noise complaints to the Home Owners Association, potentially resulting in fines or even the removal of the pet [2]. In this paper, we will discuss the development of an algorithm that takes in audio inputs through a microphone, then processes the audio and identifies that the audio clip is dog barks through machine learning, and ultimately sends the notification to the user. By implementing our application to the everyday life of dog owners, it allows them to accurately determine the status of their dog in real-time with minimal false reports.


2021 ◽  
Vol 18 (2(Suppl.)) ◽  
pp. 0925
Author(s):  
Asroni Asroni ◽  
Ku Ruhana Ku-Mahamud ◽  
Cahya Damarjati ◽  
Hasan Basri Slamat

Deep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to evaluate the pronunciation of the Arabic alphabet. Voice data from six school children are recorded and used to test the performance of the proposed method. The padding technique has been used to augment the voice data before feeding the data to the CNN structure to developed the classification model. In addition, three other feature extraction techniques have been introduced to enable the comparison of the proposed method which employs padding technique. The performance of the proposed method with padding technique is at par with the spectrogram but better than mel-spectrogram and mel-frequency cepstral coefficients. Results also show that the proposed method was able to distinguish the Arabic alphabets that are difficult to pronounce. The proposed method with padding technique may be extended to address other voice pronunciation ability other than the Arabic alphabets.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Stefano Bromuri ◽  
Alexander P. Henkel ◽  
Deniz Iren ◽  
Visara Urovi

PurposeA vast body of literature has documented the negative consequences of stress on employee performance and well-being. These deleterious effects are particularly pronounced for service agents who need to constantly endure and manage customer emotions. The purpose of this paper is to introduce and describe a deep learning model to predict in real-time service agent stress from emotion patterns in voice-to-voice service interactions.Design/methodology/approachA deep learning model was developed to identify emotion patterns in call center interactions based on 363 recorded service interactions, subdivided in 27,889 manually expert-labeled three-second audio snippets. In a second step, the deep learning model was deployed in a call center for a period of one month to be further trained by the data collected from 40 service agents in another 4,672 service interactions.FindingsThe deep learning emotion classifier reached a balanced accuracy of 68% in predicting discrete emotions in service interactions. Integrating this model in a binary classification model, it was able to predict service agent stress with a balanced accuracy of 80%.Practical implicationsService managers can benefit from employing the deep learning model to continuously and unobtrusively monitor the stress level of their service agents with numerous practical applications, including real-time early warning systems for service agents, customized training and automatically linking stress to customer-related outcomes.Originality/valueThe present study is the first to document an artificial intelligence (AI)-based model that is able to identify emotions in natural (i.e. nonstaged) interactions. It is further a pioneer in developing a smart emotion-based stress measure for service agents. Finally, the study contributes to the literature on the role of emotions in service interactions and employee stress.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7320
Author(s):  
Rajesh Baliram Singh ◽  
Hanqi Zhuang ◽  
Jeet Kiran Pawani

Distinguishing between a dangerous audio event like a gun firing and other non-life-threatening events, such as a plastic bag bursting, can mean the difference between life and death and, therefore, the necessary and unnecessary deployment of public safety personnel. Sounds generated by plastic bag explosions are often confused with real gunshot sounds, by either humans or computer algorithms. As a case study, the research reported in this paper offers insight into sounds of plastic bag explosions and gunshots. An experimental study in this research reveals that a deep learning-based classification model trained with a popular urban sound dataset containing gunshot sounds cannot distinguish plastic bag pop sounds from gunshot sounds. This study further shows that the same deep learning model, if trained with a dataset containing plastic pop sounds, can effectively detect the non-life-threatening sounds. For this purpose, first, a collection of plastic bag-popping sounds was recorded in different environments with varying parameters, such as plastic bag size and distance from the recording microphones. The audio clips’ duration ranged from 400 ms to 600 ms. This collection of data was then used, together with a gunshot sound dataset, to train a classification model based on a convolutional neural network (CNN) to differentiate life-threatening gunshot events from non-life-threatening plastic bag explosion events. A comparison between two feature extraction methods, the Mel-frequency cepstral coefficients (MFCC) and Mel-spectrograms, was also done. Experimental studies conducted in this research show that once the plastic bag pop sounds are injected into model training, the CNN classification model performs well in distinguishing actual gunshot sounds from plastic bag sounds.


Author(s):  
Javier Orlando Pinzón-Arenas ◽  
Robinson Jiménez-Moreno

This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.


Sign in / Sign up

Export Citation Format

Share Document