Campus Violence Detection Based on Artificial Intelligent Interpretation of Surveillance Video Sequences

Liang Ye; Tong Liu; Tian Han; Hany Ferdinando; Tapio Seppänen; Esko Alasaarela

doi:10.3390/rs13040628

Campus Violence Detection Based on Artificial Intelligent Interpretation of Surveillance Video Sequences

Remote Sensing ◽

10.3390/rs13040628 ◽

2021 ◽

Vol 13 (4) ◽

pp. 628

Author(s):

Liang Ye ◽

Tong Liu ◽

Tian Han ◽

Hany Ferdinando ◽

Tapio Seppänen ◽

...

Keyword(s):

Neural Network ◽

Recognition Accuracy ◽

Role Playing ◽

School Bullying ◽

Image Features ◽

Campus Violence ◽

Surveillance Video ◽

Acoustic Features ◽

Mel Frequency Cepstral Coefficients ◽

Violence Detection

Campus violence is a common social phenomenon all over the world, and is the most harmful type of school bullying events. As artificial intelligence and remote sensing techniques develop, there are several possible methods to detect campus violence, e.g., movement sensor-based methods and video sequence-based methods. Sensors and surveillance cameras are used to detect campus violence. In this paper, the authors use image features and acoustic features for campus violence detection. Campus violence data are gathered by role-playing, and 4096-dimension feature vectors are extracted from every 16 frames of video images. The C3D (Convolutional 3D) neural network is used for feature extraction and classification, and an average recognition accuracy of 92.00% is achieved. Mel-frequency cepstral coefficients (MFCCs) are extracted as acoustic features, and three speech emotion databases are involved. The C3D neural network is used for classification, and the average recognition accuracies are 88.33%, 95.00%, and 91.67%, respectively. To solve the problem of evidence conflict, the authors propose an improved Dempster–Shafer (D–S) algorithm. Compared with existing D–S theory, the improved algorithm increases the recognition accuracy by 10.79%, and the recognition accuracy can ultimately reach 97.00%.

Download Full-text

Physical Violence Detection for Preventing School Bullying

Advances in Artificial Intelligence ◽

10.1155/2014/740358 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 13

Author(s):

Liang Ye ◽

Hany Ferdinando ◽

Tapio Seppänen ◽

Esko Alasaarela

Keyword(s):

Physical Violence ◽

Promising Result ◽

Detection System ◽

Role Playing ◽

School Bullying ◽

Dropping Out ◽

Daily Life Activities ◽

Everyday Activities ◽

Violence Detection ◽

Physical Bullying

School bullying is a serious problem among teenagers, causing depression, dropping out of school, or even suicide. It is thus important to develop antibullying methods. This paper proposes a physical bullying detection method based on activity recognition. The architecture of the physical violence detection system is described, and a Fuzzy Multithreshold classifier is developed to detect physical bullying behaviour, including pushing, hitting, and shaking. Importantly, the application has the capability of distinguishing these types of behaviour from such everyday activities as running, walking, falling, or doing push-ups. To accomplish this, the method uses acceleration and gyro signals. Experimental data were gathered by role playing school bullying scenarios and by doing daily-life activities. The simulations achieved an average classification accuracy of 92%, which is a promising result for smartphone-based detection of physical bullying.

Download Full-text

Research on Recognition Effect of DSCN Network Structure in Hand-Drawn Sketch

Computational Intelligence and Neuroscience ◽

10.1155/2021/4056454 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Qunjing Ji

Keyword(s):

Neural Network ◽

Recognition Accuracy ◽

Rapid Development ◽

Image Features ◽

Sketch Recognition ◽

Freehand Sketch ◽

Classical Models ◽

Manual Selection ◽

Stroke Sequence ◽

Deep Learning Model

With the rapid development of image recognition technology, freehand sketch recognition has attracted more and more attention. How to achieve good recognition effect in the absence of color and texture information is the key to the development of freehand sketch recognition. Traditional nonlearning classical models are highly dependent on manual selection features. To solve this problem, a neural network sketch recognition method based on DSCN structure is proposed in this paper. Firstly, the stroke sequence of the sketch is drawn; then, the feature is extracted according to the stroke sequence combined with neural network, and the extracted image features are used as the input of the model to construct the time relationship between different image features. Through the control experiment on TU-Berlin dataset, the results show that, compared with the traditional nonlearning methods, HOG-SVM, SIFT-Fisher Vector, MKL-SVM, and FV-SP, the recognition accuracy of DSCN network is improved by 15.8%, 10.3%, 6.0%, and 2.9%, respectively. Compared with the classical deep learning model, Alex-Net, the recognition accuracy is improved by 5.6%. The above results show that the DSCN network proposed in this paper has strong ability of feature extraction and nonlinear expression and can effectively improve the recognition accuracy of hand-painted sketches after introducing the stroke order.

Download Full-text

Qualitative Analysis of PLP in LSTM for Bangla Speech Recognition

The International journal of Multimedia & Its Applications ◽

10.5121/ijma.2020.12501 ◽

2020 ◽

Vol 12 (5) ◽

pp. 1-8

Author(s):

Nahyan Al Mahmud ◽

Shahfida Amjad Munni

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Linear Prediction ◽

Short Term Memory ◽

Acoustic Features ◽

Linear Predictive Coding ◽

Acoustic Feature ◽

Mel Frequency Cepstral Coefficients ◽

Bhattacharyya Distance ◽

Perceptual Linear Prediction

The performance of various acoustic feature extraction methods has been compared in this work using Long Short-Term Memory (LSTM) neural network in a Bangla speech recognition system. The acoustic features are a series of vectors that represents the speech signals. They can be classified in either words or sub word units such as phonemes. In this work, at first linear predictive coding (LPC) is used as acoustic vector extraction technique. LPC has been chosen due to its widespread popularity. Then other vector extraction techniques like Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction (PLP) have also been used. These two methods closely resemble the human auditory system. These feature vectors are then trained using the LSTM neural network. Then the obtained models of different phonemes are compared with different statistical tools namely Bhattacharyya Distance and Mahalanobis Distance to investigate the nature of those acoustic features.

Download Full-text

Artificial Neural Network Model for Road Pavement Classification using Features of Tire-Pavement Noise and Road Surface Images

INTER-NOISE and NOISE-CON Congress and Conference Proceedings ◽

10.3397/in-2021-2964 ◽

2021 ◽

Vol 263 (1) ◽

pp. 5101-5105

Author(s):

Seo Il Chang ◽

Bo Kyeong Kim ◽

Jae Kwan Lee

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Sound Intensity ◽

Image Features ◽

Road Surface ◽

Intensity Level ◽

Acoustic Features ◽

Road Pavement ◽

Artificial Neural ◽

Pavement Noise

Artificial neural network models were developed to classify road pavement types into the transverse-tined, the longitudinal-tined, NGCS(Next Generation Concrete Surface), Diamond Grinding, and Stone Mastic Asphalt by utilizing tire-pavement noise and road surface images. Tire-pavement noise data were collected by OBSI(On-Board Sound Intensity) method, and analyzed to obtain sound intensity level, sound pressure level, and sound quality indices. Road surface image data was analyzed through image feature extraction algorithms of Hough transformation and HOG(Histogram of gradient). The important features among the acoustic and image characteristics were selected by a random forest model. The acoustic features selected by the random forest algorithm are the overall sound intensity level of 400~5kHz 1/3-octave bands, the sound intensities (W/m2) of 800~2kHz 1/3-octave bands, loudness, fluctuation strength and tonality. The image features selected are the number of longitudinal lines extracted from Hough transform algorithm and HOG of the central cell. The two groups of the selected features were applied separately or together to an artificial neural network model to find classification performance. The classification accuracy rates of the models using acoustic features only, image features only and both acoustic and image features combined were 90.8%, 88.8%, and 97.3%, respectively.

Download Full-text

Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network

Sensors ◽

10.3390/s19112472 ◽

2019 ◽

Vol 19 (11) ◽

pp. 2472 ◽

Cited By ~ 18

Author(s):

Fath U Min Ullah ◽

Amin Ullah ◽

Khan Muhammad ◽

Ijaz Ul Haq ◽

Sung Wook Baik

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Smart Cities ◽

Automatic Monitoring ◽

Video Stream ◽

Surveillance Video ◽

Softmax Classifier ◽

Spatiotemporal Features ◽

Violence Detection ◽

3D Cnn

The worldwide utilization of surveillance cameras in smart cities has enabled researchers to analyze a gigantic volume of data to ensure automatic monitoring. An enhanced security system in smart cities, schools, hospitals, and other surveillance domains is mandatory for the detection of violent or abnormal activities to avoid any casualties which could cause social, economic, and ecological damages. Automatic detection of violence for quick actions is very significant and can efficiently assist the concerned departments. In this paper, we propose a triple-staged end-to-end deep learning violence detection framework. First, persons are detected in the surveillance video stream using a light-weight convolutional neural network (CNN) model to reduce and overcome the voluminous processing of useless frames. Second, a sequence of 16 frames with detected persons is passed to 3D CNN, where the spatiotemporal features of these sequences are extracted and fed to the Softmax classifier. Furthermore, we optimized the 3D CNN model using an open visual inference and neural networks optimization toolkit developed by Intel, which converts the trained model into intermediate representation and adjusts it for optimal execution at the end platform for the final prediction of violent activity. After detection of a violent activity, an alert is transmitted to the nearest police station or security department to take prompt preventive actions. We found that our proposed method outperforms the existing state-of-the-art methods for different benchmark datasets.

Download Full-text

An Unconstrained Face Recognition Method Based on Siamese Networks

10.21203/rs.3.rs-707159/v1 ◽

2021 ◽

Author(s):

Song CunLi ◽

Shouyong Ji

Keyword(s):

Neural Network ◽

Face Recognition ◽

Network Model ◽

High Frequency ◽

Recognition Accuracy ◽

Recognition Rate ◽

Low Frequency ◽

Image Features ◽

Frequency Features ◽

Frequency Feature

Abstract It is aimed at the low accuracy and low efficiency of face recognition under unlimited conditions.In this paper, a Siamese neural Network model SN-LF (Siamese Network based on LBP and Frequency Feature perception) is designed based on the Local Binary Pattern (LBP) and the Frequency sensing model.Based on Siamese neural networks, the network adopts circular LBP algorithm and frequency feature perception to realize face recognition under unrestricted conditions.The LBP algorithm can eliminate the influence of light on the image and provide directional input to the network model at the same time.Frequency feature sensing divides the image features into low frequency features and high frequency features. The low frequency features are compressed in the Siamese neural network to increase the recognition efficiency of the network. At the same time, information is exchanged with the high frequency features, so that the target noise data can be eliminated while the feature data is retained.In this way, the recognition rate of the network is maintained, and the computing speed of the network is improved.Simulation experiments are carried out on standard face dataset CASIA-Webface and Yale-B, and compared with other network models. The experimental results show that the proposed SN-LF network structure can improve the recognition accuracy of the algorithm, and achieve a good recognition accuracy.

Download Full-text

Digitalization system of ancient architecture decoration art based on neural network and image features

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189251 ◽

2020 ◽

pp. 1-12

Author(s):

Wu Xin ◽

Qiu Daping

Keyword(s):

Neural Network ◽

Construction Industry ◽

Three Dimensional ◽

Performance Testing ◽

Image Features ◽

Three Dimensional Model ◽

Performance Effect ◽

Data Process ◽

And Performance ◽

Construction Mode

The inheritance and innovation of ancient architecture decoration art is an important way for the development of the construction industry. The data process of traditional ancient architecture decoration art is relatively backward, which leads to the obvious distortion of the digitalization of ancient architecture decoration art. In order to improve the digital effect of ancient architecture decoration art, based on neural network, this paper combines the image features to construct a neural network-based ancient architecture decoration art data system model, and graphically expresses the static construction mode and dynamic construction process of the architecture group. Based on this, three-dimensional model reconstruction and scene simulation experiments of architecture groups are realized. In order to verify the performance effect of the system proposed in this paper, it is verified through simulation and performance testing, and data visualization is performed through statistical methods. The result of the study shows that the digitalization effect of the ancient architecture decoration art proposed in this paper is good.

Download Full-text

Graphene-based 3D XNOR-VRRAM with ternary precision for neuromorphic computing

npj 2D Materials and Applications ◽

10.1038/s41699-021-00236-x ◽

2021 ◽

Vol 5 (1) ◽

Author(s):

Batyrbek Alimkhanuly ◽

Joon Sohn ◽

Ik-Joon Chang ◽

Seunghyun Lee

Keyword(s):

Neural Network ◽

Energy Consumption ◽

Recognition Accuracy ◽

Material Selection ◽

Weighted Sum ◽

Device Design ◽

Key Factors ◽

Neuromorphic Computing ◽

Device Scaling ◽

The Impact

AbstractRecent studies on neural network quantization have demonstrated a beneficial compromise between accuracy, computation rate, and architecture size. Implementing a 3D Vertical RRAM (VRRAM) array accompanied by device scaling may further improve such networks’ density and energy consumption. Individual device design, optimized interconnects, and careful material selection are key factors determining the overall computation performance. In this work, the impact of replacing conventional devices with microfabricated, graphene-based VRRAM is investigated for circuit and algorithmic levels. By exploiting a sub-nm thin 2D material, the VRRAM array demonstrates an improved read/write margins and read inaccuracy level for the weighted-sum procedure. Moreover, energy consumption is significantly reduced in array programming operations. Finally, an XNOR logic-inspired architecture designed to integrate 1-bit ternary precision synaptic weights into graphene-based VRRAM is introduced. Simulations on VRRAM with metal and graphene word-planes demonstrate 83.5 and 94.1% recognition accuracy, respectively, denoting the importance of material innovation in neuromorphic computing.

Download Full-text

Classification of papillary thyroid carcinoma histological images based on deep learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210100 ◽

2021 ◽

pp. 1-11

Author(s):

Yaning Liu ◽

Lin Han ◽

Hexiang Wang ◽

Bo Yin

Keyword(s):

Neural Network ◽

Differential Diagnosis ◽

Deep Learning ◽

Papillary Thyroid Carcinoma ◽

Thyroid Carcinoma ◽

Image Features ◽

Papillary Thyroid ◽

Histological Image ◽

Histological Images

Papillary thyroid carcinoma (PTC) is a common carcinoma in thyroid. As many benign thyroid nodules have the papillary structure which could easily be confused with PTC in morphology. Thus, pathologists have to take a lot of time on differential diagnosis of PTC besides personal diagnostic experience and there is no doubt that it is subjective and difficult to obtain consistency among observers. To address this issue, we applied deep learning to the differential diagnosis of PTC and proposed a histological image classification method for PTC based on the Inception Residual convolutional neural network (IRCNN) and support vector machine (SVM). First, in order to expand the dataset and solve the problem of histological image color inconsistency, a pre-processing module was constructed that included color transfer and mirror transform. Then, to alleviate overfitting of the deep learning model, we optimized the convolution neural network by combining Inception Network and Residual Network to extract image features. Finally, the SVM was trained via image features extracted by IRCNN to perform the classification task. Experimental results show effectiveness of the proposed method in the classification of PTC histological images.

Download Full-text

Image Restoration by Learning Morphological Opening-Closing Network

Mathematical Morphology - Theory and Applications ◽

10.1515/mathm-2020-0103 ◽

2020 ◽

Vol 4 (1) ◽

pp. 87-107

Author(s):

Ranjan Mondal ◽

Moni Shankar Dey ◽

Bhabatosh Chanda

Keyword(s):

Neural Network ◽

Image Restoration ◽

State Of The Art ◽

Source Code ◽

Back Propagation ◽

Image Features ◽

Main Difficulty ◽

The Right ◽

Right Order ◽

Morphological Opening

AbstractMathematical morphology is a powerful tool for image processing tasks. The main difficulty in designing mathematical morphological algorithm is deciding the order of operators/filters and the corresponding structuring elements (SEs). In this work, we develop morphological network composed of alternate sequences of dilation and erosion layers, which depending on learned SEs, may form opening or closing layers. These layers in the right order along with linear combination (of their outputs) are useful in extracting image features and processing them. Structuring elements in the network are learned by back-propagation method guided by minimization of the loss function. Efficacy of the proposed network is established by applying it to two interesting image restoration problems, namely de-raining and de-hazing. Results are comparable to that of many state-of-the-art algorithms for most of the images. It is also worth mentioning that the number of network parameters to handle is much less than that of popular convolutional neural network for similar tasks. The source code can be found here https://github.com/ranjanZ/Mophological-Opening-Closing-Net

Download Full-text