scholarly journals Towards Automatic Depression Detection: A BiLSTM/1D CNN-Based Model

2020 ◽  
Vol 10 (23) ◽  
pp. 8701
Author(s):  
Lin Lin ◽  
Xuri Chen ◽  
Ying Shen ◽  
Lin Zhang

Depression is a global mental health problem, the worst cases of which can lead to self-injury or suicide. An automatic depression detection system is of great help in facilitating clinical diagnosis and early intervention of depression. In this work, we propose a new automatic depression detection method utilizing speech signals and linguistic content from patient interviews. Specifically, the proposed method consists of three components, which include a Bidirectional Long Short-Term Memory (BiLSTM) network with an attention layer to deal with linguistic content, a One-Dimensional Convolutional Neural Network (1D CNN) to deal with speech signals, and a fully connected network integrating the outputs of the previous two models to assess the depressive state. Evaluated on two publicly available datasets, our method achieves state-of-the-art performance compared with the existing methods. In addition, our method utilizes audio and text features simultaneously. Therefore, it can get rid of the misleading information provided by the patients. As a conclusion, our method can automatically evaluate the depression state and does not require an expert to conduct the psychological evaluation on site. Our method greatly improves the detection accuracy, as well as the efficiency.

2019 ◽  
Vol 9 (14) ◽  
pp. 2941 ◽  
Author(s):  
Donghwoon Kwon ◽  
Ritesh Malaiya ◽  
Geumchae Yoon ◽  
Jeong-Tak Ryu ◽  
Su-Young Pi

One of the recent news headlines is that a pedestrian was killed by an autonomous vehicle because safety features in this vehicle did not detect an object on a road correctly. Due to this accident, some global automobile companies announced plans to postpone development of an autonomous vehicle. Furthermore, there is no doubt about the importance of safety features for autonomous vehicles. For this reason, our research goal is the development of a very safe and lightweight camera-based blind spot detection system, which can be applied to future autonomous vehicles. The blind spot detection system was implemented in open source software. Approximately 2000 vehicle images and 9000 non-vehicle images were adopted for training the Fully Connected Network (FCN) model. Other data processing concepts such as the Histogram of Oriented Gradients (HOG), heat map, and thresholding were also employed. We achieved 99.43% training accuracy and 98.99% testing accuracy of the FCN model, respectively. Source codes with respect to all the methodologies were then deployed to an off-the-shelf embedded board for actual testing on a road. Actual testing was conducted with consideration of various factors, and we confirmed 93.75% average detection accuracy with three false positives.


2020 ◽  
Vol 17 (3) ◽  
pp. 705-715
Author(s):  
Chuyue Zhang ◽  
Xiaofan Zhao ◽  
Manchun Cai ◽  
Dawei Wang ◽  
Luzhe Cao

In this paper, we propose a new model to predict the age and number of suspects through the feature modeling of historical data. We discrete the case information into values of 20 dimensions. After feature selection, we use 9 machine learning algorithms and Deep Neural Networks to extract the numerical features. In addition, we use Convolutional Neural Networks and Long Short- Term Memory to extract the text features of case description. These two types of features are fused and fed into fully connected layer and softmax layer. This work is an extension of our short conference proceeding paper. The experimental results show that the new model improved accuracy by 3% in predicting the number of suspects and improved accuracy by 12% in predicting the number of suspects. To the best of our knowledge, it is the first time to combine machine learning and deep learning in crime prediction.


2021 ◽  
Vol 11 (11) ◽  
pp. 4887
Author(s):  
Ting He ◽  
Xiaohong Xu ◽  
Yating Wu ◽  
Huazhen Wang ◽  
Jian Chen

Intent detection and slot filling are important modules in task-oriented dialog systems. In order to make full use of the relationship between different modules and resource sharing, solving the problem of a lack of semantics, this paper proposes a multitasking learning intent-detection system, based on the knowledge-base and slot-filling joint model. The approach has been used to share information and rich external utility between intent and slot modules in a three-part process. First, this model obtains shared parameters and features between the two modules based on long short-term memory and convolutional neural networks. Second, a knowledge base is introduced into the model to improve its performance. Finally, a weighted-loss function is built to optimize the joint model. Experimental results demonstrate that our model achieves better performance compared with state-of-the-art algorithms on a benchmark Airline Travel Information System (ATIS) dataset and the Snips dataset. Our joint model achieves state-of-the-art results on the benchmark ATIS dataset with a 1.33% intent-detection accuracy improvement, a 0.94% slot filling F value improvement, and with 0.19% and 0.31% improvements respectively on the Snips dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Ihtisham Ullah ◽  
Basit Raza ◽  
Sikandar Ali ◽  
Irshad Ahmed Abbasi ◽  
Samad Baseer ◽  
...  

Software Defined Network (SDN) is a next-generation networking architecture and its power lies in centralized control intelligence. The control plane of SDN can be extended to many underlying networks such as fog to Internet of Things (IoT). The fog-to-IoT is currently a promising architecture to manage a real-time large amount of data. However, most of the fog-to-IoT devices are resource-constrained and devices are widespread that can be potentially targeted with cyber-attacks. The evolving cyber-attacks are still an arresting challenge in the fog-to-IoT environment such as Denial of Service (DoS), Distributed Denial of Service (DDoS), Infiltration, malware, and botnets attacks. They can target varied fog-to-IoT agents and the whole network of organizations. The authors propose a deep learning (DL) driven SDN-enabled architecture for sophisticated cyber-attacks detection in fog-to-IoT environment to identify new attacks targeting IoT devices as well as other threats. The extensive simulations have been carried out using various DL algorithms and current state-of-the-art Coburg Intrusion Detection Data Set (CIDDS-001) flow-based dataset. For better analysis five DL models are compared including constructed hybrid DL models to distinguish the DL model with the best performance. The results show that proposed Long Short-Term Memory (LSTM) hybrid model outperforms other DL models in terms of detection accuracy and response time. To show unbiased results 10-fold cross-validation is performed. The proposed framework is so effective that it can detect several types of cyber-attacks with 99.92% accuracy rate in multiclass classification.


Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3635 ◽  
Author(s):  
Guoming Zhang ◽  
Xiaoyu Ji ◽  
Yanjie Li ◽  
Wenyuan Xu

As a critical component in the smart grid, the Distribution Terminal Unit (DTU) dynamically adjusts the running status of the entire smart grid based on the collected electrical parameters to ensure the safe and stable operation of the smart grid. However, as a real-time embedded device, DTU has not only resource constraints but also specific requirements on real-time performance, thus, the traditional anomaly detection method cannot be deployed. To detect the tamper of the program running on DTU, we proposed a power-based non-intrusive condition monitoring method that collects and analyzes the power consumption of DTU using power sensors and machine learning (ML) techniques, the feasibility of this approach is that the power consumption is closely related to the executing code in CPUs, that is when the execution code is tampered with, the power consumption changes accordingly. To validate this idea, we set up a testbed based on DTU and simulated four types of imperceptible attacks that change the code running in ARM and DSP processors, respectively. We generate representative features and select lightweight ML algorithms to detect these attacks. We finally implemented the detection system on the windows and ubuntu platform and validated its effectiveness. The results show that the detection accuracy is up to 99.98% in a non-intrusive and lightweight way.


2021 ◽  
Vol 11 (11) ◽  
pp. 4894
Author(s):  
Anna Scius-Bertrand ◽  
Michael Jungo ◽  
Beat Wolf ◽  
Andreas Fischer ◽  
Marc Bui

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.


Author(s):  
Nujud Aloshban ◽  
Anna Esposito ◽  
Alessandro Vinciarelli

AbstractDepression is one of the most common mental health issues. (It affects more than 4% of the world’s population, according to recent estimates.) This article shows that the joint analysis of linguistic and acoustic aspects of speech allows one to discriminate between depressed and nondepressed speakers with an accuracy above 80%. The approach used in the work is based on networks designed for sequence modeling (bidirectional Long-Short Term Memory networks) and multimodal analysis methodologies (late fusion, joint representation and gated multimodal units). The experiments were performed over a corpus of 59 interviews (roughly 4 hours of material) involving 29 individuals diagnosed with depression and 30 control participants. In addition to an accuracy of 80%, the results show that multimodal approaches perform better than unimodal ones owing to people’s tendency to manifest their condition through one modality only, a source of diversity across unimodal approaches. In addition, the experiments show that it is possible to measure the “confidence” of the approach and automatically identify a subset of the test data in which the performance is above a predefined threshold. It is possible to effectively detect depression by using unobtrusive and inexpensive technologies based on the automatic analysis of speech and language.


Electronics ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 1686 ◽  
Author(s):  
Nancy Agarwal ◽  
Mudasir Ahmad Wani ◽  
Patrick Bours

This work focuses on designing a grammar detection system that understands both structural and contextual information of sentences for validating whether the English sentences are grammatically correct. Most existing systems model a grammar detector by translating the sentences into sequences of either words appearing in the sentences or syntactic tags holding the grammar knowledge of the sentences. In this paper, we show that both these sequencing approaches have limitations. The former model is over specific, whereas the latter model is over generalized, which in turn affects the performance of the grammar classifier. Therefore, the paper proposes a new sequencing approach that contains both information, linguistic as well as syntactic, of a sentence. We call this sequence a Lex-Pos sequence. The main objective of the paper is to demonstrate that the proposed Lex-Pos sequence has the potential to imbibe the specific nature of the linguistic words (i.e., lexicals) and generic structural characteristics of a sentence via Part-Of-Speech (POS) tags, and so, can lead to a significant improvement in detecting grammar errors. Furthermore, the paper proposes a new vector representation technique, Word Embedding One-Hot Encoding (WEOE) to transform this Lex-Pos into mathematical values. The paper also introduces a new error induction technique to artificially generate the POS tag specific incorrect sentences for training. The classifier is trained using two corpora of incorrect sentences, one with general errors and another with POS tag specific errors. Long Short-Term Memory (LSTM) neural network architecture has been employed to build the grammar classifier. The study conducts nine experiments to validate the strength of the Lex-Pos sequences. The Lex-Pos -based models are observed as superior in two ways: (1) they give more accurate predictions; and (2) they are more stable as lesser accuracy drops have been recorded from training to testing. To further prove the potential of the proposed Lex-Pos -based model, we compare it with some well known existing studies.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1820
Author(s):  
Xiaotao Shao ◽  
Qing Wang ◽  
Wei Yang ◽  
Yun Chen ◽  
Yi Xie ◽  
...  

The existing pedestrian detection algorithms cannot effectively extract features of heavily occluded targets which results in lower detection accuracy. To solve the heavy occlusion in crowds, we propose a multi-scale feature pyramid network based on ResNet (MFPN) to enhance the features of occluded targets and improve the detection accuracy. MFPN includes two modules, namely double feature pyramid network (FPN) integrated with ResNet (DFR) and repulsion loss of minimum (RLM). We propose the double FPN which improves the architecture to further enhance the semantic information and contours of occluded pedestrians, and provide a new way for feature extraction of occluded targets. The features extracted by our network can be more separated and clearer, especially those heavily occluded pedestrians. Repulsion loss is introduced to improve the loss function which can keep predicted boxes away from the ground truths of the unrelated targets. Experiments carried out on the public CrowdHuman dataset, we obtain 90.96% AP which yields the best performance, 5.16% AP gains compared to the FPN-ResNet50 baseline. Compared with the state-of-the-art works, the performance of the pedestrian detection system has been boosted with our method.


Sign in / Sign up

Export Citation Format

Share Document