scholarly journals Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information †

Sensors ◽  
2018 ◽  
Vol 19 (1) ◽  
pp. 4
Author(s):  
Álvaro García-Martín ◽  
Juan SanMiguel ◽  
José Martínez

Applying people detectors to unseen data is challenging since patterns distributions, such as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt frame by frame people detectors during runtime classification, without requiring any additional manually labeled ground truth apart from the offline training of the detection model. Such adaptation make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation discriminates between relevant instants in a video sequence, i.e., identifies the representative frames for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration (i.e., detection threshold) of each detector under analysis, maximizing the mutual information to obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not require training the detectors for each new scenario and uses standard people detector outputs, i.e., bounding boxes. The experimental results demonstrate that the proposed approach outperforms state-of-the-art detectors whose optimal threshold configurations are previously determined and fixed from offline training data.

Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4385
Author(s):  
Rafael Martín-Nieto ◽  
Álvaro García-Martín ◽  
José Martínez ◽  
Juan SanMiguel

Finding optimal parametrizations for people detectors is a complicated task due to the large number of parameters and the high variability of application scenarios. In this paper, we propose a framework to adapt and improve any detector automatically in multi-camera scenarios where people are observed from various viewpoints. By accurately transferring detector results between camera viewpoints and by self-correlating these transferred results, the best configuration (in this paper, the detection threshold) for each detector-viewpoint pair is identified online without requiring any additional manually-labeled ground truth apart from the offline training of the detection model. Such a configuration consists of establishing the confidence detection threshold present in every people detector, which is a critical parameter affecting detection performance. The experimental results demonstrate that the proposed framework improves the performance of four different state-of-the-art detectors (DPM , ACF, faster R-CNN, and YOLO9000) whose Optimal Fixed Thresholds (OFTs) have been determined and fixed during training time using standard datasets.


2021 ◽  
Vol 33 (5) ◽  
pp. 83-104
Author(s):  
Aleksandr Igorevich Getman ◽  
Maxim Nikolaevich Goryunov ◽  
Andrey Georgievich Matskevich ◽  
Dmitry Aleksandrovich Rybolovlev

The paper discusses the issues of training models for detecting computer attacks based on the use of machine learning methods. The results of the analysis of publicly available training datasets and tools for analyzing network traffic and identifying features of network sessions are presented sequentially. The drawbacks of existing tools and possible errors in the datasets formed with their help are noted. It is concluded that it is necessary to collect own training data in the absence of guarantees of the public datasets reliability and the limited use of pre-trained models in networks with characteristics that differ from the characteristics of the network in which the training traffic was collected. A practical approach to generating training data for computer attack detection models is proposed. The proposed solutions have been tested to evaluate the quality of model training on the collected data and the quality of attack detection in conditions of real network infrastructure.


2020 ◽  
Vol 36 (12) ◽  
pp. 3863-3870
Author(s):  
Mischa Schwendy ◽  
Ronald E Unger ◽  
Sapun H Parekh

Abstract Motivation Deep learning use for quantitative image analysis is exponentially increasing. However, training accurate, widely deployable deep learning algorithms requires a plethora of annotated (ground truth) data. Image collections must contain not only thousands of images to provide sufficient example objects (i.e. cells), but also contain an adequate degree of image heterogeneity. Results We present a new dataset, EVICAN—Expert visual cell annotation, comprising partially annotated grayscale images of 30 different cell lines from multiple microscopes, contrast mechanisms and magnifications that is readily usable as training data for computer vision applications. With 4600 images and ∼26 000 segmented cells, our collection offers an unparalleled heterogeneous training dataset for cell biology deep learning application development. Availability and implementation The dataset is freely available (https://edmond.mpdl.mpg.de/imeji/collection/l45s16atmi6Aa4sI?q=). Using a Mask R-CNN implementation, we demonstrate automated segmentation of cells and nuclei from brightfield images with a mean average precision of 61.6 % at a Jaccard Index above 0.5.


2021 ◽  
Vol 11 (22) ◽  
pp. 10966
Author(s):  
Hsiang-Chieh Chen ◽  
Zheng-Ting Li

This article introduces an automated data-labeling approach for generating crack ground truths (GTs) within concrete images. The main algorithm includes generating first-round GTs, pre-training a deep learning-based model, and generating second-round GTs. On the basis of the generated second-round GTs of the training data, a learning-based crack detection model can be trained in a self-supervised manner. The pre-trained deep learning-based model is effective for crack detection after it is re-trained using the second-round GTs. The main contribution of this study is the proposal of an automated GT generation process for training a crack detection model at the pixel level. Experimental results show that the second-round GTs are similar to manually marked labels. Accordingly, the cost of implementing learning-based methods is reduced significantly because data labeling by humans is not necessitated.


Electronics ◽  
2019 ◽  
Vol 8 (3) ◽  
pp. 329 ◽  
Author(s):  
Yong Li ◽  
Guofeng Tong ◽  
Huashuai Gao ◽  
Yuebin Wang ◽  
Liqiang Zhang ◽  
...  

Panoramic images have a wide range of applications in many fields with their ability to perceive all-round information. Object detection based on panoramic images has certain advantages in terms of environment perception due to the characteristics of panoramic images, e.g., lager perspective. In recent years, deep learning methods have achieved remarkable results in image classification and object detection. Their performance depends on the large amount of training data. Therefore, a good training dataset is a prerequisite for the methods to achieve better recognition results. Then, we construct a benchmark named Pano-RSOD for panoramic road scene object detection. Pano-RSOD contains vehicles, pedestrians, traffic signs and guiding arrows. The objects of Pano-RSOD are labelled by bounding boxes in the images. Different from traditional object detection datasets, Pano-RSOD contains more objects in a panoramic image, and the high-resolution images have 360-degree environmental perception, more annotations, more small objects and diverse road scenes. The state-of-the-art deep learning algorithms are trained on Pano-RSOD for object detection, which demonstrates that Pano-RSOD is a useful benchmark, and it provides a better panoramic image training dataset for object detection tasks, especially for small and deformed objects.


Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4272 ◽  
Author(s):  
Jun Sang ◽  
Zhongyuan Wu ◽  
Pei Guo ◽  
Haibo Hu ◽  
Hong Xiang ◽  
...  

Vehicle detection is one of the important applications of object detection in intelligent transportation systems. It aims to extract specific vehicle-type information from pictures or videos containing vehicles. To solve the problems of existing vehicle detection, such as the lack of vehicle-type recognition, low detection accuracy, and slow speed, a new vehicle detection model YOLOv2_Vehicle based on YOLOv2 is proposed in this paper. The k-means++ clustering algorithm was used to cluster the vehicle bounding boxes on the training dataset, and six anchor boxes with different sizes were selected. Considering that the different scales of the vehicles may influence the vehicle detection model, normalization was applied to improve the loss calculation method for length and width of bounding boxes. To improve the feature extraction ability of the network, the multi-layer feature fusion strategy was adopted, and the repeated convolution layers in high layers were removed. The experimental results on the Beijing Institute of Technology (BIT)-Vehicle validation dataset demonstrated that the mean Average Precision (mAP) could reach 94.78%. The proposed model also showed excellent generalization ability on the CompCars test dataset, where the “vehicle face” is quite different from the training dataset. With the comparison experiments, it was proven that the proposed method is effective for vehicle detection. In addition, with network visualization, the proposed model showed excellent feature extraction ability.


Author(s):  
Sulaimon Adebayo Bashir ◽  
Andrei Petrovski ◽  
Daniel Doolan

Purpose This purpose of this paper is to develop a change detection technique for activity recognition model. The approach aims to detect changes in the initial accuracy of the model after training and when the model is deployed for recognizing new unseen activities without access to the ground truth. The changes between the two sessions may occur because of differences in sensor placement, orientation and user characteristics such as age and gender. However, many of the existing approaches for model adaptation in activity recognition are blind methods because they continuously adapt the recognition model without explicit detection of changes in the model performance. Design/methodology/approach The approach determines the variation between reference activity data belonging to different classes and newly classified unseen data. If there is coherency between the data, it means the model is correctly classifying the instances; otherwise, a significant variation indicates wrong instances are being classified to different classes. Thus, the approach is formulated as a two-level architectural framework comprising of the off-line phase and the online phase. The off-line phase extracts of Shewart Chart change parameters from the training data set. The online phase performs classification of new samples and the detection of the changes in each class of activity present in the data set by using the change parameters computed earlier. Findings The approach is evaluated using a real activity-recognition data set. The results show that there are consistent detections that correlate with the error rate of the model. Originality/value The developed approach does not use ground truth to detect classifier performance degradation. Rather, it uses a data discrimination method and a base classifier to detect the changes by using the parameters computed from the reference data of each class to discriminate outliers in the new data being classified to the same class. The approach is the first, to the best of the authors’ knowledge, that addresses the problem of detecting within-user and cross-user variations that lead to concept drift in activity recognition. The approach is also the first to use statistical process control method for change detection in activity recognition, with a robust integrated framework that seamlessly detects variations in the underlying model performance.


2020 ◽  
Author(s):  
Loïc Dutrieux ◽  
Radhouene Azzabi ◽  
Sébastien Bauwens ◽  
Ulrich Gaël Bouka Dipelet ◽  
Olivier Chenoz ◽  
...  

<p>As part of a project aiming to support FSC certified logging concessions in their tasks of forest inventory and management, we collected aerial imagery over 9000 ha of tropical forests in Northern Congo using long range Unmanned Aerial Vehicles (UAVs). Once processed into orthomosaics, the aerial imagery is used in combination with reference training samples to train a deep learning object detection model (FasterRCNN) capable of detecting and predicting tree species. The remoteness and diversity of these forests make both data acquisition and generation of a training dataset challenging. Unlike natural images containing common objects like cars, bicycles, cats and dogs, there is no easy way to create a training dataset of tree species from overhead imagery of tropical forests. The first reason is that a human operator cannot as easily recognize and label objects. The second reason is that the polymorphism of tree species, phenological variations and uncertainty associated with visual recognition makes the exhaustive labeling of all instances of each class very difficult. Such exhaustive labeling is required to successfully train any object detection model. To overcome these challenges we built an interactive and ergonomic interface that allows a human operator to work in a spatial context, being guided by the approximate geographic location of already inventoried trees. We solved the issue of non-exhaustive instance labeling by building synthetic images, hence allowing full control of the training data. In addition to these specific developments related to training data generation, we will present details of the UAV missions, modelling results on synthetic images, and finally preliminary results of model transfer to aerial imagery.</p>


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Mingshu He ◽  
Xiaojuan Wang ◽  
Junhua Zhou ◽  
Yuanyuan Xi ◽  
Lei Jin ◽  
...  

With the increase of Internet visits and connections, it is becoming essential and arduous to protect the networks and different devices of the Internet of Things (IoT) from malicious attacks. The intrusion detection systems (IDSs) based on supervised machine learning (ML) methods require a large number of labeled samples. However, the number of abnormal behaviors is far less than that of normal behaviors, let alone that the shots of malicious behavior samples which can be intercepted as training dataset are actually limited. Consequently, it is a key research topic to conduct the anomaly detection for the small number of abnormal behavior samples. This paper proposes an anomaly detection model with a few abnormal samples to solve the problem in few-shot detection based on convolutional neural networks (CNN) and autoencoder (AE). This model mainly consists of the CNN-based supervised pretraining module and the AE-based data reconstruction module. Only a few abnormal samples are utilized to the pretrain module to build the structure of extracting deep features. The data reconstruction module simply chooses the deep features of normal samples as training data. There also exist some effective attention mechanisms in the pretraining module. Through the pretraining of small samples, the accuracy of abnormal detection is improved compared with merely training normal samples with AE. The simulation results prove that this solution can solve the above problems occurring in network behavior anomaly detection. In comparison to the original AE model and other clustering methods, the proposed model advances the detection results in a visible way.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


Sign in / Sign up

Export Citation Format

Share Document