scholarly journals Feature Extraction and Fusion Using Deep Convolutional Neural Networks for Face Detection

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Xiaojun Lu ◽  
Xu Duan ◽  
Xiuping Mao ◽  
Yuanyuan Li ◽  
Xiangde Zhang

This paper proposes a method that uses feature fusion to represent images better for face detection after feature extraction by deep convolutional neural network (DCNN). First, with Clarifai net and VGG Net-D (16 layers), we learn features from data, respectively; then we fuse features extracted from the two nets. To obtain more compact feature representation and mitigate computation complexity, we reduce the dimension of the fused features by PCA. Finally, we conduct face classification by SVM classifier for binary classification. In particular, we exploit offset max-pooling to extract features with sliding window densely, which leads to better matches of faces and detection windows; thus the detection result is more accurate. Experimental results show that our method can detect faces with severe occlusion and large variations in pose and scale. In particular, our method achieves 89.24% recall rate on FDDB and 97.19% average precision on AFW.

2020 ◽  
Vol 10 (12) ◽  
pp. 4177
Author(s):  
Chaowei Tang ◽  
Shiyu Chen ◽  
Xu Zhou ◽  
Shuai Ruan ◽  
Haotian Wen

Face detection is an important basic technique for face-related applications, such as face analysis, recognition, and reconstruction. Images in unconstrained scenes may contain many small-scale faces. The features that the detector can extract from small-scale faces are limited, which will cause missed detection and greatly reduce the precision of face detection. Therefore, this study proposes a novel method to detect small-scale faces based on region-based fully convolutional network (R-FCN). First, we propose a novel R-FCN framework with the ability of feature fusion and receptive field adaptation. Second, a bottom-up feature fusion branch is established to enrich the local information of high-layer features. Third, a receptive field adaptation block (RFAB) is proposed to ensure that the receptive field can be adaptively selected to strengthen the expression ability of features. Finally, we improve the anchor setting method and adopt soft non-maximum suppression (SoftNMS) as the selection method of candidate boxes. Experimental results show that average precision for small-scale face detection of R-FCN with feature fusion branch and RFAB (RFAB-f-R-FCN) is improved by 0.8%, 2.9%, and 11% on three subsets of Wider Face compared with that of R-FCN.


2021 ◽  
Author(s):  
Yu Xiang ◽  
Liwei Hu ◽  
Jun Zhang ◽  
Wenyong Wang

Abstract The perception of geometry-features of airfoils is the basis in aerodynamic area for performance prediction, parameterization, aircraft inverse design, etc. There are three approaches to percept the geometric shape of an airfoil, namely manual design of airfoil geometry parameter, polynomial definition and deep learning. The first two methods can directly extract geometry-features of airfoils or polynomial equations of airfoil curves, but the number of features extracted is limited. While deep learning algorithms can extract a large number of potential features (called latent features), however, the features extracted by deep learning are lacking of explicit geometrical meaning. Motivated by the advantages of polynomial definition and deep learning, we propose a geometry-based deep learning feature extraction scheme (named Bézier-based feature extraction, BFE) for airfoils, which consists of two parts: manifold metric feature extraction and geometry-feature fusion encoder (GF encoder). Manifold metric feature extraction, with the help of the Bézier curve, captures features from tangent space of airfoil curves, and GF encoder combines airfoil coordinate data and manifold metrics together to form a novel feature representation. A public UIUC airfoil dataset is used to verify the proposed BFE. Compared with classic Auto-Encoder, the mean square error (MSE) of BFE is reduced by 17.97% ~29.14%.


Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3766
Author(s):  
Behnood Rasti ◽  
Pedram Ghamisi ◽  
Peter Seidel ◽  
Sandra Lorenz ◽  
Richard Gloaguen

Geological objects are characterized by a high complexity inherent to a strong compositional variability at all scales and usually unclear class boundaries. Therefore, dedicated processing schemes are required for the analysis of such data for mineralogical mapping. On the other hand, the variety of optical sensing technology reveals different data attributes and therefore multi-sensor approaches are adapted to solve such complicated mapping problems. In this paper, we devise an adapted multi-optical sensor fusion (MOSFus) workflow which takes the geological characteristics into account. The proposed processing chain exhaustively covers all relevant stages, including data acquisition, preprocessing, feature fusion, and mineralogical mapping. The concept includes (i) a spatial feature extraction based on morphological profiles on RGB data with high spatial resolution, (ii) a specific noise reduction applied on the hyperspectral data that assumes mixed sparse and Gaussian contamination, and (iii) a subsequent dimensionality reduction using a sparse and smooth low rank analysis. The feature extraction approach allows one to fuse heterogeneous data at variable resolutions, scales, and spectral ranges and improve classification substantially. The last step of the approach, an SVM classifier, is robust to unbalanced and sparse training sets and is particularly efficient with complex imaging data. We evaluate the performance of the procedure with two different multi-optical sensor datasets. The results demonstrate the superiority of this dedicated approach over common strategies.


Now days the image processing can be used in various areas such as in Agriculture, in Health care system also for security purpose. In case of Crime investigation the image processing can be used to identify the particular suspect from an available dataset for that purpose an image retrieval technique is presented in this paper. For image retrieval number of techniques is available. In earlier days Block Truncation Coding is used but due its some disadvantage feature extraction method is used. Using DDBTC technique two features are derived. The first feature as Color Co-occurrence Features (CCF) obtained using color quantizes features such as Bit Pattern Feature (BPF) is derived from Bitmap image. The five different distance metrics are used to measure the similarity between two images. The simulated results shows proposed Technique can shows the better result in the form of Average Precision rate (APR) and Average Recall Rate (ARR) as compared to other techniques.


2021 ◽  
Vol 13 (10) ◽  
pp. 1912
Author(s):  
Zhili Zhang ◽  
Meng Lu ◽  
Shunping Ji ◽  
Huafen Yu ◽  
Chenhui Nie

Extracting water-bodies accurately is a great challenge from very high resolution (VHR) remote sensing imagery. The boundaries of a water body are commonly hard to identify due to the complex spectral mixtures caused by aquatic vegetation, distinct lake/river colors, silts near the bank, shadows from the surrounding tall plants, and so on. The diversity and semantic information of features need to be increased for a better extraction of water-bodies from VHR remote sensing images. In this paper, we address these problems by designing a novel multi-feature extraction and combination module. This module consists of three feature extraction sub-modules based on spatial and channel correlations in feature maps at each scale, which extract the complete target information from the local space, larger space, and between-channel relationship to achieve a rich feature representation. Simultaneously, to better predict the fine contours of water-bodies, we adopt a multi-scale prediction fusion module. Besides, to solve the semantic inconsistency of feature fusion between the encoding stage and the decoding stage, we apply an encoder-decoder semantic feature fusion module to promote fusion effects. We carry out extensive experiments in VHR aerial and satellite imagery respectively. The result shows that our method achieves state-of-the-art segmentation performance, surpassing the classic and recent methods. Moreover, our proposed method is robust in challenging water-body extraction scenarios.


2021 ◽  
Vol 11 (2) ◽  
pp. 424-431
Author(s):  
Yingxin Wang ◽  
Qianqian Zeng

Texture analysis has always been active areas of ultrasound image processing research. Using texture features to classify the ultrasound images is the focus of researchers' attention. How to extract representative texture features is an important part of successful texture description. The research goal of this paper is to apply the deep neural network into the ultrasound classification of ovarian tumors, and design a novel type of ovarian cancer diagnosis system. The improved HOG feature extraction method and the gray-level concurrence matrix of LBP image are firstly adopted to extract low-level features; Then, these features are cascaded into a new feature vector, and are input into the auto-encoder neural network to learn the high-level feature. Finally, the SVM classifier is used to achieve the classification of ovarian lesion. A large number of qualitative and quantitative experiments show that the improved method has more performance than the comparisons algorithms for ovarian ultrasound lesion, and it can significantly improve the classification performance while ensuring the accuracy rate and recall rate.


2019 ◽  
Vol 11 (24) ◽  
pp. 3006 ◽  
Author(s):  
Yafei Lv ◽  
Xiaohan Zhang ◽  
Wei Xiong ◽  
Yaqi Cui ◽  
Mi Cai

Remote sensing image scene classification (RSISC) is an active task in the remote sensing community and has attracted great attention due to its wide applications. Recently, the deep convolutional neural networks (CNNs)-based methods have witnessed a remarkable breakthrough in performance of remote sensing image scene classification. However, the problem that the feature representation is not discriminative enough still exists, which is mainly caused by the characteristic of inter-class similarity and intra-class diversity. In this paper, we propose an efficient end-to-end local-global-fusion feature extraction (LGFFE) network for a more discriminative feature representation. Specifically, global and local features are extracted from channel and spatial dimensions respectively, based on a high-level feature map from deep CNNs. For the local features, a novel recurrent neural network (RNN)-based attention module is first proposed to capture the spatial layout information and context information across different regions. Gated recurrent units (GRUs) is then exploited to generate the important weight of each region by taking a sequence of features from image patches as input. A reweighed regional feature representation can be obtained by focusing on the key region. Then, the final feature representation can be acquired by fusing the local and global features. The whole process of feature extraction and feature fusion can be trained in an end-to-end manner. Finally, extensive experiments have been conducted on four public and widely used datasets and experimental results show that our method LGFFE outperforms baseline methods and achieves state-of-the-art results.


2019 ◽  
Vol 8 (3) ◽  
pp. 3305-3310

Through the landing of therapeutic endoscopes, earth perception satellites and individual telephones, content-based picture recovery (CBIR) has concerned critical consideration, activated by its broad applications, e.g., medicinal picture investigation, removed detecting, and individual re-distinguishing proof. Be that as it may, developing successful component extraction is as yet reported as an invigorating issue.In this paper, to overcome the feature extraction problems a hybrid Tile Based Feature Extraction (TBFE) is introduced. The TBFE algorithm is hybrid with the local binary pattern (LBP) and Local derivative pattern (LDP). These hybrid TBFE feature extraction method helps to extract the color image features in automatic manner. Support vector machine (SVM) is used as a classifier in this image retrieval approach to retrieve the images from the database. The hybrid TBFE along with the SVM classifier image retrieval is named as IR-TBFE-SVM. Experiments show that IR-TBFE-SVMdelivers a higher correctness and recall rate than single feature employed retrieval systems, and ownsdecentweight balancing and query efficiency performance.


2020 ◽  
Vol 10 (15) ◽  
pp. 5385
Author(s):  
Zheng Liu ◽  
Feixiang Du ◽  
Wang Li ◽  
Xu Liu ◽  
Qiang Zou

Given a video containing a person, the video-based person re-identification (Re-ID) task aims to identify the same person from videos captured under different cameras. How to embed spatial-temporal information of a video into its feature representation is a crucial challenge. Most existing methods have failed to make full use of the relationship between frames during feature extraction. In this work, we propose a plug-and-play non-local attention module (NLAM) for frame-level feature extraction. NLAM, based on global spatial attention and channel attention, helps the network to determine the location of the person in each frame. Besides, we propose a non-local temporal pooling (NLTP) method used for temporal features’ aggregation, which can effectively capture long-range and global dependencies among the frames of the video. Our model obtained impressive results on different datasets compared to the state-of-the-art methods. In particular, it achieved the rank-1 accuracy of 86.3% on the MARS (Motion Analysis and Re-identification Set) dataset without re-ranking, which is 1.4% higher than the state-of-the-art way. On the DukeMTMC-VideoReID (Duke Multi-Target Multi-Camera Video Reidentification) dataset, our method also had an excellent performance of 95% rank-1 accuracy and 94.5% mAP (mean Average Precision).


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Bao-Yuan Chen ◽  
Yu-Kun Shen ◽  
Kun Sun

At present, object detectors based on convolution neural networks generally rely on the last layer of features extracted by the feature extraction network. In the process of continuous convolution and pooling of deep features, the position information cannot be completely transferred backward. This paper proposes a multiscale feature reuse detection model, which includes the basic feature extraction network DenseNet, feature fusion network, multiscale anchor region proposal network, and classification and regression network. The fusion of high-dimensional features and low-dimensional features not only strengthens the model's sensitivity to objects of different sizes but also strengthens the transmission of information, so that the feature map has rich deep semantic information and shallow location information at the same time, which significantly improves the robustness and detection accuracy of the model. The algorithm is trained and tested in Pascal VOC2007 dataset. The experimental results show that the mean average precision of the objects in the dataset is 73.87%. At the same time, compared with the mainstream faster RCNN and SSD detection models, the mean average precision of object detection algorithm based on DenseNet is improved by 5.63% and 3.86%, respectively.


Sign in / Sign up

Export Citation Format

Share Document