Multi-Block Color-Binarized Statistical Images for Single-Sample Face Recognition

Insaf Adjabi; Abdeldjalil Ouahabi; Amir Benzaoui; Sébastien Jacques

doi:10.3390/s21030728

Multi-Block Color-Binarized Statistical Images for Single-Sample Face Recognition

Sensors ◽

10.3390/s21030728 ◽

2021 ◽

Vol 21 (3) ◽

pp. 728

Author(s):

Insaf Adjabi ◽

Abdeldjalil Ouahabi ◽

Amir Benzaoui ◽

Sébastien Jacques

Keyword(s):

Face Recognition ◽

Facial Expression ◽

State Of The Art ◽

Original Method ◽

Image Features ◽

Single Sample ◽

K Nearest Neighbors ◽

Art Methods ◽

In The Wild ◽

Real Time Identification

Single-Sample Face Recognition (SSFR) is a computer vision challenge. In this scenario, there is only one example from each individual on which to train the system, making it difficult to identify persons in unconstrained environments, mainly when dealing with changes in facial expression, posture, lighting, and occlusion. This paper discusses the relevance of an original method for SSFR, called Multi-Block Color-Binarized Statistical Image Features (MB-C-BSIF), which exploits several kinds of features, namely, local, regional, global, and textured-color characteristics. First, the MB-C-BSIF method decomposes a facial image into three channels (e.g., red, green, and blue), then it divides each channel into equal non-overlapping blocks to select the local facial characteristics that are consequently employed in the classification phase. Finally, the identity is determined by calculating the similarities among the characteristic vectors adopting a distance measurement of the K-nearest neighbors (K-NN) classifier. Extensive experiments on several subsets of the unconstrained Alex and Robert (AR) and Labeled Faces in the Wild (LFW) databases show that the MB-C-BSIF achieves superior and competitive results in unconstrained situations when compared to current state-of-the-art methods, especially when dealing with changes in facial expression, lighting, and occlusion. The average classification accuracies are 96.17% and 99% for the AR database with two specific protocols (i.e., Protocols I and II, respectively), and 38.01% for the challenging LFW database. These performances are clearly superior to those obtained by state-of-the-art methods. Furthermore, the proposed method uses algorithms based only on simple and elementary image processing operations that do not imply higher computational costs as in holistic, sparse or deep learning methods, making it ideal for real-time identification.

Download Full-text

Multi-Block Color-Binarized Statistical Images for Single Sample Face Recognition

10.20944/preprints202012.0237.v1 ◽

2020 ◽

Author(s):

Insaf Adjabi ◽

Amir Benzaoui ◽

Abdeldjalil Ouahabi ◽

Sebastien Jacques

Keyword(s):

Face Recognition ◽

Facial Expression ◽

State Of The Art ◽

Computational Cost ◽

Image Features ◽

Single Sample ◽

Facial Image ◽

K Nearest Neighbors ◽

Current State ◽

In The Wild

Single sample face recognition (SSFR) is a computer vision challenge. In this scenario, there is only one example from each individual on which to train the system, making it difficult to identify persons in unconstrained environments, particularly when dealing with changes in facial expression, posture, lighting, and occlusion. This paper suggests a different method based on a variant of the Binarized Statistical Image Features (BSIF) descriptor called Multi-Block Color-Binarized Statistical Image Features (MB-C-BSIF) to resolve the SSFR Problem. First, the MB-C-BSIF method decomposes a facial image into three channels (e.g., red, green, and blue), then it divides each channel into equal non-overlapping blocks to select the local facial characteristics that are consequently employed in the classification phase. Finally, the identity is determined by calculating the similarities among the characteristic vectors adopting a distance measurement of the k-nearest neighbors (K-NN) classifier. Extensive experiments on several subsets of the unconstrained Alex &amp; Robert (AR) and Labeled Faces in the Wild (LFW) databases show that the MB-C-BSIF achieves superior results in unconstrained situations when compared to current state-of-the-art methods, especially when dealing with changes in facial expression, lighting, and occlusion. Furthermore, the suggested method employs algorithms with lower computational cost, making it ideal for real-time applications.

Download Full-text

Robust Single-Sample Face Recognition by Sparsity-Driven Sub-Dictionary Learning Using Deep Features

Sensors ◽

10.3390/s19010146 ◽

2019 ◽

Vol 19 (1) ◽

pp. 146 ◽

Cited By ~ 5

Author(s):

Vittorio Cuculo ◽

Alessandro D’Amelio ◽

Giuliano Grossi ◽

Raffaella Lanzarotti ◽

Jianyi Lin

Keyword(s):

Face Recognition ◽

Dictionary Learning ◽

Single Sample ◽

Reference Image ◽

Low Resolution ◽

Face Expression ◽

Norm Minimization ◽

Augmentation Techniques ◽

In The Wild ◽

Learned Features

Face recognition using a single reference image per subject is challenging, above all when referring to a large gallery of subjects. Furthermore, the problem hardness seriously increases when the images are acquired in unconstrained conditions. In this paper we address the challenging Single Sample Per Person (SSPP) problem considering large datasets of images acquired in the wild, thus possibly featuring illumination, pose, face expression, partial occlusions, and low-resolution hurdles. The proposed technique alternates a sparse dictionary learning technique based on the method of optimal direction and the iterative ℓ 0 -norm minimization algorithm called k-LiMapS. It works on robust deep-learned features, provided that the image variability is extended by standard augmentation techniques. Experiments show the effectiveness of our method against the hardness introduced above: first, we report extensive experiments on the unconstrained LFW dataset when referring to large galleries up to 1680 subjects; second, we present experiments on very low-resolution test images up to 8 × 8 pixels; third, tests on the AR dataset are analyzed against specific disguises such as partial occlusions, facial expressions, and illumination problems. In all the three scenarios our method outperforms the state-of-the-art approaches adopting similar configurations.

Download Full-text

Efficient Detection of Occlusion prior to Robust Face Recognition

The Scientific World JOURNAL ◽

10.1155/2014/519158 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 18

Author(s):

Rui Min ◽

Abdenour Hadid ◽

Jean-Luc Dugelay

Keyword(s):

Face Recognition ◽

Facial Expression ◽

Video Surveillance ◽

Real World ◽

State Of The Art ◽

Efficient Detection ◽

Art Works ◽

Enormous Amount ◽

Recognition Systems ◽

Robust Face Recognition

While there has been an enormous amount of research on face recognition under pose/illumination/expression changes and image degradations, problems caused by occlusions attracted relatively less attention. Facial occlusions, due, for example, to sunglasses, hat/cap, scarf, and beard, can significantly deteriorate performances of face recognition systems in uncontrolled environments such as video surveillance. The goal of this paper is to explore face recognition in the presence of partial occlusions, with emphasis on real-world scenarios (e.g., sunglasses and scarf). In this paper, we propose an efficient approach which consists of first analysing the presence of potential occlusion on a face and then conducting face recognition on the nonoccluded facial regions based on selective local Gabor binary patterns. Experiments demonstrate that the proposed method outperforms the state-of-the-art works including KLD-LGBPHS, S-LNMF, OA-LBP, and RSC. Furthermore, performances of the proposed approach are evaluated under illumination and extreme facial expression changes provide also significant results.

Download Full-text

A Thermal Blended Facial Expression Analysis and Recognition System Using Deformed Thermal Facial Areas

International Journal of Image and Graphics ◽

10.1142/s0219467822500498 ◽

2021 ◽

Author(s):

Priya Saha ◽

Debotosh Bhattacharjee ◽

Barin Kumar De ◽

Mita Nasipuri

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Expression Analysis ◽

State Of The Art ◽

Recognition System ◽

Facial Expression Analysis ◽

Face Images ◽

Art Methods ◽

Facial Area ◽

Thermal Face

There are many research works in visible as well as thermal facial expression analysis and recognition. Several facial expression databases have been designed in both modalities. However, little attention has been given for analyzing blended facial expressions in the thermal infrared spectrum. In this paper, we have introduced a Visual-Thermal Blended Facial Expression Database (VTBE) that contains visual and thermal face images with both basic and blended facial expressions. The database contains 12 posed blended facial expressions and spontaneous six basic facial expressions in both modalities. In this paper, we have proposed Deformed Thermal Facial Area (DTFA) in thermal expressive face image and make an analysis to differentiate between basic and blended expressions using DTFA. Here, the fusion of DTFA and Deformed Visual Facial Area (DVFA) has been proposed combining the features of both modalities and experiments and has been conducted on this new database. However, to show the effectiveness of our proposed approach, we have compared our method with state-of-the-art methods using USTC-NVIE database. Experiment results reveal that our approach is superior to state-of-the-art methods.

Download Full-text

Multi-pose 3D facial texture refinement for face recognition

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691318400064 ◽

2018 ◽

Vol 16 (02) ◽

pp. 1840006

Author(s):

Wanshun Gao ◽

Xi Zhao ◽

Jun An ◽

Jianhua Zou

Keyword(s):

Face Recognition ◽

State Of The Art ◽

Competitive Performance ◽

3D Face Reconstruction ◽

3D Face ◽

Face Reconstruction ◽

Novel Approach ◽

Art Methods ◽

The Face ◽

Facial Images

In this paper, we propose a novel approach for 3D face reconstruction from multi-facial images. Given original pose-variant images, coarse 3D face templates are initialized to reconstruct a refined 3D face mesh in an iteration manner. Then, we warp original facial images to the 2D meshes projected from 3D using Sparse Mesh Affine Warp (SMAW). Finally, we weight the face patches in each view respectively and map the patch with higher weight to a canonical UV space. For facial images with arbitrary pose, their invisible regions are filled with the corresponding UV patches. Poisson editing is applied to blend different patches seamlessly. We evaluate the proposed method on LFW dataset in terms of texture refinement and face recognition. The results demonstrate competitive performance compared to state-of-the-art methods.

Download Full-text

Automatic Recognition of Flock Behavior of Chickens with Convolutional Neural Network and Kinect Sensor

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001418500234 ◽

2018 ◽

Vol 32 (07) ◽

pp. 1850023 ◽

Cited By ~ 6

Author(s):

Haitao Pu ◽

Jian Lian ◽

Mingqu Fan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Learning Strategy ◽

State Of The Art ◽

Domestic Animal ◽

Image Features ◽

Kinect Sensor ◽

Multi Scale ◽

Art Methods ◽

Effectiveness And Efficiency

In this paper, we propose an automatic convolutional neural network (CNN)-based method to recognize the chicken behavior within a poultry farm using a Kinect sensor. It resolves the hardships in flock behavior image classification by leveraging a data-driven mechanism and exploiting non-manually extracted multi-scale image features which combine both the local and global characteristics of the image. To our best knowledge, this is probably the first attempt of deep learning strategy in the field of domestic animal behavior recognition. To testify the performance of our proposed method, we conducted experiments between state-of-the-art methods and our method. Experimental results witness that our proposed approach outperforms the state-of-the-art methods both in effectiveness and efficiency. Our proposed CNN architecture for recognizing flock behavior of chickens produces an extremely impressive accuracy of 99.17%.

Download Full-text

Omnidirectional Scene Text Detection with Sequential-free Box Discretization

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/423 ◽

2019 ◽

Cited By ~ 9

Author(s):

Yuliang Liu ◽

Sheng Zhang ◽

Lianwen Jin ◽

Lele Xie ◽

Yaqiang Wu ◽

...

Keyword(s):

State Of The Art ◽

Detection Performance ◽

Text Detection ◽

Detection Methods ◽

Bounding Box ◽

Scene Text Detection ◽

Scene Text ◽

Art Methods ◽

In The Wild ◽

Ablation Study

Scene text in the wild is commonly presented with high variant characteristics. Using quadrilateral bounding box to localize the text instance is nearly indispensable for detection methods. However, recent researches reveal that introducing quadrilateral bounding box for scene text detection will bring a label confusion issue which is easily overlooked, and this issue may significantly undermine the detection performance. To address this issue, in this paper, we propose a novel method called Sequential-free Box Discretization (SBD) by discretizing the bounding box into key edges (KE) which can further derive more effective methods to improve detection performance. Experiments showed that the proposed method can outperform state-of-the-art methods in many popular scene text benchmarks, including ICDAR 2015, MLT, and MSRA-TD500. Ablation study also showed that simply integrating the SBD into Mask R-CNN framework, the detection performance can be substantially improved. Furthermore, an experiment on the general object dataset HRSC2016 (multi-oriented ships) showed that our method can outperform recent state-of-the-art methods by a large margin, demonstrating its powerful generalization ability.

Download Full-text

Facial UV map completion for pose-invariant face recognition: a novel adversarial approach based on coupled attention residual UNets

Human-centric Computing and Information Sciences ◽

10.1186/s13673-020-00250-w ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

In Seop Na ◽

Chung Tran ◽

Dung Nguyen ◽

Sang Dinh

Keyword(s):

Face Recognition ◽

Facial Expression ◽

Generative Model ◽

Superior Performance ◽

3D Mesh ◽

Testing Phase ◽

Face Images ◽

In The Wild ◽

Pose Variation

Abstract Pose-invariant face recognition refers to the problem of identifying or verifying a person by analyzing face images captured from different poses. This problem is challenging due to the large variation of pose, illumination and facial expression. A promising approach to deal with pose variation is to fulfill incomplete UV maps extracted from in-the-wild faces, then attach the completed UV map to a fitted 3D mesh and finally generate different 2D faces of arbitrary poses. The synthesized faces increase the pose variation for training deep face recognition models and reduce the pose discrepancy during the testing phase. In this paper, we propose a novel generative model called Attention ResCUNet-GAN to improve the UV map completion. We enhance the original UV-GAN by using a couple of U-Nets. Particularly, the skip connections within each U-Net are boosted by attention gates. Meanwhile, the features from two U-Nets are fused with trainable scalar weights. The experiments on the popular benchmarks, including Multi-PIE, LFW, CPLWF and CFP datasets, show that the proposed method yields superior performance compared to other existing methods.

Download Full-text

2D Positional Embedding-based Transformer for Scene Text Recognition

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3533 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-4

Author(s):

Zobeir Raisi ◽

Mohamed A. Naiel ◽

Paul Fieguth ◽

Steven Wardell ◽

John Zelek

Keyword(s):

Spatial Information ◽

State Of The Art ◽

Image Features ◽

Text Recognition ◽

Recognition Method ◽

One Dimensional ◽

Art Scene ◽

Scene Text ◽

In The Wild ◽

Scene Text Recognition

Recent state-of-the-art scene text recognition methods are primarily based on Recurrent Neural Networks (RNNs), however, these methods require one-dimensional (1D) features and are not designed for recognizing irregular-text instances due to the loss of spatial information present in the original two-dimensional (2D) images. In this paper, we leverage a Transformer-based architecture for recognizing both regular and irregular text-in-the-wild images. The proposed method takes advantage of using a 2D positional encoder with the Transformer architecture to better preserve the spatial information of 2D image features than previous methods. The experiments on popular benchmarks, including the challenging COCO-Text dataset, demonstrate that the proposed scene text recognition method outperformed the state-of-the-art in most cases, especially on irregular-text recognition.

Download Full-text

FaceFilter: Face Identification with Deep Learning and Filter Algorithm

Scientific Programming ◽

10.1155/2020/7846264 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Mohammed Alghaili ◽

Zhiyong Li ◽

Hamdi A. R. Ali

Keyword(s):

Face Recognition ◽

State Of The Art ◽

Recognition Performance ◽

Face Identification ◽

Original Image ◽

Convolutional Network ◽

The Face ◽

In The Wild ◽

Minimum Number ◽

Different Levels

Although significant advances have been made recently in the field of face recognition, these have some limitations, especially when faces are in different poses or have different levels of illumination, or when the face is blurred. In this study, we present a system that can directly identify an individual under all conditions by extracting the most important features and using them to identify a person. Our method uses a deep convolutional network that is trained to extract the most important features. A filter is then used to select the most significant of these features by finding features greater than zero, storing their indices, and comparing the features of other identities with the same indices as the original image. Finally, the selected features of each identity in the dataset are subtracted from features of the original image to find the minimum number that refers to that identity. This method gives good results, as we only extract the most important features using the filter to recognize the face in different poses. We achieve state-of-the-art face recognition performance using only half of the 128 bytes per face. The system has an accuracy of 99.7% on the Labeled Faces in the Wild dataset and 94.02% on YouTube Faces DB.

Download Full-text