Self-Erasing Network for Person Re-Identification

Xinyue Fan; Yang Lin; Chaoxi Zhang; Jia Zhang

doi:10.3390/s21134262

Self-Erasing Network for Person Re-Identification

Sensors ◽

10.3390/s21134262 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4262

Author(s):

Xinyue Fan ◽

Yang Lin ◽

Chaoxi Zhang ◽

Jia Zhang

Keyword(s):

Background Noise ◽

Visual Cues ◽

State Of The Art ◽

Local Information ◽

Whole Body ◽

Global Information ◽

Viewing Angle ◽

Intelligent Surveillance ◽

Fine Grained ◽

Activation Map

Person re-identification (ReID) plays an important role in intelligent surveillance and receives widespread attention from academics and the industry. Due to extreme changes in viewing angles, some discriminative local regions are suppressed. In addition, the data with similar backgrounds collected by a fixed viewing angle camera will also affect the model’s ability to distinguish a person. Therefore, we need to discover more fine-grained information to form the overall characteristics of each identity. The proposed self-erasing network structure composed of three branches benefits the extraction of global information, the suppression of background noise and the mining of local information. The two self-erasing strategies that we proposed encourage the network to focus on foreground information and strengthen the model’s ability to encode weak features so as to form more effective and richer visual cues of a person. Extensive experiments show that the proposed method is competitive with the advanced methods and achieves state-of-the-art performance on DukeMTMC-ReID and CUHK-03(D) datasets. Furthermore, it can be seen from the activation map that the proposed method is beneficial to spread the attention to the whole body. Both metrics and the activation map validate the effectiveness of our proposed method.

Download Full-text

Action-Guided Attention Mining and Relation Reasoning Network for Human-Object Interaction Detection

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/154 ◽

2020 ◽

Author(s):

Xue Lin ◽

Qi Zou ◽

Xixia Xu

Keyword(s):

State Of The Art ◽

Subtle Difference ◽

Fine Grained ◽

Interaction Detection ◽

Object Interaction ◽

Proposed Model ◽

Human Object ◽

Domination Problem ◽

Combination Space ◽

Activation Map

Human-object interaction (HOI) detection is important to understand human-centric scenes and is challenging due to subtle difference between fine-grained actions, and multiple co-occurring interactions. Most approaches tackle the problems by considering the multi-stream information and even introducing extra knowledge, which suffer from a huge combination space and the non-interactive pair domination problem. In this paper, we propose an Action-Guided attention mining and Relation Reasoning (AGRR) network to solve the problems. Relation reasoning on human-object pairs is performed by exploiting contextual compatibility consistency among pairs to filter out the non-interactive combinations. To better discriminate the subtle difference between fine-grained actions, an action-aware attention based on class activation map is proposed to mine the most relevant features for recognizing HOIs. Extensive experiments on V-COCO and HICO-DET datasets demonstrate the effectiveness of the proposed model compared with the state-of-the-art approaches.

Download Full-text

A new dataset of dog breed images and a benchmark for finegrained classification

Computational Visual Media ◽

10.1007/s41095-020-0184-6 ◽

2020 ◽

Vol 6 (4) ◽

pp. 477-487

Author(s):

Ding-Nan Zou ◽

Song-Hai Zhang ◽

Tai-Jiang Mu ◽

Min Zhang

Keyword(s):

Real World ◽

State Of The Art ◽

Whole Body ◽

Classification Models ◽

Neural Models ◽

Fine Grained ◽

Image Dataset ◽

Dog Breed ◽

Bounding Boxes

AbstractIn this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog breeds and 70,428 real-world images. It has only one dog in each image and provides annotated bounding boxes for the whole body and head. In comparison to previous similar datasets, it contains more breeds and more carefully chosen images for each breed. The diversity within each breed is greater, with between 200 and 7000+ images for each breed. Annotation of the whole body and head makes the dataset not only suitable for the improvement of finegrained image classification models based on overall features, but also for those locating local informative parts. We show that dataset provides a tough challenge by benchmarking several state-of-the-art deep neural models. The dataset is available for academic purposes at https://cg.cs.tsinghua.edu.cn/ThuDogs/.

Download Full-text

Graph Based Translation Memory for Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017297 ◽

2019 ◽

Vol 33 ◽

pp. 7297-7304 ◽

Cited By ~ 4

Author(s):

Mengzhou Xia ◽

Guoping Huang ◽

Lemao Liu ◽

Shuming Shi

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Local Information ◽

Global Information ◽

Translation Memory ◽

Neural Machine Translation ◽

Efficient Approach ◽

Compact Graph ◽

Decoding Efficiency ◽

Space Occupation

A translation memory (TM) is proved to be helpful to improve neural machine translation (NMT). Existing approaches either pursue the decoding efficiency by merely accessing local information in a TM or encode the global information in a TM yet sacrificing efficiency due to redundancy. We propose an efficient approach to making use of the global information in a TM. The key idea is to pack a redundant TM into a compact graph and perform additional attention mechanisms over the packed graph for integrating the TM representation into the decoding network. We implement the model by extending the state-of-the-art NMT, Transformer. Extensive experiments on three language pairs show that the proposed approach is efficient in terms of running time and space occupation, and particularly it outperforms multiple strong baselines in terms of BLEU scores.

Download Full-text

Bilinear CNN Model for Fine-Grained Classification Based on Subcategory-Similarity Measurement

Applied Sciences ◽

10.3390/app9020301 ◽

2019 ◽

Vol 9 (2) ◽

pp. 301 ◽

Cited By ~ 1

Author(s):

Xinghua Dai ◽

Shengrong Gong ◽

Shan Zhong ◽

Zongming Bao

Keyword(s):

Background Noise ◽

Supervised Classification ◽

State Of The Art ◽

Equal Treatment ◽

Training Set ◽

Significant Similarity ◽

Fine Grained ◽

Triplet Loss ◽

Weakly Supervised ◽

Weakly Supervised Classification

One of the challenges in fine-grained classification is that subcategories with significant similarity are hard to be distinguished due to the equal treatment of all subcategories in existing algorithms. In order to solve this problem, a fine-grained image classification method by combining a bilinear convolutional neural network (B-CNN) and the measurement of subcategory similarities is proposed. Firstly, an improved weakly supervised localization method is designed to obtain the bounding box of the main object, which allows the model to eliminate the influence of background noise and obtain more accurate features. Then, sample features in the training set are computed by B-CNN so that the fuzzing similarity matrix for measuring interclass similarities can be obtained. To further improve classification accuracy, the loss function is designed by weighting triplet loss and softmax loss. Extensive experiments implemented on two benchmarks datasets, Stanford Cars-196 and Caltech-UCSD Birds-200-2011 (CUB-200-2011), show that the newly proposed method outperforms in accuracy several state-of-the-art weakly supervised classification models.

Download Full-text

Behavioral Genetics: Concepts for Research and Practice in Language Development and Disorders

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3805.1126 ◽

1995 ◽

Vol 38 (5) ◽

pp. 1126-1142 ◽

Cited By ~ 14

Author(s):

Jeffrey W. Gilger

Keyword(s):

Language Development ◽

Behavioral Genetics ◽

State Of The Art ◽

Genetic Research ◽

Great Promise ◽

Behavioral Genetic ◽

Fine Grained ◽

Future Goals ◽

Current State ◽

Research Designs

This paper is an introduction to behavioral genetics for researchers and practioners in language development and disorders. The specific aims are to illustrate some essential concepts and to show how behavioral genetic research can be applied to the language sciences. Past genetic research on language-related traits has tended to focus on simple etiology (i.e., the heritability or familiality of language skills). The current state of the art, however, suggests that great promise lies in addressing more complex questions through behavioral genetic paradigms. In terms of future goals it is suggested that: (a) more behavioral genetic work of all types should be done—including replications and expansions of preliminary studies already in print; (b) work should focus on fine-grained, theory-based phenotypes with research designs that can address complex questions in language development; and (c) work in this area should utilize a variety of samples and methods (e.g., twin and family samples, heritability and segregation analyses, linkage and association tests, etc.).

Download Full-text

Do You Want to See the Tree? Ignore the Forest

Experimental Psychology (formerly Zeitschrift für Experimentelle Psychologie) ◽

10.1027/1618-3169/a000240 ◽

2014 ◽

Vol 61 (3) ◽

pp. 205-214 ◽

Cited By ~ 12

Author(s):

Nicolas Poirel ◽

Claire Sara Krakowski ◽

Sabrina Sayah ◽

Arlette Pineau ◽

Olivier Houdé ◽

...

Keyword(s):

Visual Recognition ◽

Local Level ◽

Local Information ◽

Visual Environment ◽

Global Information ◽

Focus Attention ◽

Global Level ◽

Hierarchical Stimuli ◽

Priming Paradigm ◽

The One

The visual environment consists of global structures (e.g., a forest) made up of local parts (e.g., trees). When compound stimuli are presented (e.g., large global letters composed of arrangements of small local letters), the global unattended information slows responses to local targets. Using a negative priming paradigm, we investigated whether inhibition is required to process hierarchical stimuli when information at the local level is in conflict with the one at the global level. The results show that when local and global information is in conflict, global information must be inhibited to process local information, but that the reverse is not true. This finding has potential direct implications for brain models of visual recognition, by suggesting that when local information is conflicting with global information, inhibitory control reduces feedback activity from global information (e.g., inhibits the forest) which allows the visual system to process local information (e.g., to focus attention on a particular tree).

Download Full-text

State-of-the-Art: A Systematic Literature Review on Image Segmentation in Latent Fingerprint Forensics

Recent Patents on Computer Science ◽

10.2174/2213275912666190429153952 ◽

2019 ◽

Vol 12 ◽

Cited By ~ 2

Author(s):

Megha Chhabra ◽

Manoj Kumar Shukla ◽

Kiran Kumar Ravulakollu

Keyword(s):

Image Segmentation ◽

Background Noise ◽

State Of The Art ◽

Recognition Rate ◽

Poor Quality ◽

Latent Fingerprint ◽

In The Beginning ◽

The Comparative Study ◽

Finger Skin

: Latent fingerprints are unintentional finger skin impressions left as ridge patterns at crime scenes. A major challenge in latent fingerprint forensics is the poor quality of the lifted image from the crime scene. Forensics investigators are in permanent search of novel outbreaks of the effective technologies to capture and process low quality image. The accuracy of the results depends upon the quality of the image captured in the beginning, metrics used to assess the quality and thereafter level of enhancement required. The low quality of the image collected by low quality scanners, unstructured background noise, poor ridge quality, overlapping structured noise result in detection of false minutiae and hence reduce the recognition rate. Traditionally, Image segmentation and enhancement is partially done manually using help of highly skilled experts. Using automated systems for this work, differently challenging quality of images can be investigated faster. This survey amplifies the comparative study of various segmentation techniques available for latent fingerprint forensics.

Download Full-text

Integumentary Colour Allocation in the Stork Family (Ciconiidae) Reveals Short-Range Visual Cues for Species Recognition

Birds ◽

10.3390/birds2010010 ◽

2021 ◽

Vol 2 (1) ◽

pp. 138-146

Author(s):

Eduardo J. Rodríguez-Rodríguez ◽

Juan J. Negro

Keyword(s):

Short Range ◽

Visual Cues ◽

Species Recognition ◽

The Body ◽

Whole Body ◽

Divergent Evolution ◽

Black And White ◽

Extant Species ◽

Chromatic Spectrum ◽

Mixed Species Flocks

The family Ciconiidae comprises 19 extant species which are highly social when nesting and foraging. All species share similar morphotypes, with long necks, a bill, and legs, and are mostly coloured in the achromatic spectrum (white, black, black, and white, or shades of grey). Storks may have, however, brightly coloured integumentary areas in, for instance, the bill, legs, or the eyes. These chromatic patches are small in surface compared with the whole body. We have analyzed the conservatism degree of colouration in 10 body areas along an all-species stork phylogeny derived from BirdTRee using Geiger models. We obtained low conservatism in frontal areas (head and neck), contrasting with a high conservatism in the rest of the body. The frontal areas tend to concentrate the chromatic spectrum whereas the rear areas, much larger in surface, are basically achromatic. These results lead us to suggest that the divergent evolution of the colouration of frontal areas is related to species recognition through visual cue assessment in the short-range, when storks form mixed-species flocks in foraging or resting areas.

Download Full-text

Representation Learning for Fine-Grained Change Detection

Sensors ◽

10.3390/s21134486 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4486

Author(s):

Niall O’Mahony ◽

Sean Campbell ◽

Lenka Krpalkova ◽

Anderson Carvalho ◽

Joseph Walsh ◽

...

Keyword(s):

Deep Learning ◽

Change Detection ◽

Model Calibration ◽

State Of The Art ◽

Representation Learning ◽

Machine Intelligence ◽

The State ◽

Sensor Data ◽

Fine Grained ◽

Learning Techniques

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.

Download Full-text

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text