scholarly journals Self-Erasing Network for Person Re-Identification

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4262
Author(s):  
Xinyue Fan ◽  
Yang Lin ◽  
Chaoxi Zhang ◽  
Jia Zhang

Person re-identification (ReID) plays an important role in intelligent surveillance and receives widespread attention from academics and the industry. Due to extreme changes in viewing angles, some discriminative local regions are suppressed. In addition, the data with similar backgrounds collected by a fixed viewing angle camera will also affect the model’s ability to distinguish a person. Therefore, we need to discover more fine-grained information to form the overall characteristics of each identity. The proposed self-erasing network structure composed of three branches benefits the extraction of global information, the suppression of background noise and the mining of local information. The two self-erasing strategies that we proposed encourage the network to focus on foreground information and strengthen the model’s ability to encode weak features so as to form more effective and richer visual cues of a person. Extensive experiments show that the proposed method is competitive with the advanced methods and achieves state-of-the-art performance on DukeMTMC-ReID and CUHK-03(D) datasets. Furthermore, it can be seen from the activation map that the proposed method is beneficial to spread the attention to the whole body. Both metrics and the activation map validate the effectiveness of our proposed method.

Author(s):  
Xue Lin ◽  
Qi Zou ◽  
Xixia Xu

Human-object interaction (HOI) detection is important to understand human-centric scenes and is challenging due to subtle difference between fine-grained actions, and multiple co-occurring interactions. Most approaches tackle the problems by considering the multi-stream information and even introducing extra knowledge, which suffer from a huge combination space and the non-interactive pair domination problem. In this paper, we propose an Action-Guided attention mining and Relation Reasoning (AGRR) network to solve the problems. Relation reasoning on human-object pairs is performed by exploiting contextual compatibility consistency among pairs to filter out the non-interactive combinations. To better discriminate the subtle difference between fine-grained actions, an action-aware attention based on class activation map is proposed to mine the most relevant features for recognizing HOIs. Extensive experiments on V-COCO and HICO-DET datasets demonstrate the effectiveness of the proposed model compared with the state-of-the-art approaches.


2020 ◽  
Vol 6 (4) ◽  
pp. 477-487
Author(s):  
Ding-Nan Zou ◽  
Song-Hai Zhang ◽  
Tai-Jiang Mu ◽  
Min Zhang

AbstractIn this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog breeds and 70,428 real-world images. It has only one dog in each image and provides annotated bounding boxes for the whole body and head. In comparison to previous similar datasets, it contains more breeds and more carefully chosen images for each breed. The diversity within each breed is greater, with between 200 and 7000+ images for each breed. Annotation of the whole body and head makes the dataset not only suitable for the improvement of finegrained image classification models based on overall features, but also for those locating local informative parts. We show that dataset provides a tough challenge by benchmarking several state-of-the-art deep neural models. The dataset is available for academic purposes at https://cg.cs.tsinghua.edu.cn/ThuDogs/.


Author(s):  
Mengzhou Xia ◽  
Guoping Huang ◽  
Lemao Liu ◽  
Shuming Shi

A translation memory (TM) is proved to be helpful to improve neural machine translation (NMT). Existing approaches either pursue the decoding efficiency by merely accessing local information in a TM or encode the global information in a TM yet sacrificing efficiency due to redundancy. We propose an efficient approach to making use of the global information in a TM. The key idea is to pack a redundant TM into a compact graph and perform additional attention mechanisms over the packed graph for integrating the TM representation into the decoding network. We implement the model by extending the state-of-the-art NMT, Transformer. Extensive experiments on three language pairs show that the proposed approach is efficient in terms of running time and space occupation, and particularly it outperforms multiple strong baselines in terms of BLEU scores.


2019 ◽  
Vol 9 (2) ◽  
pp. 301 ◽  
Author(s):  
Xinghua Dai ◽  
Shengrong Gong ◽  
Shan Zhong ◽  
Zongming Bao

One of the challenges in fine-grained classification is that subcategories with significant similarity are hard to be distinguished due to the equal treatment of all subcategories in existing algorithms. In order to solve this problem, a fine-grained image classification method by combining a bilinear convolutional neural network (B-CNN) and the measurement of subcategory similarities is proposed. Firstly, an improved weakly supervised localization method is designed to obtain the bounding box of the main object, which allows the model to eliminate the influence of background noise and obtain more accurate features. Then, sample features in the training set are computed by B-CNN so that the fuzzing similarity matrix for measuring interclass similarities can be obtained. To further improve classification accuracy, the loss function is designed by weighting triplet loss and softmax loss. Extensive experiments implemented on two benchmarks datasets, Stanford Cars-196 and Caltech-UCSD Birds-200-2011 (CUB-200-2011), show that the newly proposed method outperforms in accuracy several state-of-the-art weakly supervised classification models.


1995 ◽  
Vol 38 (5) ◽  
pp. 1126-1142 ◽  
Author(s):  
Jeffrey W. Gilger

This paper is an introduction to behavioral genetics for researchers and practioners in language development and disorders. The specific aims are to illustrate some essential concepts and to show how behavioral genetic research can be applied to the language sciences. Past genetic research on language-related traits has tended to focus on simple etiology (i.e., the heritability or familiality of language skills). The current state of the art, however, suggests that great promise lies in addressing more complex questions through behavioral genetic paradigms. In terms of future goals it is suggested that: (a) more behavioral genetic work of all types should be done—including replications and expansions of preliminary studies already in print; (b) work should focus on fine-grained, theory-based phenotypes with research designs that can address complex questions in language development; and (c) work in this area should utilize a variety of samples and methods (e.g., twin and family samples, heritability and segregation analyses, linkage and association tests, etc.).


Author(s):  
Nicolas Poirel ◽  
Claire Sara Krakowski ◽  
Sabrina Sayah ◽  
Arlette Pineau ◽  
Olivier Houdé ◽  
...  

The visual environment consists of global structures (e.g., a forest) made up of local parts (e.g., trees). When compound stimuli are presented (e.g., large global letters composed of arrangements of small local letters), the global unattended information slows responses to local targets. Using a negative priming paradigm, we investigated whether inhibition is required to process hierarchical stimuli when information at the local level is in conflict with the one at the global level. The results show that when local and global information is in conflict, global information must be inhibited to process local information, but that the reverse is not true. This finding has potential direct implications for brain models of visual recognition, by suggesting that when local information is conflicting with global information, inhibitory control reduces feedback activity from global information (e.g., inhibits the forest) which allows the visual system to process local information (e.g., to focus attention on a particular tree).


Author(s):  
Megha Chhabra ◽  
Manoj Kumar Shukla ◽  
Kiran Kumar Ravulakollu

: Latent fingerprints are unintentional finger skin impressions left as ridge patterns at crime scenes. A major challenge in latent fingerprint forensics is the poor quality of the lifted image from the crime scene. Forensics investigators are in permanent search of novel outbreaks of the effective technologies to capture and process low quality image. The accuracy of the results depends upon the quality of the image captured in the beginning, metrics used to assess the quality and thereafter level of enhancement required. The low quality of the image collected by low quality scanners, unstructured background noise, poor ridge quality, overlapping structured noise result in detection of false minutiae and hence reduce the recognition rate. Traditionally, Image segmentation and enhancement is partially done manually using help of highly skilled experts. Using automated systems for this work, differently challenging quality of images can be investigated faster. This survey amplifies the comparative study of various segmentation techniques available for latent fingerprint forensics.


Birds ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 138-146
Author(s):  
Eduardo J. Rodríguez-Rodríguez ◽  
Juan J. Negro

The family Ciconiidae comprises 19 extant species which are highly social when nesting and foraging. All species share similar morphotypes, with long necks, a bill, and legs, and are mostly coloured in the achromatic spectrum (white, black, black, and white, or shades of grey). Storks may have, however, brightly coloured integumentary areas in, for instance, the bill, legs, or the eyes. These chromatic patches are small in surface compared with the whole body. We have analyzed the conservatism degree of colouration in 10 body areas along an all-species stork phylogeny derived from BirdTRee using Geiger models. We obtained low conservatism in frontal areas (head and neck), contrasting with a high conservatism in the rest of the body. The frontal areas tend to concentrate the chromatic spectrum whereas the rear areas, much larger in surface, are basically achromatic. These results lead us to suggest that the divergent evolution of the colouration of frontal areas is related to species recognition through visual cue assessment in the short-range, when storks form mixed-species flocks in foraging or resting areas.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4486
Author(s):  
Niall O’Mahony ◽  
Sean Campbell ◽  
Lenka Krpalkova ◽  
Anderson Carvalho ◽  
Joseph Walsh ◽  
...  

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.


Author(s):  
Anil S. Baslamisli ◽  
Partha Das ◽  
Hoang-An Le ◽  
Sezer Karaoglu ◽  
Theo Gevers

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.


Sign in / Sign up

Export Citation Format

Share Document