scholarly journals Video-Based Person Re-Identification With Unregulated Sequences

2020 ◽  
Vol 12 (2) ◽  
pp. 59-76
Author(s):  
Wenjun Huang ◽  
Chao Liang ◽  
Chunxia Xiao ◽  
Zhen Han

Video-based person re-identification (re-id) has recently attracted widespread attentions because extra space-time information and more appearance cues in videos can be used to improve the performance of image-based person re-id. Most existing approaches equally treat person video images, ignoring their individual discrepancy. However, in real scenarios, captured images are usually contaminated by various noises, especially occlusions, resulting in a series of unregulated sequences. Through investigating the impact of unregulated sequences to feature representation of video-based person re-id, the authors find a remarkable promotion by eliminating noisy sub sequences. Based on this interesting finding, an adaptive unregulated sub sequence detection and refinement method is proposed to purify original video sequence and obtain a more effective and discriminative feature representation for video-based person re-id. Experimental results on two public datasets demonstrate that the proposed method outperforms the state-of-the-art work.

Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3861 ◽  
Author(s):  
Changxin Gao ◽  
Jin Wang ◽  
Leyuan Liu ◽  
Jin-Gang Yu ◽  
Nong Sang

Most existing person re-identification methods focus on matching still person images across non-overlapping camera views. Despite their excellent performance in some circumstances, these methods still suffer from occlusion and the changes of pose, viewpoint or lighting. Video-based re-id is a natural way to overcome these problems, by exploiting space–time information from videos. One of the most challenging problems in video-based person re-identification is temporal alignment, in addition to spatial alignment. To address the problem, we propose an effective superpixel-based temporally aligned representation for video-based person re-identification, which represents a video sequence only using one walking cycle. Particularly, we first build a candidate set of walking cycles by extracting motion information at superpixel level, which is more robust than that at the pixel level. Then, from the candidate set, we propose an effective criterion to select the walking cycle most matching the intrinsic periodicity property of walking persons. Finally, we propose a temporally aligned pooling scheme to describe the video data in the selected walking cycle. In addition, to characterize the individual still images in the cycle, we propose a superpixel-based representation to improve spatial alignment. Extensive experimental results on three public datasets demonstrate the effectiveness of the proposed method compared with the state-of-the-art approaches.


2021 ◽  
Vol 13 (4) ◽  
pp. 2031
Author(s):  
Fabio Grandi ◽  
Riccardo Karim Khamaisi ◽  
Margherita Peruzzini ◽  
Roberto Raffaeli ◽  
Marcello Pellicciari

Product and process digitalization is pervading numerous areas in the industry to improve quality and reduce costs. In particular, digital models enable virtual simulations to predict product and process performances, as well as to generate digital contents to improve the general workflow. Digital models can also contain additional contents (e.g., model-based design (MBD)) to provide online and on-time information about process operations and management, as well as to support operator activities. The recent developments in augmented reality (AR) offer new specific interfaces to promote the great diffusion of digital contents into industrial processes, thanks to flexible and robust applications, as well as cost-effective devices. However, the impact of AR applications on sustainability is still poorly explored in research. In this direction, this paper proposed an innovative approach to exploit MBD and introduce AR interfaces in the industry to support human intensive processes. Indeed, in those processes, the human contribution is still crucial to guaranteeing the expected product quality (e.g., quality inspection). The paper also analyzed how this new concept can benefit sustainability and define a set of metrics to assess the positive impact on sustainability, focusing on social aspects.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Alina Trifan ◽  
José Luis Oliveira

Abstract With the continuous increase in the use of social networks, social mining is steadily becoming a powerful component of digital phenotyping. In this paper we explore social mining for the classification of self-diagnosed depressed users of Reddit as social network. We conduct a cross evaluation study based on two public datasets in order to understand the impact of transfer learning when the data source is virtually the same. We further complement these results with an experiment of transfer learning in post-partum depression classification, using a corpus we have collected for the matter. Our findings show that transfer learning in social mining might still be at an early stage in computational research and we thoroughly discuss its implications.


2010 ◽  
Vol 16 (4) ◽  
pp. 112-121 ◽  
Author(s):  
Brennen W. Mills ◽  
Owen B. J. Carter ◽  
Robert J. Donovan

The objective of this case study was to experimentally manipulate the impact on arousal and recall of two characteristics frequently occurring in gruesome depictions of body parts in smoking cessation advertisements: the presence or absence of an external physical insult to the body part depicted; whether or not the image contains a clear figure/ground demarcation. Three hundred participants (46% male, 54% female; mean age 27.3 years, SD = 11.4) participated in a two-stage online study wherein they viewed and responded to a series of gruesome 4-s video images. Seventy-two video clips were created to provide a sample of images across the two conditions: physical insult versus no insult and clear figure/ground demarcation versus merged or no clear figure/ground demarcation. In stage one, participants viewed a randomly ordered series of 36 video clips and rated how “confronting” they considered each to be. Seven days later (stage two), to test recall of each video image, participants viewed all 72 clips and were asked to identify those they had seen previously. Images containing a physical insult were consistently rated more confronting and were remembered more accurately than images with no physical insult. Images with a clear figure/ground demarcation were rated as no more confronting but were consistently recalled with greater accuracy than those with unclear figure/ground demarcation. Makers of gruesome health warning television advertisements should incorporate some form of physical insult and use a clear figure/ground demarcation to maximize image recall and subsequent potential advertising effectiveness.


2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Xinman Zhang ◽  
Kunlei Jing ◽  
Guokun Song

The security problems of online transactions by smartphones reveal extreme demand for reliable identity authentication systems. With a lower risk of forgery, richer texture, and more comfortable acquisition mode, compared with face, fingerprint, and iris, palmprint is rarely adopted for identity authentication. In this paper, we develop an effective and full-function palmprint authentication system regarding the application on an Android smartphone, which bridges the algorithmic study and application of palmprint authentication. In more detail, an overall system framework is designed with complete functions, including palmprint acquisition, key points location, ROI segmentation, feature extraction, and feature coding. Basically, we develop a palmprint authentication system having user-friendly interfaces and good compatibility with the Android smartphone. Particularly, on the one hand, to guarantee the effectiveness and efficiency of the system, we exploit the practical Log-Gabor filter for feature extraction and discuss the impact of filtering direction, downsampling ratio, and discriminative feature coding to propose an improved algorithm. On the other hand, after exploring the hardware components of the smartphone and the technical development of the Android system, we provide an open technology to extend the biometric methods to real-world applications. On the public PolyU databases, simulation results suggest that the improved algorithm outperforms the original one with a promising accuracy of 100% and a good speed of 0.041 seconds. In real-world authentication, the developed system achieves an accuracy of 98.40% and a speed of 0.051 seconds. All the results verify the accuracy and timeliness of the developed system.


2020 ◽  
Vol 79 (11-12) ◽  
pp. 7783-7809
Author(s):  
Yunbo Gu ◽  
Hui Tang ◽  
Tianling Lv ◽  
Yang Chen ◽  
Zhiping Wang ◽  
...  

Author(s):  
Yitian Yuan ◽  
Tao Mei ◽  
Wenwu Zhu

We have witnessed the tremendous growth of videos over the Internet, where most of these videos are typically paired with abundant sentence descriptions, such as video titles, captions and comments. Therefore, it has been increasingly crucial to associate specific video segments with the corresponding informative text descriptions, for a deeper understanding of video content. This motivates us to explore an overlooked problem in the research community — temporal sentence localization in video, which aims to automatically determine the start and end points of a given sentence within a paired video. For solving this problem, we face three critical challenges: (1) preserving the intrinsic temporal structure and global context of video to locate accurate positions over the entire video sequence; (2) fully exploring the sentence semantics to give clear guidance for localization; (3) ensuring the efficiency of the localization method to adapt to long videos. To address these issues, we propose a novel Attention Based Location Regression (ABLR) approach to localize sentence descriptions in videos in an efficient end-to-end manner. Specifically, to preserve the context information, ABLR first encodes both video and sentence via Bi-directional LSTM networks. Then, a multi-modal co-attention mechanism is presented to generate both video and sentence attentions. The former reflects the global video structure, while the latter highlights the sentence details for temporal localization. Finally, a novel attention based location prediction network is designed to regress the temporal coordinates of sentence from the previous attentions. We evaluate the proposed ABLR approach on two public datasets ActivityNet Captions and TACoS. Experimental results show that ABLR significantly outperforms the existing approaches in both effectiveness and efficiency.


Information ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 280
Author(s):  
Shaoxiu Wang ◽  
Yonghua Zhu ◽  
Wenjing Gao ◽  
Meng Cao ◽  
Mengyao Li

The sentiment analysis of microblog text has always been a challenging research field due to the limited and complex contextual information. However, most of the existing sentiment analysis methods for microblogs focus on classifying the polarity of emotional keywords while ignoring the transition or progressive impact of words in different positions in the Chinese syntactic structure on global sentiment, as well as the utilization of emojis. To this end, we propose the emotion-semantic-enhanced bidirectional long short-term memory (BiLSTM) network with the multi-head attention mechanism model (EBILSTM-MH) for sentiment analysis. This model uses BiLSTM to learn feature representation of input texts, given the word embedding. Subsequently, the attention mechanism is used to assign the attentive weights of each words to the sentiment analysis based on the impact of emojis. The attentive weights can be combined with the output of the hidden layer to obtain the feature representation of posts. Finally, the sentiment polarity of microblog can be obtained through the dense connection layer. The experimental results show the feasibility of our proposed model on microblog sentiment analysis when compared with other baseline models.


Author(s):  
Oleg Y. Borbulevych ◽  
Roger I. Martin ◽  
Lance M. Westerhoff

Abstract Conventional protein:ligand crystallographic refinement uses stereochemistry restraints coupled with a rudimentary energy functional to ensure the correct geometry of the model of the macromolecule—along with any bound ligand(s)—within the context of the experimental, X-ray density. These methods generally lack explicit terms for electrostatics, polarization, dispersion, hydrogen bonds, and other key interactions, and instead they use pre-determined parameters (e.g. bond lengths, angles, and torsions) to drive structural refinement. In order to address this deficiency and obtain a more complete and ultimately more accurate structure, we have developed an automated approach for macromolecular refinement based on a two layer, QM/MM (ONIOM) scheme as implemented within our DivCon Discovery Suite and "plugged in" to two mainstream crystallographic packages: PHENIX and BUSTER. This implementation is able to use one or more region layer(s), which is(are) characterized using linear-scaling, semi-empirical quantum mechanics, followed by a system layer which includes the balance of the model and which is described using a molecular mechanics functional. In this work, we applied our Phenix/DivCon refinement method—coupled with our XModeScore method for experimental tautomer/protomer state determination—to the characterization of structure sets relevant to structure-based drug design (SBDD). We then use these newly refined structures to show the impact of QM/MM X-ray refined structure on our understanding of function by exploring the influence of these improved structures on protein:ligand binding affinity prediction (and we likewise show how we use post-refinement scoring outliers to inform subsequent X-ray crystallographic efforts). Through this endeavor, we demonstrate a computational chemistry ↔ structural biology (X-ray crystallography) "feedback loop" which has utility in industrial and academic pharmaceutical research as well as other allied fields.


Sign in / Sign up

Export Citation Format

Share Document