Extracting Effective Image Attributes with Refined Universal Detection

Qiang Yu; Xinyu Xiao; Chunxia Zhang; Lifei Song; Chunhong Pan

doi:10.3390/s21010095

Extracting Effective Image Attributes with Refined Universal Detection

Sensors ◽

10.3390/s21010095 ◽

2020 ◽

Vol 21 (1) ◽

pp. 95

Author(s):

Qiang Yu ◽

Xinyu Xiao ◽

Chunxia Zhang ◽

Lifei Song ◽

Chunhong Pan

Keyword(s):

Large Scale ◽

Visual Recognition ◽

Extraction Methods ◽

Visual Features ◽

Image Captioning ◽

Parts Of Speech ◽

Visual Concepts ◽

Attribute Extraction ◽

Universal Detection ◽

High Level

Recently, image attributes containing high-level semantic information have been widely used in computer vision tasks, including visual recognition and image captioning. Existing attribute extraction methods map visual concepts to the probabilities of frequently-used words by directly using Convolutional Neural Networks (CNNs). Typically, two main problems exist in those methods. First, words of different parts of speech (POSs) are handled in the same way, but non-nominal words can hardly be mapped to visual regions through CNNs only. Second, synonymous nominal words are treated as independent and different words, in which similarities are ignored. In this paper, a novel Refined Universal Detection (RUDet) method is proposed to solve these two problems. Specifically, a Refinement (RF) module is designed to extract refined attributes of non-nominal words based on the attributes of nominal words and visual features. In addition, a Word Tree (WT) module is constructed to integrate synonymous nouns, which ensures that similar words hold similar and more accurate probabilities. Moreover, a Feature Enhancement (FE) module is adopted to enhance the ability to mine different visual concepts in different scales. Experiments conducted on the large-scale Microsoft (MS) COCO dataset illustrate the effectiveness of our proposed method.

Download Full-text

Content-Based Video Indexing and Retrieval

Computer Graphics and Multimedia ◽

10.4018/978-1-59140-196-4.ch007 ◽

2011 ◽

pp. 110-144

Author(s):

Jianping Fan ◽

Xingquan Zhu ◽

Jing Xiao

Keyword(s):

Video Compression ◽

Large Scale ◽

Video Retrieval ◽

Retrieval Systems ◽

Database Indexing ◽

Visual Concepts ◽

Representation Scheme ◽

High Level ◽

Video Retrieval Systems ◽

Content Based Video Retrieval

Recent advances in digital video compression and networks have made videos more accessible than ever. Several content-based video retrieval systems have been proposed in the past. In this chapter, we first review these existing content-based video retrieval systems and then propose a new framework, called ClassView, to make some advances towards more efficient content-based video retrieval. This framework includes: (a) an efficient video content analysis and representation scheme to support high-level visual concept characterization; (b) a hierarchical video classification technique to bridge the semantic gap between low-level visual features and high-level semantic visual concepts; and (c) a hierarchical video database indexing structure to enable video access over large-scale database. Integrating video access with efficient database indexing tree structures has provided a great opportunity for supporting more powerful video search engines.

Download Full-text

Computational reconstruction of mental representations using human behavior

10.31234/osf.io/7fdvw ◽

2022 ◽

Author(s):

Laurent Caplette ◽

Nicholas Turk-Browne

Keyword(s):

Mental Representations ◽

Semantic Space ◽

Direct Access ◽

Visual Features ◽

Semantic Mapping ◽

Semantic Features ◽

Visual Concept ◽

Visual Concepts ◽

Computational Reconstruction ◽

High Level

Revealing the contents of mental representations is a longstanding goal of cognitive science. However, there is currently no general framework for providing direct access to representations of high-level visual concepts. We asked participants to indicate what they perceived in images synthesized from random visual features in a deep neural network. We then inferred a mapping between the semantic features of their responses and the visual features of the images. This allowed us to reconstruct the mental representation of virtually any common visual concept, both those reported and others extrapolated from the same semantic space. We successfully validated 270 of these reconstructions as containing the target concept in a separate group of participants. The visual-semantic mapping uncovered with our method further generalized to new stimuli, participants, and tasks. Finally, it allowed us to reveal how the representations of individual observers differ from each other and from those of neural networks.

Download Full-text

A Novel Hyperspectral Endmember Extraction Algorithm Based on Online Robust Dictionary Learning

Remote Sensing ◽

10.3390/rs11151792 ◽

2019 ◽

Vol 11 (15) ◽

pp. 1792 ◽

Cited By ~ 1

Author(s):

Xiaorui Song ◽

Lingda Wu

Keyword(s):

Dictionary Learning ◽

Large Scale ◽

Hyperspectral Image ◽

Signal To Noise Ratio ◽

Extraction Methods ◽

Computational Time ◽

Endmember Extraction ◽

Learning Framework ◽

Extraction Algorithm ◽

High Level

Due to the sparsity of hyperspectral images, the dictionary learning framework has been applied in hyperspectral endmember extraction. However, current endmember extraction methods based on dictionary learning are not robust enough in noisy environments. To solve this problem, this paper proposes a novel endmember extraction approach based on online robust dictionary learning, termed EEORDL. Because of the large scale of the hyperspectral image (HSI) data, an online scheme is introduced to reduce the computational time of dictionary learning. In the proposed algorithm, a new form of the objective function is introduced into the dictionary learning process to improve the robustness for noisy HSI data. The experimental results, conducted with both synthetic and real-world hyperspectral datasets, illustrate that the proposed EEORDL outperforms the state-of-the-art approaches under different signal-to-noise ratio (SNR) conditions, especially for high-level noise.

Download Full-text

Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

Complexity ◽

10.1155/2021/5538927 ◽

2021 ◽

Vol 2021 ◽

pp. 1-19

Author(s):

Ariyo Oluwasammi ◽

Muhammad Umar Aftab ◽

Zhiguang Qin ◽

Son Tung Ngo ◽

Thang Van Doan ◽

...

Keyword(s):

Deep Learning ◽

Evaluation Criteria ◽

Semantic Segmentation ◽

Extraction Methods ◽

Object Identification ◽

Image Captioning ◽

Convolutional Network ◽

Segmentation Analysis ◽

Comprehensive Survey ◽

High Level

With the emergence of deep learning, computer vision has witnessed extensive advancement and has seen immense applications in multiple domains. Specifically, image captioning has become an attractive focal direction for most machine learning experts, which includes the prerequisite of object identification, location, and semantic understanding. In this paper, semantic segmentation and image captioning are comprehensively investigated based on traditional and state-of-the-art methodologies. In this survey, we deliberate on the use of deep learning techniques on the segmentation analysis of both 2D and 3D images using a fully convolutional network and other high-level hierarchical feature extraction methods. First, each domain’s preliminaries and concept are described, and then semantic segmentation is discussed alongside its relevant features, available datasets, and evaluation criteria. Also, the semantic information capturing of objects and their attributes is presented in relation to their annotation generation. Finally, analysis of the existing methods, their contributions, and relevance are highlighted, informing the importance of these methods and illuminating a possible research continuation for the application of semantic image segmentation and image captioning approaches.

Download Full-text

Research on High-Level Semantic Image Retrieval

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.268-270.1427 ◽

2011 ◽

Vol 268-270 ◽

pp. 1427-1432

Author(s):

Chang Yong Ri ◽

Min Yao

Keyword(s):

Image Retrieval ◽

Image Understanding ◽

Extraction Methods ◽

Semantic Gap ◽

Visual Features ◽

Semantic Features ◽

Image Description ◽

Semantic Image Retrieval ◽

Image Grammar ◽

High Level

This paper presented the key problems to shorten “semantic gap” between low-level visual features and high-level semantic features to implement high-level semantic image retrieval. First, introduced ontology based semantic image description and semantic extraction methods based on machine learning. Then, illustrated image grammar on the high-level semantic image understanding and retrieval, and-or graph and context based methods of semantic image. Finally, we discussed the development directions and research emphases in this field.

Download Full-text

A Novel Plausible Model for Visual Perception

Discoveries and Breakthroughs in Cognitive Informatics and Natural Intelligence ◽

10.4018/978-1-60566-902-1.ch023 ◽

2011 ◽

pp. 428-444

Author(s):

Zhiwei Shi ◽

Zhongzhi Shi ◽

Hong Hu

Keyword(s):

Visual Perception ◽

Large Scale ◽

Experimental Simulation ◽

Visual Features ◽

Practical Applications ◽

Low Level ◽

Cognitive Problem ◽

Plausible Model ◽

Semantic Concepts ◽

High Level

Traditionally, how to bridge the gap between low-level visual features and high-level semantic concepts has been a tough task for researchers. In this article, we propose a novel plausible model, namely cellular Bayesian networks (CBNs), to model the process of visual perception. The new model takes advantage of both the low-level visual features, such as colors, textures, and shapes, of target objects and the interrelationship between the known objects, and integrates them into a Bayesian framework, which possesses both firm theoretical foundation and wide practical applications. The novel model successfully overcomes some weakness of traditional Bayesian Network (BN), which prohibits BN being applied to large-scale cognitive problem. The experimental simulation also demonstrates that the CBNs model outperforms purely Bottom-up strategy 6% or more in the task of shape recognition. Finally, although the CBNs model is designed for visual perception, it has great potential to be applied to other areas as well.

Download Full-text

Leveraging prior concept learning improves ability to generalize from few examples in computational models of human object recognition

10.1101/2020.02.18.944702 ◽

2020 ◽

Author(s):

Joshua S. Rule ◽

Maximilian Riesenhuber

Keyword(s):

Object Recognition ◽

Concept Learning ◽

Computational Models ◽

Sparse Data ◽

Visual Features ◽

Software Models ◽

Computational Work ◽

Visual Concepts ◽

High Level ◽

The Brain

AbstractHumans quickly learn new visual concepts from sparse data, sometimes just a single example. Decades of prior work have established the hierarchical organization of the ventral visual stream as key to this ability. Computational work has shown that networks which hierarchically pool afferents across scales and positions can achieve human-like object recognition performance and predict human neural activity. Prior computational work has also reused previously acquired features to efficiently learn novel recognition tasks. These approaches, however, require magnitudes of order more examples than human learners and only reuse intermediate features at the object level or below. None has attempted to reuse extremely high-level visual features capturing entire visual concepts. We used a benchmark deep learning model of object recognition to show that leveraging prior learning at the concept level leads to vastly improved abilities to learn from few examples. These results suggest computational techniques for learning even more efficiently as well as neuroscientific experiments to better understand how the brain learns from sparse data. Most importantly, however, the model architecture provides a biologically plausible way to learn new visual concepts from a small number of examples, and makes several novel predictions regarding the neural bases of concept representations in the brain.Author summaryWe are motivated by the observation that people regularly learn new visual concepts from as little as one or two examples, far better than, e.g., current machine vision architectures. To understand the human visual system’s superior visual concept learning abilities, we used an approach inspired by computational models of object recognition which: 1) use deep neural networks to achieve human-like performance and predict human brain activity; and 2) reuse previous learning to efficiently master new visual concepts. These models, however, require many times more examples than human learners and, critically, reuse only low-level and intermediate information. None has attempted to reuse extremely high-level visual features (i.e., entire visual concepts). We used a neural network model of object recognition to show that reusing concept-level features leads to vastly improved abilities to learn from few examples. Our findings suggest techniques for future software models that could learn even more efficiently, as well as neuroscience experiments to better understand how people learn so quickly. Most importantly, however, our model provides a biologically plausible way to learn new visual concepts from a small number of examples.

Download Full-text

Social inequality in the evolution of human societies

Sociology: Theory, Methods, Marketing ◽

10.15407/sociology2019.02.098 ◽

2019 ◽

pp. 98-120

Author(s):

Georgi Derluguian

Keyword(s):

Social Inequality ◽

Large Scale ◽

Human Beings ◽

Mass Violence ◽

Public And Private ◽

Tangible Assets ◽

Transition To Agriculture ◽

New Institutions ◽

High Level ◽

Political Economic

The author develops ideas about the origin of social inequality during the evolution of human societies and reflects on the possibilities of its overcoming. What makes human beings different from other primates is a high level of egalitarianism and altruism, which contributed to more successful adaptability of human collectives at early stages of the development of society. The transition to agriculture, coupled with substantially increasing population density, was marked by the emergence and institutionalisation of social inequality based on the inequality of tangible assets and symbolic wealth. Then, new institutions of warfare came into existence, and they were aimed at conquering and enslaving the neighbours engaged in productive labour. While exercising control over nature, people also established and strengthened their power over other people. Chiefdom as a new type of polity came into being. Elementary forms of power (political, economic and ideological) served as a basis for the formation of early states. The societies in those states were characterised by social inequality and cruelties, including slavery, mass violence and numerous victims. Nowadays, the old elementary forms of power that are inherent in personalistic chiefdom are still functioning along with modern institutions of public and private bureaucracy. This constitutes the key contradiction of our time, which is the juxtaposition of individual despotic power and public infrastructural one. However, society is evolving towards an ever more efficient combination of social initiatives with the sustainability and viability of large-scale organisations.

Download Full-text

A survey: which features are required for dynamic visual simultaneous localization and mapping?

Visual Computing for Industry Biomedicine and Art ◽

10.1186/s42492-021-00086-w ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Zewen Xu ◽

Zheng Rong ◽

Yihong Wu

Keyword(s):

Simultaneous Localization And Mapping ◽

Dynamic Environments ◽

Visual Features ◽

Advantages And Disadvantages ◽

Intelligent Robots ◽

Localization And Mapping ◽

High Level ◽

Static World ◽

Robotic Applications ◽

Significant Attention

AbstractIn recent years, simultaneous localization and mapping in dynamic environments (dynamic SLAM) has attracted significant attention from both academia and industry. Some pioneering work on this technique has expanded the potential of robotic applications. Compared to standard SLAM under the static world assumption, dynamic SLAM divides features into static and dynamic categories and leverages each type of feature properly. Therefore, dynamic SLAM can provide more robust localization for intelligent robots that operate in complex dynamic environments. Additionally, to meet the demands of some high-level tasks, dynamic SLAM can be integrated with multiple object tracking. This article presents a survey on dynamic SLAM from the perspective of feature choices. A discussion of the advantages and disadvantages of different visual features is provided in this article.

Download Full-text

Arabidopsis Genes Essential for Seedling Viability: Isolation of Insertional Mutants and Molecular Cloning

Genetics ◽

10.1093/genetics/159.4.1765 ◽

2001 ◽

Vol 159 (4) ◽

pp. 1765-1778

Author(s):

Gregory J Budziszewski ◽

Sharon Potter Lewis ◽

Lyn Wegrich Glover ◽

Jennifer Reineke ◽

Gary Jones ◽

...

Keyword(s):

Large Scale ◽

Protein Translocation ◽

Gene Families ◽

Mutant Phenotype ◽

Lethal Mutant ◽

A Genome ◽

Genes Encoding ◽

High Level ◽

Mutant Lines ◽

Genome Scale

Abstract We have undertaken a large-scale genetic screen to identify genes with a seedling-lethal mutant phenotype. From screening ~38,000 insertional mutant lines, we identified >500 seedling-lethal mutants, completed cosegregation analysis of the insertion and the lethal phenotype for >200 mutants, molecularly characterized 54 mutants, and provided a detailed description for 22 of them. Most of the seedling-lethal mutants seem to affect chloroplast function because they display altered pigmentation and affect genes encoding proteins predicted to have chloroplast localization. Although a high level of functional redundancy in Arabidopsis might be expected because 65% of genes are members of gene families, we found that 41% of the essential genes found in this study are members of Arabidopsis gene families. In addition, we isolated several interesting classes of mutants and genes. We found three mutants in the recently discovered nonmevalonate isoprenoid biosynthetic pathway and mutants disrupting genes similar to Tic40 and tatC, which are likely to be involved in chloroplast protein translocation. Finally, we directly compared T-DNA and Ac/Ds transposon mutagenesis methods in Arabidopsis on a genome scale. In each population, we found only about one-third of the insertion mutations cosegregated with a mutant phenotype.

Download Full-text