Multimedia Data Engineering Applications and Processing
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

0
(FIVE YEARS 0)

Published By IGI Global

9781466629400, 9781466629417

Author(s):  
Ning Yu ◽  
Kien A. Hua ◽  
Danzhou Liu

During the last decade, high quality (i.e. over 1 megapixel) built-in cameras have become standard features of handheld devices. Users can take high-resolution pictures and share with friends via the internet. At the same time, the demand of multimedia information retrieval using those pictures on mobile devices has become an urgent problem to solve, and therefore attracts attention. A relevance feedback information retrieval process includes several rounds of query refinement, which incurs exchange of images between the mobile device and the server. With limited wireless bandwidth, this process can incur substantial delay, making the system unfriendly to use. This issue is addressed by considering a Client-side Relevance Feedback (CRF) technique. In the CRF system, Relevance Feedback (RF) is done on client side along. Mobile devices’ battery power is saved from exchanging images between server and client and system response is instantaneous, which significantly enhances system usability. Furthermore, because the server is not involved in RF processing, it is able to support more users simultaneously. The experiment indicates that the system outperforms the traditional server-client relevance feedback systems on the aspects of system response time, mobile battery power saving, and retrieval result.


Author(s):  
Seunghan Han ◽  
Walter Stechele

Default reasoning can provide a means of deriving plausible semantic conclusion under imprecise and contradictory information in forensic visual surveillance. In such reasoning under uncertainty, proper uncertainty handling formalism is required. A discrete species of Bilattice for multivalued default logic demonstrated default reasoning in visual surveillance. In this article, the authors present an approach to default reasoning using subjective logic that acts in a continuous space. As an uncertainty representation and handling formalism, subjective logic bridges Dempster Shafer belief theory and second order Bayesian, thereby making it attractive tool for artificial reasoning. For the verification of the proposed approach, the authors extend the inference scheme on the bilattice for multivalued default logic to L-fuzzy set based logics that can be modeled with continuous species of bilattice structures. The authors present some illustrative case studies in visual surveillance scenarios to contrast the proposed approach with L-fuzzy set based approaches.


Author(s):  
Jyh-Ren Shieh ◽  
Ching-Yung Lin ◽  
Shun-Xuan Wang ◽  
Ja-Ling Wu

The abundance of Web 2.0 social media in various media formats calls for integration that takes into account tags associated with these resources. The authors present a new approach to multi-modal media search, based on novel related-tag graphs, in which a query is a resource in one modality, such as an image, and the results are semantically similar resources in various modalities, for instance text and video. Thus the use of resource tagging enables the use of multi-modal results and multi-modal queries, a marked departure from the traditional text-based search paradigm. Tag relation graphs are built based on multi-partite networks of existing Web 2.0 social media such as Flickr and Wikipedia. These multi-partite linkage networks (contributor-tag, tag-category, and tag-tag) are extracted from Wikipedia to construct relational tag graphs. In fusing these networks, the authors propose incorporating contributor-category networks to model contributor’s specialization; it is shown that this step significantly enhances the accuracy of the inferred relatedness of the term-semantic graphs. Experiments based on 200 TREC-5 ad-hoc topics show that the algorithms outperform existing approaches. In addition, user studies demonstrate the superiority of this visualization system and its usefulness in the real world.


Author(s):  
Qiusha Zhu ◽  
Lin Lin ◽  
Mei-Ling Shyu ◽  
Dianting Liu

Traditional image classification relies on text information such as tags, which requires a lot of human effort to annotate them. Therefore, recent work focuses more on training the classifiers directly on visual features extracted from image content. The performance of content-based classification is improving steadily, but it is still far below users’ expectation. Moreover, in a web environment, HTML surrounding texts associated with images naturally serve as context information and are complementary to content information. This paper proposes a novel two-stage image classification framework that aims to improve the performance of content-based image classification by utilizing context information of web-based images. A new TF*IDF weighting scheme is proposed to extract discriminant textual features from HTML surrounding texts. Both content-based and context-based classifiers are built by applying multiple correspondence analysis (MCA). Experiments on web-based images from Microsoft Research Asia (MSRA-MM) dataset show that the proposed framework achieves promising results.


Author(s):  
Liping Zhou ◽  
Wei-Bang Chen ◽  
Chengcui Zhang

This paper describes a framework to detect authorship of eBay images. It contains three modules: editing style summarization, classification and multi-account linking detection. For editing style summarization, three approaches, namely the edge-based approach, the color-based approach, and the color probability approach, are proposed to encode the common patterns inside a group of images with similar editing styles into common edge or color models. Prior to the summarization step, an edge-based clustering algorithm is developed. Corresponding to the three summarization approaches, three classification methods are developed accordingly to predict the authorship of an unlabeled test image. For multi-account linking detection, to detect the hidden owner behind multiple eBay seller accounts, two methods to measure the similarity between seller accounts based on similar models are presented.


Author(s):  
Ehsan Younessian ◽  
Deepu Rajan

In this paper, the authors propose an effective content-based clustering method for keyframes of news video stories using the Near Duplicate Keyframe (NDK) identification concept. Initially, the authors investigate the near-duplicate relationship, as a content-based visual similarity across keyframes, through the Near-Duplicate Keyframe (NDK) identification algorithm presented. The authors assign a near-duplicate score to each pair of keyframes within the story. Using an efficient keypoint matching technique followed by matching pattern analysis, this NDK identification algorithm can handle extreme zooming and significant object motion. In the second step, the weighted adjacency matrix is determined for each story based on assigned near duplicate score. The authors then use the spectral clustering scheme to remove outlier keyframes and partition remainders. Two sets of experiments are carried out to evaluate the NDK identification method and assess the proposed keyframe clustering method performance.


Author(s):  
Massimiliano Albanese ◽  
Antonio d’Acierno ◽  
Vincenzo Moscato ◽  
Fabio Persia ◽  
Antonio Picariello

One of the most important challenges in the information access field, especially for multimedia repositories, is information overload. To cope with this problem, in this paper, the authors present a strategy for a recommender system that computes customized recommendations for users’ accessing multimedia collections, using semantic contents and low-level features of multimedia objects, past behaviour of individual users, and social behaviour of the users’ community as a whole. The authors implement their strategy in a recommender prototype for browsing image digital libraries in the Cultural Heritage domain. They then investigate the effectiveness of the proposed approach, based on the users’ satisfaction. The preliminary experimental results show that the approach is promising and encourages further research in this direction.


Author(s):  
Hongli Luo

Video transmission over wireless networks has quality of service (QoS) requirements and the time-varying characteristics of wireless channels make it a challenging task. IEEE 802.11 Wireless LAN has been widely used for the last mile connection for multimedia transmission. In this paper, a cross-layer design is presented for video streaming over IEEE 802.11e HCF Controlled Channel Access (HCCA) WLAN. The goal of the cross-layer design is to improve the quality of the video received in a wireless network under the constraint of network bandwidth. The approach is composed of two algorithms. First, an allocation of optimal TXOP is calculated which aims at maintaining a short queuing delay at the wireless station at the cost of a small TXOP allocation. Second, the transmission of the packets is scheduled according to the importance of the packets in order to maximize the visual quality of video. The approach is compared with the standard HCCA on NS2 simulation tools using H.264 video codec. The proposed cross-layer design outperforms the standard approach in terms of the PSNRs of the received video. This approach reduces the packet loss to allow the graceful video degradation, especially under heavy network traffic.


Author(s):  
Nathaniel Rossol ◽  
Irene Cheng ◽  
Iqbal Jamal ◽  
John Berezowski ◽  
Anup Basu

Geographic Information Systems (GISs), which map spatiotemporal event data on geographical maps, have proven to be useful in many applications. Time-based Geographic Information Systems (GISs) allow practitioners to visualize collected data in an intuitive way. However, while current GIS systems have proven to be useful in post hoc analysis and provide simple two-dimensional geographic visualizations, their design typically lacks the features necessary for highly targeted real-time surveillance with the goal of spread prevention. This paper outlines the design, implementation, and usage of a 3D framework for real-time geospatial temporal visualization. In this case study, using livestock movements, the authors show that the framework is capable of tracking and simulating the spread of epidemic diseases. Although the application discussed in this paper relates to livestock disease, the proposed framework can be used to manage and visualize other types of high-dimensional multimedia data as well.


Author(s):  
Kimiaki Shirahama ◽  
Kuniaki Uehara

This paper examines video retrieval based on Query-By-Example (QBE) approach, where shots relevant to a query are retrieved from large-scale video data based on their similarity to example shots. This involves two crucial problems: The first is that similarity in features does not necessarily imply similarity in semantic content. The second problem is an expensive computational cost to compute the similarity of a huge number of shots to example shots. The authors have developed a method that can filter a large number of shots irrelevant to a query, based on a video ontology that is knowledge base about concepts displayed in a shot. The method utilizes various concept relationships (e.g., generalization/specialization, sibling, part-of, and co-occurrence) defined in the video ontology. In addition, although the video ontology assumes that shots are accurately annotated with concepts, accurate annotation is difficult due to the diversity of forms and appearances of the concepts. Dempster-Shafer theory is used to account the uncertainty in determining the relevance of a shot based on inaccurate annotation of this shot. Experimental results on TRECVID 2009 video data validate the effectiveness of the method.


Sign in / Sign up

Export Citation Format

Share Document