A Case for Automated Large Scale Semantic Annotation

Author(s):  
Stephen Dill ◽  
Nadav Eiron ◽  
David Gibson ◽  
Daniel Gruhl ◽  
R. Guha ◽  
...  
2011 ◽  
Vol 268-270 ◽  
pp. 1386-1389
Author(s):  
Xiao Ying Wu ◽  
Yun Juan Liang ◽  
Li Li ◽  
Li Juan Ma

In this paper, improve the image annotation with semantic meaning, and name the new algorithm for semantic fusion of image annotation, that is a image is given to be labeled, use of training data set, the word set, and a collection of image area and other information to establish the probability model ,estimates the joint probability by word and given image areas.The probability value as the size, combined with keywords relevant table that integrates lexical semantics to extract keywords as the most representative image semantic annotation results. The algorithm can effectively use large-scale training data with rich annotation, so as to achieve better recall and precision than the existing automatic image annotation ,and validate the algorithm in the Corel data set.


2009 ◽  
Vol 11 (4) ◽  
pp. 47-73 ◽  
Author(s):  
Daniel Sonntag ◽  
Pinar Wennerberg ◽  
Paul Buitelaar ◽  
Sonja Zillner

In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.


Author(s):  
Daniel Sonntag ◽  
Pinar Wennerberg ◽  
Paul Buitelaar ◽  
Sonja Zillner

In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.


Author(s):  
Paula M Mabee ◽  
Wasila M Dahdul ◽  
James P Balhoff ◽  
Hilmar Lapp ◽  
Prashanti Manda ◽  
...  

The study of how the observable features of organisms, i.e., their phenotypes, result from the complex interplay between genetics, development, and the environment, is central to much research in biology. The varied language used in the description of phenotypes, however, impedes the large scale and interdisciplinary analysis of phenotypes by computational methods. The Phenoscape project (www.phenoscape.org) has developed semantic annotation tools and a gene–phenotype knowledgebase, the Phenoscape KB, that uses machine reasoning to connect evolutionary phenotypes from the comparative literature to mutant phenotypes from model organisms. The semantically annotated data enables the linking of novel species phenotypes with candidate genes that may underlie them. Semantic annotation of evolutionary phenotypes further enables previously difficult or novel analyses of comparative anatomy and evolution. These include generating large, synthetic character matrices of presence/absence phenotypes based on inference, and searching for taxa and genes with similar variation profiles using semantic similarity. Phenoscape is further extending these tools to enable users to automatically generate synthetic supermatrices for diverse character types, and use the domain knowledge encoded in ontologies for evolutionary trait analysis. Curating the annotated phenotypes necessary for this research requires significant human curator effort, although semi-automated natural language processing tools promise to expedite the curation of free text. As semantic tools and methods are developed for the biodiversity sciences, new insights from the increasingly connected stores of interoperable phenotypic and genetic data are anticipated.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 686 ◽  
Author(s):  
Ferdinando Villa ◽  
Stefano Balbi ◽  
Ioannis N. Athanasiadis ◽  
Caterina Caracciolo

Correct and reliable linkage of independently produced information is a requirement to enable sophisticated applications and processing workflows. These can ultimately help address the challenges posed by complex systems (such as socio-ecological systems), whose many components can only be described through independently developed data and model products. We discuss the first outcomes of an investigation in the conceptual and methodological aspects of semantic annotation of data and models, aimed to enable a high standard of interoperability of information. The results, operationalized in the context of a long-term, active, large-scale project on ecosystem services assessment, include: A definition of interoperability based on semantics and scale;A conceptual foundation for the phenomenology underlying scientific observations, aimed to guide the practice of semantic annotation in domain communities;A dedicated language and software infrastructure that operationalizes the findings and allows practitioners to reap the benefits of data and model interoperability. The work presented is the first detailed description of almost a decade of work with communities active in socio-ecological system modeling. After defining the boundaries of possible interoperability based on the understanding of scale, we discuss examples of the practical use of the findings to obtain consistent, interoperable and machine-ready semantic specifications that can integrate semantics across diverse domains and disciplines.


Author(s):  
DEJAN GJORGJEVIKJ ◽  
GJORGJI MADJAROV ◽  
SAŠO DŽEROSKI

Multi-label learning (MLL) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLL are the large-scale problem, which have a strong impact on the computational complexity of learning. These problems are especially pronounced for approaches that transform MLL problems into a set of binary classification problems for which Support Vector Machines (SVMs) are used. On the other hand, the most efficient approaches to MLL, based on decision trees, have clearly lower predictive performance. We propose a hybrid decision tree architecture, where the leaves do not give multi-label predictions directly, but rather utilize local SVM-based classifiers giving multi-label predictions. A binary relevance architecture is employed in the leaves, where a binary SVM classifier is built for each of the labels relevant to that particular leaf. We use a broad range of multi-label datasets with a variety of evaluation measures to evaluate the proposed method against related and state-of-the-art methods, both in terms of predictive performance and time complexity. Our hybrid architecture on almost every large classification problem outperforms the competing approaches in terms of the predictive performance, while its computational efficiency is significantly improved as a result of the integrated decision tree.


2013 ◽  
Vol 07 (03) ◽  
pp. 257-290 ◽  
Author(s):  
KE HAO ◽  
PHILLIP C-Y SHEU ◽  
HIROSHI YAMAGUCHI

This paper addresses semantic search of Web services using natural language processing. First we survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a service search framework based on the vector space model to combine the traditional frequency weighted term-document matrix, the syntactical information extracted from a lexical database and a dependency grammar parser. In particular, instead of using terms as the rows in a term-document matrix, we propose using synsets from WordNet to distinguish different meanings of a word under different contexts as well as clustering different words with similar meanings. Also based on the characteristics of Web services descriptions, we propose an approach to identifying semantically important terms to adjust weightings. Our experiments show that our approach achieves its goal well.


Sign in / Sign up

Export Citation Format

Share Document