A Case for Automated Large Scale Semantic Annotation

A practical experience concerning the parallel semantic annotation of a large-scale data collection

Proceedings of the 9th International Conference on Semantic Systems - I-SEMANTICS '13 ◽

10.1145/2506182.2506191 ◽

2013 ◽

Cited By ~ 2

Author(s):

Javier Fabra ◽

Sergio Hernández ◽

Pedro Álvarez ◽

Estefanía Otero ◽

Juan Carlos Vidal ◽

...

Keyword(s):

Data Collection ◽

Large Scale ◽

Semantic Annotation ◽

Practical Experience ◽

Large Scale Data ◽

Scale Data

Download Full-text

Semantic Fusion of Image Annotation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.268-270.1386 ◽

2011 ◽

Vol 268-270 ◽

pp. 1386-1389

Author(s):

Xiao Ying Wu ◽

Yun Juan Liang ◽

Li Li ◽

Li Juan Ma

Keyword(s):

Large Scale ◽

Image Annotation ◽

Semantic Annotation ◽

Probability Model ◽

Joint Probability ◽

Training Data ◽

Data Set ◽

Other Information ◽

Image Area ◽

Representative Image

In this paper, improve the image annotation with semantic meaning, and name the new algorithm for semantic fusion of image annotation, that is a image is given to be labeled, use of training data set, the word set, and a collection of image area and other information to establish the probability model ,estimates the joint probability by word and given image areas.The probability value as the size, combined with keywords relevant table that integrates lexical semantics to extract keywords as the most representative image semantic annotation results. The algorithm can effectively use large-scale training data with rich annotation, so as to achieve better recall and precision than the existing automatic image annotation ,and validate the algorithm in the Corel data set.

Download Full-text

Pillars of Ontology Treatment in the Medical Domain

Journal of Cases on Information Technology ◽

10.4018/jcit.2009072103 ◽

2009 ◽

Vol 11 (4) ◽

pp. 47-73 ◽

Cited By ~ 9

Author(s):

Daniel Sonntag ◽

Pinar Wennerberg ◽

Paul Buitelaar ◽

Sonja Zillner

Keyword(s):

Large Scale ◽

Knowledge Engineering ◽

Semantic Annotation ◽

Image Data ◽

Clinical Decision ◽

Direct Access ◽

Seamless Integration ◽

Data Repositories ◽

Semantic Image Retrieval ◽

Medical Domain

In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.

Download Full-text

Pillars of Ontology Treatment in the Medical Domain

Cases on Semantic Interoperability for Information Systems Integration ◽

10.4018/978-1-60566-894-9.ch008 ◽

2011 ◽

pp. 162-186 ◽

Cited By ~ 3

Author(s):

Daniel Sonntag ◽

Pinar Wennerberg ◽

Paul Buitelaar ◽

Sonja Zillner

Keyword(s):

Large Scale ◽

Knowledge Engineering ◽

Semantic Annotation ◽

Image Data ◽

Clinical Decision ◽

Direct Access ◽

Seamless Integration ◽

Data Repositories ◽

Semantic Image Retrieval ◽

Medical Domain

In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.

Download Full-text

A Cloud Computing Capability Model for Large-Scale Semantic Annotation

2020 13th International Conference on Developments in eSystems Engineering (DeSE) ◽

10.1109/dese51703.2020.9450748 ◽

2020 ◽

Author(s):

Oluwasegun Adedugbe ◽

Elhadj Benkhelifa ◽

Anoud Bani-Hani

Keyword(s):

Cloud Computing ◽

Large Scale ◽

Semantic Annotation

Download Full-text

Phenoscape: Semantic analysis of organismal traits and genes yields insights in evolutionary biology

10.7287/peerj.preprints.26988v1 ◽

2018 ◽

Cited By ~ 1

Author(s):

Paula M Mabee ◽

Wasila M Dahdul ◽

James P Balhoff ◽

Hilmar Lapp ◽

Prashanti Manda ◽

...

Keyword(s):

Language Processing ◽

Evolutionary Biology ◽

Domain Knowledge ◽

Large Scale ◽

Semantic Analysis ◽

Comparative Anatomy ◽

Semantic Annotation ◽

Model Organisms ◽

Free Text ◽

Mutant Phenotypes

The study of how the observable features of organisms, i.e., their phenotypes, result from the complex interplay between genetics, development, and the environment, is central to much research in biology. The varied language used in the description of phenotypes, however, impedes the large scale and interdisciplinary analysis of phenotypes by computational methods. The Phenoscape project (www.phenoscape.org) has developed semantic annotation tools and a gene–phenotype knowledgebase, the Phenoscape KB, that uses machine reasoning to connect evolutionary phenotypes from the comparative literature to mutant phenotypes from model organisms. The semantically annotated data enables the linking of novel species phenotypes with candidate genes that may underlie them. Semantic annotation of evolutionary phenotypes further enables previously difficult or novel analyses of comparative anatomy and evolution. These include generating large, synthetic character matrices of presence/absence phenotypes based on inference, and searching for taxa and genes with similar variation profiles using semantic similarity. Phenoscape is further extending these tools to enable users to automatically generate synthetic supermatrices for diverse character types, and use the domain knowledge encoded in ontologies for evolutionary trait analysis. Curating the annotated phenotypes necessary for this research requires significant human curator effort, although semi-automated natural language processing tools promise to expedite the curation of free text. As semantic tools and methods are developed for the biodiversity sciences, new insights from the increasingly connected stores of interoperable phenotypic and genetic data are anticipated.

Download Full-text

Semantics for interoperability of distributed data and models: Foundations for better-connected information

F1000Research ◽

10.12688/f1000research.11638.1 ◽

2017 ◽

Vol 6 ◽

pp. 686 ◽

Cited By ~ 7

Author(s):

Ferdinando Villa ◽

Stefano Balbi ◽

Ioannis N. Athanasiadis ◽

Caterina Caracciolo

Keyword(s):

Large Scale ◽

List Item ◽

Semantic Annotation ◽

System Modeling ◽

High Standard ◽

List Type ◽

Distributed Data ◽

Software Infrastructure ◽

Conceptual Foundation ◽

Definition Of

Correct and reliable linkage of independently produced information is a requirement to enable sophisticated applications and processing workflows. These can ultimately help address the challenges posed by complex systems (such as socio-ecological systems), whose many components can only be described through independently developed data and model products. We discuss the first outcomes of an investigation in the conceptual and methodological aspects of semantic annotation of data and models, aimed to enable a high standard of interoperability of information. The results, operationalized in the context of a long-term, active, large-scale project on ecosystem services assessment, include: A definition of interoperability based on semantics and scale;A conceptual foundation for the phenomenology underlying scientific observations, aimed to guide the practice of semantic annotation in domain communities;A dedicated language and software infrastructure that operationalizes the findings and allows practitioners to reap the benefits of data and model interoperability. The work presented is the first detailed description of almost a decade of work with communities active in socio-ecological system modeling. After defining the boundaries of possible interoperability based on the understanding of scale, we discuss examples of the practical use of the findings to obtain consistent, interoperable and machine-ready semantic specifications that can integrate semantics across diverse domains and disciplines.

Download Full-text

HYBRID DECISION TREE ARCHITECTURE UTILIZING LOCAL SVMs FOR EFFICIENT MULTI-LABEL LEARNING

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800141351004x ◽

2013 ◽

Vol 27 (07) ◽

pp. 1351004 ◽

Cited By ~ 3

Author(s):

DEJAN GJORGJEVIKJ ◽

GJORGJI MADJAROV ◽

SAŠO DŽEROSKI

Keyword(s):

Decision Tree ◽

Text Categorization ◽

Large Scale ◽

Semantic Annotation ◽

Predictive Performance ◽

Tree Architecture ◽

Support Vector ◽

Svm Classifier ◽

Strong Impact ◽

Classification Problems

Multi-label learning (MLL) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLL are the large-scale problem, which have a strong impact on the computational complexity of learning. These problems are especially pronounced for approaches that transform MLL problems into a set of binary classification problems for which Support Vector Machines (SVMs) are used. On the other hand, the most efficient approaches to MLL, based on decision trees, have clearly lower predictive performance. We propose a hybrid decision tree architecture, where the leaves do not give multi-label predictions directly, but rather utilize local SVM-based classifiers giving multi-label predictions. A binary relevance architecture is employed in the leaves, where a binary SVM classifier is built for each of the labels relevant to that particular leaf. We use a broad range of multi-label datasets with a variety of evaluation measures to evaluate the proposed method against related and state-of-the-art methods, both in terms of predictive performance and time complexity. Our hybrid architecture on almost every large classification problem outperforms the competing approaches in terms of the predictive performance, while its computational efficiency is significantly improved as a result of the integrated decision tree.

Download Full-text

Ontology Learning for Cost-Effective Large-Scale Semantic Annotation of Web Service Interfaces

Knowledge Engineering and Management by the Masses - Lecture Notes in Computer Science ◽

10.1007/978-3-642-16438-5_30 ◽

2010 ◽

pp. 401-410 ◽

Cited By ~ 6

Author(s):

Shahab Mokarizadeh ◽

Peep Küngas ◽

Mihhail Matskin

Keyword(s):

Web Service ◽

Large Scale ◽

Semantic Annotation ◽

Cost Effective ◽

Ontology Learning

Download Full-text

SEMANTIC SEARCH OF SERVICES

International Journal of Semantic Computing ◽

10.1142/s1793351x13500049 ◽

2013 ◽

Vol 07 (03) ◽

pp. 257-290 ◽

Cited By ~ 1

Author(s):

KE HAO ◽

PHILLIP C-Y SHEU ◽

HIROSHI YAMAGUCHI

Keyword(s):

Natural Language Processing ◽

Web Services ◽

Vector Space ◽

Language Processing ◽

Large Scale ◽

Semantic Annotation ◽

Vector Space Model ◽

Semantic Search ◽

Lexical Database ◽

Space Model

This paper addresses semantic search of Web services using natural language processing. First we survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a service search framework based on the vector space model to combine the traditional frequency weighted term-document matrix, the syntactical information extracted from a lexical database and a dependency grammar parser. In particular, instead of using terms as the rows in a term-document matrix, we propose using synsets from WordNet to distinguish different meanings of a word under different contexts as well as clustering different words with similar meanings. Also based on the characteristics of Web services descriptions, we propose an approach to identifying semantically important terms to adjust weightings. Our experiments show that our approach achieves its goal well.

Download Full-text