scholarly journals Processing Tools for Greek and Other Languages of the Christian Middle East

2018 ◽  
Vol Special Issue on... (Project presentations) ◽  
Author(s):  
Bastien Kindt

This paper presents some computer tools and linguistic resources of the GREgORI project. These developments allow automated processing of texts written in the main languages of the Christian Middel East, such as Greek, Arabic, Syriac, Armenian and Georgian. The main goal is to provide scholars with tools (lemmatized indexes and concordances) making corpus-based linguistic information available. It focuses on the questions of text processing, lemmatization, information retrieval, and bitext alignment.

Author(s):  
Juncal Gutiérrez-Artacho ◽  
María-Dolores Olvera-Lobo

Within the sphere of the Web, the overload of information is more notable than in other contexts. Question answering systems (QAS) are presented as an alternative to the traditional Information Retrieval (IR) systems, seeking to offer precise and understandable answers to factual questions instead of showing the user a list of documents related to a given search . Given that the QAS is presented as a substantial advance in the improvement of IR, it becomes necessary to determine its effectiveness for the final user. With this aim, 7 studies were undertaken to evaluate: a) in the first two, the linguistic resources and tools used in these systems for multilingual retrieval (Research 1; Research 2); and b) the performance and quality of the answers of the main monolingual and multilingual QA of general domain and specialized domain in the Web in response to different types of questions and subjects, so that different evaluation means can be applied (Research 3, Research 4, Research 5, Research 6, Research 7).


Author(s):  
Bilel Elayeb ◽  
Ibrahim Bounhas ◽  
Oussama Ben Khiroun ◽  
Fabrice Evrard ◽  
Narjès Bellamine-BenSaoud

This paper presents a new possibilistic information retrieval system using semantic query expansion. The work is involved in query expansion strategies based on external linguistic resources. In this case, the authors exploited the French dictionary “Le Grand Robert”. First, they model the dictionary as a graph and compute similarities between query terms by exploiting the circuits in the graph. Second, the possibility theory is used by taking advantage of a double relevance measure (possibility and necessity) between the articles of the dictionary and query terms. Third, these two approaches are combined by using two different aggregation methods. The authors also benefit from an existing approach for reweighting query terms in the possibilistic matching model to improve the expansion process. In order to assess and compare the approaches, the authors performed experiments on the standard ‘LeMonde94’ test collection.


Author(s):  
Juncal Gutiérrez-Artacho ◽  
María-Dolores Olvera-Lobo

Within the sphere of the web, the overload of information is more notable than in other contexts. Question answering systems (QAS) are presented as an alternative to the traditional information retrieval (IR) systems seeking to offer precise and understandable answers to factual questions instead of showing the user a list of documents related to a given search. Given that the QAS is presented as a substantial advance in the improvement of IR, it becomes necessary to determine its effectiveness for the final user. With this aim, seven studies were undertaken to evaluate: 1) in the first two, the linguistic resources and tools used in these systems for multilingual retrieval (Research 1, Research 2), and 2) the performance and quality of the answers of the main monolingual and multilingual QA of general domain and specialized domain in the web in response to different types of questions and subjects, so that different evaluation means can be applied (Research 3, Research 4, Research 5, Research 6, Research 7).


Author(s):  
Francisco M. Couto ◽  
Mário J. Silva ◽  
Vivian Lee ◽  
Emily Dimmer ◽  
Evelyn Camon ◽  
...  

Molecular Biology research projects produced vast amounts of data, part of which has been preserved in a variety of public databases. However, a large portion of the data contains a significant number of errors and therefore requires careful verification by curators, a painful and costly task, before being reliable enough to derive valid conclusions from it. On the other hand, research in biomedical information retrieval and information extraction are nowadays delivering Text Mining solutions that can support curators to improve the efficiency of their work to deliver better data resources. Over the past decades, automatic text processing systems have successfully exploited biomedical scientific literature to reduce the researchers’ efforts to keep up to date, but many of these systems still rely on domain knowledge that is integrated manually leading to unnecessary overheads and restrictions in its use. A more efficient approach would acquire the domain knowledge automatically from publicly available biological sources, such as BioOntologies, rather than using manually inserted domain knowledge. An example of this approach is GOAnnotator, a tool that assists the verification of uncurated protein annotations. It provided correct evidence text at 93% precision to the curators and thus achieved promising results. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.


1992 ◽  
Vol 36 (4) ◽  
pp. 356-360 ◽  
Author(s):  
Cortney G. Vargo ◽  
Clifford E. Brown ◽  
Sarah J. Swierenga

This study was designed to investigate whether computer-supported backtracking tools reduced navigation time over manual backtracking and to compare navigation times among a subset of four backtracking tools. Each tool was evaluated in the context of an experimental, hierarchical, direct-manipulation database. Trials consisted of an information retrieval task requiring subjects to answer multiple-choice questions about the contents of the database. The independent variables included the backtracking tool and the backtrack navigation Task Length. The dependent measures included navigation time, the frequency with which the computer tool was selected and used over manual backtracking (a Table of Contents), and questionnaire responses. Backtracking with any of the four computer-supported tools resulted in a significantly reduced navigation time over manual backtracking using the Table of Contents. When provided with a history list, subjects had significantly smaller navigation times when backtracking at the higher of two levels in the database hierarchy. There were no differences between computer tools in rated efficiency, ease of use, or objective or subjective preference measures.


2020 ◽  
Author(s):  
Rianto Rianto ◽  
Achmad Benny Mutiara ◽  
Eri Prasetyo Wibowo ◽  
Paulus Insap Santosa

Abstract Stemming has long been used in data pre-processing in information retrieval, which aims to make affix words into root words. However, there are not many stemming methods for non-formal Indonesian text processing. The existing stemming method has high accuracy for formal Indonesian, but low for non-formal Indonesian. Thus, the stemming method which has high accuracy for non-formal Indonesian classifier model is still an open-ended challenge. This study introduces a new stemming method to solve problems in the non-formal Indonesian text data pre-processing. Furthermore, this study aims to provide comprehensive research on improving the accuracy of text classifier models by strengthening on stemming method. Using the Support Vector Machine algorithm, a text classifier model is developed, and its accuracy is checked. The experimental evaluation was done by testing 550 datasets in Indonesian using two different stemming methods. The results show that using the proposed stemming method, the text classifier model has higher accuracy than the existing methods with a score of 0.85 and 0.73, respectively. In the future, the proposed stemming method can be used to develop the Indonesian text classifier model which can be used for various purposes including text clustering, summarization, detecting hate speech, and other text processing applications.


Music is the combination of melody, linguistic information and singer’s mental realm. As popularity of music increases, the choice of songs also varies according to their mental conditions. The mental conditions reach the supreme bliss to melancholy strain based on the musical notes. Majority mostly prefer songs, which satisfy their current state of mind. Pragmatic analysis in music by computer is a difficult task, as emotion is very complex and it camouflages the real situation. Hence, In this paper , trying to classify the songs based on the features of music which helps to classify the emotion more easily. Music feature extraction is done using Music Information Retrieval (MIR) toolbox. The dataset consists of 100 of Hindi songs of 30 seconds clip and later classify the emotion based on Naïve Bayes classification method using Weka API.


Sign in / Sign up

Export Citation Format

Share Document