METHODS OF SEMANTIC ANALYSIS IN ANNOTATED GENERALIZATION OF TEXT DOCUMENTS

2017 ◽  
Vol 2 (1) ◽  
pp. 53-58
Author(s):  
Ocherklevich O. ◽  
◽  
Ihnatovych A.

The article is devoted to the use of semantic analysis in the generalization of text documents. The analysis of features of the most widespread methods of generalization of text documents and an estimation of quality of results of an estimation is carried out. Features of the improved method of annotative generalization of text documents, which uses the principles of hidden semantic analysis and elements of fuzzy logic to identify semantically important sentences, are presented. It is proposed to use a new approach to evaluating the effectiveness of generalization, based on elements of fuzzy logic and a statistical indicator used to assess the importance of words in the context and class of the document, which allows to determine the correspondence between the original document and its summary. The results of verification of the proposed tools, certifying their effectiveness. Keywords: text document, annotation generalization, semantic analysis, fuzzy logic, evaluation, efficiency

Author(s):  
Ammar Kamal Abasi ◽  
Ahamad Tajudin Khader ◽  
Mohammed Azmi Al-Betar ◽  
Syibrah Naim ◽  
Mohammed A. Awadallah ◽  
...  

In this study, a multi-verse optimizer (MVO) is utilised for the text document clus- tering (TDC) problem. TDC is treated as a discrete optimization problem, and an objective function based on the Euclidean distance is applied as similarity measure. TDC is tackled by the division of the documents into clusters; documents belonging to the same cluster are similar, whereas those belonging to different clusters are dissimilar. MVO, which is a recent metaheuristic optimization algorithm established for continuous optimization problems, can intelligently navigate different areas in the search space and search deeply in each area using a particular learning mechanism. The proposed algorithm is called MVOTDC, and it adopts the convergence behaviour of MVO operators to deal with discrete, rather than continuous, optimization problems. For evaluating MVOTDC, a comprehensive comparative study is conducted on six text document datasets with various numbers of documents and clusters. The quality of the final results is assessed using precision, recall, F-measure, entropy accuracy, and purity measures. Experimental results reveal that the proposed method performs competitively in comparison with state-of-the-art algorithms. Statistical analysis is also conducted and shows that MVOTDC can produce significant results in comparison with three well-established methods.


Author(s):  
Mónica Rebeca Franco Pombo ◽  
María del Mar Fernández Martínez ◽  
Antonio Luque de la Rosa ◽  
Rafaela Gutiérrez Cáceres

The public policy of Ecuador has placed the improvement of educational quality as one of the main objectives of government management. The climate-school constructs, organizational climate, work climate, and institutional climate, are used in different environments to accentuate the importance of the relationship established between a management environment ­ factor and the quality of results in organizations. Throughout this study, the objective has been to analyze the variations in the relationship among the actors, and the levels of trust between the actors of the school organizational climate, according to the socioeconomic level, the role played in the educational community, or the type of socio-educational center.  The study follows a new approach of an experimental, descriptive-comparative investigation, using techniques such as the survey, the interview, and the discussion group, adjusting to the mixed methodology according to the objectives and sense of the research proposed. In consideration of the results, we will appreciate that the relationships between the actors and the confidence levels of the actors show mostly positive indicators based on the responses of the participants in the study. However, the percentage of negative perceptions (24.34%) is a factor to consider, since it might suggest that these perceptions underlie indicators of distrust that should be taken into account for any future interventions. In conclusion, the schools participating in this study have built a mostly positive school organizational climate, which generates favorable spaces for innovation and change processes.


Informatica ◽  
2022 ◽  
pp. 1-22
Author(s):  
Pavel Stefanovič ◽  
Olga Kurasova

In this paper, a new approach has been proposed for multi-label text data class verification and adjustment. The approach helps to make semi-automated revisions of class assignments to improve the quality of the data. The data quality significantly influences the accuracy of the created models, for example, in classification tasks. It can also be useful for other data analysis tasks. The proposed approach is based on the combination of the usage of the text similarity measure and two methods: latent semantic analysis and self-organizing map. First, the text data must be pre-processed by selecting various filters to clean the data from unnecessary and irrelevant information. Latent semantic analysis has been selected to reduce the vectors dimensionality of the obtained vectors that correspond to each text from the analysed data. The cosine similarity distance has been used to determine which of the multi-label text data class should be changed or adjusted. The self-organizing map has been selected as the key method to detect similarity between text data and make decisions for a new class assignment. The experimental investigation has been performed using the newly collected multi-label text data. Financial news data in the Lithuanian language have been collected from four public websites and classified by experts into ten classes manually. Various parameters of the methods have been analysed, and the influence on the final results has been estimated. The final results are validated by experts. The research proved that the proposed approach could be helpful to verify and adjust multi-label text data classes. 82% of the correct assignments are obtained when the data dimensionality is reduced to 40 using the latent semantic analysis, and the self-organizing map size is reduced from 40 to 5 by step 5.


2020 ◽  
Vol 13 (3) ◽  
pp. 142-155
Author(s):  
Youness Madani ◽  
Mohammed Erritali ◽  
Jamaa Bengourram ◽  
Francoise Sailhan

Sentiment analysis has become an important field in scientific research in recent years. The goal is to extract opinions and sentiments from written text using artificial intelligence algorithms. In this article, we propose a new approach for classifying Twitter data into classes (positive, negative, and neutral). The proposed method is based on two approaches, a dictionary-based approach using the sentimental dictionary SentiWordNet, and an approach based on the fuzzy logic system (fuzzification, rule inference, and defuzzification). Experimental results show that our approach outperforms some other approaches in the literature and that by using the fuzzy logic we improve the quality of the classification.


2020 ◽  
Vol 26 ◽  
pp. 3-9
Author(s):  
Enrico Fattinnanzi

Following the pandemic, investments in theconstruction sector are destined to reach exceptionallevels, with noteworthy effects on economicproduction, all while providing a very real opportunityto carry out projects that could resolve a good many ofthe critical problems currently affecting the towns, citiesand surrounding territory of Italy: problems such asaccessibility, the distribution of public resources andservices, and seismic and hydrogeological safety;foremost among these are problems plaguing structuralfunctions, as well as the lack of social and economicintegration in outlying urban areas. And yet there is arisk that pursuit of these objectives could be seriouslyhampered by distressing levels of inefficiency in theoverall organisation of the processes of intervention, aswell as the unwarranted influence of special interests,including, in some cases, infiltration by organised crime.The resulting situation presents an unacceptable ratioof resources allocated to the quality of results. Thispaper illustrates why significantly different forms ofgovernance should be adopted, together with a totallynew, innovative approach to organising the processesof intervention. A reference found to be especially useful in formulatingthe considerations developed in the paper was a criticalanalysis of the results - ultimately judged to be positive- of the project to replace the Morandi Bridge in Genoa,though mention was also made of the factors thatrender a generalised application of this example tofuture investments unfeasible. Instead we have focussed on certain aspects that, as wesee it, may prove useful in arriving at a new approach tomanaging project programs: first of all, theestablishment, at the heart of any project, and especiallythose that are particularly demanding in terms of sizeand/or quality, of a strong management function that,being in possession of all the necessary powers andknow-how, is responsible for successful performance ofthe work, meaning proper use of the availableresources, compliance with deadlines and pursuit of allthe objectives and standards of performance that gaverise to investment in the first place. To this end, planningquality is held to be of the utmost importance, and soplanning decisions should be supported by effectiveevaluation procedures. And such procedures, the paperargues, prove all the more effective when they supportfor each and every one of the phases of decision-making involved in planning, covering governance ofthe entire process, up to and including the actualexecution of the project.AbstractYou delight in laying down laws, Yet you delight more in breaking them.Like children playing by the ocean who build sand-towerswith constancy and then destroy them with laughter...... But while you build your sand-towers the ocean brings more sand to the shore.From The ProphetKhalil Gibran


2012 ◽  
Vol 433-440 ◽  
pp. 1555-1560
Author(s):  
Dang En Xie ◽  
Hai Na Hu ◽  
Zhi Li Zhang

In this paper, we put forward an improved non-photorealistic rendering method for generating a colored pencil drawing from digital image. First, to make sure the result can retain the original color information, we use the original pixel value instead of the black dot which generated by the traditional white-noise generating method. Second, we added a ratio for the Kirsch operator to be suitable for different images with different details. Third, we present a new approach which extruded form the luminance of the original image to determine the stroke orientation. Based on our methods, the quality of traditional pencil illustration can be guaranteed to a certain extent, and an effective and convenient tool is provided to generate the same drawings in style with artists and illustrations even for the users that have not been trained professionally.


Author(s):  
Raja K. ◽  
Kanagavalli V. R. ◽  
Nizar Banu P. K. ◽  
Kannan K.

In recent decades, all the documents maintained by the industries are getting transformed into soft copies in either structured documents or as an e-copies. In text document processing, there is a number of ways available to extract the raw data. As the accuracy in finding the spatial data is crucial, this domain invites various research solutions that provide high accuracy. In this article, the Fuzzy Extraction, Resolving, and Clustering (FERC) architecture is proposed which uses fuzzy logic techniques to identify and cluster uncertain textual spatial reference. When the text corpus is queried with a spatial-keyword, FERC returns a set of relevant documents sorted in view of the fuzzy pertinence score. Any two documents may be compared in light of the spatial references that exist in them and their fuzzy similarity score is presented. This enables finding the degree to which the two documents speak about a specified location. The proposed architecture provides a better result set to the user, unlike a Boolean search where the document is either rated relevant or irrelevant.


2018 ◽  
Vol 16 (2) ◽  
pp. 107-119
Author(s):  
Supavit KONGWUDHIKUNAKORN ◽  
Kitsana WAIYAMAI

This paper presents a method for clustering short text documents, such as instant messages, SMS, or news headlines. Vocabularies in the texts are expanded using external knowledge sources and represented by a Distributed Word Representation. Clustering is done using the K-means algorithm with Word Mover's Distance as the distance metric. Experiments were done to compare the clustering quality of this method, and several leading methods, using large datasets from BBC headlines, SearchSnippets, StackExchange, and Twitter. For all datasets, the proposed algorithm produced document clusters with higher accuracy, precision, F1-score, and Adjusted Rand Index. We also observe that cluster description can be inferred from keywords represented in each cluster.


Author(s):  
Russell L. Steere ◽  
Eric F. Erbe ◽  
J. Michael Moseley

We have designed and built an electronic device which compares the resistance of a defined area of vacuum evaporated material with a variable resistor. When the two resistances are matched, the device automatically disconnects the primary side of the substrate transformer and stops further evaporation.This approach to controlled evaporation in conjunction with the modified guns and evaporation source permits reliably reproducible multiple Pt shadow films from a single Pt wrapped carbon point source. The reproducibility from consecutive C point sources is also reliable. Furthermore, the device we have developed permits us to select a predetermined resistance so that low contrast high-resolution shadows, heavy high contrast shadows, or any grade in between can be selected at will. The reproducibility and quality of results are demonstrated in Figures 1-4 which represent evaporations at various settings of the variable resistor.


Sign in / Sign up

Export Citation Format

Share Document