Applying Clustering and Topic Modeling to Automatic Analysis of Citizens’ Comments in EGovernment

Gunay Y. Iskandarli;

doi:10.5815/ijitcs.2020.06.01

Applying Clustering and Topic Modeling to Automatic Analysis of Citizens’ Comments in EGovernment

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2020.06.01 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1-10

Author(s):

Gunay Y. Iskandarli ◽

Keyword(s):

Semantic Similarity ◽

Topic Modeling ◽

Clustering Algorithms ◽

Automatic Analysis ◽

Large Set ◽

Modeling Methods

The paper proposes an approach to analyze citizens' comments in e-government using topic modeling and clustering algorithms. The main purpose of the proposed approach is to determine what topics are the citizens' commentaries about written in the e-government environment and to improve the quality of e-services. One of the methods used to determine this is topic modeling methods. In the proposed approach, first citizens' comments are clustered and then the topics are extracted from each cluster. Thus, we can determine which topics are discussed by citizens. However, in the usage of clustering and topic modeling methods appear some problems. These problems include the size of the vectors and the collection of semantically related of documents in different clusters. Considering this, the semantic similarity of words is used in the approach to reduce measure. Therefore, we only save one of the words that are semantically similar to each other and throw the others away. So, the size of the vector is reduced. Then the documents are clustered and topics are extracted from each cluster. The proposed method can significantly reduce the size of a large set of documents, save time spent on the analysis of this data, and improve the quality of clustering and LDA algorithm.

Download Full-text

Anatomy of a Psychological Theory: Integrating Construct-Validation and Computational-Modeling Methods to Advance Theorizing

Perspectives on Psychological Science ◽

10.1177/1745691620966794 ◽

2021 ◽

pp. 174569162096679

Author(s):

Ivan Grahek ◽

Mark Schaller ◽

Jennifer L. Tackett

Keyword(s):

Computational Modeling ◽

Construct Validation ◽

Theory Development ◽

Psychological Theory ◽

Modeling Methods ◽

Psychological Theories ◽

Development Processes ◽

Convergence And Divergence

Discussions about the replicability of psychological studies have primarily focused on improving research methods and practices, with less attention paid to the role of well-specified theories in facilitating the production of reliable empirical results. The field is currently in need of clearly articulated steps to theory specification and development, particularly regarding frameworks that may generalize across different fields of psychology. Here we focus on two approaches to theory specification and development that are typically associated with distinct research traditions: computational modeling and construct validation. We outline the points of convergence and divergence between them to illuminate the anatomy of a scientific theory in psychology—what a well-specified theory should contain and how it should be interrogated and revised through iterative theory-development processes. We propose how these two approaches can be used in complementary ways to increase the quality of explanations and the precision of predictions offered by psychological theories.

Download Full-text

Engineering methods in the tasks of digital transformation of the economy. Book 4. Engineering modeling methods in the tasks of management, analysis and integrated assessment of the quality of a digital organization

10.31145/978-5-901771-83-9-2021 ◽

2021 ◽

Author(s):

Vladimir Nikolaevich Azarov ◽

◽

Alexander Valentinovich Andreychikov ◽

Valery Prokhorovich Mayboroda ◽

Andrey Valentinovich Titov ◽

...

Keyword(s):

Integrated Assessment ◽

Digital Transformation ◽

Mrp Ii ◽

Modeling Methods ◽

Engineering Modeling

В учебном пособии рассмотрены подходы к проектированию и методологии применения информационных технологий и средств информатики в задачах цифровой трансформации экономики, обеспечения качества продукции и услуг. Рассмотрены концептуальные основы инфокоммуникационных технологий, сферы применения, основные тенденции их развития. Рассмотрен системный подход к цифровой трансформации предприятий на основе методологии инжиниринга и реинжиниринга бизнес-процессов организации при внедрении ИС и ИТ, переходу стратегии организации к ИТ стратегии и цифровой трансформации. Создание организационной и ИТ-инфраструктуры предприятий промышленного и финансового сектора экономики различного уровня, заинтересованного в построении эффективно работающей информационно-технологической системы поддержки бизнеса. Приводятся задачи управления и анализа, поддерживающие их управляющие информационные системы, СППР руководителя и рассматривается последовательность информатизации предприятия. Излагаются тенденции развития стандартов методов управления производством (MRP, MRP II и ERP), технологий управления (CRM и CSRP) и соответствующих им систем. Даются краткие основы технологии internet/intranet и её применения в управлении бизнесом. Рассмотрены организационные вопросы информационной безопасности данных и сети организации, задачи и виды аудита информационных систем, вопросы методологии и управления цифровыми проектами.

Download Full-text

Comparison of service quality of full service carriers in Korea using topic modeling: based on reviews from TripAdvisor

Journal of Hospitality and Tourism Studies ◽

10.31667/jhts.2021.2.86.152 ◽

2021 ◽

Vol 23 (1) ◽

pp. 152-165

Author(s):

Min-Kyoung Cho ◽

Byung-Joo Lee

Keyword(s):

Service Quality ◽

Topic Modeling

Download Full-text

A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA

International Journal of Computing ◽

10.47839/ijc.10.1.733 ◽

2011 ◽

pp. 24-32 ◽

Cited By ~ 1

Author(s):

Nicoleta Rogovschi ◽

Mustapha Lebbah ◽

Younès Bennani

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Mixed Data ◽

Categorical Variables ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Public Data ◽

Self Organizing

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.

Download Full-text

Maintainability Evaluation of Object-Oriented Software System Using Clustering Techniques

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v5i2.3535 ◽

2013 ◽

Vol 5 (2) ◽

pp. 136-143 ◽

Cited By ~ 1

Author(s):

Astha Mehra ◽

Sanjay Kumar Dubey

Keyword(s):

Clustering Algorithms ◽

Daily Basis ◽

Computer Assisted ◽

Program Execution ◽

Huge Data ◽

Large Databases ◽

Input Dataset ◽

Data Objects

In todayâ€™s world data is produced every day at a phenomenal rate and we are required to store this ever growing data on almost daily basis. Even though our ability to store this huge data has grown but the problem lies when users expect sophisticated information from this data. This can be achieved by uncovering the hidden information from the raw data, which is the purpose of data mining.Â Data mining or knowledge discovery is the computer-assisted process of digging through and analyzing enormous set of data and then extracting the meaning out of it. The raw and unlabeled data present in large databases can be classified initially in an unsupervised manner by making use of cluster analysis. Clustering analysis is the process of finding the groups of objects such that the objects in a group will be similar to one another and dissimilar from the objects in other groups. These groups are known as clusters.Â In other words, clustering is the process of organizing the data objects in groups whose members have some similarity among them. Some of the applications of clustering are in marketing -finding group of customers with similar behavior, biology- classification of plants and animals given their features, data analysis, and earthquake study -observe earthquake epicenter to identify dangerous zones, WWW -document classification, etc. The results or outcome and efficiency of clustering process is generally identified though various clustering algorithms. The aim of this research paper is to compare two important clustering algorithms namely centroid based K-means and X-means. The performance of the algorithms is evaluated in different program execution on the same input dataset. The performance of these algorithms is analyzed and compared on the basis of quality of clustering outputs, number of iterations and cut-off factors.

Download Full-text

The comparative study of text documents clustering algorithms

Environment Conservation Journal ◽

10.36953/ecj.2015.se1614 ◽

2015 ◽

Vol 16 (SE) ◽

pp. 133-138

Author(s):

Mohammad Eiman Jamnezhad ◽

Reza Fattahi

Keyword(s):

Data Mining ◽

Dna Analysis ◽

Clustering Algorithms ◽

Research Area ◽

Large Set ◽

Text Documents ◽

Web Documents ◽

Significant Research ◽

The Comparative Study ◽

F Measure

Clustering is one of the most significant research area in the field of data mining and considered as an important tool in the fast developing information explosion era.Clustering systems are used more and more often in text mining, especially in analyzing texts and to extracting knowledge they contain. Data are grouped into clusters in such a way that the data of the same group are similar and those in other groups are dissimilar. It aims to minimizing intra-class similarity and maximizing inter-class dissimilarity. Clustering is useful to obtain interesting patterns and structures from a large set of data. It can be applied in many areas, namely, DNA analysis, marketing studies, web documents, and classification. This paper aims to study and compare three text documents clustering, namely, k-means, k-medoids, and SOM through F-measure.

Download Full-text

Tracking patients healthcare experiences during the COVID-19 outbreak: Topic modeling and sentiment analysis of doctor reviews

Journal of Engineering Research ◽

10.36909/jer.v9i3a.8703 ◽

2021 ◽

Vol 9 (3A) ◽

Author(s):

Adnan M. Shah ◽

◽

Xiangbin Yan ◽

Samia tariq ◽

Syed Asad A. Shah ◽

...

Keyword(s):

Quality Of Care ◽

Sentiment Analysis ◽

Topic Modeling ◽

Swot Analysis ◽

Vaccine Development ◽

Time Slice ◽

Healthcare Organizations ◽

Physician Rating ◽

Physician Rating Website

Emerging voices of patients in the form of opinions and expectations about the quality of care can improve healthcare service quality. A large volume of patients’ opinions as online doctor reviews (ODRs) are available online to access, analyze, and improve patients’ perceptions. This paper aims to explore COVID-19-related conversations, complaints, and sentiments using ODRs posted by users of the physician rating website. We analyzed 96,234 ODRs of 5,621 physicians from a prominent health rating website in the United Kingdom (Iwantgreatcare.org) in threetime slices (i.e., from February 01 to October 31, 2020). We employed machine learning approach, dynamic topic modeling, to identify prominent bigrams, salient topics and labels, sentiments embedded in reviews and topics, and patient-perceived root cause and strengths, weaknesses, opportunities, and threats (SWOT) analyses to examine SWOT for healthcare organizations. This method finds a total of 30 latent topics with 10 topics across each time slice. The current study identified new discussion topics about COVID-19 occurring from time slice 1 to time slice 3, such as news about the COVID-19 pandemic, violence against the lockdown, quarantine process and quarantine centers at different locations, and vaccine development/treatment to stop virus spread. Sentiment analysis reveals that fear for novel pathogen prevails across all topics. Based on the SWOT analysis, our findings provide a clue for doctors, hospitals, and government officials to enhance patients’ satisfaction and minimize dissatisfaction by satisfying their needs and improve the quality of care during the COVID-19 crisis.

Download Full-text

Investigation of Effective Mechanical Characteristics of Nanomodified Carbon-Epoxide Composite by Numerical and Analytical Methods

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i5.1049 ◽

2021 ◽

Vol 12 (5) ◽

pp. 535-541

Author(s):

M.O. Kaptakov

Keyword(s):

Mechanical Properties ◽

Numerical Modeling ◽

Fracture Surface ◽

Mechanical Characteristics ◽

Modeling Methods ◽

Fullerene Soot ◽

The Matrix ◽

Numerical And Analytical Methods ◽

Composite Samples

In this work, the mechanical properties of composite samples prepared using a conventional and nanomodified matrix were studied. The thickness of the monolayers in the samples was 0,2 μm. It was found in experiments, that the addition of fullerene soot as a nanomodifierled to an increase in the mechanical properties of the samples along the direction of reinforcement. At the same time, an improvement in the quality of the contact of the matrix with the fibers in the samples with the nanomodifier was observed: on the fracture surface, the nanomodified matrix envelops the fibers, while the usual matrix completely exfoliates. The obtained effects of changing the strength of composites can be associated, among other things, with a change in the level of residual stresses arising in composites during nanomodification. Analytical and numerical modeling methods are used to explain these effects.

Download Full-text

A Probabilistic Approach to Error Detection&Correction for Tree-Mapping Grammars

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2018-0009 ◽

2018 ◽

Vol 111 (1) ◽

pp. 97-112

Author(s):

Tim vor der Brück

Keyword(s):

Error Detection ◽

Basic Assumption ◽

Probabilistic Approach ◽

Large Set ◽

Generation Process ◽

Input Structure ◽

Error Detection And Correction ◽

Complete Failure ◽

Automatic Error

Abstract Rule-based natural language generation denotes the process of converting a semantic input structure into a surface representation by means of a grammar. In the following, we assume that this grammar is handcrafted and not automatically created for instance by a deep neural network. Such a grammar might comprise of a large set of rules. A single error in these rules can already have a large impact on the quality of the generated sentences, potentially causing even a complete failure of the entire generation process. Searching for errors in these rules can be quite tedious and time-consuming due to potentially complex and recursive dependencies. This work proposes a statistical approach to recognizing errors and providing suggestions for correcting certain kinds of errors by cross-checking the grammar with the semantic input structure. The basic assumption is the correctness of the latter, which is usually a valid hypothesis due to the fact that these input structures are often automatically created. Our evaluation reveals that in many cases an automatic error detection and correction is indeed possible.

Download Full-text

Enhanced F-Perceptory Approach for Dealing With Geographic Data Imprecision From the Conceptual Modeling to the Fuzzy Geographical Database Building

Environmental Information Systems ◽

10.4018/978-1-5225-7033-2.ch019 ◽

2019 ◽

pp. 426-455

Author(s):

Besma Khalfi ◽

Cyril De Runz ◽

Herman Akdag

Keyword(s):

Conceptual Modeling ◽

Fuzzy Data ◽

Functional Modules ◽

Modeling Tools ◽

Modeling Methods ◽

Geographic Data ◽

Data Imprecision ◽

Available Information ◽

Geographic Data Modeling

When analyzing spatial issues, it is often that the geographer is confronted with many problems concerning the uncertainty of the available information. These problems may appear on the geometric or semantic quality of objects and as a result, a low precision is considered. So, it is necessary to develop representation and modeling methods that are suited to the imprecise nature of geographic data. This leads proposing recently F-Perceptory to manage fuzzy geographic data modeling. From the model described in Zoghlami, et al, (2011) some limits are relieved. F-Perceptory does not manage fuzzy composite geographic objects. The paper shows proposition to enhance the approach by the managing this type of objects in modeling and its transformation to the UML. On the technical level, the object modeling tools commonly used do not take into account fuzzy data. The authors propose new functional modules integrated under an existing CASE tool.

Download Full-text