scholarly journals Term-Community-Based Topic Detection with Variable Resolution

Information ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 221
Author(s):  
Andreas Hamm ◽  
Simon Odrowski

Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like similar methods, it employs community detection in term co-occurrence graphs, but it is enhanced by including a resolution parameter that can be used for changing the targeted topic granularity. We also establish a term ranking and use semantic word-embedding for presenting term communities in a way that facilitates their interpretation. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics at various resolutions. A comparison with topics detected by Latent Dirichlet Allocation is also included. Finally, we discuss factors that influence topic interpretation.

Social Change ◽  
2021 ◽  
Vol 51 (4) ◽  
pp. 475-482
Author(s):  
Zoya Hasan

The recent spread of the delta variant of the COVID-19 pandemic in many countries, though uneven, has once again set alarm bells ringing throughout the world. Nearly two years have passed since the onset of this pandemic: vaccines have been developed and vaccination is underway, but the end of the campaign against the pandemic is nowhere in sight. This drive has merely attempted to adjust and readjust, with or without success, to the various fresh challenges that have kept emerging from time to time. The pandemic’s persistence and its handling by the governments both have had implications for citizens’/peoples’ rights as well as for the systems which were in place before the pandemic. In this symposium domain experts investigate, with a sharp focus on India, the interface between the COVID-19 pandemic and democracy, health, education and social sciences. These contributions are notable for their nuanced and insightful examination of the impact of the pandemic on crucial social development issues with special attention to the exacerbated plight of society’s marginalised sections. In India, as in several other countries, the COVID-19 pandemic has affected democracy. The health crisis came at a moment when India was already experiencing democratic backsliding. The pandemic came in handy in imposing greater restrictions on democratic rights, public discussion and political opposition. This note provides an analysis and commentary on how the government’s response to the COVID-19 pandemic impacted governance, at times undermining human rights and democratic processes, and posing a range of new challenges to democracy.


2021 ◽  
Author(s):  
Cheng Chen ◽  
Jesse Mullis ◽  
Beshoy Morkos

Abstract Risk management is vital to a product’s lifecycle. The current practice of reducing risks relies on domain experts or management tools to identify unexpected engineering changes, where such approaches are prone to human errors and laborious operations. However, this study presents a framework to contribute to requirements management by implementing a generative probabilistic model, the supervised latent Dirichlet allocation (LDA) with collapsed Gibbs sampling (CGS), to study the topic composition within three unlabeled and unstructured industrial requirements documents. As finding the preferred number of topics remains an open-ended question, a case study estimates an appropriate number of topics to represent each requirements document based on both perplexity and coherence values. Using human evaluations and interpretable visualizations, the result demonstrates the different level of design details by varying the number of topics. Further, a relevance measurement provides the flexibility to improve the quality of topics. Designers can increase design efficiency by understanding, organizing, and analyzing high-volume requirements documents in confirmation management based on topics across different domains. With domain knowledge and purposeful interpretation of topics, designers can make informed decisions on product evolution and mitigate the risks of unexpected engineering changes.


MedEdPublish ◽  
2017 ◽  
Vol 6 (1) ◽  
Author(s):  
Leide Da Conceição Sanches ◽  
Leandro Rozin ◽  
Izabel Cristina Meister Martins Coelho ◽  
Patricia Helena Napolitano ◽  
Christiane Luiza Santos ◽  
...  

2019 ◽  
Vol 52 (9-10) ◽  
pp. 1289-1298 ◽  
Author(s):  
Lei Shi ◽  
Gang Cheng ◽  
Shang-ru Xie ◽  
Gang Xie

The aim of topic detection is to automatically identify the events and hot topics in social networks and continuously track known topics. Applying the traditional methods such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis is difficult given the high dimensionality of massive event texts and the short-text sparsity problems of social networks. The problem also exists of unclear topics caused by the sparse distribution of topics. To solve the above challenge, we propose a novel word embedding topic model by combining the topic model and the continuous bag-of-words mode (Cbow) method in word embedding method, named Cbow Topic Model (CTM), for topic detection and summary in social networks. We conduct similar word clustering of the target social network text dataset by introducing the classic Cbow word vectorization method, which can effectively learn the internal relationship between words and reduce the dimensionality of the input texts. We employ the topic model-to-model short text for effectively weakening the sparsity problem of social network texts. To detect and summarize the topic, we propose a topic detection method by leveraging similarity computing for social networks. We collected a Sina microblog dataset to conduct various experiments. The experimental results demonstrate that the CTM method is superior to the existing topic model method.


2012 ◽  
Vol 22 (07) ◽  
pp. 1250171 ◽  
Author(s):  
CLARA GRANELL ◽  
SERGIO GÓMEZ ◽  
ALEX ARENAS

The analysis of the modular structure of networks is a major challenge in complex networks theory. The validity of the modular structure obtained is essential to confront the problem of the topology-functionality relationship. Recently, several authors have worked on the limit of resolution that different community detection algorithms have, making impossible the detection of natural modules when very different topological scales coexist in the network. Existing multiresolution methods are not the panacea for solving the problem in extreme situations, and also fail. Here, we present a new hierarchical multiresolution scheme that works even when the network decomposition is very close to the resolution limit. The idea is to split the multiresolution method for optimal subgraphs of the network, focusing the analysis on each part independently. We also propose a new algorithm to speed up the computational cost of screening the mesoscale looking for the resolution parameter that best splits every subgraph. The hierarchical algorithm is able to solve a difficult benchmark proposed in [Lancichinetti & Fortunato, 2011], encouraging the further analysis of hierarchical methods based on the modularity quality function.


Author(s):  
Sanghee Kim ◽  
Rob H. Bracewell ◽  
Ken M. Wallace

Textual documents are the most common way of storing and distributing information within organizations. Extracting useful information from large text collections is therefore the goal of every organization that would like to take advantage of the experience encapsulated in those texts. Entering data using a free text style is easy, as it does not require any special training. However, unstructured texts pose a major challenge for automatic extraction and retrieval systems. Generally, deep levels of text analysis using advanced and complex linguistic processing are necessary that involve computational linguistic experts and domain experts. Linguistic experts are rare in engineering organizations, which thus find it difficult to apply and exploit such advanced extraction techniques. It is therefore desirable to minimize the extensive involvement of linguist experts by learning extraction patterns automatically from example texts. In doing so, the analysis of given texts is necessary in order to identify the scope and suitable automatic methods. Focusing on causality reasoning in the field of fault diagnosis, the results of experimenting with an automatic causality extraction method using shallow linguistic processing are presented.


2010 ◽  
Vol 24 (1) ◽  
pp. 56-63 ◽  
Author(s):  
Louis D. Burgio

In this article the author first attempts to disentangle a number of issues in translational science from a social science perspective. As expected in a fledgling field of study being approached from various disciplines, there are marked differences in the research literature on terminology, definition of terms, and conceptualization of staging of clinical research from the pilot phase to widespread dissemination in the community. The author asserts that translational efforts in the social sciences are at a crossroads, and its greatest challenge involves the movement of interventions gleaned from clinical trials to community settings. Four strategies for reaching this goal are discussed: the use of methods derived from health services research, a yet-to-be-developed strategy where decisions to modify aspects of an intervention derived from a clinical trial are triggered by data-based criteria, community based participatory action research (CBPR), and a hybrid system wherein methods from CBPR and traditional experimental procedures are combined to achieve translation. The author ends on an optimistic note, emphasizing the impressive advances in the area over the existing barriers and calling for a unified interdisciplinary science of translation.


Author(s):  
Wei Wang ◽  
Payam M. Barnaghi ◽  
Andrzej Bargiela

The problem of learning concept hierarchies and terminological ontologies can be divided into two sub-tasks: concept extraction and relation learning. The authors of this chapter describe a novel approach to learn relations automatically from unstructured text corpus based on probabilistic topic models. The authors provide definition (Information Theory Principle for Concept Relationship) and quantitative measure for establishing “broader” (or “narrower”) and “related” relations between concepts. They present a relation learning algorithm to automatically interconnect concepts into concept hierarchies and terminological ontologies with the probabilistic topic models learned. In this experiment, around 7,000 ontology statements expressed in terms of “broader” and “related” relations are generated using different combination of model parameters. The ontology statements are evaluated by domain experts and the results show that the highest precision of the learned ontologies is around 86.6% and structures of learned ontologies remain stable when values of the parameters are changed in the ontology learning algorithm.


Sign in / Sign up

Export Citation Format

Share Document