scholarly journals Semantic Network Analysis Pipeline—Interactive Text Mining Framework for Exploration of Semantic Flows in Large Corpus of Text

2019 ◽  
Vol 9 (24) ◽  
pp. 5302
Author(s):  
Martin Cenek ◽  
Rowan Bulkow ◽  
Eric Pak ◽  
Levi Oyster ◽  
Boyd Ching ◽  
...  

Historical topic modeling and semantic concepts exploration in a large corpus of unstructured text remains a hard, opened problem. Despite advancements in natural languages processing tools, statistical linguistics models, graph theory and visualization, there is no framework that combines these piece-wise tools under one roof. We designed and constructed a Semantic Network Analysis Pipeline (SNAP) that is available as an open-source web-service that implements work-flow needed by a data scientist to explore historical semantic concepts in a text corpus. We define a graph theoretic notion of a semantic concept as a flow of closely related tokens through the corpus of text. The modular work-flow pipeline processes text using natural language processing tools, statistical content narrowing, creates semantic networks from lexical token chaining, performs social network analysis of token networks and creates a 3D visualization of the semantic concept flows through corpus for interactive concept exploration. Finally, we illustrate the framework’s utility to extract the information from a text corpus of Herman Melville’s novel Moby Dick, the transcript of the 2015–2016 United States (U.S.) Senate Hearings on Environment and Public Works, and the Australian Broadcast Corporation’s short news articles on rural and science topics.

2019 ◽  
Author(s):  
Alexander P. Christensen ◽  
Yoed Kenett

To date, the application of semantic network methodologies to study cognitive processes in psychological phenomena has been limited in scope. One barrier to broader application is the lack of resources for researchers unfamiliar with the approach. Another barrier, for both the unfamiliar and knowledgeable researcher, is the tedious and laborious preprocessing of semantic data. In this article, we aim to minimize these barriers by offering a comprehensive semantic network analysis pipeline (preprocessing, estimating, and analyzing networks), and an associated R tutorial that uses a suite of R packages to accommodate this pipeline. Two of these packages, SemNetDictionaries and SemNetCleaner, promote an efficient, reproducible, and transparent approach to preprocessing verbal fluency data. The third package, SemNeT, provides methods and measures for analyzing and statistically comparing semantic networks via a point-and-click graphical user interface. Using real-world data, we present a start-to-finish pipeline from raw data to semantic network analysis results. This article aims to provide resources for researchers, both the unfamiliar and knowledgeable, that reduce some of the barriers for conducting semantic network analysis.


2016 ◽  
Vol 5 (2) ◽  
pp. 27-41
Author(s):  
Kyung Sik Kim ◽  
Bo Ram Hyun ◽  
Byung Kook Lee ◽  
Mi Ran Jang

Sign in / Sign up

Export Citation Format

Share Document