scholarly journals A New Method for Identifying Key and Common Themes Based on Text Mining: An Example in the Field of Urban Expansion

2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Yanwei Zhang ◽  
Xinhai Lu ◽  
Chaoran Lin ◽  
Feng Wu ◽  
Jinqiu Li

Urban land use is a core area of multidisciplinary research that involves geography, land science, and urban planning. With the rapid progress of global urbanization, urban expansion has become a research focus in recent years. Therefore, how to scientifically and accurately identify key and common themes in the urban expansion literature has become crucial for scientific research institutions in various countries. This paper proposes a new framework for identifying such themes based on an analysis of scientific literature and by using text mining and thematic evolutionary analysis. First, the latent Dirichlet allocation algorithm is used to capture the thematic clustering of scientific literature. Second, the key degree of the thematic node in the thematic evolution transfer network is used to represent the key feature of a theme, and the PageRank algorithm is employed to measure the critical score of this theme. When recognizing common themes, the common features of various themes are digitized and mapped to a specially selected quadratic function to measure the degree of commonness. Finally, the hidden Markov model is used to build a thematic prediction model. This method can efficiently identify key and common themes from the literature and provide theoretical and technical support for future research in related fields.

2018 ◽  
Vol 10 (10) ◽  
pp. 3522 ◽  
Author(s):  
Sung-Ho Shin ◽  
Oh Kyoung Kwon ◽  
Xiao Ruan ◽  
Prem Chhetri ◽  
Paul Tae-Woo Lee ◽  
...  

Since the world’s first Earth Summit in Rio de Janeiro in 1992, sustainability has become a focal point of significant debate for industry, government, and international organizations. As a result, research on sustainability of maritime logistics is on the rise, yet fragmented in terms of conceptual development, empirical testing and validation, and theory building. The aim of this paper is therefore two-fold: the first aim is to present a literature review of key journal articles in the field of maritime studies published between 1993 and 2017 using a technique of topic modelling; and the second is to provide future research directions with respect to major topics, themes and co-authorship patterns. Mapping and consolidation of sustainability issues are achieved by conducting a generative probabilistic text-mining technique, called latent Dirichlet allocation (LDA), for latent data discovery and relationships among text document data. Moreover, bibliometric analysis is conducted to visualize the landscape of sustainability research. Based on the results, a new intellectual structure of sustainability research is created, the underlying themes are identified, key trends and patterns are extracted and future research development trajectories are mapped for the field of maritime studies.


2021 ◽  
Vol 27 (3) ◽  
pp. 200-213
Author(s):  
Yuen Chi Phang ◽  
Azleena Mohd Kassim ◽  
Ernest Mangantig

Objectives: The main aim of this study was to use text mining on social media to analyze information and gain insight into the health-related concerns of thalassemia patients, thalassemia carriers, and their caregivers.Methods: Posts from two Facebook groups whose members consisted of thalassemia patients, thalassemia carriers, and caregivers in Malaysia were extracted using the Data Miner tool. In this study, a new framework known as Malay-English social media text pre-processing was proposed for performing the steps of pre-processing the noisy mixed language (Malay-English language) of social media posts. Topic modeling was used to identify hidden topics within posts shared among members. Three different topic models—latent Dirichlet allocation (LDA) in GenSim, LDA in MALLET, and latent semantic analysis—were applied to the dataset with and without stemming using Python.Results: LDA in MALLET without stemming was found to be the best topic model for this dataset. Eight topics were identified within the posts shared by members. Of those eight topics, four were newly discovered by this study, and four others corresponded to the findings of previous studies that used an interview approach.Conclusions: Topic 2 (the challenges faced by thalassemia patients) was found to be the topic with the highest attention and engagement. Healthcare practitioners and other concerned parties should make an effort to build a stronger support system related to this issue for those affected by thalassemia.


2019 ◽  
Vol 53 (1) ◽  
pp. 79-83
Author(s):  
Kim Quaile Hill

ABSTRACTA growing body of research investigates the factors that enhance the research productivity and creativity of political scientists. This work provides a foundation for future research, but it has not addressed some of the most promising causal hypotheses in the general scientific literature on this topic. This article explicates the latter hypotheses, a typology of scientific career paths that distinguishes how scientific careers vary over time with respect to creative ambitions and achievements, and a research agenda based on the preceding components for investigation of the publication success of political scientists.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Chris Bauer ◽  
Ralf Herwig ◽  
Matthias Lienhard ◽  
Paul Prasse ◽  
Tobias Scheffer ◽  
...  

Abstract Background There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. Methods In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. Results We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap. Conclusions Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.


1982 ◽  
Vol 47 (2) ◽  
pp. 160-164
Author(s):  
Glenn L. Falkowski ◽  
Arthur M. Guilford ◽  
Jack Sandler

Utilizing airflow therapy, Schwartz (1976) has claimed an 89% success rate with stutterers following treatment and an 83% success rate at one year follow-up. Such claims have yet to be documented in the scientific literature. The purposes of this study were: (a) to investigate the effectiveness of a modified version of airflow therapy; (b) to examine the relative importance of its two main components—passive airflow and elongation of the first vowel spoken. The speech of two adult male stutterers with a lengthy history of stuttering, was assessed with spontaneous speaking and reading tasks. Results indicated marked improvement in both subjects' speech on the reading task was maintained at follow-up 10 weeks later. For spontaneous speech, results were generally weaker and less durable. Effects of the two treatment components were cumulative and did not allow determination of any differential effectiveness between components. Implications of these findings were considered and directions for future research discussed.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Pei Xu ◽  
Joonghee Lee ◽  
James R. Barth ◽  
Robert Glenn Richey

PurposeThis paper discusses how the features of blockchain technology impact supply chain transparency through the lens of the information security triad (confidentiality, integrity and availability). Ultimately, propositions are developed to encourage future research in supply chain applications of blockchain technology.Design/methodology/approachPropositions are developed based on a synthesis of the information security and supply chain transparency literature. Findings from text mining of Twitter data and a discussion of three major blockchain use cases support the development of the propositions.FindingsThe authors note that confidentiality limits supply chain transparency, which causes tension between transparency and security. Integrity and availability promote supply chain transparency. Blockchain features can preserve security and increase transparency at the same time, despite the tension between confidentiality and transparency.Research limitations/implicationsThe research was conducted at a time when most blockchain applications were still in pilot stages. The propositions developed should therefore be revisited as blockchain applications become more widely adopted and mature.Originality/valueThis study is among the first to examine the way blockchain technology eases the tension between supply chain transparency and security. Unlike other studies that have suggested only positive impacts of blockchain technology on transparency, this study demonstrates that blockchain features can influence transparency both positively and negatively.


2020 ◽  
Author(s):  
Hisao Ishibuchi ◽  
Lie Meng Pang ◽  
Ke Shang

This paper proposes a new framework for the design of evolutionary multi-objective optimization (EMO) algorithms. The main characteristic feature of the proposed framework is that the optimization result of an EMO algorithm is not the final population but a subset of the examined solutions during its execution. As a post-processing procedure, a pre-specified number of solutions are selected from an unbounded external archive where all the examined solutions are stored. In the proposed framework, the final population does not have to be a good solution set. The point of the algorithm design is to examine a wide variety of solutions over the entire Pareto front and to select well-distributed solutions from the archive. In this paper, first we explain difficulties in the design of EMO algorithms in the existing two frameworks: non-elitist and elitist. Next, we propose the new framework of EMO algorithms. Then we demonstrate advantages of the proposed framework over the existing ones through computational experiments. Finally we suggest some interesting and promising future research topics.


2017 ◽  
Author(s):  
Morgan N. Price ◽  
Adam P. Arkin

AbstractLarge-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources that link protein sequences to scientific articles (Swiss-Prot, GeneRIF, and EcoCyc). PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Irene R. Faber ◽  
Till Koopmann ◽  
Dirk Büsch ◽  
Jörg Schorer

Abstract Background The assessment of technical skills as part of a multidimensional approach for talent identification and development in sports seems promising, especially in a technique-based sport like table tennis. However, current instruments mostly focus on a single element of technical skills, mainly use quantitative outcomes, and/or are not developed for talent purposes. Practice would benefit from a new instrument using a more ecologically valid approach. Thus, the purpose of this study was to identify the essential elements of technical skills in young table tennis players and to establish a first tool while using a multi-methods study design including an archive search for professional literature, a systematic search for scientific literature, as well as ten in-depth interviews with expert coaches. Results This approach taken ensured empirical findings to be combined with knowledge and experiences from the practical field and detailed explications by high-level expert coaches. Results for the literature searches yielded 23 professional and 21 scientific articles while data saturation was reached through all ten interviews. The triangulation process resulted in two general (i.e., individuality, interconnection between elements) and five specific (i.e., bat grip, ready position, footwork/body positioning, service, stroke) elements of technical skills in young table tennis players. In addition, criteria for both flawed and excellent executions were identified for each of the five specific elements. Finally, these results were used to create an observation sheet usable for an assessment during competition. Conclusions This study revealed the crucial elements of technical skills that should be taken into account when assessing sport-specific technical skills of youth table tennis players (8–12 years). Moreover, it provided concise descriptions of what is considered to be flawed or excellent executions of technical skills. Based on these findings, a first observation sheet, the Oldenburg observation sheet for Table Tennis Technique (O3T), was created to be used for the assessment of the current technical skill level within a competitive context at the early stage of a table tennis player’s career. Future research should focus on its measurement properties and its value within a multidimensional assessment for talent purposes.


Sign in / Sign up

Export Citation Format

Share Document