scholarly journals Emerging Pattern Mining To Aid Toxicological Knowledge Discovery

2014 ◽  
Vol 54 (7) ◽  
pp. 1864-1879 ◽  
Author(s):  
Richard Sherhod ◽  
Philip N. Judson ◽  
Thierry Hanser ◽  
Jonathan D. Vessey ◽  
Samuel J. Webb ◽  
...  
2012 ◽  
Vol 52 (11) ◽  
pp. 3074-3087 ◽  
Author(s):  
Richard Sherhod ◽  
Valerie J. Gillet ◽  
Philip N. Judson ◽  
Jonathan D. Vessey

Author(s):  
Mohammad Al Hasan

The research on mining interesting patterns from transactions or scientific datasets has matured over the last two decades. At present, numerous algorithms exist to mine patterns of variable complexities, such as set, sequence, tree, graph, etc. Collectively, they are referred as Frequent Pattern Mining (FPM) algorithms. FPM is useful in most of the prominent knowledge discovery tasks, like classification, clustering, outlier detection, etc. They can be further used, in database tasks, like indexing and hashing while storing a large collection of patterns. But, the usage of FPM in real-life knowledge discovery systems is considerably low in comparison to their potential. The prime reason is the lack of interpretability caused from the enormity of the output-set size. For instance, a moderate size graph dataset with merely thousand graphs can produce millions of frequent graph patterns with a reasonable support value. This is expected due to the combinatorial search space of pattern mining. However, classification, clustering, and other similar Knowledge discovery tasks should not use that many patterns as their knowledge nuggets (features), as it would increase the time and memory complexity of the system. Moreover, it can cause a deterioration of the task quality because of the popular “curse of dimensionality” effect. So, in recent years, researchers felt the need to summarize the output set of FPM algorithms, so that the summary-set is small, non-redundant and discriminative. There are different summarization techniques: lossless, profile-based, cluster-based, statistical, etc. In this article, we like to overview the main concept of these summarization techniques, with a comparative discussion of their strength, weakness, applicability and computation cost.


2009 ◽  
pp. 2405-2426 ◽  
Author(s):  
Vania Bogorny ◽  
Paulo Martins Engel ◽  
Luis Otavio Alavares

This chapter introduces the problem of mining frequent geographic patterns and spatial association rules from geographic databases. In the geographic domain most discovered patterns are trivial, non-novel, and noninteresting, which simply represent natural geographic associations intrinsic to geographic data. A large amount of natural geographic associations are explicitly represented in geographic database schemas and geo-ontologies, which have not been used so far in frequent geographic pattern mining. Therefore, this chapter presents a novel approach to extract patterns from geographic databases using geoontologies as prior knowledge. The main goal of this chapter is to show how the large amount of knowledge represented in geo-ontologies can be used to avoid the extraction of patterns that are previously known as noninteresting.


2015 ◽  
Vol 4 (1) ◽  
pp. 46-56 ◽  
Author(s):  
Laurence Coquin ◽  
Steven J. Canipa ◽  
William C. Drewe ◽  
Lilia Fisk ◽  
Valerie J. Gillet ◽  
...  

The discovered patterns are used to develop new structural alerts for mutagenicity in the Derek Nexus expert system.


2015 ◽  
Vol 46 ◽  
pp. 311-321 ◽  
Author(s):  
Gang Li ◽  
Rob Law ◽  
Huy Quan Vu ◽  
Jia Rong ◽  
Xinyuan (Roy) Zhao

2021 ◽  
Vol 13 (16) ◽  
pp. 8900
Author(s):  
Naeem Ahmed Mahoto ◽  
Asadullah Shaikh ◽  
Mana Saleh Al Reshan ◽  
Muhammad Ali Memon ◽  
Adel Sulaiman

The medical history of a patient is an essential piece of information in healthcare agencies, which keep records of patients. Due to the fact that each person may have different medical complications, healthcare data remain sparse, high-dimensional and possibly inconsistent. The knowledge discovery from such data is not easily manageable for patient behaviors. It becomes a challenge for both physicians and healthcare agencies to discover knowledge from many healthcare electronic records. Data mining, as evidenced from the existing published literature, has proven its effectiveness in transforming large data collections into meaningful information and knowledge. This paper proposes an overview of the data mining techniques used for knowledge discovery in medical records. Furthermore, based on real healthcare data, this paper also demonstrates a case study of discovering knowledge with the help of three data mining techniques: (1) association analysis; (2) sequential pattern mining; (3) clustering. Particularly, association analysis is used to extract frequent correlations among examinations done by patients with a specific disease, sequential pattern mining allows extracting frequent patterns of medical events and clustering is used to find groups of similar patients. The discovered knowledge may enrich healthcare guidelines, improve their processes and detect anomalous patients’ behavior with respect to the medical guidelines.


Sign in / Sign up

Export Citation Format

Share Document