An Overview of Data Mining Techniques and Applications and its Future Scope

Author(s):  
Nithya C ◽  
Saravanan V

Data mining is a process which finds useful patterns from large amount of data. The paper discusses few of the data mining techniques, algorithms and some of the organizations which have adapted data mining technology to improve their businesses and found excellent results. The greater part of data mining methods can manage distinctive information sorts.Data mining may be defined as the science of extracting useful information from databases. It also called knowledge discovery. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future.

Corona Virus Disease of 2019 (COVID-19) has emerged as a serious health emergency worldwide. The symptoms of COVID-19 are un-detectable at early stage in most of the patients. It spreads from person to person very rapidly and causes severe sickness and loss of life in a number of cases if not treated early. Data mining techniques are very commonly being used in medical sector for detection and prediction of a variety of diseases and medical conditions of patients. A number of researchers are also working towards prediction of possibility of infection of COVID-19 among humans using machine learning techniques, specifically by applying data mining methods. In this paper, an extensive survey of available literature in the domain of prediction of COVID-19 infection and other diseases has been presented. This also includes survey on data mining techniques, models and various datasets.


2013 ◽  
Vol 694-697 ◽  
pp. 2317-2321
Author(s):  
Hui Wang

The goal of knowledge discovery is to extract hidden or useful unknown knowledge from databases, while the objective of knowledge hiding is to prevent certain confidential data or knowledge from being extracted through data mining techniques. Hiding sensitive association rules is focused. The side-effects of the existing data mining technology are investigated. The problem of sensitive association rule hiding is described formally. The representative sanitizing strategies for sensitive association rule hiding are discussed.


Author(s):  
Chetna Gupta ◽  
Surbhi Singhal ◽  
Astha Kumari

This study addresses the problem of effectively searching and selecting relevant requirements for reuse meeting stakeholders' objectives through knowledge discovery and data mining techniques maintained over a cloud platform. Knowledge extraction of similar requirement(s) is performed on data and meta-data stored in central repository using a novel intersective way method (i-way), which uses intersection results of two machine learning algorithm namely, K-nearest neighbors (KNN) and term frequency-inverse document frequency (TF-IDF). I-way is a two-level extraction framework which represents win-win situation by considering intersective results of two different approaches to ensure that selection is progressing towards desired requirement for reuse consideration. The validity and effectiveness of results of proposed framework are evaluated on requirement dataset, which show that proposed approach can significantly help in reducing effort by selecting similar requirements of interest for reuse.


2021 ◽  
Vol 1088 (1) ◽  
pp. 012035
Author(s):  
Mulyawan ◽  
Agus Bahtiar ◽  
Githera Dwilestari ◽  
Fadhil Muhammad Basysyar ◽  
Nana Suarna

2021 ◽  
pp. 097215092098485
Author(s):  
Sonika Gupta ◽  
Sushil Kumar Mehta

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.


Author(s):  
Shadi Aljawarneh ◽  
Aurea Anguera ◽  
John William Atwood ◽  
Juan A. Lara ◽  
David Lizcano

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.


Author(s):  
Feyza Gürbüz ◽  
Fatma Gökçe Önen

The previous decades have witnessed major change within the Information Systems (IS) environment with a corresponding emphasis on the importance of specifying timely and accurate information strategies. Currently, there is an increasing interest in data mining and information systems optimization. Therefore, it makes data mining for optimization of information systems a new and growing research community. This chapter surveys the application of data mining to optimization of information systems. These systems have different data sources and accordingly different objectives for knowledge discovery. After the preprocessing stage, data mining techniques can be applied on the suitable data for the objective of the information systems. These techniques are prediction, classification, association rule mining, statistics and visualization, clustering and outlier detection.


Author(s):  
Benard Magara Maake ◽  
Sunday O. Ojo ◽  
Tranos Zuva

In this chapter, the authors give an overview of the main data mining techniques that are utilized in the context of research paper recommender systems. These techniques refer to mathematical models and tools that are utilized in discovering patterns in data. Data mining is a term used to describe a collection of techniques that infer recommendation rules and build models from research paper datasets. The authors briefly describe how research paper recommender systems' data is processed, analyzed, and then, finally, interpreted using these techniques. They review different distance measures, sampling techniques, and dimensionality reduction methods employed in computing research paper recommendations. They also review the various clustering, classification, and association rule-mining methods employed to mine for hidden information. Finally, they highlight the major data mining issues that are affecting research paper recommender systems.


Sign in / Sign up

Export Citation Format

Share Document