An Overview of Data Mining Techniques and Applications and its Future Scope

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841079 ◽

2018 ◽

pp. 16-24

Author(s):

Nithya C ◽

Saravanan V

Keyword(s):

Machine Learning ◽

Data Mining ◽

Statistical Analysis ◽

Knowledge Discovery ◽

Mining Technology ◽

Data Mining Techniques ◽

Modeling Techniques ◽

Database Technology ◽

Mining Methods

Data mining is a process which finds useful patterns from large amount of data. The paper discusses few of the data mining techniques, algorithms and some of the organizations which have adapted data mining technology to improve their businesses and found excellent results. The greater part of data mining methods can manage distinctive information sorts.Data mining may be defined as the science of extracting useful information from databases. It also called knowledge discovery. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future.

Download Full-text

A Survey on Data Mining Techniques for COVID Prediction

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2021/02982021 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1051-1056

Keyword(s):

Machine Learning ◽

Data Mining ◽

Virus Disease ◽

Early Stage ◽

Machine Learning Techniques ◽

Loss Of Life ◽

Data Mining Techniques ◽

Medical Sector ◽

Learning Techniques ◽

Mining Methods

Corona Virus Disease of 2019 (COVID-19) has emerged as a serious health emergency worldwide. The symptoms of COVID-19 are un-detectable at early stage in most of the patients. It spreads from person to person very rapidly and causes severe sickness and loss of life in a number of cases if not treated early. Data mining techniques are very commonly being used in medical sector for detection and prediction of a variety of diseases and medical conditions of patients. A number of researchers are also working towards prediction of possibility of infection of COVID-19 among humans using machine learning techniques, specifically by applying data mining methods. In this paper, an extensive survey of available literature in the domain of prediction of COVID-19 infection and other diseases has been presented. This also includes survey on data mining techniques, models and various datasets.

Download Full-text

Hiding Sensitive Association Rules by Sanitizing

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.694-697.2317 ◽

2013 ◽

Vol 694-697 ◽

pp. 2317-2321

Author(s):

Hui Wang

Keyword(s):

Data Mining ◽

Side Effects ◽

Knowledge Discovery ◽

Association Rules ◽

Association Rule ◽

Mining Technology ◽

Data Mining Techniques ◽

Confidential Data ◽

Knowledge Hiding ◽

Existing Data

The goal of knowledge discovery is to extract hidden or useful unknown knowledge from databases, while the objective of knowledge hiding is to prevent certain confidential data or knowledge from being extracted through data mining techniques. Hiding sensitive association rules is focused. The side-effects of the existing data mining technology are investigated. The problem of sensitive association rule hiding is described formally. The representative sanitizing strategies for sensitive association rule hiding are discussed.

Download Full-text

I-Way

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Crowdsourcing and Probabilistic Decision-Making in Software Engineering ◽

10.4018/978-1-5225-9659-2.ch002 ◽

2020 ◽

pp. 23-34

Author(s):

Chetna Gupta ◽

Surbhi Singhal ◽

Astha Kumari

Keyword(s):

Machine Learning ◽

Data Mining ◽

Knowledge Discovery ◽

Learning Algorithm ◽

Nearest Neighbors ◽

Machine Learning Algorithm ◽

K Nearest Neighbors ◽

Inverse Document Frequency ◽

Data Mining Techniques ◽

Document Frequency

This study addresses the problem of effectively searching and selecting relevant requirements for reuse meeting stakeholders' objectives through knowledge discovery and data mining techniques maintained over a cloud platform. Knowledge extraction of similar requirement(s) is performed on data and meta-data stored in central repository using a novel intersective way method (i-way), which uses intersection results of two machine learning algorithm namely, K-nearest neighbors (KNN) and term frequency-inverse document frequency (TF-IDF). I-way is a two-level extraction framework which represents win-win situation by considering intersective results of two different approaches to ensure that selection is progressing towards desired requirement for reuse consideration. The validity and effectiveness of results of proposed framework are evaluated on requirement dataset, which show that proposed approach can significantly help in reducing effort by selecting similar requirements of interest for reuse.

Download Full-text

Data mining techniques with machine learning algorithm to predict patients of heart disease

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/1088/1/012035 ◽

2021 ◽

Vol 1088 (1) ◽

pp. 012035

Author(s):

Mulyawan ◽

Agus Bahtiar ◽

Githera Dwilestari ◽

Fadhil Muhammad Basysyar ◽

Nana Suarna

Keyword(s):

Machine Learning ◽

Data Mining ◽

Heart Disease ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Data Mining Techniques

Download Full-text

Data Mining-based Financial Statement Fraud Detection: Systematic Literature Review and Meta-analysis to Estimate Data Sample Mapping of Fraudulent Companies Against Non-fraudulent Companies

Global Business Review ◽

10.1177/0972150920984857 ◽

2021 ◽

pp. 097215092098485

Author(s):

Sonika Gupta ◽

Sushil Kumar Mehta

Keyword(s):

Machine Learning ◽

Data Mining ◽

Literature Review ◽

Systematic Literature Review ◽

Classification Accuracy ◽

Meta Analysis ◽

Financial Statement ◽

Research Articles ◽

Financial Statement Fraud ◽

Data Mining Techniques

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.

Download Full-text

Particularities of data mining in medicine: lessons learned from patient medical time series data analysis

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1582-2 ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 2

Author(s):

Shadi Aljawarneh ◽

Aurea Anguera ◽

John William Atwood ◽

Juan A. Lara ◽

David Lizcano

Keyword(s):

Data Mining ◽

Time Series ◽

Knowledge Discovery ◽

Time Series Data ◽

Medical Patient ◽

Lessons Learned ◽

Physiological Signals ◽

Knowledge Discovery In Databases ◽

Series Data ◽

Data Mining Techniques

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.

Download Full-text

Enhanced Machine Learning and Data Mining Methods for Analysing Large Hybrid Electric Vehicle Fleets based on Load Spectrum Data

10.1007/978-3-658-20367-2 ◽

2018 ◽

Cited By ~ 2

Author(s):

Philipp Bergmeir

Keyword(s):

Machine Learning ◽

Data Mining ◽

Electric Vehicle ◽

Hybrid Electric Vehicle ◽

Load Spectrum ◽

Mining Methods ◽

Hybrid Electric ◽

Spectrum Data

Download Full-text

Business Intelligence using Machine Learning and Data Mining techniques - An analysis

2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca.2018.8474847 ◽

2018 ◽

Cited By ~ 2

Author(s):

Ruchi Sharma ◽

Pravin Srinath

Keyword(s):

Machine Learning ◽

Data Mining ◽

Business Intelligence ◽

Data Mining Techniques

Download Full-text

Informational Data Mining

Enterprise Business Modeling, Optimization Techniques, and Flexible Information Systems ◽

10.4018/978-1-4666-3946-1.ch005 ◽

2013 ◽

pp. 58-65

Author(s):

Feyza Gürbüz ◽

Fatma Gökçe Önen

Keyword(s):

Data Mining ◽

Information Systems ◽

Knowledge Discovery ◽

Major Change ◽

Research Community ◽

Data Sources ◽

Accurate Information ◽

Rule Mining ◽

Data Mining Techniques ◽

Information Strategies

The previous decades have witnessed major change within the Information Systems (IS) environment with a corresponding emphasis on the importance of specifying timely and accurate information strategies. Currently, there is an increasing interest in data mining and information systems optimization. Therefore, it makes data mining for optimization of information systems a new and growing research community. This chapter surveys the application of data mining to optimization of information systems. These systems have different data sources and accordingly different objectives for knowledge discovery. After the preprocessing stage, data mining techniques can be applied on the suitable data for the objective of the information systems. These techniques are prediction, classification, association rule mining, statistics and visualization, clustering and outlier detection.

Download Full-text

A Survey on Data Mining Techniques in Research Paper Recommender Systems

Advances in Library and Information Science - Research Data Access and Management in Modern Libraries ◽

10.4018/978-1-5225-8437-7.ch006 ◽

2019 ◽

pp. 119-143 ◽

Cited By ~ 1

Author(s):

Benard Magara Maake ◽

Sunday O. Ojo ◽

Tranos Zuva

Keyword(s):

Data Mining ◽

Recommender Systems ◽

Research Paper ◽

Distance Measures ◽

Sampling Techniques ◽

Hidden Information ◽

Data Mining Techniques ◽

Reduction Methods ◽

Mining Methods ◽

Research Paper Recommender Systems

In this chapter, the authors give an overview of the main data mining techniques that are utilized in the context of research paper recommender systems. These techniques refer to mathematical models and tools that are utilized in discovering patterns in data. Data mining is a term used to describe a collection of techniques that infer recommendation rules and build models from research paper datasets. The authors briefly describe how research paper recommender systems' data is processed, analyzed, and then, finally, interpreted using these techniques. They review different distance measures, sampling techniques, and dimensionality reduction methods employed in computing research paper recommendations. They also review the various clustering, classification, and association rule-mining methods employed to mine for hidden information. Finally, they highlight the major data mining issues that are affecting research paper recommender systems.

Download Full-text