Parallel clustering of large data set on Hadoop using data mining techniques

An Optimistic Data Mining Approach for Handling Large Data Set using Data Partitioning Techniques

International Journal of Computer Applications ◽

10.5120/2930-3878 ◽

2011 ◽

Vol 24 (3) ◽

pp. 29-33 ◽

Cited By ~ 3

Author(s):

Dipak V. Patil ◽

R. S. Bichkar

Keyword(s):

Data Mining ◽

Large Data ◽

Data Partitioning ◽

Data Set ◽

Data Mining Approach ◽

Large Data Set ◽

Using Data

Download Full-text

A Hybrid Method for Prediction and Assessment Efficiency of Decision Making Units

International Journal of Decision Support System Technology ◽

10.4018/jdsst.2013010104 ◽

2013 ◽

Vol 5 (1) ◽

pp. 66-83 ◽

Cited By ~ 1

Author(s):

Iman Rahimi ◽

Reza Behmanesh ◽

Rosnah Mohd. Yusuff

Keyword(s):

Data Mining ◽

Decision Making ◽

Decision Rules ◽

Large Data ◽

Poultry Meat ◽

Small Data ◽

Data Set ◽

Data Mining Techniques ◽

Decision Making Units

The objective of this article is an evaluation and assessment efficiency of the poultry meat farm as a case study with the new method. As it is clear poultry farm industry is one of the most important sub- sectors in comparison to other ones. The purpose of this study is the prediction and assessment efficiency of poultry farms as decision making units (DMUs). Although, several methods have been proposed for solving this problem, the authors strongly need a methodology to discriminate performance powerfully. Their methodology is comprised of data envelopment analysis and some data mining techniques same as artificial neural network (ANN), decision tree (DT), and cluster analysis (CA). As a case study, data for the analysis were collected from 22 poultry companies in Iran. Moreover, due to a small data set and because of the fact that the authors must use large data set for applying data mining techniques, they employed k-fold cross validation method to validate the authors’ model. After assessing efficiency for each DMU and clustering them, followed by applied model and after presenting decision rules, results in precise and accurate optimizing technique.

Download Full-text

Data mining for energy analysis of a large data set of flats

Proceedings of the Institution of Civil Engineers - Engineering Sustainability ◽

10.1680/jensu.15.00051 ◽

2017 ◽

Vol 170 (1) ◽

pp. 3-18 ◽

Cited By ~ 5

Author(s):

Alfonso Capozzoli ◽

Gianluca Serale ◽

Marco Savino Piscitelli ◽

Daniele Grassi

Keyword(s):

Data Mining ◽

Energy Analysis ◽

Large Data ◽

Data Set ◽

Large Data Set

Download Full-text

Identifying Non-Performing Students in Higher Educational Institutions Using Data Mining Techniques

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2021010105 ◽

2021 ◽

Vol 12 (1) ◽

pp. 94-110

Author(s):

Deepti Aggarwal ◽

Sonu Mittal ◽

Vikram Bali

Keyword(s):

Data Mining ◽

Uttar Pradesh ◽

Drop Out ◽

Primary Data ◽

Educational Institutions ◽

Technical Institute ◽

Data Set ◽

Data Mining Techniques ◽

Preventive Actions ◽

Using Data

The educational institutes are focusing on improving the performance of students by using several data mining techniques. Since there is an increase in the number of drop out students every year, if we are able to predict whether a student will complete the course or not, it is possible to take some preventive actions beforehand. The primary data set used for modelling has been taken from a reputed technical institute of Uttar Pradesh which consists of data of 6,807 students containing 20 academic and non-academic attributes. The most relevant attributes are extracted using CorrelationAttributeEval (in WEKA) technique using Ranker search method which ranks the attributes as per their evaluation. Synthetic minority oversampling technique (SMOTE) filter is applied to deal with the skewed data set. The models are built from eight classifiers that are analysed for predicting the most appropriate model to classify whether a student will complete the course or withdraw his/her admission.

Download Full-text

Exploration of Healthcare Using Data Mining Techniques

Big Data Management and the Internet of Things for Improved Health Systems - Advances in Healthcare Information Systems and Administration ◽

10.4018/978-1-5225-5222-2.ch014 ◽

2018 ◽

pp. 243-259 ◽

Cited By ~ 1

Author(s):

Anindita Desarkar ◽

Ajanta Das

Keyword(s):

Data Mining ◽

Patient Care ◽

Knowledge Discovery ◽

Healthcare Cost ◽

Healthcare Sector ◽

Huge Amount ◽

Data Set ◽

Data Mining Techniques ◽

Using Data ◽

Tools And Techniques

Huge amount of data is generated from Healthcare transactions where data are complex, voluminous and heterogeneous in nature. This large dataset can be used as an ideal store which can be analyzed for knowledge discovery as well as various future predictions. So, Data mining is becoming increasingly popular as it offers set of innovative tools and techniques to handle this kind of data set whereas traditional methods have limitations for that. In summary, providing the better patient care and reduction in healthcare cost are two major goals of application of data mining in healthcare. Initially, this chapter explores on the various types of eHealth data and its characteristics. Subsequently it explores various domains in healthcare sector and shows how data mining plays a major role in those domains. Finally, it describes few common data mining techniques and their applications in eHealth domain.

Download Full-text

A Study of the Applications of Data Mining Techniques in Higher Education

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2015.1263 ◽

2015 ◽

pp. 1-4

Author(s):

SUSHIL VERMA ◽

R. S. THAKUR ◽

SHAILESH JALORI

Keyword(s):

Higher Education ◽

Data Mining ◽

Business Processes ◽

Large Data ◽

Educational Institutions ◽

Data Set ◽

Data Mining Techniques ◽

Higher Education System ◽

Student’S Performance ◽

Quality In Higher Education

Data mining is used to extract meaningful information and to develop significant relationships among variables stored in large data set. Few years ago, the information flow in education field was relatively simple and the application of technology was limited. However, as we progress into a more integrated world where technology has become an integral part of the business processes, the process of transfer of information has become more complicated. Today, one of the biggest challenges that educational institutions face is the explosive growth of educational data and to use this data to improve the quality of managerial decisions and student’s performance. The main objective of higher education institutions is to provide quality education to its students. One way to achieve highest level of quality in higher education system is by discovering knowledge for prediction regarding enrolment of students in a particular course, alienation of traditional classroom teaching model, detection of Unfair means used in online examination, detection of abnormal values in the result sheets of the students, prediction about students’ performance. The paper aims to purpose the use of Data mining techniques to improve the efficiency of higher educational institutions. If data mining techniques such as clustering, dicision tree and association can be applied to higher education processes, it can help improve student’s performance.

Download Full-text

Evaluating a guest satisfaction model through data mining

International Journal of Contemporary Hospitality Management ◽

10.1108/ijchm-03-2019-0280 ◽

2019 ◽

Vol 32 (4) ◽

pp. 1523-1538 ◽

Cited By ~ 2

Author(s):

Sérgio Moro ◽

Joaquim Esmerado ◽

Pedro Ramos ◽

Bráulio Alturas

Keyword(s):

Data Mining ◽

Large Data ◽

Online Reviews ◽

Theory And Practice ◽

Target Feature ◽

Data Set ◽

Content Type ◽

Large Data Set ◽

Guest Satisfaction ◽

Satisfaction Model

Purpose This paper aims to propose a data mining approach to evaluate a conceptual model in tourism, encompassing a large data set characterized by dimensions grounded on existing literature. Design/methodology/approach The approach is tested using a guest satisfaction model encompassing nine dimensions. A large data set of 84 k online reviews and 31 features was collected from TripAdvisor. The review score granted was considered a proxy of guest satisfaction and was defined as the target feature to model. A sequence of data understanding and preparation tasks led to a tuned set of 60k reviews and 29 input features which were used for training the data mining model. Finally, the data-based sensitivity analysis was adopted to understand which dimensions most influence guest satisfaction. Findings Previous user’s experience with the online platform, individual preferences, and hotel prestige were the most relevant dimensions concerning guests’ satisfaction. On the opposite, homogeneous characteristics among the Las Vegas hotels such as the hotel size were found of little relevance to satisfaction. Originality/value This study intends to set a baseline for an easier adoption of data mining to evaluate conceptual models through a scalable approach, helping to bridge between theory and practice, especially relevant when dealing with Big Data sources such as the social media. Thus, the steps undertaken during the study are detailed to facilitate replication to other models.

Download Full-text

A Survey of using Data Mining Techniques for Soil Fertility

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.11096 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 917

Author(s):

Madhuri Kommineni ◽

Someswari Perla ◽

Divya Bharathi Yedla

Keyword(s):

Data Mining ◽

Soil Fertility ◽

Weather Forecasting ◽

Large Data ◽

Soil Biology ◽

Weed Detection ◽

Nutrient Analysis ◽

Data Mining Techniques ◽

Development Data ◽

Using Data

Data Mining is a technique which focuses on large data sets to extract information for prediction and discovery of hidden patterns. Data Mining is applicable on various areas like healthcare, insurance, marketing, retail, communication, agriculture. Agriculture is the backbone of country’s economy. It is the important source of livelihood. Agriculture mainly depends on climate, topography, soil, biology. Agricultural Mining is a technology which can bring knowledge to agriculture development. Data Mining in agriculture plays a role in weather forecasting, yield prediction, soil fertility, fertilizers usage, fruit grading, plant disease and weed detection. The current study presents the different data mining techniques and their role in context of soil fertility, nutrient analysis.

Download Full-text

P1-204: The Retrospective Analysis of a Large Data-Set Using Data-Monitoring Algorithms: What are the Logical Relationships between the ADAS-Cog and MMSE?

Alzheimer s & Dementia ◽

10.1016/j.jalz.2011.05.483 ◽

2011 ◽

Vol 7 ◽

pp. S177-S177

Author(s):

Christian Yavorsky

Keyword(s):

Retrospective Analysis ◽

Large Data ◽

Data Monitoring ◽

Data Set ◽

Large Data Set ◽

Using Data

Download Full-text

Analysis of Crop Yield Prediction of Kharif & Rabi Jowar Crops Using Data Mining Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.468 ◽

2017 ◽

Vol 7 (11) ◽

pp. 79

Author(s):

Sujata Mulik

Keyword(s):

Data Mining ◽

Crop Yield ◽

Crop Production ◽

Climatic Factors ◽

Crop Productivity ◽

Yield Prediction ◽

Data Mining Techniques ◽

Agriculture Sector ◽

Using Data ◽

Rabi Crops

Agriculture sector in India is facing rigorous problem to maximize crop productivity. More than 60 percent of the crop still depends on climatic factors like rainfall, temperature, humidity. This paper discusses the use of various Data Mining applications in agriculture sector. Data Mining is used to solve various problems in agriculture sector. It can be used it to solve yield prediction. The problem of yield prediction is a major problem that remains to be solved based on available data. Data mining techniques are the better choices for this purpose. Different Data Mining techniques are used and evaluated in agriculture for estimating the future year's crop production. In this paper we have focused on predicting crop yield productivity of kharif & Rabi Crops.

Download Full-text