scholarly journals Stratification-Based Outlier Detection over the Deep Web

2016 ◽  
Vol 2016 ◽  
pp. 1-13 ◽  
Author(s):  
Xuefeng Xian ◽  
Pengpeng Zhao ◽  
Victor S. Sheng ◽  
Ligang Fang ◽  
Caidong Gu ◽  
...  

For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

2011 ◽  
Vol 2 (2) ◽  
pp. 1-21 ◽  
Author(s):  
Nenad Jukic ◽  
Svetlozar Nestorov ◽  
Miguel Velasco ◽  
Jami Eddington

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.


The data of medical applications over the internet contains sensitive data. There exist several methods that provide privacy for these data. Most of the privacy-preserving data mining methods make the assumption of the separation of quasi-identifiers (QID) from multiple sensitive attributes. But in reality, the attributes in a dataset possess both the features of QIDs and sensitive data. In this paper privacy model namely (vi…vj)-diversity is proposed. The proposed anonymization algorithm works for databases containing numerous sensitive QIDs. The real dataset is used for performance evaluation. Our system reduced the information loss for even huge number of attributes and the values of sensitive QID’s are protected.


2019 ◽  
Vol 281 ◽  
pp. 05003
Author(s):  
Reem Razzaq Abdul Hussein ◽  
Dr.Muayad Sadik Croock ◽  
Dr Salih Mahdi Al-Qaraawi

Data-mining methods, which can be optimized via different methods, are applied in crime detection. This work, the decision tree algorithm is used for classifying and optimizing its structure with the smart method. This method is applied to two datasets: Iraq and India criminals. The goal of the proposed method is to identify criminals using a mining method based on smart search. This contribution helps in the acquisition of better results than those provided by traditional mining methods via controlling the size of the tree through decreasing leaf size.


2019 ◽  
Vol 8 (2) ◽  
pp. 5766-5774

In today's market there is cut throat competition in the banks and struggling hard to gain competitive advantage over each other. The banking industry has undergone tremendous changes in the way business conducted. They realizes the needs and techniques of data mining which is helpful tool to gather, store, capture data and convert into knowledge. The application of data mining enhances the performance of telemarketing process in banking industry. It also provide an insight how these techniques effectively used in banking industry to make the decision making process easier and productive. This work describes a data mining approach to extract valuable knowledge and information from a bank telemarketing campaign data. At this time, the potential of five data mining methods was explored for forecasting of term deposit subscription. The presentation of these techniques was evaluated on fourteen different classifier parameters. The overall better performance achieved by J48 decision tree which reported 91.2% correctly classified with sensitivity, specificity and lowest error rate of 53.8, 95.9 and 8.8 % respectively


Author(s):  
Ari Supriadi ◽  
Poningsih P ◽  
Hendry Qurniawan

Customer satisfaction is the most important thing in assessing the level of management and services provided by the bank to its customers. The existence of banking services in society is indeed more profitable, especially in the economic sector, where economic actors are more free to carry out the process of economic activities to support survival. Data mining is an analysis of observations of large amounts of data to find relationships that are not known beforehand, data processed by the data mining method will produce a new knowledge sourced from old data, the results of processing can be used to determine future decisions. Using the C4.5 algorithm will predict which aspects are more dominant towards customer satisfaction. The data source of this research was collected based on a questionnaire (questionnaire) filled out by customers of Bank Syariah Mandiri in Pematangsiantar City. Data will be processed by calculating the value of entropy, calculating the gain value. So that the final results obtained in the form of a decision tree are expected to be input to the Bank Syariah Mandiri in Pematangsiantar City in maintaining the quality of its services to customers and improving the quality so that customers are always satisfied with the services provided


Author(s):  
I.M. Burykin ◽  
◽  
G.N. Aleeva ◽  
R.Kh. Khafizianova ◽  
◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document