Discovery of Characteristic Sequential Patterns Based on Two Types of Constraints

Shigeaki Sakurai

doi:10.4018/ijeach.2019010105

Discovery of Characteristic Sequential Patterns Based on Two Types of Constraints

International Journal of Extreme Automation and Connectivity in Healthcare ◽

10.4018/ijeach.2019010105 ◽

2019 ◽

Vol 1 (1) ◽

pp. 40-54

Author(s):

Shigeaki Sakurai

Keyword(s):

Pattern Discovery ◽

Evaluation Criteria ◽

Background Knowledge ◽

Structured Data ◽

Sequential Patterns ◽

Time Constraints ◽

Sequential Data ◽

Healthcare Data ◽

Discovery Method ◽

Attribute Value

This article proposes a method for discovering characteristic sequential patterns from sequential data by using background knowledge. In the case of the tabular structured data, each item is composed of an attribute and an attribute value. This article focuses on two types of constraints describing background knowledge. The first one is time constraints. It can flexibly describe relationships related to the time between items. The second one is item constraints, it can select items included in sequential patterns. These constraints can represent the background knowledge representing the interests of analysts. Therefore, they can easily discover sequential patterns coinciding the interests as characteristic sequential patterns. Lastly, this article verifies the effect of the pattern discovery method based on both the evaluation criteria of sequential patterns and the background knowledge. The method can be applied to the analysis of the healthcare data.

Download Full-text

Introduction of Item Constraints to Discover Characteristic Sequential Patterns

Emerging Perspectives in Big Data Warehousing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-5516-2.ch011 ◽

2019 ◽

pp. 279-292

Author(s):

Shigeaki Sakurai

Keyword(s):

Background Knowledge ◽

Structured Data ◽

Sequential Patterns ◽

Sequential Data ◽

Special Case ◽

Attribute Value

This chapter introduces a method that discovers characteristic sequential patterns from sequential data based on background knowledge. The sequential data is composed of rows of items. This chapter focuses on the sequential data based on the tabular structured data. That is, each item is composed of an attribute and an attribute value. Also, this chapter focuses on item constraints in order to describe the background knowledge. The constraints describe the combination of items included in sequential patterns. They can represent the interests of analysts. Therefore, they can easily discover sequential patterns coinciding to the interests of the analysts as characteristic sequential patterns. In addition, this chapter focuses on the special case of the item constraints. It is constrained at the last item of the sequential patterns. The discovered patterns are used to the analysis of cause, and reason and can predict the last item in the case that the sub-sequence is given. This chapter introduces the property of the item constraints for the last item.

Download Full-text

Sequential Pattern Mining from Sequential Data

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch067 ◽

2009 ◽

pp. 622-631

Author(s):

Shigeaki Sakurai

Keyword(s):

Pattern Mining ◽

Pattern Discovery ◽

Sequential Pattern ◽

The Other ◽

Sequential Patterns ◽

Sequential Data ◽

Frequent Patterns ◽

New Knowledge ◽

Discovery Method ◽

Time Information

Owing to the progress of computer and network environments, it is easy to collect data with time information such as daily business reports, weblog data, and physiological information. This is the context in which methods of analyzing data with time information have been studied. This chapter focuses on a sequential pattern discovery method from discrete sequential data. The methods proposed by Pei et al. (2001), Srikant & Agrawal (1996), and Zaki (2001) efficiently discover the frequent patterns as characteristic patterns. However, the discovered patterns do not always correspond to the interests of analysts, because the patterns are common and are not a source of new knowledge for the analysts. The problem has been pointed out in connection with the discovery of associative rules. Blanchard et al. (2005), Brin et al. (1997), Silberschatz et al. (1996), and Suzuki et al. (2005) propose other criteria in order to discover other kinds of characteristic patterns. The patterns discovered by the criteria are not always frequent but are characteristic of viewpoints. The criteria may be applicable to discovery methods of sequential patterns. However, these criteria do not satisfy the Apriori property. It is difficult for the methods based on the criteria to efficiently discover the patterns. On the other hand, methods that use the background knowledge of analysts have been proposed in order to discover sequential patterns corresponding to the interests of analysts (Garofalakis et al., 1999; Pei et al., 2002; Sakurai et al., 2008b; Yen, 2005).

Download Full-text

A Discovery Method of Attractive Rules from the Tabular Structured Data

Intelligent Data Analysis for Real-Life Applications ◽

10.4018/978-1-4666-1806-0.ch001 ◽

2012 ◽

pp. 1-17

Author(s):

Shigeaki Sakurai

Keyword(s):

Data Collection ◽

Data Storage ◽

Missing Values ◽

Evaluation Method ◽

Evaluation Criteria ◽

Research Field ◽

Structured Data ◽

Frequent Patterns ◽

Retail Business ◽

Discovery Method

This chapter introduces a discovery method of attractive rules from the tabular structured data. The data is a set of examples composed of attributes and their attribute values. The method is included in the research field discovering frequent patterns from transactions composed of items. Here, the transaction and the item are a receipt and a sales item in the case of the retail business. The method focuses on relationships between the attributes and the attribute values in order to efficiently discover patterns based on their frequencies from the tabular structured data. Also, the method needs to deal with missing values. This is because parts of attribute values are missing due to the problems of data collection and data storage. Thus, this chapter introduces a method dealing with the missing values. The method defines two evaluation criteria related to the patterns and introduces a method that discovers the patterns based on the two-stepwise evaluation method. In addition, this chapter introduces evaluation criteria of the attractive rules in order to discover the rules from the patterns.

Download Full-text

Applications of Pattern Discovery Using Sequential Data Mining

Pattern Discovery Using Sequence Data Mining ◽

10.4018/978-1-61350-056-9.ch001 ◽

2012 ◽

pp. 1-23 ◽

Cited By ~ 8

Author(s):

Manish Gupta ◽

Jiawei Han

Keyword(s):

Data Mining ◽

Text Mining ◽

Intrusion Detection ◽

Pattern Mining ◽

Pattern Discovery ◽

Sequential Pattern Mining ◽

Web Usage Mining ◽

Sequential Pattern ◽

Sequential Data ◽

Mining Methods

Sequential pattern mining methods have been found to be applicable in a large number of domains. Sequential data is omnipresent. Sequential pattern mining methods have been used to analyze this data and identify patterns. Such patterns have been used to implement efficient systems that can recommend based on previously observed patterns, help in making predictions, improve usability of systems, detect events, and in general help in making strategic product decisions. In this chapter, we discuss the applications of sequential data mining in a variety of domains like healthcare, education, Web usage mining, text mining, bioinformatics, telecommunications, intrusion detection, et cetera. We conclude with a summary of the work.

Download Full-text

Data Pattern Tutor for AprioriAll and PrefixSpan

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch083 ◽

2011 ◽

pp. 531-537

Author(s):

Mohammed Alshalalfa

Keyword(s):

Data Mining ◽

Computer Literacy ◽

User Study ◽

Sequential Patterns ◽

Sequential Data ◽

Educational Value ◽

Data Mining Algorithms ◽

New Meanings ◽

Mining Works ◽

Mining Algorithms

Data mining can be described as data processing using sophisticated data search capabilities and statistical algorithms to discover patterns and correlations in large pre-existing databases (Agrawal & Srikant 1995; Zhao & Sourav 2003). From these patterns, new and important information can be obtained that will lead to the discovery of new meanings which can then be translated into enhancements in many current fields. In this paper, we focus on the usability of sequential data mining algorithms. Based on a conducted user study, many of these algorithms are difficult to comprehend. Our goal is to make an interface that acts as a “tutor” to help the users understand better how data mining works. We consider two of the algorithms more commonly used by our students for discovering sequential patterns, namely the AprioriAll and the PrefixSpan algorithms. We hope to generate some educational value, such that the tool could be used as a teaching aid for comprehending data mining algorithms. We concentrated our effort to develop the user interface to be easy to use by naïve end users with minimum computer literacy; the interface is intended to be used by beginners. This will help in having a wider audience and users for the developed tool.

Download Full-text

Traversal Pattern Mining in Web Usage Data

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch119 ◽

2008 ◽

pp. 2004-2021

Author(s):

Jenq-Foung Yao ◽

Yongqiao Xiao

Keyword(s):

Pattern Mining ◽

Pattern Discovery ◽

Web Usage Mining ◽

Sequential Patterns ◽

Web Usage ◽

Web Logs ◽

Frequent Episodes ◽

Browsing Behavior ◽

The Web ◽

Usage Data

Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent Sequences. As a necessary step for pattern discovery, the preprocessing of the web logs is described. Some important issues, such as privacy, sessionization, are raised, and the possible solutions are also discussed.

Download Full-text

Methodologies and Techniques of Web Usage Mining

Advances in Data Mining and Database Management - Web Usage Mining Techniques and Applications Across Industries ◽

10.4018/978-1-5225-0613-3.ch011 ◽

2017 ◽

pp. 275-296

Author(s):

T. Venkat Narayana Rao ◽

D. Hiranmayi

Keyword(s):

Web Mining ◽

Pattern Analysis ◽

Pattern Discovery ◽

Secondary Data ◽

Web Usage Mining ◽

Sequential Patterns ◽

Useful Knowledge ◽

Web Usage ◽

Automatic Discovery ◽

Collection Data

Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. It is the type of Web mining activity that involves the automatic discovery of out what users are looking for on the Internet. In this chapter methodology of web usage mining explained in detail which are data collection, data preprocessing, knowledge discovery and pattern analysis. The different Web Usage Mining techniques are described, which are used for knowledge and pattern discovery. These are statistical analysis, sequential patterns, classification, association rule mining, clustering, dependency modeling. Pattern analysis is needed to filter out uninterested rules or patterns from the set found in the pattern discovery phase.

Download Full-text

Efficient mining of sequential patterns with time constraints by delimited pattern growth

Knowledge and Information Systems ◽

10.1007/s10115-004-0182-5 ◽

2005 ◽

Vol 7 (4) ◽

pp. 499-514 ◽

Cited By ~ 20

Author(s):

Ming-Yen Lin ◽

Suh-Yin Lee

Keyword(s):

Sequential Patterns ◽

Time Constraints ◽

Pattern Growth

Download Full-text

Pre-processing time constraints for efficiently mining generalized sequential patterns

Proceedings. 11th International Symposium on Temporal Representation and Reasoning, 2004. TIME 2004. ◽

10.1109/time.2004.1314424 ◽

2004 ◽

Cited By ~ 14

Author(s):

F. Masseglia ◽

P. Poncelet ◽

M. Teisseire

Keyword(s):

Processing Time ◽

Sequential Patterns ◽

Time Constraints

Download Full-text

Mining of Sequential Patterns using Directed Graphs

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2242.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 4002-4007

Keyword(s):

Pattern Mining ◽

Directed Graphs ◽

Real Life ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Sequential Patterns ◽

Sequential Data ◽

Sequence Database ◽

Directed Paths ◽

Digraph Model

Sequential pattern mining is one of the important functionalities of data mining. It is used for analyzing sequential database and discovers sequential patterns. It is focused for extracting interesting subsequences from a set of sequences. Various factors such as rate of occurrence, length, and profit are used to define the interestingness of subsequence derived from the sequence database. Sequential pattern mining has abundant real-life applications since sequential data is logically programmed as sequences of cipher in many fields such as bioinformatics, e-learning, market basket analysis, texts, and webpage click-stream analysis. A large diversity of competent algorithms such as Prefixspan, GSP and Freespan have been proposed during the past few years. In this paper we propose a data model for organizing the sequential database, which consists of a directed graph DGS (cycles and several edges are allowed) and an organization of directed paths in DGS to represent a sequential data for discovering sequential pattern3 from a sequence database. Competent algorithms for constructing the digraph model (DGS) for extracting all sequential patterns and mining association rules are proposed. A number of theoretical parameters of digraph model are also introduced, which lead to more understanding of the problem.

Download Full-text