Combining Text Mining and Data Mining for Bug Report Classification

Combining text mining and data mining for bug report classification

Journal of Software Evolution and Process ◽

10.1002/smr.1770 ◽

2016 ◽

Vol 28 (3) ◽

pp. 150-176 ◽

Cited By ~ 36

Author(s):

Yu Zhou ◽

Yanxiang Tong ◽

Ruihang Gu ◽

Harald Gall

Keyword(s):

Data Mining ◽

Text Mining ◽

Bug Report

Download Full-text

Plagiarism Detection Process using Data Mining Techniques

International Journal of Recent Contributions from Engineering Science & IT (iJES) ◽

10.3991/ijes.v5i4.7869 ◽

2017 ◽

Vol 5 (4) ◽

pp. 68

Author(s):

Mahwish Abid ◽

Muhammad Usman ◽

Muhammad Waleed Ashraf

Keyword(s):

Data Mining ◽

Text Mining ◽

Computer Systems ◽

Plagiarism Detection ◽

Data Mining Techniques ◽

Detection Process ◽

Using Data ◽

Day By Day

<strong>As the technology is growing very fast and usage of computer systems is increased as compared to the old times, plagiarism is the phenomenon which is increasing day by day. Wrongful appropriation of someone else’s work is known as plagiarism. Manually detection of plagiarism is difficult so this process should be automated. There are various tools which can be used for plagiarism detection. Some works on intrinsic plagiarism while other work on extrinsic plagiarism. Data mining the field which can help in detecting the plagiarism as well as can help to improve the efficiency of the process. Different data mining techniques can be used to detect plagiarism. Text mining, clustering, bi-gram, tri-grams, n-grams are the techniques which can help in this process</strong>

Download Full-text

Use of text mining techniques for unsupervised organization of digital procedural acts

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.83581 ◽

2018 ◽

Vol 25 (4) ◽

pp. 74

Author(s):

Alfredo Silveira Araújo Neto ◽

Marcos Negreiros

Keyword(s):

Data Mining ◽

Text Mining ◽

Text Documents ◽

Digital Format ◽

Large Databases ◽

Context Data ◽

Self Discovery ◽

The Many ◽

Many Sources ◽

And Storage

The rapid advances in technologies related to the capture and storage of data in digital format have allowed to organizations the accumulation of a volume of information extremely high, constituted a higher proportion of data in unstructured format, represented by texts. However, it is noted that the retrieval of useful information from these large repositories has been a very challenging activity. In this context, data mining is presented as a self-discovery process that acts on large databases and enables the knowledge extraction from raw text documents. Among the many sources of textual documents are electronic diaries of justice, which are intended to make public officially all the acts of the Judiciary. Despite the publication in digital form has provided improvements represented by the removal of imperfections related to divulgation at printed format, it is observed that the application of data mining methods could render more rapid analysis of its contents. In this sense, this article establishes a tool capable of automatically grouping and categorizing digital procedural acts, based on the evaluation of text mining techniques applied to groups determination activity. In addition, the strategy of defining the descriptors of the groups, that is usually conducted based on the most frequent words in the documents, was evaluated and remodeled in order to use, instead of words, the most regularly identified concepts in the texts.

Download Full-text

Integrating text mining, data mining, and network analysis for identifying genetic breast cancer trends

BMC Research Notes ◽

10.1186/s13104-016-2023-5 ◽

2016 ◽

Vol 9 (1) ◽

Cited By ~ 14

Author(s):

Gabriela Jurca ◽

Omar Addam ◽

Alper Aksac ◽

Shang Gao ◽

Tansel Özyer ◽

...

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Network Analysis ◽

Text Mining ◽

Cancer Trends

Download Full-text

Tools in the field of Data Mining and Text Mining Applications

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.18118 ◽

2017 ◽

Vol 4 (12) ◽

Keyword(s):

Data Mining ◽

Text Mining

Download Full-text

Detection of Non-Technical Losses

Advances in Secure Computing, Internet Services, and Applications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-4940-8.ch008 ◽

2014 ◽

pp. 140-164 ◽

Cited By ~ 1

Author(s):

Juan I. Guerrero ◽

Íñigo Monedero ◽

Félix Biscarri ◽

Jesús Biscarri ◽

Rocío Millán ◽

...

Keyword(s):

Data Mining ◽

Neural Networks ◽

Pattern Recognition ◽

Text Mining ◽

Test Phase ◽

Automated System ◽

Statistical Techniques ◽

Research Fields ◽

Power Utilities ◽

The University

The MIDAS project began in 2006 as collaboration between Endesa, Sadiel, and the University of Seville. The objective of the MIDAS project is the detection of Non-Technical Losses (NTLs) on power utilities. The NTLs represent the non-billed energy due to faults or illegal manipulations in clients’ facilities. Initially, research lines study the application of techniques of data mining and neural networks. After several researches, the studies are expanded to other research fields: expert systems, text mining, statistical techniques, pattern recognition, etc. These techniques have provided an automated system for detection of NTLs on company databases. This system is in the test phase, and it is applied in real cases in company databases.

Download Full-text

Applications of Pattern Discovery Using Sequential Data Mining

Pattern Discovery Using Sequence Data Mining ◽

10.4018/978-1-61350-056-9.ch001 ◽

2012 ◽

pp. 1-23 ◽

Cited By ~ 8

Author(s):

Manish Gupta ◽

Jiawei Han

Keyword(s):

Data Mining ◽

Text Mining ◽

Intrusion Detection ◽

Pattern Mining ◽

Pattern Discovery ◽

Sequential Pattern Mining ◽

Web Usage Mining ◽

Sequential Pattern ◽

Sequential Data ◽

Mining Methods

Sequential pattern mining methods have been found to be applicable in a large number of domains. Sequential data is omnipresent. Sequential pattern mining methods have been used to analyze this data and identify patterns. Such patterns have been used to implement efficient systems that can recommend based on previously observed patterns, help in making predictions, improve usability of systems, detect events, and in general help in making strategic product decisions. In this chapter, we discuss the applications of sequential data mining in a variety of domains like healthcare, education, Web usage mining, text mining, bioinformatics, telecommunications, intrusion detection, et cetera. We conclude with a summary of the work.

Download Full-text

Analytical Competition for Managing Customer Relations

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch005 ◽

2011 ◽

pp. 25-30 ◽

Cited By ~ 1

Author(s):

Dan Zhu

Keyword(s):

Data Mining ◽

World Wide Web ◽

Text Mining ◽

World Wide ◽

Customer Relationship ◽

Service Access ◽

Web Content ◽

The World ◽

Content Mining ◽

Automated Tools

With the advent of technology, information is available in abundance on the World Wide Web. In order to have appropriate and useful information users must increasingly use techniques and automated tools to search, extract, filter, analyze and evaluate desired information and resources. Data mining can be defined as the extraction of implicit, previously unknown, and potentially useful information from large databases. On the other hand, text mining is the process of extracting the information from an unstructured text. A standard text mining approach will involve categorization of text, text clustering, and extraction of concepts, granular taxonomies production, sentiment analysis, document summarization, and modeling (Fan et al, 2006). Furthermore, Web mining is the discovery and analysis of useful information using the World Wide Web (Berry, 2002; Mobasher, 2007). This broad definition encompasses “web content mining,” the automated search for resources and retrieval of information from millions of websites and online databases, as well as “web usage mining,” the discovery and analysis of users’ website navigation and online service access patterns. Companies are investing significant amounts of time and money on creating, developing, and enhancing individualized customer relationship, a process called customer relationship management or CRM. Based on a report by the Aberdeen Group, worldwide CRM spending reached close to $20 billion by 2006. Today, to improve the customer relationship, most companies collect and refine massive amounts of data available through the customers. To increase the value of current information resources, data mining techniques can be rapidly implemented on existing software and hardware platforms, and integrated with new products and systems (Wang et al., 2008). If implemented on high-performance client/server or parallel processing computers, data mining tools can analyze enormous databases to answer customer-centric questions such as, “Which clients have the highest likelihood of responding to my next promotional mailing, and why.” This paper provides a basic introduction to data mining and other related technologies and their applications in CRM.

Download Full-text

An Application of Text Mining to Capture and Analyze eWOM

Advances in Marketing, Customer Relationship Management, and E-Services - Capturing, Analyzing, and Managing Word-of-Mouth in the Digital Marketplace ◽

10.4018/978-1-4666-9449-1.ch010 ◽

2016 ◽

pp. 168-186 ◽

Cited By ~ 2

Author(s):

Taşkın Dirsehan

Keyword(s):

Data Mining ◽

Text Mining ◽

Customer Relationship ◽

Competitive Advantages ◽

Strategic Decisions ◽

Data Mining Tool ◽

Mining Tool ◽

The Moment ◽

Mining Tools

Marketing concept has progressed through different phases of evolution in the past. At the moment, customer relationship management is considered as the last era of marketing development. The main purpose of this approach is to build long-term oriented profitable relationships with customers. So, companies should know better their customers. This knowledge can be created through a deeper analysis of companies' data with data mining tools. Companies which are able to use data mining tools will gain strong competitive advantages for their strategic decisions. Hotel industry is selected in this study, since it provides a warehouse of customer comments from which precious knowledge can be obtained if text mining as a data mining tool is used appropriately. Thus, this study attempts to explain the stages of text mining with the use of Rapidminer. As a result, different approaches according to the customer satisfaction/dissatisfaction are discussed to build competitive advantages.

Download Full-text

Text Mining in the Context of Business Intelligence

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch496 ◽

2005 ◽

pp. 2793-2798 ◽

Cited By ~ 1

Author(s):

Hércules Antonio do Prado ◽

José Palazzo Moreira de Oliveira ◽

Edilson Ferneda ◽

Leandro Krug Wives ◽

Edilberto Magalhães Silva ◽

...

Keyword(s):

Data Mining ◽

Text Mining ◽

Knowledge Discovery ◽

Business Intelligence ◽

External Environment ◽

Organizational Processes ◽

External Monitoring ◽

New Applications ◽

Textual Form ◽

Organizational Problems

Information about the external environment and organizational processes are among the most worthwhile input for business intelligence (BI). Nowadays, companies have plenty of information in structured or textual forms, either from external monitoring or from the corporative systems. In the last years, the structured part of this information stock has been massively explored by means of data-mining (DM) techniques (Wang, 2003), generating models that enable the analysts to gain insights on the solutions for organizational problems. On the text-mining (TM) side, the rhythm of new applications development did not go so fast. In an informal poll carried out in 2002 (Kdnuggets), just 4% of the knowledge-discovery-from-databases (KDD) practitioners were applying TM techniques. This fact is as intriguing as surprising if one considers that 80% of all information available in an organization comes in textual form (Tan, 1999).

Download Full-text