EFFICIENT PREPROCESSING FOR WEB LOG COMPRESSION

International Journal of Computing ◽

10.47839/ijc.7.1.487 ◽

2014 ◽

pp. 35-42

Author(s):

Sebastian Deorowicz ◽

Szymon Grabowski

Keyword(s):

General Purpose ◽

Test Results ◽

Log Data ◽

Web Log ◽

Log Files ◽

User Activity ◽

Sequence Substitution

Web log files, storing user activity on a server, may grow at the pace of hundreds of megabytes a day, or even more, on popular sites. They are usually archived, as it enables further analysis, e.g., for detecting attacks or other server abuse patterns. In this work we present a specialized lossless Apache web log preprocessor and test it with combination of several popular general-purpose compressors. Our method works on individual fields of log data (each storing such information like the client’s IP, date/time, requested file or query, download size in bytes, etc.), and utilizes such compression techniques like finding and extracting common prefixes and suffixes, dictionary-based phrase sequence substitution, move-to-front coding, and more. The test results show the proposed transform improves the average compression ratios 2.70 times in case of gzip and 1.86 times in case of bzip2.

Download Full-text

Tracking User Interaction with Web and Assisting in Targeted Communication

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35037 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 605-611

Author(s):

Dheeraj Ahuja

Keyword(s):

Social Media ◽

Digital Technology ◽

User Interaction ◽

User Performance ◽

Support Design ◽

User Interactions ◽

Log Data ◽

Web Log ◽

Log Files ◽

The Web

Today, we spend most of our time online using some form of digital technology (such as search engines, news portals, or social media sites). Our online presence keeps us involved most of the time and provides a lot of information to Internet customers. The development of the web is excellent because every day about a million pages are added. Due to the massive use of the network, the log files of the network increase at a faster rate and the scope becomes enormous. Web Usage Mining uses mining technology on log data to extract user performance, which is used in different applications such as support design, e-commerce, service modification, prefetch, etc. In this paper, we propose a tool that users can use to collect data on their website, and then use this web log data to track user interactions on your website, which helps in targeted communication.

Download Full-text

A study on website log data analysis methodology by transition probability

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.12.11118 ◽

2018 ◽

Vol 7 (2.12) ◽

pp. 171

Author(s):

Jae Kyeong Lee ◽

Mi Hwan Hyun ◽

Dong Gu Shin

Keyword(s):

Data Analysis ◽

Transition Matrix ◽

Transition Probability ◽

Probability Model ◽

Transition Probability Matrix ◽

General Purpose ◽

Business Decisions ◽

Log Data ◽

Web Log ◽

Log Data Analysis

Background/Objectives: To measure occupancy using transition probability matrix as a data analysis method to predict future requirements for web use. From this study, Executives facing business challenges can enhance the decision-making process for management and can be provided quantified evidence.Methods/Statistical analysis: Transition matrix and transition probability matrix are estimated if web users’ webpage use patterns are tied with frequency, using web log data. Occupancy is forecasted based on a Markov chain model.Findings: Data analysis from the perspective of web log-based marketing mostly focuses on increasing traffic and improving transition rates. However, general-purpose tools such as Google Analytics provide diverse web log data. In assumption of independence on users’ page reload, occupancy can be easily estimated through matrix on page reload (transition). As a result, we obtained slightly different results from the usual method that reported only frequency. In particular, rather than making business decisions with the frequency of absolute concepts, we were able to identify the top priority services through the percentage value of relative concepts.Improvements/Applications: The occupancy prediction using transition matrix is about future prediction based on previous information. However, it differs from marketing techniques in that it is estimated based on probability. In addition, it is able to predict more accurately through a probability model.

Download Full-text

Design and Implementation of a Data Parsing Module for Power Information Equipment Log Management System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.765-767.1092 ◽

2013 ◽

Vol 765-767 ◽

pp. 1092-1097

Author(s):

Yi Ting Zhang ◽

Bin Wang ◽

Zhi Hui Zhang

Keyword(s):

Sql Server ◽

Relational Model ◽

Test Results ◽

Good Efficiency ◽

Log Data ◽

Development Platform ◽

Log Files ◽

System Information ◽

Log Data Analysis ◽

Sql Server 2005

in order to manage the log information of Windows servers, Linux servers, network devices and security devices in a unified, so as to query log data, analysis and audit log data conveniently, a program is put forward, in which a variety of power system information devices log data be converted into a unified relational model and integrated into the database. The data parsing module uses the Windows Workflow procedure to select, clean and merge the massive log data. The database is created and operated by Microsoft SQL Server 2005 development platform. All of the log files have to be converted into a unified format and saved in centralized storage. Experiments and test results show that the module has a good efficiency of data processing and integration, and it greatly increases the proportion of valid data. It provides supports for efficient log auditing and fault diagnosis in the future.

Download Full-text

A Web Usage Mining Approach to User Navigation Pattern and Prediction in Web Log Data

International Journal of Scientific Research ◽

10.15373/22778179/apr2014/34 ◽

2012 ◽

Vol 3 (4) ◽

pp. 92-94

Author(s):

SUJATHA PADMAKUMAR ◽

◽

Dr.PUNITHAVALLI Dr.PUNITHAVALLI ◽

Dr.RANJITH Dr.RANJITH

Keyword(s):

Web Usage Mining ◽

Log Data ◽

Web Log ◽

Web Usage ◽

User Navigation

Download Full-text

Integrating Genetic Algorithm with Random Forest for Improving the Classification Performance of Web Log Data

2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC) ◽

10.1109/pdgc50313.2020.9315807 ◽

2020 ◽

Author(s):

Ruchi Mittal ◽

Varun Malik ◽

Vikram Singh ◽

Jaiteg Singh ◽

Amandeep Kaur

Keyword(s):

Genetic Algorithm ◽

Random Forest ◽

Classification Performance ◽

Log Data ◽

Web Log

Download Full-text

The Efficacy of Facebook in Teaching and Learning: Studied via Content Analysis of Web Log Data

Procedia Computer Science ◽

10.1016/j.procs.2019.11.149 ◽

2019 ◽

Vol 161 ◽

pp. 493-501

Author(s):

Suleiman Alsaif ◽

Alice S Li ◽

Ben Soh ◽

Sara Alraddady

Keyword(s):

Content Analysis ◽

Teaching And Learning ◽

Log Data ◽

Web Log

Download Full-text

Mining Web Log Data for Personalized Recommendation System

2018 6th International Conference on Information and Communication Technology (ICoICT) ◽

10.1109/icoict.2018.8528799 ◽

2018 ◽

Author(s):

Asma Rosyidah ◽

Isti Surjandari ◽

Zulkarnain

Keyword(s):

Recommendation System ◽

Personalized Recommendation ◽

Log Data ◽

Web Log

Download Full-text

Extracting the User’s Interests from Web Log Data using A Time Based Algorithm

International Journal of Trend in Scientific Research and Development ◽

10.31142/ijtsrd2474 ◽

2017 ◽

Vol Volume-1 (Issue-6) ◽

pp. 477-482

Author(s):

K. Srinivasa Rao ◽

Dr. A. Ramesh Babu ◽

Dr. M. Krishna Murthy ◽

Keyword(s):

Log Data ◽

Web Log

Download Full-text

Web Log Data Analysis by Enhanced Fuzzy C Means Clustering

International Journal on Computational Science & Applications ◽

10.5121/ijcsa.2014.4209 ◽

2014 ◽

Vol 4 (2) ◽

pp. 81-95 ◽

Cited By ~ 3

Author(s):

V .Chitraa ◽

Antony Selvadoss Thanamani

Keyword(s):

Data Analysis ◽

Log Data ◽

Web Log ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering ◽

Log Data Analysis

Download Full-text

Review on Classification of Web Log Data using CART Algorithm

International Journal of Computer Applications ◽

10.5120/13972-1879 ◽

2013 ◽

Vol 80 (17) ◽

pp. 41-43 ◽

Cited By ~ 1

Author(s):

Jagriti Chand ◽

Abhishek Singh Chauhan ◽

Ashish Kumar Shrivastava

Keyword(s):

Log Data ◽

Web Log ◽

Cart Algorithm

Download Full-text