scholarly journals One-Pass Inconsistency Detection Algorithms for Big Data

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 22377-22394
Author(s):  
Meifan Zhang ◽  
Hongzhi Wang ◽  
Jianzhong Li ◽  
Hong Gao
2019 ◽  
Vol 17 (2) ◽  
pp. 272-280
Author(s):  
Adeel Hashmi ◽  
Tanvir Ahmad

Anomaly/Outlier detection is the process of finding abnormal data points in a dataset or data stream. Most of the anomaly detection algorithms require setting of some parameters which significantly affect the performance of the algorithm. These parameters are generally set by hit-and-trial; hence performance is compromised with default or random values. In this paper, the authors propose a self-optimizing algorithm for anomaly detection based on firefly meta-heuristic, and named as Firefly Algorithm for Anomaly Detection (FAAD). The proposed solution is a non-clustering unsupervised learning approach for anomaly detection. The algorithm is implemented on Apache Spark for scalability and hence the solution can handle big data as well. Experiments were conducted on various datasets, and the results show that the proposed solution is much accurate than the standard algorithms of anomaly detection.


Author(s):  
Dhanya Sudhakaran ◽  
Shini Renjith

Community detection is a common problem in graph and big data analytics. It consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large-scale networks is an important task in many scientific domains. Community detection algorithms in literature proves to be less efficient, as it leads to generation of communities with noisy interactions. To address this limitation, there is a need to develop a system which identifies the best community among multi-dimensional networks based on relevant selection criteria and dimensionality of entities, thereby eliminating the noisy interactions in a real-time environment.


2020 ◽  
Vol 5 (1) ◽  
pp. 1
Author(s):  
Omar Alghushairy ◽  
Raed Alsini ◽  
Terence Soule ◽  
Xiaogang Ma

Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams.


Author(s):  
Alireza Vahdatpour ◽  
Cynthia A. Lucero-Obusan ◽  
Chris Lee ◽  
Gina Oda ◽  
Patricia Schirmer ◽  
...  

 We evaluated the specificity of Praedico Biosurveillance, a next generation biosurveillance application leveraging multiple detection algorithms, big data and machine learning, for VA outpatient syndromic surveillance alerting during the period of June 2014 thru May 2015, and compared it to the Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE). Praedicoâ„¢ Biosurveillance generated alerts were significantly lower compared to ESSENCE generated alerts across all major syndromic syndromes and demonstrated higher sensitivity to seasons (i.e., ILI activity in winter). Reducing alerting fatigue would enhance specificity of computer-generated alerts, promoting more usage and gradual improvement in the algorithm's output.


Author(s):  
Sobin C. C. ◽  
Vaskar Raychoudhury ◽  
Snehanshu Saha

The amount of data generated by online social networks such as Facebook, Twitter, etc., has recently experienced an enormous growth. Extracting useful information such as community structure, from such large networks is very important in many applications. Community is a collection of nodes, having dense internal connections and sparse external connections. Community detection algorithms aim to group nodes into different communities by extracting similarities and social relations between nodes. Although, many community detection algorithms in literature, they are not scalable enough to handle large volumes of data generated by many of the today's big data applications. So, researchers are focusing on developing parallel community detection algorithms, which can handle networks consisting of millions of edges and vertices. In this article, we present a comprehensive survey of parallel community detection algorithms, which is the first ever survey in this domain, although, multiple papers exist in literature related to sequential community detection algorithms.


Community detection is a nowadays research problem in the Big Data era related to huge volume, variety, and velocity of data. Big data defines data where normal processing, storage, retrieval fails and require some advanced tools to solve these types of problem. An important tool in the analysis of complex network is community detection. Community detection or community mining is a technique which is used to find the same type of relations in a particular group. Community detection is also known as Graph Clustering. This paper represents Big data in the form of graphs and detects community via some graph algorithms like METIS, Spectral Partitioning, hierarchical clustering, Markov Clustering, Genetic Algorithm based community detection algorithm, etc. Community detection is widely used in various types of disease detection, drug formation, species clustering. It can be also used in social networking sites to control crimes by detecting community bad peoples.


ASHA Leader ◽  
2013 ◽  
Vol 18 (2) ◽  
pp. 59-59
Keyword(s):  

Find Out About 'Big Data' to Track Outcomes


Sign in / Sign up

Export Citation Format

Share Document