Anomaly detection in the Zwicky Transient Facility DR3

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stab316 ◽

2021 ◽

Author(s):

K L Malanchev ◽

M V Pruzhinskaya ◽

V S Korolev ◽

P D Aleo ◽

M V Kornilov ◽

...

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Domain Knowledge ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Light Curves ◽

Image Subtraction ◽

Scientific Application ◽

Public Data ◽

Expert Analysis

Abstract We present results from applying the SNAD anomaly detection pipeline to the third public data release of the Zwicky Transient Facility (ZTF DR3). The pipeline is composed of 3 stages: feature extraction, search of outliers with machine learning algorithms and anomaly identification with followup by human experts. Our analysis concentrates in three ZTF fields, comprising more than 2.25 million objects. A set of 4 automatic learning algorithms was used to identify 277 outliers, which were subsequently scrutinised by an expert. From these, 188 (68%) were found to be bogus light curves – including effects from the image subtraction pipeline as well as overlapping between a star and a known asteroid, 66 (24%) were previously reported sources whereas 23 (8%) correspond to non-catalogued objects, with the two latter cases of potential scientific interest (e. g. 1 spectroscopically confirmed RS Canum Venaticorum star, 4 supernovae candidates, 1 red dwarf flare). Moreover, using results from the expert analysis, we were able to identify a simple bi-dimensional relation which can be used to aid filtering potentially bogus light curves in future studies. We provide a complete list of objects with potential scientific application so they can be further scrutinised by the community. These results confirm the importance of combining automatic machine learning algorithms with domain knowledge in the construction of recommendation systems for astronomy. Our code is publicly available*.

Download Full-text

Anomaly Detection in Market Data Structures Via Machine Learning Algorithms

SSRN Electronic Journal ◽

10.2139/ssrn.3516028 ◽

2020 ◽

Author(s):

Dirk Röder ◽

Henning Mueller

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Data Structures ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Market Data

Download Full-text

Anomaly Detection Technique for Intrusion Detection in SDN Environment using Continuous Data Stream Machine Learning Algorithms

2021 IEEE International Systems Conference (SysCon) ◽

10.1109/syscon48628.2021.9447092 ◽

2021 ◽

Author(s):

Admilson de Ribamar Lima Ribeiro ◽

Reneilson Yves Carvalho Santos ◽

Anderson Clayton Alves Nascimento

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Anomaly Detection ◽

Data Stream ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Detection Technique ◽

Continuous Data

Download Full-text

Detecting TCP Flood DDoS Attack by Anomaly Detection based on Machine Learning Algorithms

10.1109/ubmk52708.2021.9558989 ◽

2021 ◽

Author(s):

Berkay Ozcam ◽

H. Hakan Kilinc ◽

Abdul Halim Zaim

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Ddos Attack

Download Full-text

A fault sensitivity analysis for anomaly detection in water distribution systems using Machine Learning algorithms

2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP) ◽

10.1109/iccp.2018.8516643 ◽

2018 ◽

Author(s):

Alexandru Predescu ◽

Mariana Mocanu ◽

Ciprian Lupu

Keyword(s):

Machine Learning ◽

Sensitivity Analysis ◽

Anomaly Detection ◽

Distribution Systems ◽

Water Distribution ◽

Learning Algorithms ◽

Water Distribution Systems ◽

Machine Learning Algorithms ◽

Fault Sensitivity

Download Full-text

Predicting Health Material Cognitive Accessibility Using Multidimensional Semantic Features and Readability Tools as Predicators (Preprint)

10.2196/preprints.29175 ◽

2021 ◽

Author(s):

Meng Ji ◽

Yanmeng Liu ◽

Tianyong Hao

Keyword(s):

Machine Learning ◽

Health Education ◽

Health Information ◽

Domain Knowledge ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Semantic Features ◽

Integrated Models ◽

Advanced Education ◽

Cognitive Accessibility

BACKGROUND Much of current health information understandability research uses medical readability formula (MRF) to assess the cognitive difficulty of health education resources. This is based on an implicit assumption that medical domain knowledge represented by uncommon words or jargons form the sole barriers to health information access among the public. Our study challenged this by showing that for readers from non-English speaking backgrounds with higher education attainment, semantic features of English health texts rather than medical jargons can explain the lack of cognitive access of health materials among readers with better understanding of health terms, yet limited exposure to English health education materials. OBJECTIVE Our study explored combined MRF and multidimensional semantic features (MSF) for developing machine learning algorithms to predict the actual level of cognitive accessibility of English health materials on health risks and diseases for specific populations. We compare algorithms to evaluate the cognitive accessibility of specialised health information for non-native English speaker with advanced education levels yet very limited exposure to English health education environments. METHODS We used 108 semantic features to measure the content complexity and accessibility of original English resources. Using 1000 English health texts collected from international health organization websites, rated by international tertiary students, we compared machine learning (decision tree, SVM, discriminant analysis, ensemble tree and logistic regression) after automatic hyperparameter optimization (grid search for the best combination of hyperparameters of minimal classification errors). We applied 10-fold cross-validation on the whole dataset for the model training and testing, calculated the AUC, sensitivity, specificity, and accuracy as the measured of the model performance. RESULTS Using two sets of predictor features: widely tested MRF and MSF proposed in our study, we developed and compared three sets of machine learning algorithms: the first set of algorithms used MRF as predictors only, the second set of algorithms used MSF as predictors only, and the last set of algorithms used both MRF and MSF as integrated models. The results showed that the integrated models outperformed in terms of AUC, sensitivity, accuracy, and specificity. CONCLUSIONS Our study showed that cognitive accessibility of English health texts is not limited to word length and sentence length conventionally measured by MRF. We compared machine learning algorithms combing MRF and MSF to explore the cognitive accessibility of health information from syntactic and semantic perspectives. The results showed the strength of integrated models in terms of statistically increased AUC, sensitivity, and accuracy to predict health resource accessibility for the target readership, indicating that both MRF and MSF contribute to the comprehension of health information, and that for readers with advanced education, semantic features outweigh syntax and domain knowledge.

Download Full-text

Intelligent optimization and machine learning algorithms for structural anomaly detection using seismic signals

Mechanical Systems and Signal Processing ◽

10.1016/j.ymssp.2019.106250 ◽

2019 ◽

Vol 133 ◽

pp. 106250

Author(s):

Maximilian Trapp ◽

Can Bogoclu ◽

Tamara Nestorović ◽

Dirk Roos

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Seismic Signals ◽

Structural Anomaly ◽

Intelligent Optimization

Download Full-text

Evaluation of Machine Learning Algorithms for Anomaly Detection in Industrial Networks

2019 IEEE International Symposium on Measurements & Networking (M&N) ◽

10.1109/iwmn.2019.8805036 ◽

2019 ◽

Cited By ~ 2

Author(s):

Giuseppe Bernieri ◽

Mauro Conti ◽

Federico Turrin

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Industrial Networks

Download Full-text

Adaptive Anomaly Detection Framework Model Objects in Cyberspace

Applied Bionics and Biomechanics ◽

10.1155/2020/6660489 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Hasan Alkahtani ◽

Theyazn H. H. Aldhyani ◽

Mohammed Al-Yaari

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Network Security ◽

Anomaly Detection ◽

Denial Of Service ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Ddos Attacks ◽

Framework Model

Telecommunication has registered strong and rapid growth in the past decade. Accordingly, the monitoring of computers and networks is too complicated for network administrators. Hence, network security represents one of the biggest serious challenges that can be faced by network security communities. Taking into consideration the fact that e-banking, e-commerce, and business data will be shared on the computer network, these data may face a threat from intrusion. The purpose of this research is to propose a methodology that will lead to a high level and sustainable protection against cyberattacks. In particular, an adaptive anomaly detection framework model was developed using deep and machine learning algorithms to manage automatically-configured application-level firewalls. The standard network datasets were used to evaluate the proposed model which is designed for improving the cybersecurity system. The deep learning based on Long-Short Term Memory Recurrent Neural Network (LSTM-RNN) and machine learning algorithms namely Support Vector Machine (SVM), K-Nearest Neighbor (K-NN) algorithms were implemented to classify the Denial-of-Service attack (DoS) and Distributed Denial-of-Service (DDoS) attacks. The information gain method was applied to select the relevant features from the network dataset. These network features were significant to improve the classification algorithm. The system was used to classify DoS and DDoS attacks in four stand datasets namely KDD cup 199, NSL-KDD, ISCX, and ICI-ID2017. The empirical results indicate that the deep learning based on the LSTM-RNN algorithm has obtained the highest accuracy. The proposed system based on the LSTM-RNN algorithm produced the highest testing accuracy rate of 99.51% and 99.91% with respect to KDD Cup’99, NSL-KDD, ISCX, and ICI-Id2017 datasets, respectively. A comparative result analysis between the machine learning algorithms, namely SVM and KNN, and the deep learning algorithms based on the LSTM-RNN model is presented. Finally, it is concluded that the LSTM-RNN model is efficient and effective to improve the cybersecurity system for detecting anomaly-based cybersecurity.

Download Full-text

Machine learning-based anomaly detection via integration of manufacturing, inspection and after-sales service data

Industrial Management & Data Systems ◽

10.1108/imds-06-2016-0195 ◽

2017 ◽

Vol 117 (5) ◽

pp. 927-945 ◽

Cited By ~ 12

Author(s):

Taehoon Ko ◽

Je Hyuk Lee ◽

Hyunchang Cho ◽

Sungzoon Cho ◽

Wounjoo Lee ◽

...

Keyword(s):

Machine Learning ◽

Quality Management ◽

Data Integration ◽

Anomaly Detection ◽

Learning Algorithms ◽

Perceived Quality ◽

Machine Learning Algorithms ◽

Series Data ◽

Content Type ◽

Service Data

Purpose Quality management of products is an important part of manufacturing process. One way to manage and assure product quality is to use machine learning algorithms based on relationship among various process steps. The purpose of this paper is to integrate manufacturing, inspection and after-sales service data to make full use of machine learning algorithms for estimating the products’ quality in a supervised fashion. Proposed frameworks and methods are applied to actual data associated with heavy machinery engines. Design/methodology/approach By following Lenzerini’s formula, manufacturing, inspection and after-sales service data from various sources are integrated. The after-sales service data are used to label each engine as normal or abnormal. In this study, one-class classification algorithms are used due to class imbalance problem. To address multi-dimensionality of time series data, the symbolic aggregate approximation algorithm is used for data segmentation. Then, binary genetic algorithm-based wrapper approach is applied to segmented data to find the optimal feature subset. Findings By employing machine learning-based anomaly detection models, an anomaly score for each engine is calculated. Experimental results show that the proposed method can detect defective engines with a high probability before they are shipped. Originality/value Through data integration, the actual customer-perceived quality from after-sales service is linked to data from manufacturing and inspection process. In terms of business application, data integration and machine learning-based anomaly detection can help manufacturers establish quality management policies that reflect the actual customer-perceived quality by predicting defective engines.

Download Full-text

Application and evaluation of selected machine learning algorithms in anomaly detection module for SOC

Developments of Artificial Intelligence Technologies in Computation and Robotics ◽

10.1142/9789811223334_0117 ◽

2020 ◽

Author(s):

A. Warzyński ◽

P. Bienias ◽

G. Kołaczek

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text