Evaluating GAN-Based Image Augmentation for Threat Detection in Large-Scale Xray Security Images

Joanna Kazzandra Dumagpi; Yong-Jin Jeong

doi:10.3390/app11010036

Evaluating GAN-Based Image Augmentation for Threat Detection in Large-Scale Xray Security Images

Applied Sciences ◽

10.3390/app11010036 ◽

2020 ◽

Vol 11 (1) ◽

pp. 36

Author(s):

Joanna Kazzandra Dumagpi ◽

Yong-Jin Jeong

Keyword(s):

Computer Vision ◽

Large Scale ◽

False Positive Rate ◽

Image Synthesis ◽

True Positive Rate ◽

Generative Adversarial Networks ◽

X Ray ◽

Imbalance Problem ◽

High True Positive Rate ◽

Positive Rate

The inherent imbalance in the data distribution of X-ray security images is one of the most challenging aspects of computer vision algorithms applied in this domain. Most of the prior studies in this field have ignored this aspect, limiting their application in the practical setting. This paper investigates the effect of employing Generative Adversarial Networks (GAN)-based image augmentation, or image synthesis, in improving the performance of computer vision algorithms on an imbalanced X-ray dataset. We used Deep Convolutional GAN (DCGAN) to generate new X-ray images of threat objects and Cycle-GAN to translate camera images of threat objects to X-ray images. We synthesized new X-ray security images by combining threat objects with background X-ray images, which are used to augment the dataset. Then, we trained various Faster (Region Based Convolutional Neural Network) R-CNN models using different augmentation approaches and evaluated their performance on a large-scale practical X-ray image dataset. Experiment results show that image synthesis is an effective approach to combating the imbalance problem by significantly reducing the false-positive rate (FPR) by up to 15.3%. The FPR is further improved by up to 19.9% by combining image synthesis and conventional image augmentation. Meanwhile, a relatively high true positive rate (TPR) of about 94% was maintained regardless of the augmentation method used.

Download Full-text

PF : Website Fingerprinting Attack Using Probabilistic Topic Model

Security and Communication Networks ◽

10.1155/2021/3265300 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Hongcheng Zou ◽

Ziling Wei ◽

Jinshu Su ◽

Baokang Zhao ◽

Yusheng Xia ◽

...

Keyword(s):

Topic Model ◽

False Positive Rate ◽

True Positive Rate ◽

The Other ◽

Open World ◽

Closed World ◽

Probabilistic Topic Model ◽

High True Positive Rate ◽

Positive Rate ◽

Fingerprinting Attack

Website fingerprinting (WFP) attack enables identifying the websites a user is browsing even under the protection of privacy-enhancing technologies (PETs). Previous studies demonstrate that most machine-learning attacks need multiple types of features as input, thus inducing tremendous feature engineering work. However, we show the other alternative. That is, we present Probabilistic Fingerprinting (PF), a new website fingerprinting attack that merely leverages one type of features. They are produced by using a mathematical model PWFP that combines a probabilistic topic model with WFP for the first time, due to a finding that a plain text and the sequence file generated from a traffic instance are essentially the same. Experimental results show that the proposed new features are more distinguishing than the existing features. In a closed-world setting, PF attains a better accuracy performance (99.79% at most) than prior attacks on various datasets gathered in the scenarios of Shadowsocks, SSH, and TLS, respectively. Besides, even when the number of training instances drops to as few as 4, PF still reaches an accuracy of above 90%. In the more realistic open-world setting, PF attains a high true positive rate (TPR) and Bayes detection rate (BDR), and a low false positive rate (FPR) in all evaluations, which outperforms the other attacks. These results highlight that it is meaningful and possible to explore new features to improve the accuracy of WFP attacks.

Download Full-text

Autoencoder-Based Anomaly Detection in Industrial X-ray Images

10.1115/qnde2021-74428 ◽

2021 ◽

Author(s):

Erik Lindgren ◽

Christopher Zach

Keyword(s):

False Positive Rate ◽

Image Interpretation ◽

True Positive Rate ◽

Image Model ◽

Confidence Estimation ◽

X Ray ◽

Automatic Interpretation ◽

Positive Rate ◽

Public Dataset ◽

Fusion Welds

Abstract Within many quality-critical industries, e.g. the aerospace industry, industrial X-ray inspection is an essential as well as a resource intense part of quality control. Within such industries the X-ray image interpretation is typically still done by humans, therefore, increasing the interpretation automatization would be of great value. We claim, that safe automatic interpretation of industrial X-ray images, requires a robust confidence estimation with respect to out-of-distribution (OOD) data. In this work we have explored if such a confidence estimation can be achieved by comparing input images with a model of the accepted images. For the image model we derived an autoencoder which we trained unsupervised on a public dataset with X-ray images of metal fusion-welds. We achieved a true positive rate at 80–90% at a 4% false positive rate, as well as correctly detected an OOD data example as an anomaly.

Download Full-text

Cyber Situation Comprehension for IoT Systems based on APT Alerts and Logs Correlation

Sensors ◽

10.3390/s19184045 ◽

2019 ◽

Vol 19 (18) ◽

pp. 4045 ◽

Cited By ~ 1

Author(s):

Xiang Cheng ◽

Jiale Zhang ◽

Bing Chen

Keyword(s):

Data Transmission ◽

False Positive Rate ◽

Correlation Method ◽

True Positive Rate ◽

True Positive ◽

Advanced Persistent Threat ◽

Transmission Cost ◽

Large Numbers ◽

High True Positive Rate ◽

Positive Rate

With the emergence of the Advanced Persistent Threat (APT) attacks, many Internet of Things (IoT) systems have faced large numbers of potential threats with the characteristics of concealment, permeability, and pertinence. However, existing methods and technologies cannot provide comprehensive and prompt recognition of latent APT attack activities in the IoT systems. To address this problem, we propose an APT Alerts and Logs Correlation Method, named APTALCM and a framework of deploying APTALCM on the IoT system, where an edge computing architecture was used to achieve cyber situation comprehension without too much data transmission cost. Specifically, we firstly present a cyber situation ontology for modeling the concepts and properties to formalize APT attack activities in the IoT systems. Then, we introduce a cyber situation instance similarity measurement method based on the SimRank mechanism for APT alerts and logs Correlation. Combining with instance similarity, we further propose an APT alert instances correlation method to reconstruct APT attack scenarios and an APT log instances correlation method to detect log instance communities. Through the coalescence of these methods, APTALCM can accomplish the cyber situation comprehension effectively by recognizing the APT attack intentions in the IoT systems. The exhaustive experimental results demonstrate that the two kernel modules, i.e., Alert Instance Correlation Module (AICM) and Log Instance Correlation Module (LICM) in our APTALCM, can achieve both high true-positive rate and low false-positive rate.

Download Full-text

MCS-RF: mobile crowdsensing–based air quality estimation with random forest

International Journal of Distributed Sensor Networks ◽

10.1177/1550147718804702 ◽

2018 ◽

Vol 14 (10) ◽

pp. 155014771880470 ◽

Cited By ~ 1

Author(s):

Cheng Feng ◽

Ye Tian ◽

Xiangyang Gong ◽

Xirong Que ◽

Wendong Wang

Keyword(s):

Random Forest ◽

Urban Areas ◽

Large Scale ◽

Characteristic Curve ◽

False Positive Rate ◽

Estimation Method ◽

True Positive Rate ◽

Mobile Crowdsensing ◽

Fine Grained ◽

Positive Rate

It is a great challenge to offer a fine-grained and accurate PM2.5 monitoring service in urban areas as required facilities are very expensive and huge. Since PM2.5 has a significant scattering effect on visible light, large-scale user-contributed image data collected by the mobile crowdsensing bring a new opportunity for understanding the urban PM2.5. In this article, we propose a fine-grained PM2.5 estimation method based on random forest with data announced by meteorological departments and collected from smartphone users without any PM2.5 measurement devices. We design and implement a platform to collect data in the real world including the image provided by users. By combining online learning and offline learning, the method based on random forest performs well in terms of time complexity and accuracy. We compare our method with two kinds of baselines: subsets of the whole data sets and six classical models (such as logistic, naive Bayes). Six kinds of evaluation indexes (precision, recall, true-positive rate, false-positive rate, F-measure, and receiver operating characteristic curve area) are used in the evaluation. The experimental results show that our method achieves high accuracy (precision: 0.875, recall: 0.872) on PM2.5 estimation, which outperforms the other methods.

Download Full-text

Computer-aided COVID-19 diagnosis and a comparison of deep learners using augmented CXRs

Journal of X-Ray Science and Technology ◽

10.3233/xst-211047 ◽

2021 ◽

pp. 1-21

Author(s):

Asma Naseer ◽

Maria Tamoor ◽

Arifah Azhar

Keyword(s):

Neural Network ◽

Short Term Memory ◽

False Positive Rate ◽

False Negative ◽

True Positive Rate ◽

X Rays ◽

Infected People ◽

Computer Aided ◽

High True Positive Rate ◽

Positive Rate

Background: Coronavirus Disease 2019 (COVID-19) is contagious, producing respiratory tract infection, caused by a newly discovered coronavirus. Its death toll is too high, and early diagnosis is the main problem nowadays. Infected people show a variety of symptoms such as fatigue, fever, tastelessness, dry cough, etc. Some other symptoms may also be manifested by radiographic visual identification. Therefore, Chest X-Rays (CXR) play a key role in the diagnosis of COVID-19. Methods: In this study, we use Chest X-Rays images to develop a computer-aided diagnosis (CAD) of the disease. These images are used to train two deep networks, the Convolution Neural Network (CNN), and the Long Short-Term Memory Network (LSTM) which is an artificial Recurrent Neural Network (RNN). The proposed study involves three phases. First, the CNN model is trained on raw CXR images. Next, it is trained on pre-processed CXR images and finally enhanced CXR images are used for deep network CNN training. Geometric transformations, color transformations, image enhancement, and noise injection techniques are used for augmentation. From augmentation, we get 3,220 augmented CXRs as training datasets. In the final phase, CNN is used to extract the features of CXR imagery that are fed to the LSTM model. The performance of the four trained models is evaluated by the evaluation techniques of different models, including accuracy, specificity, sensitivity, false-positive rate, and receiver operating characteristic (ROC) curve. Results: We compare our results with other benchmark CNN models. Our proposed CNN-LSTM model gives superior accuracy (99.02%) than the other state-of-the-art models. Our method to get improved input, helped the CNN model to produce a very high true positive rate (TPR 1) and no false-negative result whereas false negative was a major problem while using Raw CXR images. Conclusions: We conclude after performing different experiments that some image pre-processing and augmentation, remarkably improves the results of CNN-based models. It will help a better early detection of the disease that will eventually reduce the mortality rate of COVID.

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

PRATD: A Phased Remote Access Trojan Detection Method with Double-Sided Features

Electronics ◽

10.3390/electronics9111894 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1894

Author(s):

Chun Guo ◽

Zihua Song ◽

Yuan Ping ◽

Guowei Shen ◽

Yuhei Cui ◽

...

Keyword(s):

False Positive ◽

Detection Method ◽

False Positive Rate ◽

True Positive Rate ◽

Remote Access ◽

Detection Methods ◽

Security Threats ◽

True Positive ◽

Trojan Detection ◽

Positive Rate

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.

Download Full-text

Ascertaining an efficient eligibility cut-off for extended Medicare items for eating disorders

Australasian Psychiatry ◽

10.1177/10398562211028632 ◽

2021 ◽

pp. 103985622110286

Author(s):

Tracey Wade ◽

Jamie-Lee Pennesi ◽

Yuan Zhou

Keyword(s):

Eating Disorders ◽

Eating Disorder ◽

False Positive Rate ◽

Area Under The Curve ◽

Rate Sensitivity ◽

True Positive Rate ◽

Eating Disorder Examination Questionnaire ◽

Eating Disorder Examination ◽

Positive Rate ◽

The Relationship

Objective: Currently eligibility for expanded Medicare items for eating disorders (excluding anorexia nervosa) require a score ⩾ 3 on the 22-item Eating Disorder Examination-Questionnaire (EDE-Q). We compared these EDE-Q “cases” with continuous scores on a validated 7-item version of the EDE-Q (EDE-Q7) to identify an EDE-Q7 cut-off commensurate to 3 on the EDE-Q. Methods: We utilised EDE-Q scores of female university students ( N = 337) at risk of developing an eating disorder. We used a receiver operating characteristic (ROC) curve to assess the relationship between the true-positive rate (sensitivity) and the false-positive rate (1-specificity) of cases ⩾ 3. Results: The area under the curve showed outstanding discrimination of 0.94 (95% CI: .92–.97). We examined two specific cut-off points on the EDE-Q7, which included 100% and 87% of true cases, respectively. Conclusion: Given the EDE-Q cut-off for Medicare is used in conjunction with other criteria, we suggest using the more permissive EDE-Q7 cut-off (⩾2.5) to replace use of the EDE-Q cut-off (⩾3) in eligibility assessments.

Download Full-text

Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records

Political Analysis ◽

10.1093/pan/mpw001 ◽

2016 ◽

Vol 24 (2) ◽

pp. 263-272 ◽

Cited By ~ 29

Author(s):

Kosuke Imai ◽

Kabir Khanna

Keyword(s):

Mean Squared Error ◽

False Positive Rate ◽

True Positive Rate ◽

Voter Registration ◽

Racial Groups ◽

Ecological Inference ◽

Inference Problem ◽

Individual Level ◽

Positive Rate ◽

Election Results

In both political behavior research and voting rights litigation, turnout and vote choice for different racial groups are often inferred using aggregate election results and racial composition. Over the past several decades, many statistical methods have been proposed to address this ecological inference problem. We propose an alternative method to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we use Bayes's rule to combine the Census Bureau's Surname List with various information from geocoded voter registration records. We evaluate the performance of the proposed methodology using approximately nine million voter registration records from Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate above 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates yields substantially less amounts of bias and root mean squared error than standard ecological inference estimates. We provide open-source software to implement the proposed methodology.

Download Full-text

Watch For Failing Objects: What Inappropriate Compliance Reveals About Shared Mental Models In Autonomous Cars

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181321651081 ◽

2021 ◽

Vol 65 (1) ◽

pp. 643-647

Author(s):

Yosef S. Razin ◽

Jack Gale ◽

Jiaojiao Fan ◽

Jaznae’ Smith ◽

Karen M. Feigh

Keyword(s):

Mental Models ◽

Mental Model ◽

False Positive Rate ◽

Ground Truth ◽

True Positive Rate ◽

Shared Mental Models ◽

Shared Mental Model ◽

Autonomous Cars ◽

Positive Rate ◽

Dispositional Factors

This paper evaluates Banks et al.’s Human-AI Shared Mental Model theory by examining how a self-driving vehicle’s hazard assessment facilitates shared mental models. Participants were asked to affirm the vehicle’s assessment of road objects as either hazards or mistakes in real-time as behavioral and subjective measures were collected. The baseline performance of the AI was purposefully low (<50%) to examine how the human’s shared mental model might lead to inappropriate compliance. Results indicated that while the participant true positive rate was high, overall performance was reduced by the large false positive rate, indicating that participants were indeed being influenced by the Al’s faulty assessments, despite full transparency as to the ground-truth. Both performance and compliance were directly affected by frustration, mental, and even physical demands. Dispositional factors such as faith in other people’s cooperativeness and in technology companies were also significant. Thus, our findings strongly supported the theory that shared mental models play a measurable role in performance and compliance, in a complex interplay with trust.

Download Full-text