Beyond generalization: Enhancing accurate interpretation of flexible models

Mapping Intimacies ◽

10.1101/808261 ◽

2019 ◽

Author(s):

Mikhail Genkin ◽

Tatiana A. Engel

Keyword(s):

Machine Learning ◽

Neural Dynamics ◽

Correct Interpretation ◽

Good Data ◽

Accurate Interpretation ◽

Data Prediction ◽

Alternative Approach ◽

Correct Hypothesis ◽

Model Features ◽

Flexible Models

ABSTRACTMachine learning optimizes flexible models to predict data. In scientific applications, there is a rising interest in interpreting these flexible models to derive hypotheses from data. However, it is unknown whether good data prediction guarantees accurate interpretation of flexible models. We test this connection using a flexible, yet intrinsically interpretable framework for modeling neural dynamics. We find that many models discovered during optimization predict data equally well, yet they fail to match the correct hypothesis. We develop an alternative approach that identifies models with correct interpretation by comparing model features across data samples to separate true features from noise. Our results reveal that good predictions cannot substitute for accurate interpretation of flexible models and offer a principled approach to identify models with correct interpretation.

Download Full-text

Improving Power Grid Monitoring Data Quality: An Efficient Machine Learning Framework for Missing Data Prediction

2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems ◽

10.1109/hpcc-css-icess.2015.16 ◽

2015 ◽

Cited By ~ 10

Author(s):

Weiwei Shi ◽

Yongxin Zhu ◽

Jinkui Zhang ◽

Xiang Tao ◽

Gehao Sheng ◽

...

Keyword(s):

Machine Learning ◽

Missing Data ◽

Data Quality ◽

Power Grid ◽

Monitoring Data ◽

Learning Framework ◽

Data Prediction ◽

Grid Monitoring ◽

Efficient Machine ◽

Missing Data Prediction

Download Full-text

Chain-based machine learning for full PVT data prediction

Journal of Petroleum Science and Engineering ◽

10.1016/j.petrol.2021.109658 ◽

2021 ◽

pp. 109658

Author(s):

Kassem Ghorayeb ◽

Arwa Ahmed Mawlod ◽

Alaa Maarouf ◽

Qazi Sami ◽

Nour El Droubi ◽

...

Keyword(s):

Machine Learning ◽

Pvt Data ◽

Data Prediction

Download Full-text

Audit Fraud Data Prediction Using Machine Learning Algorithms

Algorithms for Intelligent Systems - Proceedings of International Conference on Communication and Computational Technologies ◽

10.1007/978-981-15-5077-5_38 ◽

2020 ◽

pp. 413-419

Author(s):

Ankita Sharma ◽

Amit Sinhal ◽

Manish Tiwari ◽

Mayank Patel

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Prediction

Download Full-text

Big Data Prediction in Location-Aware Wireless Caching: A Machine Learning Approach

2019 IEEE Global Communications Conference (GLOBECOM) ◽

10.1109/globecom38437.2019.9014068 ◽

2019 ◽

Author(s):

Yunzhe Qi ◽

Zhong Yang ◽

Zhijin Qin ◽

Yuanwei Liu ◽

Yue Chen

Keyword(s):

Machine Learning ◽

Big Data ◽

Learning Approach ◽

Location Aware ◽

Data Prediction ◽

Machine Learning Approach

Download Full-text

Stock Market Data Prediction Using Machine Learning Techniques

Advances in Intelligent Systems and Computing - Information Technology and Systems ◽

10.1007/978-3-030-11890-7_52 ◽

2019 ◽

pp. 539-547

Author(s):

Edgar P. Torres P. ◽

Myriam Hernández-Álvarez ◽

Edgar A. Torres Hernández ◽

Sang Guun Yoo

Keyword(s):

Machine Learning ◽

Stock Market ◽

Machine Learning Techniques ◽

Market Data ◽

Data Prediction ◽

Learning Techniques

Download Full-text

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

Download Full-text

Machine Learning-Based Detection of Graphene Defects with Atomic Precision

Nano-Micro Letters ◽

10.1007/s40820-020-00519-w ◽

2020 ◽

Vol 12 (1) ◽

Author(s):

Bowen Zheng ◽

Grace X. Gu

Keyword(s):

Machine Learning ◽

Material Design ◽

Thermal Vibration ◽

Single Atom ◽

Complex Data ◽

Alternative Approach ◽

Atomic Precision ◽

Data Driven Approach ◽

New Material ◽

Domain Discretization

AbstractDefects in graphene can profoundly impact its extraordinary properties, ultimately influencing the performances of graphene-based nanodevices. Methods to detect defects with atomic resolution in graphene can be technically demanding and involve complex sample preparations. An alternative approach is to observe the thermal vibration properties of the graphene sheet, which reflects defect information but in an implicit fashion. Machine learning, an emerging data-driven approach that offers solutions to learning hidden patterns from complex data, has been extensively applied in material design and discovery problems. In this paper, we propose a machine learning-based approach to detect graphene defects by discovering the hidden correlation between defect locations and thermal vibration features. Two prediction strategies are developed: an atom-based method which constructs data by atom indices, and a domain-based method which constructs data by domain discretization. Results show that while the atom-based method is capable of detecting a single-atom vacancy, the domain-based method can detect an unknown number of multiple vacancies up to atomic precision. Both methods can achieve approximately a 90% prediction accuracy on the reserved data for testing, indicating a promising extrapolation into unseen future graphene configurations. The proposed strategy offers promising solutions for the non-destructive evaluation of nanomaterials and accelerates new material discoveries.

Download Full-text

An alternative approach for machine learning seismic interpretation and its application in Daqing Oilfield

10.1190/segam2018-2989898.1 ◽

2018 ◽

Author(s):

Yile Ao ◽

Hongqi Li ◽

Zhongguo Yang ◽

Liping Zhu

Keyword(s):

Machine Learning ◽

Seismic Interpretation ◽

Alternative Approach

Download Full-text

Obfuscated computer virus detection using machine learning algorithm

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v8i4.1584 ◽

2019 ◽

Vol 8 (4) ◽

Author(s):

Tan Hui Xin ◽

Ismahani Ismail ◽

Ban Mohammed Khammas

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Virus Detection ◽

Computer Virus ◽

Machine Learning Technique ◽

Memory Space ◽

Alternative Approach ◽

Machine Learning Approach ◽

Learning Technique ◽

String Feature

Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.

Download Full-text

LONGITUDINAL DATA PREDICTION IN EHR: COMPARISON OF GLMM AND MACHINE LEARNING METHODS

10.23860/thesis-cao-wenqiu-2019 ◽

2019 ◽

Author(s):

◽

Wenqiu Cao

Keyword(s):

Machine Learning ◽

Longitudinal Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Data Prediction

Download Full-text