scholarly journals Beyond generalization: Enhancing accurate interpretation of flexible models

2019 ◽  
Author(s):  
Mikhail Genkin ◽  
Tatiana A. Engel

ABSTRACTMachine learning optimizes flexible models to predict data. In scientific applications, there is a rising interest in interpreting these flexible models to derive hypotheses from data. However, it is unknown whether good data prediction guarantees accurate interpretation of flexible models. We test this connection using a flexible, yet intrinsically interpretable framework for modeling neural dynamics. We find that many models discovered during optimization predict data equally well, yet they fail to match the correct hypothesis. We develop an alternative approach that identifies models with correct interpretation by comparing model features across data samples to separate true features from noise. Our results reveal that good predictions cannot substitute for accurate interpretation of flexible models and offer a principled approach to identify models with correct interpretation.

Author(s):  
Kassem Ghorayeb ◽  
Arwa Ahmed Mawlod ◽  
Alaa Maarouf ◽  
Qazi Sami ◽  
Nour El Droubi ◽  
...  

Information ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 98 ◽  
Author(s):  
Tariq Ahmad ◽  
Allan Ramsay ◽  
Hanady Ahmed

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.


2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Bowen Zheng ◽  
Grace X. Gu

AbstractDefects in graphene can profoundly impact its extraordinary properties, ultimately influencing the performances of graphene-based nanodevices. Methods to detect defects with atomic resolution in graphene can be technically demanding and involve complex sample preparations. An alternative approach is to observe the thermal vibration properties of the graphene sheet, which reflects defect information but in an implicit fashion. Machine learning, an emerging data-driven approach that offers solutions to learning hidden patterns from complex data, has been extensively applied in material design and discovery problems. In this paper, we propose a machine learning-based approach to detect graphene defects by discovering the hidden correlation between defect locations and thermal vibration features. Two prediction strategies are developed: an atom-based method which constructs data by atom indices, and a domain-based method which constructs data by domain discretization. Results show that while the atom-based method is capable of detecting a single-atom vacancy, the domain-based method can detect an unknown number of multiple vacancies up to atomic precision. Both methods can achieve approximately a 90% prediction accuracy on the reserved data for testing, indicating a promising extrapolation into unseen future graphene configurations. The proposed strategy offers promising solutions for the non-destructive evaluation of nanomaterials and accelerates new material discoveries.


Author(s):  
Tan Hui Xin ◽  
Ismahani Ismail ◽  
Ban Mohammed Khammas

Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.


Sign in / Sign up

Export Citation Format

Share Document