Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge

Author(s):  
Heng Zhang ◽  
Utpal Roy ◽  
Jeffrey Saltz
Processes ◽  
2019 ◽  
Vol 7 (11) ◽  
pp. 830 ◽  
Author(s):  
Thomas A. Duever

With the increasing availability of large amounts of data, methods that fall under the term data science are becoming important assets for chemical engineers to use. Methods, broadly speaking, are needed to carry out three tasks, namely data management, statistical and machine learning and data visualization. While claims have been made that data science is essentially statistics, consideration of the three tasks previously mentioned make it clear that it is really broader than just statistics alone and furthermore, statistical methods from a data-poor era are likely insufficient. While there have been many successful applications of data science methodologies, there are still many challenges that must be addressed. For example, just because a dataset is large, does not necessarily mean it is meaningful or information rich. From an organizational point of view, a lack of domain knowledge and a lack of a trained workforce among other issues are cited as barriers for the successful implementation of data science within an organization. Many of the methodologies employed in data science are familiar to chemical engineers; however, it is generally the case that not all the methods required to carry out data science projects are covered in an undergraduate chemical engineering program. One option to address this is to adjust the curriculum by modifying existing courses and introducing electives. Other examples include the introduction of a data science minor or a postgraduate certificate or a Master’s program in data science.


2021 ◽  
pp. 095001702097730
Author(s):  
Netta Avnoon

Drawing on theories from the sociology of work and the sociology of culture, this article argues that members of nascent technical occupations construct their professional identity and claim status through an omnivorous approach to skills acquisition. Based on a discursive analysis of 56 semi-structured in-depth interviews with data scientists, data science professors and managers in Israel, it was found that data scientists mobilise the following five resources to construct their identity: (1) ability to bridge the gap between scientist’s and engineer’s identities; (2) multiplicity of theories; (3) intensive self-learning; (4) bridging technical and social skills; and (5) acquiring domain knowledge easily. These resources diverge from former generalist-specialist identity tensions described in the literature as they attribute a higher status to the generalist-omnivore and a lower one to the specialist-snob.


2021 ◽  
Author(s):  
MUTHU RAM ELENCHEZHIAN ◽  
VAMSEE VADLAMUDI ◽  
RASSEL RAIHAN ◽  
KENNETH REIFSNIDER

Our community has a widespread knowledge on the damage tolerance and durability of the composites, developed over the past few decades by various experimental and computational efforts. Several methods have been used to understand the damage behavior and henceforth predict the material states such as residual strength (damage tolerance) and life (durability) of these material systems. Electrochemical Impedance Spectroscopy (EIS) and Broadband Dielectric Spectroscopy (BbDS) are such methods, which have been proven to identify the damage states in composites. Our previous work using BbDS method has proven to serve as precursor to identify the damage levels, indicating the beginning of end of life of the material. As a change in the material state variable is triggered by damage development, the rate of change of these states indicates the rate of damage interaction and can effectively predict impending failure. The Data-Driven Discovery of Models (D3M) [1] aims to develop model discovery systems, enabling users with domain knowledge but no data science background to create empirical models of real, complex processes. These D3M methods have been developed severely over the years in various applications and their implementation on real-time prediction for complex parameters such as material states in composites need to be trusted based on physics and domain knowledge. In this research work, we propose the use of data-driven methods combined with BbDS and progressive damage analysis to identify and hence predict material states in composites, subjected to fatigue loads.


2018 ◽  
Vol 1 (1) ◽  
pp. 139-156 ◽  
Author(s):  
Wen-wen Tung ◽  
Ashrith Barthur ◽  
Matthew C. Bowers ◽  
Yuying Song ◽  
John Gerth ◽  
...  

Author(s):  
P. Priakanth ◽  
S. Gopikrishnan

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?


Author(s):  
P. Priakanth ◽  
S. Gopikrishnan

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?


Author(s):  
Vineet Raina ◽  
Srinath Krishnamurthy

Sign in / Sign up

Export Citation Format

Share Document