Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge

Data Science in the Chemical Engineering Curriculum

Processes ◽

10.3390/pr7110830 ◽

2019 ◽

Vol 7 (11) ◽

pp. 830 ◽

Cited By ~ 1

Author(s):

Thomas A. Duever

Keyword(s):

Domain Knowledge ◽

Data Science ◽

Chemical Engineering ◽

Point Of View ◽

Successful Implementation ◽

Master's Program ◽

Engineering Curriculum ◽

Engineering Program ◽

Science Projects ◽

Term Data

With the increasing availability of large amounts of data, methods that fall under the term data science are becoming important assets for chemical engineers to use. Methods, broadly speaking, are needed to carry out three tasks, namely data management, statistical and machine learning and data visualization. While claims have been made that data science is essentially statistics, consideration of the three tasks previously mentioned make it clear that it is really broader than just statistics alone and furthermore, statistical methods from a data-poor era are likely insufficient. While there have been many successful applications of data science methodologies, there are still many challenges that must be addressed. For example, just because a dataset is large, does not necessarily mean it is meaningful or information rich. From an organizational point of view, a lack of domain knowledge and a lack of a trained workforce among other issues are cited as barriers for the successful implementation of data science within an organization. Many of the methodologies employed in data science are familiar to chemical engineers; however, it is generally the case that not all the methods required to carry out data science projects are covered in an undergraduate chemical engineering program. One option to address this is to adjust the curriculum by modifying existing courses and introducing electives. Other examples include the introduction of a data science minor or a postgraduate certificate or a Master’s program in data science.

Download Full-text

Data Scientists’ Identity Work: Omnivorous Symbolic Boundaries in Skills Acquisition

Work Employment and Society ◽

10.1177/0950017020977306 ◽

2021 ◽

pp. 095001702097730

Author(s):

Netta Avnoon

Keyword(s):

Professional Identity ◽

Domain Knowledge ◽

Data Science ◽

Identity Work ◽

Skills Acquisition ◽

Sociology Of Culture ◽

Sociology Of Work ◽

Discursive Analysis ◽

Self Learning ◽

Depth Interviews

Drawing on theories from the sociology of work and the sociology of culture, this article argues that members of nascent technical occupations construct their professional identity and claim status through an omnivorous approach to skills acquisition. Based on a discursive analysis of 56 semi-structured in-depth interviews with data scientists, data science professors and managers in Israel, it was found that data scientists mobilise the following five resources to construct their identity: (1) ability to bridge the gap between scientist’s and engineer’s identities; (2) multiplicity of theories; (3) intensive self-learning; (4) bridging technical and social skills; and (5) acquiring domain knowledge easily. These resources diverge from former generalist-specialist identity tensions described in the literature as they attribute a higher status to the generalist-omnivore and a lower one to the specialist-snob.

Download Full-text

DATA-DRIVEN DISCOVERY OF MATERIAL STATES IN COMPOSITES UNDER FATIGUE LOADS

10.12783/asc36/35783 ◽

2021 ◽

Author(s):

MUTHU RAM ELENCHEZHIAN ◽

VAMSEE VADLAMUDI ◽

RASSEL RAIHAN ◽

KENNETH REIFSNIDER

Keyword(s):

Domain Knowledge ◽

Data Science ◽

Damage Tolerance ◽

Electrochemical Impedance ◽

Research Work ◽

Rate Of Change ◽

Data Driven ◽

Develop Model ◽

Damage Development ◽

Fatigue Loads

Our community has a widespread knowledge on the damage tolerance and durability of the composites, developed over the past few decades by various experimental and computational efforts. Several methods have been used to understand the damage behavior and henceforth predict the material states such as residual strength (damage tolerance) and life (durability) of these material systems. Electrochemical Impedance Spectroscopy (EIS) and Broadband Dielectric Spectroscopy (BbDS) are such methods, which have been proven to identify the damage states in composites. Our previous work using BbDS method has proven to serve as precursor to identify the damage levels, indicating the beginning of end of life of the material. As a change in the material state variable is triggered by damage development, the rate of change of these states indicates the rate of damage interaction and can effectively predict impending failure. The Data-Driven Discovery of Models (D3M) [1] aims to develop model discovery systems, enabling users with domain knowledge but no data science background to create empirical models of real, complex processes. These D3M methods have been developed severely over the years in various applications and their implementation on real-time prediction for complex parameters such as material states in composites need to be trusted based on physics and domain knowledge. In this research work, we propose the use of data-driven methods combined with BbDS and progressive damage analysis to identify and hence predict material states in composites, subjected to fatigue loads.

Download Full-text

The Need for an Enterprise Risk Management Framework for Big Data Science Projects

Proceedings of the 9th International Conference on Data Science, Technology and Applications ◽

10.5220/0009874502680274 ◽

2020 ◽

Author(s):

Jeffrey Saltz ◽

Sucheta Lahiri

Keyword(s):

Risk Management ◽

Big Data ◽

Data Science ◽

Enterprise Risk Management ◽

Management Framework ◽

Risk Management Framework ◽

Science Projects ◽

Enterprise Risk

Download Full-text

Divide and recombine (D&R) data science projects for deep analysis of big data and high computational complexity

Japanese Journal of Statistics and Data Science ◽

10.1007/s42081-018-0008-4 ◽

2018 ◽

Vol 1 (1) ◽

pp. 139-156 ◽

Cited By ~ 1

Author(s):

Wen-wen Tung ◽

Ashrith Barthur ◽

Matthew C. Bowers ◽

Yuying Song ◽

John Gerth ◽

...

Keyword(s):

Big Data ◽

Computational Complexity ◽

Data Science ◽

Science Projects ◽

High Computational Complexity

Download Full-text

Best Practices in Structuring Data Science Projects

Advances in Intelligent Systems and Computing - Information Systems Architecture and Technology: Proceedings of 39th International Conference on Information Systems Architecture and Technology – ISAT 2018 ◽

10.1007/978-3-319-99993-7_31 ◽

2018 ◽

pp. 348-357

Author(s):

Jedrzej Rybicki

Keyword(s):

Best Practices ◽

Data Science ◽

Science Projects

Download Full-text

Machine Learning Techniques for Internet of Things

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Integrating the Internet of Things Into Software Engineering Practices ◽

10.4018/978-1-5225-7790-4.ch008 ◽

2019 ◽

pp. 160-180

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?

Download Full-text

Machine Learning Techniques for Internet of Things

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch067 ◽

2021 ◽

pp. 1490-1506

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?

Download Full-text