Predicting Multi-Disciplinary Design Performance Utilizing Automated Topic Discovery

Author(s):  
Zachary Ball ◽  
Kemper Lewis

Abstract Increasing the complexity of engineering design projects expands of the diversity of required topic knowledge. Multi-disciplinary design processes have the need for expertise from multiple fields of study. In the context of mass collaboration within engineering design, positioning key members within multi-disciplinary teams is of great importance. Determining how each discipline impacts the overall design process requires an understanding of the mapping between competency and performance. This work explores this mapping through the use of predictive models composed of various regression algorithms. Design performance of students working on their capstone design project is analyzed and the relationship between individual competencies is compared against their overall project performance. Each competency and project is represented as a distribution of topic knowledge to produce the performance metrics. Following the automated topic extraction of the textual data, the regression algorithms are applied. Three topic models and five prediction models are compared for their prediction accuracy. From this analysis it was found that representing both input and output variables as a distribution of topics while performing a support vector regression provided the most accurate mapping between ability and performance.

2020 ◽  
Vol 142 (12) ◽  
Author(s):  
Zachary Ball ◽  
Kemper Lewis

Abstract Increasingly complex engineering design challenges requires the diversification of knowledge required on design teams. In the context of open innovation, positioning key members within these teams or groups based on their estimated abilities leads to more impactful results since mass collaboration is fundamentally a sociotechnical system. Determining how each individual influences the overall design process requires an understanding of the predicted mapping between their technical competency and performance. This work explores this relationship through the use of predictive models composed of various algorithms. With support of a dataset composed of documents related to the design performance of students working on their capstone design project in combination with textual descriptors representing individual technical aptitudes, correlations are explored as a method to predict overall project development performance. Each technical competency and project is represented as a distribution of topic knowledge to produce the performance metrics, which are referred to as topic competencies, since topic representations increase the ability to decompose and identify human-centric performance measures. Three methods of topic identification and five prediction models are compared based on their prediction accuracy. From this analysis, it is found that representing input variables as topics distributions and the resulting performance as a single indicator while using support vector regression provided the most accurate mapping between ability and performance. With these findings, complex open innovation projects will benefit from increased knowledge of individual ability and how that correlates to their predicted performances.


2021 ◽  
Vol 10 (4) ◽  
pp. 199
Author(s):  
Francisco M. Bellas Aláez ◽  
Jesus M. Torres Palenzuela ◽  
Evangelos Spyrakos ◽  
Luis González Vilas

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.


2021 ◽  
Author(s):  
Sridevi S ◽  
Jeevaa Katiravan Jeevaa Katiravan

Abstract Scientific workflows deserve the emerging attention in sophisticated large-scale scientific problem-solving environments. Though a single task failure occurs in workflow based applications, due to its task dependency nature the reliability of the overall system will be affected drastically. Hence rather than reactive fault tolerant approaches, proactive measures are vital in scientific workflows. This work puts forth an attempt to concentrate on the exploration issue of structuring an Exotic Intelligent Water Drops - Support Vector Regression-based approach for task failure prognostication which facilitates proactive fault tolerance in scientific workflow applications. The failure prediction models in this study have been implemented through SVR-based machine learning approaches and its precision accuracy is optimized by IWDA and various performance metrics were evaluated. The experimental results prove that the proposed approach performs better compared with the other existing techniques.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1692 ◽  
Author(s):  
Iván Silva ◽  
José Eugenio Naranjo

Identifying driving styles using classification models with in-vehicle data can provide automated feedback to drivers on their driving behavior, particularly if they are driving safely. Although several classification models have been developed for this purpose, there is no consensus on which classifier performs better at identifying driving styles. Therefore, more research is needed to evaluate classification models by comparing performance metrics. In this paper, a data-driven machine-learning methodology for classifying driving styles is introduced. This methodology is grounded in well-established machine-learning (ML) methods and literature related to driving-styles research. The methodology is illustrated through a study involving data collected from 50 drivers from two different cities in a naturalistic setting. Five features were extracted from the raw data. Fifteen experts were involved in the data labeling to derive the ground truth of the dataset. The dataset fed five different models (Support Vector Machines (SVM), Artificial Neural Networks (ANN), fuzzy logic, k-Nearest Neighbor (kNN), and Random Forests (RF)). These models were evaluated in terms of a set of performance metrics and statistical tests. The experimental results from performance metrics showed that SVM outperformed the other four models, achieving an average accuracy of 0.96, F1-Score of 0.9595, Area Under the Curve (AUC) of 0.9730, and Kappa of 0.9375. In addition, Wilcoxon tests indicated that ANN predicts differently to the other four models. These promising results demonstrate that the proposed methodology may support researchers in making informed decisions about which ML model performs better for driving-styles classification.


2020 ◽  
Vol 39 (5) ◽  
pp. 6073-6087
Author(s):  
Meltem Yontar ◽  
Özge Hüsniye Namli ◽  
Seda Yanik

Customer behavior prediction is gaining more importance in the banking sector like in any other sector recently. This study aims to propose a model to predict whether credit card users will pay their debts or not. Using the proposed model, potential unpaid risks can be predicted and necessary actions can be taken in time. For the prediction of customers’ payment status of next months, we use Artificial Neural Network (ANN), Support Vector Machine (SVM), Classification and Regression Tree (CART) and C4.5, which are widely used artificial intelligence and decision tree algorithms. Our dataset includes 10713 customer’s records obtained from a well-known bank in Taiwan. These records consist of customer information such as the amount of credit, gender, education level, marital status, age, past payment records, invoice amount and amount of credit card payments. We apply cross validation and hold-out methods to divide our dataset into two parts as training and test sets. Then we evaluate the algorithms with the proposed performance metrics. We also optimize the parameters of the algorithms to improve the performance of prediction. The results show that the model built with the CART algorithm, one of the decision tree algorithm, provides high accuracy (about 86%) to predict the customers’ payment status for next month. When the algorithm parameters are optimized, classification accuracy and performance are increased.


Geosciences ◽  
2019 ◽  
Vol 9 (12) ◽  
pp. 504
Author(s):  
Josephine Morgenroth ◽  
Usman T. Khan ◽  
Matthew A. Perras

Machine learning methods for data processing are gaining momentum in many geoscience industries. This includes the mining industry, where machine learning is primarily being applied to autonomously driven vehicles such as haul trucks, and ore body and resource delineation. However, the development of machine learning applications in rock engineering literature is relatively recent, despite being widely used and generally accepted for decades in other risk assessment-type design areas, such as flood forecasting. Operating mines and underground infrastructure projects collect more instrumentation data than ever before, however, only a small fraction of the useful information is typically extracted for rock engineering design, and there is often insufficient time to investigate complex rock mass phenomena in detail. This paper presents a summary of current practice in rock engineering design, as well as a review of literature and methods at the intersection of machine learning and rock engineering. It identifies gaps, such as standards for architecture, input selection and performance metrics, and areas for future work. These gaps present an opportunity to define a framework for integrating machine learning into conventional rock engineering design methodologies to make them more rigorous and reliable in predicting probable underlying physical mechanics and phenomenon.


Author(s):  
Dana Bani-Hani ◽  
Pruthak Patel ◽  
Tasneem Alshaikh

Diabetes is a serious, chronic disease that has been seeing a rise in the number of cases and prevalence over the past few decades. It can lead to serious complications and can increase the overall risk of dying prematurely. Data-oriented prediction models have become effective tools that help medical decision-making and diagnoses in which the use of machine learning in medicine has increased substantially. This research introduces the Recursive General Regression Neural Network Oracle (RGRNN Oracle) and is applied on the Pima Indians Diabetes dataset for the prediction and diagnosis of diabetes. The R-GRNN Oracle (Bani-Hani, 2017) is an enhancement to the GRNN Oracle developed by Masters et al. in 1998, in which the recursive model is created of two oracles: one within the other. Several classifiers, along with the R-GRNN Oracle and the GRNN Oracle, are applied to the dataset, they are: Support Vector Machine (SVM), Multilayer Perceptron (MLP), Probabilistic Neural Network (PNN), Gaussian Naïve Bayes (GNB), K-Nearest Neighbor (KNN), and Random Forest (RF). Genetic Algorithm (GA) was used for feature selection as well as the hyperparameter optimization of SVM and MLP, and Grid Search (GS) was used to optimize the hyperparameters of KNN and RF. The performance metrics accuracy, AUC, sensitivity, and specificity were recorded for each classifier.


Extensive research has been carried out on the prediction of diesel engine performance. Machine learning techniques such as support vector regression technique makes the performance predictions simpler. Support vector regression is a regression algorithm used to minimize the error with a threshold value and tries to fit the best line with a threshold value. In this paper, a detailed study of diesel engine performance using support vector regression and performance metrics such as brake thermal efficiency and accuracy are explored. Findings specify that support vector regression is an efficient technique for diesel engine performance that validates and compares the actual performance with high accuracy. For engine performance, the support vector machine supports to reduce the time and cost of testing.


Author(s):  
Yousef O. Sharrab ◽  
Mohammad Alsmirat ◽  
Bilal Hawashin ◽  
Nabil Sarhan

Advancement of the prediction models used in a variety of fields is a result of the contribution of machine learning approaches. Utilizing such modeling in feature engineering is exceptionally imperative and required. In this research, we show how to utilize machine learning to save time in research experiments, where we save more than five thousand hours of measuring the energy consumption of encoding recordings. Since measuring the energy consumption has got to be done by humans and since we require more than eleven thousand experiments to cover all the combinations of video sequences, video bit_rate, and video encoding settings, we utilize machine learning to model the energy consumption utilizing linear regression. VP8 codec has been offered by Google as an open video encoder in an effort to replace the popular MPEG-4 Part 10, known as H.264/AVC video encoder standard. This research model energy consumption and describes the major differences between H.264/AVC and VP8 encoders in terms of energy consumption and performance through experiments that are based on machine learning modeling. Twenty-nine raw video sequences are used, offering a wide range of resolutions and contents, with the frame sizes ranging from QCIF(176x144) to 2160p(3840x2160). For fairness in comparison analysis, we use seven settings in VP8 encoder and fifteen types of tuning in H.264/AVC. The settings cover various video qualities. The performance metrics include video qualities, encoding time, and encoding energy consumption.


Sign in / Sign up

Export Citation Format

Share Document