Machine Learning Regression Analysis of EDX 2012-13 Data for Identifying the Auditors Use Case

Mark Mueller; Greg Weber

doi:10.5121/ijite.2017.6301

A machine learning approach to estimate product costs in the early product design phase: a use case from the automotive industry

Procedia CIRP ◽

10.1016/j.procir.2021.05.137 ◽

2021 ◽

Vol 100 ◽

pp. 643-648

Author(s):

Frank Bodendorf ◽

Jörg Franke

Keyword(s):

Machine Learning ◽

Product Design ◽

Automotive Industry ◽

Learning Approach ◽

Use Case ◽

Design Phase ◽

Machine Learning Approach ◽

Product Costs

Download Full-text

Content Controlled Spectral Indices for Detection of Hydrothermal Alteration Minerals Based on Machine Learning and Lasso-Logistic Regression Analysis

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ◽

10.1109/jstars.2021.3095926 ◽

2021 ◽

Vol 14 ◽

pp. 7435-7447

Author(s):

Kyuhun Shim ◽

Jaehyung Yu ◽

Lei Wang ◽

Sangin Lee ◽

Sang-Mo Koh ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Analysis ◽

Hydrothermal Alteration ◽

Logistic Regression Analysis ◽

Spectral Indices ◽

Alteration Minerals

Download Full-text

Software Effort Estimation from Use Case Diagrams Using Nonlinear Regression Analysis

2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) ◽

10.1109/ccece47787.2020.9255712 ◽

2020 ◽

Author(s):

Ali Bou Nassif ◽

Manar AbuTalib ◽

Luiz Fernando Capretz

Keyword(s):

Regression Analysis ◽

Nonlinear Regression ◽

Use Case ◽

Effort Estimation ◽

Nonlinear Regression Analysis ◽

Software Effort Estimation

Download Full-text

Construction of a quality model for machine learning systems

Software Quality Journal ◽

10.1007/s11219-021-09557-y ◽

2021 ◽

Author(s):

Julien Siebert ◽

Lisa Joeckel ◽

Jens Heidrich ◽

Adam Trendowicz ◽

Koji Nakamichi ◽

...

Keyword(s):

Machine Learning ◽

Lessons Learned ◽

Training Data ◽

Use Case ◽

Construction Process ◽

Quality Model ◽

Quality Models ◽

Quality Properties ◽

Reference Quality ◽

Industrial Use

AbstractNowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary qualities of the system and its components (such as ISO/IEC 25010). Due to the different nature of ML, we have to re-interpret existing qualities for ML systems or add new ones (such as trustworthiness). We have to be very precise about which quality property is relevant for which entity of interest (such as completeness of training data or correctness of trained model), and how to objectively evaluate adherence to quality requirements. In this article, we present how to systematically construct quality models for ML systems based on an industrial use case. This quality model enables practitioners to specify and assess qualities for ML systems objectively. In addition to the overall construction process described, the main outcomes include a meta-model for specifying quality models for ML systems, reference elements regarding relevant views, entities, quality properties, and measures for ML systems based on existing research, an example instantiation of a quality model for a concrete industrial use case, and lessons learned from applying the construction process. We found that it is crucial to follow a systematic process in order to come up with measurable quality properties that can be evaluated in practice. In the future, we want to learn how the term quality differs between different types of ML systems and come up with reference quality models for evaluating qualities of ML systems.

Download Full-text

Machine learning based Synthetic Data Generation using Iterative Regression Analysis

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca49313.2020.9297491 ◽

2020 ◽

Author(s):

Sanskar Shah ◽

Darshan Gandhi ◽

Jil Kothari

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Machine Learning Algorithms Predict Body Mass Index Using Nonlinear Trimodal Regression Analysis from Computed Tomography Scans

IFMBE Proceedings - XV Mediterranean Conference on Medical and Biological Engineering and Computing – MEDICON 2019 ◽

10.1007/978-3-030-31635-8_100 ◽

2019 ◽

pp. 839-846 ◽

Cited By ~ 2

Author(s):

Marco Recenti ◽

Carlo Ricciardi ◽

Magnus Gìslason ◽

Kyle Edmunds ◽

Ugo Carraro ◽

...

Keyword(s):

Machine Learning ◽

Computed Tomography ◽

Body Mass Index ◽

Regression Analysis ◽

Body Mass ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Computed Tomography Scans

Download Full-text

Phybrata Sensors and Machine Learning for Enhanced Neurophysiological Diagnosis and Treatment

Sensors ◽

10.3390/s21217417 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7417

Author(s):

Alex J. Hope ◽

Utkarsh Vashisth ◽

Matthew J. Parker ◽

Andreas B. Ralston ◽

Joshua M. Roper ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Random Forest ◽

Binary Classification ◽

Classification Performance ◽

Support Vector ◽

Use Case ◽

Signal Features ◽

Test Population

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.

Download Full-text

Features of machine learning in the study of the main factors of development of countries of the world

SHS Web of Conferences ◽

10.1051/shsconf/202111002006 ◽

2021 ◽

Vol 110 ◽

pp. 02006

Author(s):

Ludmila Borisova ◽

Galina Zhukova ◽

Anna Kuznetsova ◽

Julie Martin

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Mortality Rate ◽

Infant Mortality ◽

Life Expectancy ◽

Infant Mortality Rate ◽

Live Births ◽

Goods And Services ◽

Machine Learning Methods ◽

The World

The paper analyzes the socio-economic and demographic indicators of life expectancy in the countries of the world. Methods of regression analysis and machine learning are used. Statistically significant indicators that affect life expectancy around the world have been identified. When analyzing the data using machine learning methods, 13 of the 14 analyzed indicators were statistically significant. Significant indicators, in addition to those selected in the regression analysis, were 3: the under-five infant mortality rate (per 1,000 live births), the Net Barter Terms of Trade Index (2000 = 100), and Imports of goods and services (in % of GDP) (in the regression analysis, only the infant death rate was significant). In addition, it should be noted that there is a significant decrease in the under-five infant mortality rate (per 1,000 live births) for the EU, CIS and South-East Asian countries compared to the border set in the study for all countries: 4.65 vs. 34.9, a decrease in the birth rate from 2.785 to 1.85, a sharp increase in exports of goods and services: from 23.17 to 80.59, a halving in imports of goods and services, a drop in population growth from 2.105 to 0.85. The performed statistical analysis strongly supports the use of machine learning methods in identifying statistically significant relationships between various indicators that characterize the development of countries, if there are gaps in the data.

Download Full-text

Cycling performance prediction based on cadence analysis by using multiple regression

Journal of Physics Conference Series ◽

10.1088/1742-6596/2107/1/012058 ◽

2021 ◽

Vol 2107 (1) ◽

pp. 012058

Author(s):

Sukhairi Sudin ◽

Azizi Naim Abdul Aziz ◽

Fathinul Syahir Ahmad Saad ◽

Nurul Syahirah Khalid ◽

Ismail Ishaq Ibrahim

Keyword(s):

Machine Learning ◽

Heart Rate ◽

Regression Analysis ◽

Linear Regression ◽

Linear Relationship ◽

Multiple Regression ◽

Cycling Performance ◽

Continuous Output ◽

Independent Variable ◽

Prediction Problems

Abstract This project examined the influence of the cadence, speed, heart rate and power towards the cycling performance by using Garmin Edge 1000. Any change in cadence will affect the speed, heart rate and power of the novice cyclist and the changes pattern will be observed through mobile devices installed with Garmin Connect application. Every results will be recorded for the next task which analysis the collected data by using machine learning algorithm which is Regression analysis. Regression analysis is a statistical method for modelling the connection between one or more independent variables and a dependent (target) variable. Regression analysis is required to answer these types of prediction problems in machine learning. Regression is a supervised learning technique that aids in the discovery of variable correlations and allows for the prediction of a continuous output variable based on one or more predictor variables. A total of forty days’ worth of events were captured in the dataset. Cadence act as dependent variable, (y) while speed, heart rate and power act as independent variable, (x) in prediction of the cycling performance. Simple linear regression is defined as linear regression with only one input variable (x). When there are several input variables, the linear regression is referred to as multiple linear regression. The research uses a linear regression technique to predict cycling performance based on cadence analysis. The linear regression algorithm reveals a linear relationship between a dependent (y) variable and one or more independent (y) variables, thus the name. Because linear regression reveals a linear relationship, it determines how the value of the dependent variable changes as the value of the independent variable changes. This analysis use the Mean Squared Error (MSE) expense function for Linear Regression, which is the average of squared errors between expected and real values. Value of R squared had been recorded in this project. A low R-squared value means that the independent variable is not describing any of the difference in the dependent variable-regardless of variable importance, this is letting know that the defined independent variable, although meaningful, is not responsible for much of the variance in the dependent variable’s mean. By using multiple regression, the value of R-squared in this project is acceptable because over than 0.7 and as known this project based on human behaviour and usually the R-squared value hardly to have more than 0.3 if involve human factor but in this project the R-squared is acceptable.

Download Full-text

Use Case: Machine Learning in OBIEE 12c

Oracle Business Intelligence with Machine Learning ◽

10.1007/978-1-4842-3255-2_5 ◽

2017 ◽

pp. 107-133

Author(s):

Rosendo Abellera ◽

Lakshman Bulusu

Keyword(s):

Machine Learning ◽

Use Case

Download Full-text