Machine Learning Classification Techniques for Detecting the Impact of Human Resources Outcomes on Commercial Banks Performance

Applied Computational Intelligence and Soft Computing ◽

10.1155/2021/7747907 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Sulaiman O. Atiku ◽

Ibidun C. Obagbuwa

Keyword(s):

Machine Learning ◽

Human Resources ◽

Organizational Performance ◽

Commercial Banks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bank Performance ◽

Machine Learning Classification ◽

Employee Attitude ◽

The Impact

The banking industry is a market with great competition and dynamism where organizational performance becomes paramount. Different indicators can be used to measure organizational performance and sustain competitive advantage in a global marketplace. The execution of the performance indicators is usually achieved through human resources, which stand as the core element in sustaining the organization in the highly competitive marketplace. It becomes essential to effectively manage human resources strategically and align its strategies with organizational strategies. We adopted a survey research design using a quantitative approach, distributing a structured questionnaire to 305 respondents utilizing efficient sampling techniques. The prediction of bank performance is very crucial since bad performance can result in serious problems for the bank and society, such as bankruptcy and negative influence on the country’s economy. Most researchers in the past adopted traditional statistics to build prediction models; however, due to the efficiency of machine learning algorithms, a lot of researchers now apply various machine learning algorithms to various fields, including performance prediction systems. In this study, eight different machine learning algorithms were employed to build performance models to predict the prospective performance of commercial banks in Nigeria based on human resources outcomes (employee skills, attitude, and behavior) through the Python software tool with machine learning libraries and packages. The results of the analysis clearly show that human resources outcomes are crucial in achieving organizational performance, and the models built from the eight machine learning classifier algorithms in this study predict the bank performance as superior with the accuracies of 74–81%. The feature importance was computed with the package in Scikit-learn to show comparative importance or contribution of each feature in the prediction, and employee attitude is rated far more than other features. Nigeria’s bank industry should focus more on employee attitude so that the performance can be improved to outstanding class from the current superior class.

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text

Identification of Target Chicken Populations by Machine Learning Models Using the Minimum Number of SNPs

Animals ◽

10.3390/ani11010241 ◽

2021 ◽

Vol 11 (1) ◽

pp. 241

Author(s):

Dongwon Seo ◽

Sunghyun Cho ◽

Prabuddha Manjula ◽

Nuri Choi ◽

Young-Kuk Kim ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Fixation Index ◽

Machine Learning Classification ◽

Genetic Components ◽

Marker Combination ◽

A Genome ◽

Minimum Number ◽

Native Chickens

A marker combination capable of classifying a specific chicken population could improve commercial value by increasing consumer confidence with respect to the origin of the population. This would facilitate the protection of native genetic resources in the market of each country. In this study, a total of 283 samples from 20 lines, which consisted of Korean native chickens, commercial native chickens, and commercial broilers with a layer population, were analyzed to determine the optimal marker combination comprising the minimum number of markers, using a 600 k high-density single nucleotide polymorphism (SNP) array. Machine learning algorithms, a genome-wide association study (GWAS), linkage disequilibrium (LD) analysis, and principal component analysis (PCA) were used to distinguish a target (case) group for comparison with control chicken groups. In the processing of marker selection, a total of 47,303 SNPs were used for classifying chicken populations; 96 LD-pruned SNPs (50 SNPs per LD block) served as the best marker combination for target chicken classification. Moreover, 36, 44, and 8 SNPs were selected as the minimum numbers of markers by the AdaBoost (AB), Random Forest (RF), and Decision Tree (DT) machine learning classification models, which had accuracy rates of 99.6%, 98.0%, and 97.9%, respectively. The selected marker combinations increased the genetic distance and fixation index (Fst) values between the case and control groups, and they reduced the number of genetic components required, confirming that efficient classification of the groups was possible by using a small number of marker sets. In a verification study including additional chicken breeds and samples (12 lines and 182 samples), the accuracy did not significantly change, and the target chicken group could be clearly distinguished from the other populations. The GWAS, PCA, and machine learning algorithms used in this study can be applied efficiently, to determine the optimal marker combination with the minimum number of markers that can distinguish the target population among a large number of SNP markers.

Download Full-text

Analyze the impact of the epidemic on New York taxis by machine learning algorithms and recommendations for optimal prediction algorithms

10.1145/3475851.3475861 ◽

2021 ◽

Author(s):

Zheng Liu ◽

Xinjing Xia ◽

Haipeng Zhang ◽

Zihui Xie

Keyword(s):

Machine Learning ◽

New York ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Optimal Prediction ◽

Prediction Algorithms ◽

The Impact

Download Full-text

An analytical survey on the role of machine learning algorithms in case of intrusion detection

ACCENTS Transactions on Information Security ◽

10.19101/tis.2020.517002 ◽

2020 ◽

Vol 5 (19) ◽

pp. 32-35

Author(s):

Anand Vijay ◽

Kailash Patidar ◽

Manoj Yadav ◽

Rishi Kushwah

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Handling Mechanism ◽

The Impact

In this paper an analytical survey on the role of machine learning algorithms in case of intrusion detection has been presented and discussed. This paper shows the analytical aspects in the development of efficient intrusion detection system (IDS). The related study for the development of this system has been presented in terms of computational methods. The discussed methods are data mining, artificial intelligence and machine learning. It has been discussed along with the attack parameters and attack types. This paper also elaborates the impact of different attack and handling mechanism based on the previous papers.

Download Full-text

Teleconsultations between Patients and Healthcare Professionals in Primary Care in Catalonia: the Evaluation of Text Classification Algorithms Using Machine Learning

10.20944/preprints201912.0220.v1 ◽

2019 ◽

Author(s):

Francesc López Seguí ◽

Ricardo Ander Egg Aguilar ◽

Gabriel de Maeztu ◽

Anna García-Altés ◽

Francesc García Cuyàs ◽

...

Keyword(s):

Machine Learning ◽

Primary Care ◽

Text Classification ◽

Learning Strategy ◽

Care Service ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Face To Face ◽

Classification Tool ◽

The Impact

Background: the primary care service in Catalonia has operated an asynchronous teleconsulting service between GPs and patients since 2015 (eConsulta), which has generated some 500,000 messages. New developments in big data analysis tools, particularly those involving natural language, can be used to accurately and systematically evaluate the impact of the service. Objective: the study was intended to examine the predictive potential of eConsulta messages through different combinations of vector representation of text and machine learning algorithms and to evaluate their performance. Methodology: 20 machine learning algorithms (based on 5 types of algorithms and 4 text representation techniques)were trained using a sample of 3,559 messages (169,102 words) corresponding to 2,268 teleconsultations (1.57 messages per teleconsultation) in order to predict the three variables of interest (avoiding the need for a face-to-face visit, increased demand and type of use of the teleconsultation). The performance of the various combinations was measured in terms of precision, sensitivity, F-value and the ROC curve. Results: the best-trained algorithms are generally effective, proving themselves to be more robust when approximating the two binary variables "avoiding the need of a face-to-face visit" and "increased demand" (precision = 0.98 and 0.97, respectively) rather than the variable "type of query"(precision = 0.48). Conclusion: to the best of our knowledge, this study is the first to investigate a machine learning strategy for text classification using primary care teleconsultation datasets. The study illustrates the possible capacities of text analysis using artificial intelligence. The development of a robust text classification tool could be feasible by validating it with more data, making it potentially more useful for decision support for health professionals.

Download Full-text

PSIX-15 Assessment of machine learning algorithms for prediction of Aleutian disease in American mink

Journal of Animal Science ◽

10.1093/jas/skab235.484 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 264-265

Author(s):

Duy Ngoc Do ◽

Guoyu Hu ◽

Younes Miar

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Models ◽

American Mink ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Enzyme Linked Immunosorbent Assay ◽

Linear Discriminant ◽

Machine Learning Classification

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.

Download Full-text

Comparison of Machine Learning Algorithms in the Interpolation and Extrapolation of Flame Describing Functions

Volume 4B: Combustion, Fuels, and Emissions ◽

10.1115/gt2019-91319 ◽

2019 ◽

Author(s):

Michael McCartney ◽

Matthias Haeringer ◽

Wolfgang Polifke

Keyword(s):

Machine Learning ◽

Gaussian Processes ◽

Spline Interpolation ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Test Time ◽

Minimal Amount ◽

Data Points ◽

The Impact

Abstract This paper examines and compares commonly used Machine Learning algorithms in their performance in interpolation and extrapolation of FDFs, based on experimental and simulation data. Algorithm performance is evaluated by interpolating and extrapolating FDFs and then the impact of errors on the limit cycle amplitudes are evaluated using the xFDF framework. The best algorithms in interpolation and extrapolation were found to be the widely used cubic spline interpolation, as well as the Gaussian Processes regressor. The data itself was found to be an important factor in defining the predictive performance of a model, therefore a method of optimally selecting data points at test time using Gaussian Processes was demonstrated. The aim of this is to allow a minimal amount of data points to be collected while still providing enough information to model the FDF accurately. The extrapolation performance was shown to decay very quickly with distance from the domain and so emphasis should be put on selecting measurement points in order to expand the covered domain. Gaussian Processes also give an indication of confidence on its predictions and is used to carry out uncertainty quantification, in order to understand model sensitivities. This was demonstrated through application to the xFDF framework.

Download Full-text

The impact of Negative to Positive Training Dataset Ratio on Atrial Fibrillation Classification Machine Learning Algorithms Performance

Journal of Physics Conference Series ◽

10.1088/1742-6596/1500/1/012131 ◽

2020 ◽

Vol 1500 ◽

pp. 012131

Author(s):

Firdaus ◽

Andre Herviant Juliano ◽

Naufal Rachmatullah ◽

Sarifah Putri Rafflesia ◽

Dinna Yunika Hardiyanti ◽

...

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Dataset ◽

The Impact

Download Full-text

Performance of three machine learning algorithms for predicting soil organic carbon in German agricultural soil

10.5194/soil-2021-107 ◽

2021 ◽

Author(s):

Ali Sakhaee ◽

Anika Gebauer ◽

Mareike Ließ ◽

Axel Don

Keyword(s):

Machine Learning ◽

Organic Carbon ◽

Soil Organic Carbon ◽

Agricultural Soil ◽

Learning Algorithms ◽

Model Performance ◽

Machine Learning Algorithms ◽

Support Vector ◽

Organic Soils ◽

The Impact

Abstract. Soil organic carbon (SOC), as the largest terrestrial carbon pool, has the potential to influence climate change and mitigation, and consequently SOC monitoring is important in the frameworks of different international treaties. There is therefore a need for high resolution SOC maps. Machine learning (ML) offers new opportunities to do this due to its capability for data mining of large datasets. The aim of this study, therefore, was to test three commonly used algorithms in digital soil mapping – random forest (RF), boosted regression trees (BRT) and support vector machine for regression (SVR) – on the first German Agricultural Soil Inventory to model agricultural topsoil SOC content. Nested cross-validation was implemented for model evaluation and parameter tuning. Moreover, grid search and differential evolution algorithm were applied to ensure that each algorithm was tuned and optimised suitably. The SOC content of the German Agricultural Soil Inventory was highly variable, ranging from 4 g kg−1 to 480 g kg−1. However, only 4 % of all soils contained more than 87 g kg−1 SOC and were considered organic or degraded organic soils. The results show that SVR provided the best performance with RMSE of 32 g kg−1 when the algorithms were trained on the full dataset. However, the average RMSE of all algorithms decreased by 34 % when mineral and organic soils were modeled separately, with the best result from SVR with RMSE of 21 g kg−1. Model performance is often limited by the size and quality of the available soil dataset for calibration and validation. Therefore, the impact of enlarging the training data was tested by including 1223 data points from the European Land Use/Land Cover Area Frame Survey for agricultural sites in Germany. The model performance was enhanced for maximum 1 % for mineral soils and 2 % for organic soils. Despite the capability of machine learning algorithms in general, and particularly SVR, in modelling SOC on a national scale, the study showed that the most important to improve the model performance was separate modelling of mineral and organic soils.

Download Full-text

The Impact of Selecting a Validation Method in Machine Learning on Predicting Basketball Game Outcomes

Symmetry ◽

10.3390/sym12030431 ◽

2020 ◽

Vol 12 (3) ◽

pp. 431 ◽

Cited By ~ 1

Author(s):

Tomislav Horvat ◽

Ladislav Havaš ◽

Dunja Srpak

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Test Validation ◽

Sporting Events ◽

Validation Method ◽

Validation Methods ◽

Independent Events ◽

The Impact

Interest in sports predictions as well as the public availability of large amounts of structured and unstructured data are increasing every day. As sporting events are not completely independent events, but characterized by the influence of the human factor, the adequate selection of the analysis process is very important. In this paper, seven different classification machine learning algorithms are used and validated with two validation methods: Train&Test and cross-validation. Validation methods were analyzed and critically reviewed. The obtained results are analyzed and compared. Analyzing the results of the used machine learning algorithms, the best average prediction results were obtained by using the nearest neighbors algorithm and the worst prediction results were obtained by using decision trees. The cross-validation method obtained better results than the Train&Test validation method. The prediction results of the Train&Test validation method by using disjoint datasets and up-to-date data were also compared. Better results were obtained by using up-to-date data. In addition, directions for future research are also explained.

Download Full-text