Entrepreneurial Competence: Using Machine Learning to Classify Entrepreneurs

Clariandys Rivera-Kempis; Leobardo Valera; Miguel A. Sastre-Castillo

doi:10.3390/su13158252

Entrepreneurial Competence: Using Machine Learning to Classify Entrepreneurs

Sustainability ◽

10.3390/su13158252 ◽

2021 ◽

Vol 13 (15) ◽

pp. 8252

Author(s):

Clariandys Rivera-Kempis ◽

Leobardo Valera ◽

Miguel A. Sastre-Castillo

Keyword(s):

Machine Learning ◽

Latin American ◽

Principal Component ◽

Data Sets ◽

Linear Discriminant ◽

Machine Learning Approach ◽

Competency Based ◽

Numerous Data ◽

The Individual ◽

Gaussian Regression

Competencies are behaviors that some people master better than others, which makes them more effective in a given situation. Considering that entrepreneurship translates into behaviors, the competency-based approach expresses attributes necessary in the generation of such behaviors with greater precision. By virtue of the dynamic and complicated nature of entrepreneurial phenomena and, especially, of the numerous data sets and variables that accompany the entrepreneur, it has become increasingly difficult to characterize it. In this study, we use predictive analysis from the machine learning approach (unsupervised learning) in order to determine if the individual is an entrepreneur, based on measures of 20 attributes of entrepreneurial competence relative to classification and ranking. We investigated this relationship using a sample of 6649 individuals from the Latin American context and a series of algorithms that include the following: logistic regression, principal component analysis, ranking and classification of data using the Ward method, linear discriminant analysis, and Gaussian regression among others.

Download Full-text

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Download Full-text

A machine learning approach for the factorization of psychometric data with application to the Delis Kaplan Executive Function System

Scientific Reports ◽

10.1038/s41598-021-96342-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

J. A. Camilleri ◽

S. B. Eickhoff ◽

S. Weis ◽

J. Chen ◽

J. Amunts ◽

...

Keyword(s):

Machine Learning ◽

Executive Function ◽

Executive Functioning ◽

Factor Model ◽

Five Factor Model ◽

Principal Component ◽

Function System ◽

Learning Approach ◽

Psychometric Data ◽

Machine Learning Approach

AbstractWhile a replicability crisis has shaken psychological sciences, the replicability of multivariate approaches for psychometric data factorization has received little attention. In particular, Exploratory Factor Analysis (EFA) is frequently promoted as the gold standard in psychological sciences. However, the application of EFA to executive functioning, a core concept in psychology and cognitive neuroscience, has led to divergent conceptual models. This heterogeneity severely limits the generalizability and replicability of findings. To tackle this issue, in this study, we propose to capitalize on a machine learning approach, OPNMF (Orthonormal Projective Non-Negative Factorization), and leverage internal cross-validation to promote generalizability to an independent dataset. We examined its application on the scores of 334 adults at the Delis–Kaplan Executive Function System (D-KEFS), while comparing to standard EFA and Principal Component Analysis (PCA). We further evaluated the replicability of the derived factorization across specific gender and age subsamples. Overall, OPNMF and PCA both converge towards a two-factor model as the best data-fit model. The derived factorization suggests a division between low-level and high-level executive functioning measures, a model further supported in subsamples. In contrast, EFA, highlighted a five-factor model which reflects the segregation of the D-KEFS battery into its main tasks while still clustering higher-level tasks together. However, this model was poorly supported in the subsamples. Thus, the parsimonious two-factors model revealed by OPNMF encompasses the more complex factorization yielded by EFA while enjoying higher generalizability. Hence, OPNMF provides a conceptually meaningful, technically robust, and generalizable factorization for psychometric tools.

Download Full-text

Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1019 ◽

2021 ◽

Author(s):

Gediminas Adomavicius ◽

Yaqiong Wang

Keyword(s):

Machine Learning ◽

General Purpose ◽

Reliability Estimation ◽

Machine Learning Techniques ◽

Data Sets ◽

Real World Data ◽

Learning Techniques ◽

Reliability Indicator ◽

Machine Learning Approach ◽

Prediction Reliability

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.

Download Full-text

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Computational Linguistics ◽

10.1162/089120101753342653 ◽

2001 ◽

Vol 27 (4) ◽

pp. 521-544 ◽

Cited By ~ 287

Author(s):

Wee Meng Soon ◽

Hwee Tou Ng ◽

Daniel Chung Yong Lim

Keyword(s):

Machine Learning ◽

Noun Phrase ◽

State Of The Art ◽

Noun Phrases ◽

Learning Approach ◽

Data Sets ◽

Coreference Resolution ◽

Machine Learning Approach

In this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are of “organization,” “person,” or other types. We evaluate our approach on common data sets (namely, the MUC-6 and MUC-7 coreference corpora) and obtain encouraging results, indicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches. Our system is the first learning-based system that offers performance comparable to that of state-of-the-art nonlearning systems on these data sets.

Download Full-text

A machine learning approach to medical data identification through principal component analysis

Big Data III: Learning, Analytics, and Applications ◽

10.1117/12.2586038 ◽

2021 ◽

Author(s):

Lorenzo E. Jaques ◽

Arthur C. Depoian ◽

Dong Xie ◽

Colleen P. Bailey ◽

Parthasarathy Guturu

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Medical Data ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

A Study on the Psychological Analysis System Using Machine Learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.33.18591 ◽

2018 ◽

Vol 7 (3.33) ◽

pp. 128

Author(s):

Ki Young Lee ◽

Kyu Ho Kim ◽

Jeong Jin Kang ◽

Sung Jai Choi ◽

Yong Soon Im ◽

...

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Smart Phone ◽

Principal Component ◽

Digital Data ◽

Expression Recognition ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Psychological Analysis ◽

Analysis System

Real-time facial expression recognition and analysis technology is recently drawing attention in areas of computer vision, computer graphics, and HCI. Recognition of user’s emotion on the basis of video and voice is drawing particular interest. The technology may help managers of households or hospitals. In the present study, video and voice were converted into digital data through MATLAB by using PCA(Principal Component Analysis), LDA(Linear Discriminant Analysis), KNN(K Nearest Neighbor) algorithms to analyze emotions through machine learning. The manager of the psychological analysis counseling system may understand a user’s emotion in an smart phone environment. This system of the present study may help the manager to have a smooth conversation or develop a smooth relationship with a user on the basis of the provided psychological analysis results.

Download Full-text

Contactless Li-Ion Battery Voltage Detection by Using Walabot and Machine Learning

Volume 9: 15th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications ◽

10.1115/detc2019-97668 ◽

2019 ◽

Author(s):

Yanan Wang ◽

Haoyu Niu ◽

Tiebiao Zhao ◽

Xiaozhong Liao ◽

Lei Dong ◽

...

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Learning Algorithm ◽

Three Dimensional ◽

Lithium Ion ◽

Principal Component ◽

Stochastic Gradient Descent ◽

Li Ion Battery ◽

Linear Discriminant ◽

Li Ion

Abstract This paper has proposed a contactless voltage classification method for Lithium-ion batteries (LIBs). With a three-dimensional radio-frequency based sensor called Walabot, voltage data of LIBs can be collected in a contactless way. Then three machine learning algorithm, that is, principal component analysis (PCA), linear discriminant analysis (LDA), and stochastic gradient descent (SGD) classifiers, have been employed for data processing. Experiments and comparison have been conducted to verify the proposed method. The colormaps of results and prediction accuracy show that LDA may be most suitable for LIBs voltage classification.

Download Full-text

A Machine Learning Approach for Efficient Selection of Enzyme Concentrations and Its Application for Flux Optimization

Catalysts ◽

10.3390/catal10030291 ◽

2020 ◽

Vol 10 (3) ◽

pp. 291 ◽

Cited By ~ 1

Author(s):

Anamya Ajjolli Nagaraja ◽

Philippe Charton ◽

Xavier F. Cadet ◽

Nicolas Fontaine ◽

Mathieu Delsaut ◽

...

Keyword(s):

Machine Learning ◽

Glass Ceiling ◽

Principal Component ◽

Enzyme Concentration ◽

Learning Approach ◽

Neural Network Approach ◽

Free System ◽

Machine Learning Approach ◽

Selection Of

The metabolic engineering of pathways has been used extensively to produce molecules of interest on an industrial scale. Methods like gene regulation or substrate channeling helped to improve the desired product yield. Cell-free systems are used to overcome the weaknesses of engineered strains. One of the challenges in a cell-free system is selecting the optimized enzyme concentration for optimal yield. Here, a machine learning approach is used to select the enzyme concentration for the upper part of glycolysis. The artificial neural network approach (ANN) is known to be inefficient in extrapolating predictions outside the box: high predicted values will bump into a sort of “glass ceiling”. In order to explore this “glass ceiling” space, we developed a new methodology named glass ceiling ANN (GC-ANN). Principal component analysis (PCA) and data classification methods are used to derive a rule for a high flux, and ANN to predict the flux through the pathway using the input data of 121 balances of four enzymes in the upper part of glycolysis. The outcomes of this study are i. in silico selection of optimum enzyme concentrations for a maximum flux through the pathway and ii. experimental in vitro validation of the “out-of-the-box” fluxes predicted using this new approach. Surprisingly, flux improvements of up to 63% were obtained. Gratifyingly, these improvements are coupled with a cost decrease of up to 25% for the assay.

Download Full-text

Comparative Machine Learning Approach in Dementia Patient Classification using Principal Component Analysis

Proceedings of the 12th International Conference on Agents and Artificial Intelligence ◽

10.5220/0009096907800784 ◽

2020 ◽

Cited By ~ 1

Author(s):

Gopi Battineni ◽

Nalini Chintalapudi ◽

Francesco Amenta

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Dementia Patient ◽

Learning Approach ◽

Patient Classification ◽

Machine Learning Approach

Download Full-text

Performance Improvement in the Pattern Classification of Nominal Data Sets Applying Multiple Correspondence Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.670-671.1482 ◽

2014 ◽

Vol 670-671 ◽

pp. 1482-1487

Author(s):

Rodrigo Clemente Thom de Souza ◽

Maria Teresinha Arns Steiner ◽

Leandro dos Santos Coelho

Keyword(s):

Correspondence Analysis ◽

Multiple Correspondence Analysis ◽

Principal Component ◽

Knowledge Discovery In Databases ◽

Data Sets ◽

Data Types ◽

Nominal Data ◽

Linear Discriminant ◽

Whole Process ◽

Geometric Data Analysis

Classification is a supervised learning problem used to discriminate data instances in different classes. The solution to this problem is obtained through algorithms (classifiers) that look for patterns of relationships between classes in known cases, using these relationships to classify unknown cases. The performance of the classifiers depends substantially of the data types. In order to give proper treatment to nominal data, this paper shows that the application of previous transformations can substantially improve the performance of classifiers, bringing significant benefits to the result of the whole process of Knowledge Discovery in Databases (KDD). This paper uses three different data sets with nominal data and two well-known classifiers: the Linear Discriminant Analysis (LDA), and the Naïve-Bayes (NB). For data transformation, the paper applies an approach called Geometric Data Analysis (GDA). The GDA techniques compared in this paper are the traditional Principal Component Analysis (PCA) and the underexplored Multiple Correspondence Analysis (MCA). The results confirm the capability of the GDA transformation to improve the classification accuracy and attest the superiority of the MCA in comparison with its precursor, the PCA, when applied to nominal data.

Download Full-text