Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach

Jens Hainmueller; Chad Hazlett

doi:10.1093/pan/mpt019

Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach

Political Analysis ◽

10.1093/pan/mpt019 ◽

2014 ◽

Vol 22 (2) ◽

pp. 143-168 ◽

Cited By ~ 81

Author(s):

Jens Hainmueller ◽

Chad Hazlett

Keyword(s):

Machine Learning ◽

Social Science ◽

Least Squares ◽

Linear Models ◽

Science Inquiry ◽

Classification Problems ◽

Regularized Least Squares ◽

Inference Problems ◽

Heterogeneous Effects ◽

Best Fitting

We propose the use of Kernel Regularized Least Squares (KRLS) for social science modeling and inference problems. KRLS borrows from machine learning methods designed to solve regression and classification problems without relying on linearity or additivity assumptions. The method constructs a flexible hypothesis space that uses kernels as radial basis functions and finds the best-fitting surface in this space by minimizing a complexity-penalized least squares problem. We argue that the method is well-suited for social science inquiry because it avoids strong parametric assumptions, yet allows interpretation in ways analogous to generalized linear models while also permitting more complex interpretation to examine nonlinearities, interactions, and heterogeneous effects. We also extend the method in several directions to make it more effective for social inquiry, by (1) deriving estimators for the pointwise marginal effects and their variances, (2) establishing unbiasedness, consistency, and asymptotic normality of the KRLS estimator under fairly general conditions, (3) proposing a simple automated rule for choosing the kernel bandwidth, and (4) providing companion software. We illustrate the use of the method through simulations and empirical examples.

Download Full-text

A PARALLEL ONLINE REGULARIZED LEAST-SQUARES MACHINE LEARNING ALGORITHM FOR FUTURE MULTI-CORE PROCESSORS

Proceedings of the 1st International Conference on Pervasive and Embedded Computing and Communication Systems ◽

10.5220/0003411405900599 ◽

2011 ◽

Keyword(s):

Machine Learning ◽

Least Squares ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Regularized Least Squares

Download Full-text

Applications of Regularized Least Squares to Classification Problems

Lecture Notes in Computer Science - Algorithmic Learning Theory ◽

10.1007/978-3-540-30215-5_2 ◽

2004 ◽

pp. 14-18 ◽

Cited By ~ 1

Author(s):

Nicolò Cesa-Bianchi

Keyword(s):

Least Squares ◽

Classification Problems ◽

Regularized Least Squares

Download Full-text

Neural Legal Outcome Prediction with Partial Least Squares Compression

Stats ◽

10.3390/stats3030025 ◽

2020 ◽

Vol 3 (3) ◽

pp. 396-411

Author(s):

Charles Condevaux

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Outcome Prediction ◽

Linear Models ◽

Classification Problems ◽

Common Goal ◽

Multiclass Problem ◽

European Court ◽

Classification Tasks ◽

Factual Data

Predicting the outcome of a case from a set of factual data is a common goal in legal knowledge discovery. In practice, solving this task is most of the time difficult due to the scarcity of labeled datasets. Additionally, processing long documents often leads to sparse data, which adds another layer of complexity. This paper presents a study focused on the french decisions of the European Court of Human Rights (ECtHR) for which we build various classification tasks. These tasks consist first of all in the prediction of the potential violation of an article of the convention, using extracted facts. A multiclass problem is also created, with the objective of determining whether an article is relevant to plead given some circumstances. We solve these tasks by comparing simple linear models to an attention-based neural network. We also take advantage of a modified partial least squares algorithm that we integrate in the aforementioned models, capable of effectively dealing with classification problems and scale with sparse inputs coming from natural language tasks.

Download Full-text

Parallelized Online Regularized Least-Squares for Adaptive Embedded Systems

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/jertcs.2012040104 ◽

2012 ◽

Vol 3 (2) ◽

pp. 73-91 ◽

Cited By ~ 1

Author(s):

Tapio Pahikkala ◽

Antti Airola ◽

Thomas Canhao Xu ◽

Pasi Liljeberg ◽

Hannu Tenhunen ◽

...

Keyword(s):

Machine Learning ◽

Least Squares ◽

Real Time ◽

Adaptive Systems ◽

Learning Algorithm ◽

Real Life ◽

Recognition Task ◽

Learning System ◽

Regularized Least Squares ◽

On Chip

The authors introduce a machine learning approach based on parallel online regularized least-squares learning algorithm for parallel embedded hardware platforms. The system is suitable for use in real-time adaptive systems. Firstly, the system can learn in online fashion, a property required in real-life applications of embedded machine learning systems. Secondly, to guarantee real-time response in embedded multi-core computer architectures, the learning system is parallelized and able to operate with a limited amount of computational and memory resources. Thirdly, the system can predict several labels simultaneously. The authors evaluate the performance of the algorithm from three different perspectives. The prediction performance is evaluated on a hand-written digit recognition task. The computational speed is measured from 1 thread to 4 threads, in a quad-core platform. As a promising unconventional multi-core architecture, Network-on-Chip platform is studied for the algorithm. The authors construct a NoC consisting of a 4x4 mesh. The machine learning algorithm is implemented in this platform with up to 16 threads. It is shown that the memory consumption and cache efficiency can be considerably improved by optimizing the cache behavior of the system. The authors’ results provide a guideline for designing future embedded multi-core machine learning devices.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Faculty Opinions recommendation of Heterogeneous effects of alveolar recruitment in acute respiratory distress syndrome: a machine learning reanalysis of the Alveolar Recruitment for Acute Respiratory Distress Syndrome Trial.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.735517596.793561872 ◽

2019 ◽

Author(s):

Stephen Edwards Rees

Keyword(s):

Machine Learning ◽

Acute Respiratory Distress Syndrome ◽

Respiratory Distress Syndrome ◽

Respiratory Distress ◽

Distress Syndrome ◽

Alveolar Recruitment ◽

Heterogeneous Effects

Download Full-text

Some Statistical Inference Problems in Linear Models and Variance Components Models and Their Applications

10.21236/ada264062 ◽

1993 ◽

Author(s):

Thomas Mathew

Keyword(s):

Statistical Inference ◽

Variance Components ◽

Linear Models ◽

Inference Problems

Download Full-text

Some Aspects of Discrete Relaxation Time Spectra as Obtained from Dynamic Moduli Data

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc19951815 ◽

1995 ◽

Vol 60 (11) ◽

pp. 1815-1829 ◽

Cited By ~ 1

Author(s):

Jaromír Jakeš

Keyword(s):

Relaxation Time ◽

Least Squares ◽

Discrete Spectrum ◽

Integral Transform ◽

Time Spectrum ◽

Relaxation Time Spectrum ◽

Dynamic Moduli ◽

Best Fitting ◽

Discrete Spectra ◽

Well Posed

The problem of finding a relaxation time spectrum best fitting dynamic moduli data in the least-squares sense is shown to be well-posed and to yield a discrete spectrum, provided the data cannot be fitted exactly, i.e., without any deviation of data and calculated values. Properties of the resulting spectrum are discussed. Examples of discrete spectra obtained from simulated literature data and experimental literature data on polymers are given. The problem of smoothing discrete spectra when continuous ones are expected is discussed. A detailed study of an integral transform inversion under the non-negativity constraint is given in Appendix.

Download Full-text

A top-level model of case-based argumentation for explanation: Formalisation and experiments

Argument & Computation ◽

10.3233/aac-210009 ◽

2021 ◽

pp. 1-36

Author(s):

Henry Prakken ◽

Rosa Ratsma

Keyword(s):

Machine Learning ◽

Decision Making ◽

Linear Models ◽

Evaluation Studies ◽

Data Sets ◽

Machine Learning Applications ◽

Level Model ◽

Similarities And Differences ◽

Further Development ◽

Case Based

This paper proposes a formal top-level model of explaining the outputs of machine-learning-based decision-making applications and evaluates it experimentally with three data sets. The model draws on AI & law research on argumentation with cases, which models how lawyers draw analogies to past cases and discuss their relevant similarities and differences in terms of relevant factors and dimensions in the problem domain. A case-based approach is natural since the input data of machine-learning applications can be seen as cases. While the approach is motivated by legal decision making, it also applies to other kinds of decision making, such as commercial decisions about loan applications or employee hiring, as long as the outcome is binary and the input conforms to this paper’s factor- or dimension format. The model is top-level in that it can be extended with more refined accounts of similarities and differences between cases. It is shown to overcome several limitations of similar argumentation-based explanation models, which only have binary features and do not represent the tendency of features towards particular outcomes. The results of the experimental evaluation studies indicate that the model may be feasible in practice, but that further development and experimentation is needed to confirm its usefulness as an explanation model. Main challenges here are selecting from a large number of possible explanations, reducing the number of features in the explanations and adding more meaningful information to them. It also remains to be investigated how suitable our approach is for explaining non-linear models.

Download Full-text

Forecasting and trading cryptocurrencies with machine learning under changing market conditions

Financial Innovation ◽

10.1186/s40854-020-00217-x ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Helder Sebastião ◽

Pedro Godinho

Keyword(s):

Machine Learning ◽

Linear Models ◽

Test Sample ◽

Trading Strategies ◽

Network Activity ◽

Machine Learning Techniques ◽

Support Vector ◽

Success Rates ◽

Market Conditions ◽

Sharpe Ratios

AbstractThis study examines the predictability of three major cryptocurrencies—bitcoin, ethereum, and litecoin—and the profitability of trading strategies devised upon machine learning techniques (e.g., linear models, random forests, and support vector machines). The models are validated in a period characterized by unprecedented turmoil and tested in a period of bear markets, allowing the assessment of whether the predictions are good even when the market direction changes between the validation and test periods. The classification and regression methods use attributes from trading and network activity for the period from August 15, 2015 to March 03, 2019, with the test sample beginning on April 13, 2018. For the test period, five out of 18 individual models have success rates of less than 50%. The trading strategies are built on model assembling. The ensemble assuming that five models produce identical signals (Ensemble 5) achieves the best performance for ethereum and litecoin, with annualized Sharpe ratios of 80.17% and 91.35% and annualized returns (after proportional round-trip trading costs of 0.5%) of 9.62% and 5.73%, respectively. These positive results support the claim that machine learning provides robust techniques for exploring the predictability of cryptocurrencies and for devising profitable trading strategies in these markets, even under adverse market conditions.

Download Full-text