Regression-Based Methods for Finding Coupled Patterns

Michael K. Tippett; Timothy DelSole; Simon J. Mason; Anthony G. Barnston

doi:10.1175/2008jcli2150.1

Regression-Based Methods for Finding Coupled Patterns

Journal of Climate ◽

10.1175/2008jcli2150.1 ◽

2008 ◽

Vol 21 (17) ◽

pp. 4384-4398 ◽

Cited By ~ 32

Author(s):

Michael K. Tippett ◽

Timothy DelSole ◽

Simon J. Mason ◽

Anthony G. Barnston

Keyword(s):

Least Squares ◽

Linear Prediction ◽

Prediction Models ◽

Principal Component Regression ◽

Principal Component ◽

Multivariate Statistical ◽

Maximum Covariance Analysis ◽

Least Squares Estimate ◽

Common Framework ◽

Value Decomposition

Abstract There are a variety of multivariate statistical methods for analyzing the relations between two datasets. Two commonly used methods are canonical correlation analysis (CCA) and maximum covariance analysis (MCA), which find the projections of the data onto coupled patterns with maximum correlation and covariance, respectively. These projections are often used in linear prediction models. Redundancy analysis and principal predictor analysis construct projections that maximize the explained variance and the sum of squared correlations of regression models. This paper shows that the above pattern methods are equivalent to different diagonalizations of the regression between the two datasets. The different diagonalizations are computed using the singular value decomposition of the regression matrix developed using data that are suitably transformed for each method. This common framework for the pattern methods permits easy comparison of their properties. Principal component regression is shown to be a special case of CCA-based regression. A commonly used linear prediction model constructed from MCA patterns does not give a least squares estimate since correlations among MCA predictors are neglected. A variation, denoted least squares estimate (LSE)-MCA, is suggested that uses the same patterns but minimizes squared error. Since the different pattern methods correspond to diagonalizations of the same regression matrix, they all produce the same regression model when a complete set of patterns is used. Different prediction models are obtained when an incomplete set of patterns is used, with each method optimizing different properties of the regression. Some key points are illustrated in two idealized examples, and the methods are applied to statistical downscaling of rainfall over the northeast of Brazil.

Download Full-text

Application of Derivative, Derivative Ratio, and Multivariate Spectral Analysis and Thin-Layer Chomatography-Densitometry for Determination of a Ternary Mixture Containing Drotaverine Hydrochloride, Caffeine, and Paracetamol

Journal of AOAC International ◽

10.1093/jaoac/90.2.391 ◽

2007 ◽

Vol 90 (2) ◽

pp. 391-404 ◽

Cited By ~ 9

Author(s):

Fadia H Metwally ◽

Yasser S El-Saharty ◽

Mohamed Refaat ◽

Sonia Z El-Khateeb

Keyword(s):

Thin Layer ◽

Least Squares ◽

Ternary Mixture ◽

Principal Component Regression ◽

Pharmaceutical Preparations ◽

Principal Component ◽

Standard Addition ◽

Drotaverine Hydrochloride ◽

Assay Precision

Abstract New selective, precise, and accurate methods are described for the determination of a ternary mixture containing drotaverine hydrochloride (I), caffeine (II), and paracetamol (III). The first method uses the first (D1) and third (D3) derivative spectrophotometry at 331 and 315 nm for the determination of (I) and (III), respectively, without interference from (II). The second method depends on the simultaneous use of the first derivative of the ratio spectra (DD1) with measurement at 312.4 nm for determination of (I) using the spectrum of 40 μg/mL (III) as a divisor or measurement at 286.4 and 304 nm after using the spectrum of 4 μg/mL (I) as a divisor for the determination of (II) and (III), respectively. In the third method, the predictive abilities of the classical least-squares, principal component regression, and partial least-squares were examined for the simultaneous determination of the ternary mixture. The last method depends on thin-layer chromatography-densitometry after separation of the mixture on silica gel plates using ethyl acetatechloroformmethanol (16 + 3 + 1, v/v/v) as the mobile phase. The spots were scanned at 281, 272, and 248 nm for the determination of (I), (II), and (III), respectively. Regression analysis showed good correlation in the selected ranges with excellent percentage recoveries. The chemical variables affecting the analytical performance of the methodology were studied and optimized. The methods showed no significant interferences from excipients. Intraday and interday assay precision and accuracy values were within regulatory limits. The suggested procedures were checked using laboratory-prepared mixtures and were successfully applied for the analysis of their pharmaceutical preparations. The validity of the proposed methods was further assessed by applying a standard addition technique. The results obtained by applying the proposed methods were statistically analyzed and compared with those obtained by the manufacturer's method.

Download Full-text

Multivariate Analysis as a Tool for Quantification of Conformational Transitions in DNA Thin Films

Applied Sciences ◽

10.3390/app11135895 ◽

2021 ◽

Vol 11 (13) ◽

pp. 5895

Author(s):

Kristina Serec ◽

Sanja Dolanski Babić

Keyword(s):

Thin Films ◽

Learning Algorithm ◽

Principal Component Regression ◽

Principal Component ◽

Conformational Transitions ◽

Cancer Diagnostics ◽

Dna Conformation ◽

Support Vector ◽

Multivariate Statistical ◽

The Impact

The double-stranded B-form and A-form have long been considered the two most important native forms of DNA, each with its own distinct biological roles and hence the focus of many areas of study, from cellular functions to cancer diagnostics and drug treatment. Due to the heterogeneity and sensitivity of the secondary structure of DNA, there is a need for tools capable of a rapid and reliable quantification of DNA conformation in diverse environments. In this work, the second paper in the series that addresses conformational transitions in DNA thin films utilizing FTIR spectroscopy, we exploit popular chemometric methods: the principal component analysis (PCA), support vector machine (SVM) learning algorithm, and principal component regression (PCR), in order to quantify and categorize DNA conformation in thin films of different hydrated states. By complementing FTIR technique with multivariate statistical methods, we demonstrate the ability of our sample preparation and automated spectral analysis protocol to rapidly and efficiently determine conformation in DNA thin films based on the vibrational signatures in the 1800–935 cm−1 range. Furthermore, we assess the impact of small hydration-related changes in FTIR spectra on automated DNA conformation detection and how to avoid discrepancies by careful sampling.

Download Full-text

A Novel Mutual Information and Partial Least Squares Approach for Quality-Related and Quality-Unrelated Fault Detection

Processes ◽

10.3390/pr9010166 ◽

2021 ◽

Vol 9 (1) ◽

pp. 166

Author(s):

Majed Aljunaid ◽

Yang Tao ◽

Hongbo Shi

Keyword(s):

Fault Detection ◽

Mutual Information ◽

Least Squares ◽

Partial Least Squares ◽

Principal Component ◽

False Alarms ◽

Total Quality ◽

Process Variables ◽

Tennessee Eastman Process ◽

Value Decomposition

Partial least squares (PLS) and linear regression methods are widely utilized for quality-related fault detection in industrial processes. Standard PLS decomposes the process variables into principal and residual parts. However, as the principal part still contains many components unrelated to quality, if these components were not removed it could cause many false alarms. Besides, although these components do not affect product quality, they have a great impact on process safety and information about other faults. Removing and discarding these components will lead to a reduction in the detection rate of faults, unrelated to quality. To overcome the drawbacks of Standard PLS, a novel method, MI-PLS (mutual information PLS), is proposed in this paper. The proposed MI-PLS algorithm utilizes mutual information to divide the process variables into selected and residual components, and then uses singular value decomposition (SVD) to further decompose the selected part into quality-related and quality-unrelated components, subsequently constructing quality-related monitoring statistics. To ensure that there is no information loss and that the proposed MI-PLS can be used in quality-related and quality-unrelated fault detection, a principal component analysis (PCA) model is performed on the residual component to obtain its score matrix, which is combined with the quality-unrelated part to obtain the total quality-unrelated monitoring statistics. Finally, the proposed method is applied on a numerical example and Tennessee Eastman process. The proposed MI-PLS has a lower computational load and more robust performance compared with T-PLS and PCR.

Download Full-text

Functional Linear Regression

10.1093/oxfordhb/9780199568444.013.2 ◽

2018 ◽

Author(s):

Hervé Cardot ◽

Pascal Sarda

Keyword(s):

Linear Regression ◽

Least Squares ◽

Linear Models ◽

Estimation Error ◽

Asymptotic Properties ◽

Principal Component Regression ◽

Principal Component ◽

Penalized Least Squares ◽

Open Problems ◽

Functional Linear Regression

This article presents a selected bibliography on functional linear regression (FLR) and highlights the key contributions from both applied and theoretical points of view. It first defines FLR in the case of a scalar response and shows how its modelization can also be extended to the case of a functional response. It then considers two kinds of estimation procedures for this slope parameter: projection-based estimators in which regularization is performed through dimension reduction, such as functional principal component regression, and penalized least squares estimators that take into account a penalized least squares minimization problem. The article proceeds by discussing the main asymptotic properties separating results on mean square prediction error and results on L2 estimation error. It also describes some related models, including generalized functional linear models and FLR on quantiles, and concludes with a complementary bibliography and some open problems.

Download Full-text

Continuum Power CCA: A Unified Approach for Isolating Coupled Modes

Journal of Climate ◽

10.1175/jcli-d-14-00451.1 ◽

2015 ◽

Vol 28 (3) ◽

pp. 1016-1030 ◽

Cited By ~ 2

Author(s):

Erik Swenson

Keyword(s):

Signal To Noise Ratio ◽

Full Range ◽

Synthetic Data ◽

Principal Component Regression ◽

Principal Component ◽

Accurate Estimate ◽

Unified Approach ◽

Coupled Modes ◽

Multivariate Statistical ◽

Sample Covariance

Abstract Various multivariate statistical methods exist for analyzing covariance and isolating linear relationships between datasets. The most popular linear methods are based on singular value decomposition (SVD) and include canonical correlation analysis (CCA), maximum covariance analysis (MCA), and redundancy analysis (RDA). In this study, continuum power CCA (CPCCA) is introduced as one extension of continuum power regression for isolating pairs of coupled patterns whose temporal variation maximizes the squared covariance between partially whitened variables. Similar to the whitening transformation, the partial whitening transformation acts to decorrelate individual variables but only to a partial degree with the added benefit of preconditioning sample covariance matrices prior to inversion, providing a more accurate estimate of the population covariance. CPCCA is a unified approach in the sense that the full range of solutions bridges CCA, MCA, RDA, and principal component regression (PCR). Recommended CPCCA solutions include a regularization for CCA, a variance bias correction for MCA, and a regularization for RDA. Applied to synthetic data samples, such solutions yield relatively higher skill in isolating known coupled modes embedded in noise. Provided with some crude prior expectation of the signal-to-noise ratio, the use of asymmetric CPCCA solutions may be justifiable and beneficial. An objective parameter choice is offered for regularization with CPCCA based on the covariance estimate of O. Ledoit and M. Wolf, and the results are quite robust. CPCCA is encouraged for a range of applications.

Download Full-text

The Design of Weather Index Insurance Using Principal Component Regression and Partial Least Squares Regression: The Case of Forage Crops

North American Actuarial Journal ◽

10.1080/10920277.2019.1669055 ◽

2020 ◽

Vol 24 (3) ◽

pp. 355-369

Author(s):

Milton Boyd ◽

Brock Porth ◽

Lysa Porth ◽

Ken Seng Tan ◽

Shuo Wang ◽

...

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Principal Component Regression ◽

Principal Component ◽

Index Insurance ◽

Least Squares Regression ◽

Forage Crops ◽

Weather Index ◽

Weather Index Insurance

Download Full-text

Development and Validation of Chemometric-Assisted Spectrophotometric Methods for Simultaneous Determination of Phenylephrine Hydrochloride and Ketorolac Tromethamine in Binary Combinations

Journal of AOAC International ◽

10.5740/jaoacint.16-0106 ◽

2016 ◽

Vol 99 (5) ◽

pp. 1247-1251 ◽

Cited By ~ 6

Author(s):

Hamed M Elfatatry ◽

Mokhtar M Mabrouk ◽

Sherin F Hammad ◽

Fotouh R Mansour ◽

Amira H Kamal ◽

...

Keyword(s):

Least Squares ◽

Simultaneous Determination ◽

Principal Component Regression ◽

Principal Component ◽

Data Matrix ◽

Ketorolac Tromethamine ◽

Concentration Data ◽

Phenylephrine Hydrochloride ◽

Spectrophotometric Methods

Abstract The present work describes new spectrophotometric methods for the simultaneous determination of phenylephrine hydrochloride and ketorolac tromethamine in their synthetic mixtures. The applied chemometric techniques are multivariate methods including classical least squares, principal component regression, and partial least squares. In these techniques, the concentration data matrix was prepared by using the synthetic mixtures containing these drugs dissolved in distilled water. The absorbance data matrix corresponding to the concentration data was obtained by measuring the absorbances at 16 wavelengths in the range 244–274 nm at 2 nm intervals in the zero-order spectra. The spectrophotometric procedures do not require any separation steps. The accuracy, precision, and linearity ranges of the methods have been determined, and analyzing synthetic mixtures containing the studied drugs has validated them. The developed methods were successfully applied to the synthetic mixtures and the results were compared to those obtained by a reported HPLC method.

Download Full-text

Simultaneous HPLC-DAD determination of pseudoephedrine HCl, sodium benzoate, sunset yellow, and methyl paraben in syrup preparation by use of partial least squares and principal component regression

Journal of Liquid Chromatography &amp Related Technologies ◽

10.1080/10826076.2019.1647543 ◽

2019 ◽

Vol 42 (19-20) ◽

pp. 648-653 ◽

Cited By ~ 2

Author(s):

Özlem Aksu Dönmez ◽

Şule Dinç-Zor ◽

Bürge Aşçı ◽

Ecem Şen

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Sodium Benzoate ◽

Principal Component Regression ◽

Principal Component ◽

Sunset Yellow ◽

Methyl Paraben

Download Full-text

KINETIC SIMULTANEOUS DETERMINATION OF Fe(II) AND Fe(III) USING PARTIAL LEAST SQUARES (PLS) AND PRINCIPAL COMPONENT REGRESSION (PCR) CALIBRATION METHODS

Analytical Letters ◽

10.1081/al-120002685 ◽

2002 ◽

Vol 35 (3) ◽

pp. 533-544 ◽

Cited By ~ 22

Author(s):

J. Ghasemi ◽

R. Amini ◽

A. Niazi

Keyword(s):

Least Squares ◽

Simultaneous Determination ◽

Partial Least Squares ◽

Principal Component Regression ◽

Principal Component ◽

Calibration Methods

Download Full-text

Detection of Cyanuric Acid and Melamine in Infant Formula Powders by Mid-FTIR Spectroscopy and Multivariate Analysis

Journal of Food Quality ◽

10.1155/2018/7926768 ◽

2018 ◽

Vol 2018 ◽

pp. 1-7 ◽

Cited By ~ 5

Author(s):

Edwin García-Miguel ◽

Ofelia Gabriela Meza-Márquez ◽

Guillermo Osorio-Revilla ◽

Darío Iker Téllez-Medina ◽

Cristian Jiménez-Martínez ◽

...

Keyword(s):

Multivariate Analysis ◽

Ftir Spectroscopy ◽

Least Squares ◽

Infant Formula ◽

Cyanuric Acid ◽

Principal Component Regression ◽

Predictive Ability ◽

Principal Component ◽

Infant Formulas ◽

Chemometric Methods

Chemometric methods using mid-FTIR spectroscopy were developed in order to reduce the time of study of melamine and cyanuric acid in infant formulas. Chemometric models were constructed using the algorithms Partial Least Squares (PLS1, PLS2) and Principal Component Regression (PCR) in order to correlate the IR signal with the levels of melamine or cyanuric acid in the infant formula samples. Results showed that the best correlations were obtained using PLS1 (R2: 0.9998, SEC: 0.0793, and SEP: 0.5545 for melamine and R2: 0.9997, SEC: 0.1074, and SEP: 0.5021 for cyanuric acid). Also, the SIMCA model was studied to distinguish between adulterated formulas and nonadulterated samples, giving optimum discrimination and good interclass distances between samples. Results showed that chemometric models demonstrated a good predictive ability of melamine and cyanuric acid concentrations in infant formulas, showing that this is a rapid and accurate technique to be used in the identification and quantification of these adulterants in infant formulas.

Download Full-text