Principal Component Regression with Chemical Shift Increments. I. p-Disubstituted Benzenes and 2-Naphthyl Derivatives

1996 ◽  
Vol 61 (5) ◽  
pp. 713-725 ◽  
Author(s):  
Miroslav Holík

Prediction of 13C substituent chemical shifts in 14 series of para-disubstituted benzenes and in 2-substituted naphthalenes was based on principal component regression with chemical shift increments for the ipso, ortho, meta and para position of monosubstituted benzenes. Mean-centered matrix of shift increments was submitted to singular value decomposition and principal component regression was used for the projection of the investigated substituent chemical shifts and for the calculation of regression coefficients. Residual standard deviation between experimental and fitted values in para-disubstituted benzenes was in agreement with absolute values of "an electron demand" of substituents. Inspection of the regression parameters revealed that for the prediction of chemical shifts in 2-substituted naphthalenes the combination of chemical shift increments was better than the use of single increments. It is believed that the presented procedure is general and can be used for other aromatic or heteroaromatic systems.

2008 ◽  
Vol 21 (17) ◽  
pp. 4384-4398 ◽  
Author(s):  
Michael K. Tippett ◽  
Timothy DelSole ◽  
Simon J. Mason ◽  
Anthony G. Barnston

Abstract There are a variety of multivariate statistical methods for analyzing the relations between two datasets. Two commonly used methods are canonical correlation analysis (CCA) and maximum covariance analysis (MCA), which find the projections of the data onto coupled patterns with maximum correlation and covariance, respectively. These projections are often used in linear prediction models. Redundancy analysis and principal predictor analysis construct projections that maximize the explained variance and the sum of squared correlations of regression models. This paper shows that the above pattern methods are equivalent to different diagonalizations of the regression between the two datasets. The different diagonalizations are computed using the singular value decomposition of the regression matrix developed using data that are suitably transformed for each method. This common framework for the pattern methods permits easy comparison of their properties. Principal component regression is shown to be a special case of CCA-based regression. A commonly used linear prediction model constructed from MCA patterns does not give a least squares estimate since correlations among MCA predictors are neglected. A variation, denoted least squares estimate (LSE)-MCA, is suggested that uses the same patterns but minimizes squared error. Since the different pattern methods correspond to diagonalizations of the same regression matrix, they all produce the same regression model when a complete set of patterns is used. Different prediction models are obtained when an incomplete set of patterns is used, with each method optimizing different properties of the regression. Some key points are illustrated in two idealized examples, and the methods are applied to statistical downscaling of rainfall over the northeast of Brazil.


2017 ◽  
Vol 84 (1) ◽  
Author(s):  
Johannes Kiefer ◽  
Andreas Bösmann ◽  
Peter Wasserscheid

AbstractIn the past two decades, ionic liquids have found many applications as solvents for complex solutes. Prominent examples are the dissolution of biomass and carbohydrates as well as catalytically active substances. The chemical analysis of such solutions, however, is still a challenge due to the molecular complexity. In the present work, the use of infrared spectroscopy for quantifying the concentration of different solutes dissolved in an imidazolium-based ionic liquid is investigated. Binary solutions of glucose, cellubiose, and Wilkinson's catalyst in 1-ethyl-3-methylimidazolium acetate are studied as examples. For this purpose, different chemometric approaches (principal component analysis (PCA), partial least-squares regression (PLSR), and principal component regression (PCR)) for analyzing the spectra are tested. Principal component analysis was found to be suitable for classifying the different solutions. Both regression techniques were capable of deriving accurate concentration values. The performance of PLSR was slightly better than that of PCR for the same number of components.


1979 ◽  
Vol 32 (7) ◽  
pp. 1511 ◽  
Author(s):  
HM Hugel ◽  
DP Kelly ◽  
RJ Spear ◽  
J Bromilow ◽  
RTC Brownlee ◽  
...  

13C n.m.r. spectra have been obtained of a large range of 1(X),4(Y)-disubstituted benzenes in which X has been varied over a range of 25 substituents from NMe2 to +CHMe for each of the compounds where Y = H, OMe, Me, F, Cl, Br and CF3. The ipso-substituent chemical shifts (ipso-SCS) for each of the latter (that is, the change in chemical shift (Δδ) of C4 by replacement of H by Y) have been shown to vary dramatically with the electron demand of the X substituent as measured by δ(C4). When plotted against ?(C4), the ipso-SCS of F and OMe both decrease linearly with increasing electron demand whilst those of Br and Cl show linear increases. Those of Me and CF3 show discontinuities which indicate changes in the mechanism of interaction of these groups with the attached ipso-carbon. The variations in the ipso-SCS with electron demand of X are considered to be due to Y-induced variations in the sensitivity of the ipso-carbon to the effect of the para (X) substituent and not to through-conjugation effects. The results clearly show the fallacy of assuming that 13C substituent effects are constant.


Author(s):  
Shuichi Kawano

AbstractPrincipal component regression (PCR) is a two-stage procedure: the first stage performs principal component analysis (PCA) and the second stage builds a regression model whose explanatory variables are the principal components obtained in the first stage. Since PCA is performed using only explanatory variables, the principal components have no information about the response variable. To address this problem, we present a one-stage procedure for PCR based on a singular value decomposition approach. Our approach is based upon two loss functions, which are a regression loss and a PCA loss from the singular value decomposition, with sparse regularization. The proposed method enables us to obtain principal component loadings that include information about both explanatory variables and a response variable. An estimation algorithm is developed by using the alternating direction method of multipliers. We conduct numerical studies to show the effectiveness of the proposed method.


2000 ◽  
Vol 65 (1) ◽  
pp. 106-116 ◽  
Author(s):  
Jiří Kulhánek ◽  
Oldřich Pytela ◽  
Antonín Lyčka

The 13C chemical shifts have been measured of the carboxyl carbon atoms for all the 2-, 3-, and 4-substituted benzoic acids with H, CH3, CH3O, F, Cl, Br, I, and NO2 substituents, as well as for all 3,4-, 3,5-, and 2,6-disubstituted benzoic acids with combinations of CH3, CH3O, Cl (or Br), NO2 substituents and for symmetrically 2,6-disubstituted derivatives with Et, EtO, PrO, i-PrO, and BuO substituents. The chemical shifts of carboxylic group carbon atoms of the 3- and 4-substituted derivatives show correlation only with the substituent constants σI. For the 2-substituted derivatives was found the dependence only on σI and on the υ constant describing steric effects (s = 0.122, R = 0.996, without the CH3 derivative which has a distinct anisotropic effect). The substituent effects on the carboxylic carbon chemical shift show additivity with 3,4-, 3,5-, and 2,6-substituents, and the 2,6-disubstituted derivatives show a linear synergic effect of substituents due obviously to the steric hindrance to resonance. Application of the principal component analysis to the data matrix involving all the combinations of mono- and disubstitution involving the above-mentioned substituents has proved an identical substituent effect from all the positions on the chemical shift described by one latent variable, steric effects and anisotropic behaviour of methyl at the 2 and 2,6 positions being predominantly described by the second latent variable (with the total explained variability of 99.5%). Comparison of substituent effects on the chemical shift of carboxylic carbon with that on the dissociation constant measured in the same solvent has confirmed the anisotropy due to ortho methyl group, the ortho halogen substituents in monosubstituted derivatives also having a different effect. The dependence of chemical shift on pKa was not very close for the derivatives studied (s = 1.005, R = 0.690). The inclusion of anisotropy of ortho alkyl group by means of an indicator variable improved the correlation (s = 0.533, R = 0.925), and omitting of 2-F, 2-Cl, 2-Br, and 2-I substituents gave a regression without deviating points (s = 0.352, R = 0.968).


2001 ◽  
Author(s):  
Ιωάννης Πέττας

The Present Ph.D. thesis describes some of the most recent applications of chemometrics applied to the simultaneous multicomponent determination. The main goal of this work is the examination of thebehavior of chemometric models in chemical systems which used often in Analytical Chemistry, and the extraction of information from kinetic characteristics of the analytes. Various chemometric techniques have been developed and applied with the use of commercial or laboratorymade algorithms (Classical Least Squares, Inverse Least Squares, Partial Least Squares, and Principal Component Regression). These techniques were applied to data sets from a number of “real life” chemical systems, and the results were statistically analyzed, evaluated and/or compared.The most important aspect of this thesis is not so the mathematical processes, or the development of new algorithms for well-known and investigated chemometric techniques, as the successful application of the last in many cases, where traditional methods fails to serve Analytical Chemistry. In addition, a new field of kinetics is proved to be a powerful 4 ' analytical tool, equal or better than well-established analytical methods.


2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Khairunnisa Khairunnisa ◽  
Rizka Pitri ◽  
Victor P Butar-Butar ◽  
Agus M Soleh

This research used CFSRv2 data as output data general circulation model. CFSRv2 involves some variables data with high correlation, so in this research is using principal component regression (PCR) and partial least square (PLS) to solve the multicollinearity occurring in CFSRv2 data. This research aims to determine the best model between PCR and PLS to estimate rainfall at Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station by comparing RMSEP value and correlation value. Size used was 3×3, 4×4, 5×5, 6×6, 7×7, 8×8, 9×9, and 11×11 that was located between (-40) N - (-90) S and 1050 E -1100 E with a grid size of 0.5×0.5 The PLS model was the best model used in stastistical downscaling in this research than PCR model because of the PLS model obtained the lower RMSEP value and the higher correlation value. The best domain and RMSEP value for Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station is 9 × 9 with 100.06, 6 × 6 with 194.3, 8 × 8 with 117.6, and 6 × 6 with 108.2, respectively.


2007 ◽  
Vol 90 (2) ◽  
pp. 391-404 ◽  
Author(s):  
Fadia H Metwally ◽  
Yasser S El-Saharty ◽  
Mohamed Refaat ◽  
Sonia Z El-Khateeb

Abstract New selective, precise, and accurate methods are described for the determination of a ternary mixture containing drotaverine hydrochloride (I), caffeine (II), and paracetamol (III). The first method uses the first (D1) and third (D3) derivative spectrophotometry at 331 and 315 nm for the determination of (I) and (III), respectively, without interference from (II). The second method depends on the simultaneous use of the first derivative of the ratio spectra (DD1) with measurement at 312.4 nm for determination of (I) using the spectrum of 40 μg/mL (III) as a divisor or measurement at 286.4 and 304 nm after using the spectrum of 4 μg/mL (I) as a divisor for the determination of (II) and (III), respectively. In the third method, the predictive abilities of the classical least-squares, principal component regression, and partial least-squares were examined for the simultaneous determination of the ternary mixture. The last method depends on thin-layer chromatography-densitometry after separation of the mixture on silica gel plates using ethyl acetatechloroformmethanol (16 + 3 + 1, v/v/v) as the mobile phase. The spots were scanned at 281, 272, and 248 nm for the determination of (I), (II), and (III), respectively. Regression analysis showed good correlation in the selected ranges with excellent percentage recoveries. The chemical variables affecting the analytical performance of the methodology were studied and optimized. The methods showed no significant interferences from excipients. Intraday and interday assay precision and accuracy values were within regulatory limits. The suggested procedures were checked using laboratory-prepared mixtures and were successfully applied for the analysis of their pharmaceutical preparations. The validity of the proposed methods was further assessed by applying a standard addition technique. The results obtained by applying the proposed methods were statistically analyzed and compared with those obtained by the manufacturer's method.


Sign in / Sign up

Export Citation Format

Share Document