Smoothing Noisy Data Using Dynamic Programming and Generalized Cross-Validation

1988 ◽  
Vol 110 (1) ◽  
pp. 37-41 ◽  
Author(s):  
C. R. Dohrmann ◽  
H. R. Busby ◽  
D. M. Trujillo

Smoothing and differentiation of noisy data using spline functions requires the selection of an unknown smoothing parameter. The method of generalized cross-validation provides an excellent estimate of the smoothing parameter from the data itself even when the amount of noise associated with the data is unknown. In the present model only a single smoothing parameter must be obtained, but in a more general context the number may be larger. In an earlier work, smoothing of the data was accomplished by solving a minimization problem using the technique of dynamic programming. This paper shows how the computations required by generalized cross-validation can be performed as a simple extension of the dynamic programming formulas. The results of numerical experiments are also included.

1994 ◽  
Vol 116 (4) ◽  
pp. 528-531 ◽  
Author(s):  
Antony J. Hodgson

Dynamic programming techniques are useful in smoothing and differentiating noisy data signals according to an optimization criterion and the results are generally quite robust to noise spectra different from that assumed in the construction of the filter. If the noise properties are sufficiently different, however, the generalized cross-validation function used in the optimization can exhibit either multiple minima or no minima other than that corresponding to an insignificant amount of smoothing; in these cases, the smoothing parameter desired by the user typically does not lie at the global minimum of the generalized cross-validation function, but at some other point on the curve which can be identified heuristically. I present two cases to demonstrate this phenomenon and describe what measures one can take to ensure that the desired smoothing parameter is obtained.


Author(s):  
Syafruddin Side ◽  
Wahidah Sanusi ◽  
Mustati'atul Waidah Maksum

Abstrak. Regresi semiparametrik merupakan model regresi yang memuat komponen parametrik dan komponen nonparametrik dalam suatu model. Pada penelitian ini digunakan model regresi semiparametrik spline untuk data longitudinal dengan studi kasus penderita Demam Berdarah Dengue (DBD) di Rumah Sakit Universitas Hasanuddin Makassar periode bulan  Januari sampai bulan Maret 2018. Estimasi model regresi terbaik didapat dari pemilihan titik knot optimal dengan melihat nilai Generalized Cross Validation (GCV) dan Mean Square Error (MSE) yang minimum. Komponen parametrik pada penelitian ini adalah hemoglobin (g/dL) dan umur (tahun), suhu tubuh ( ), trombosit ( ) sebagai komponen nonparametrik dengan nilai GCV minimum sebesar 221,67745153 dicapai pada titik knot yaitu 14,552; 14,987; dan 15,096; nilai MSE sebesar 199,1032; dan nilai koefisien determinasi sebesar 75,3% yang diperoleh dari model regresi semiparametrik spline linear dengan tiga titik knot..Kata Kunci: regresi semiparametrik, spline, knot, Generalized Cross Validation, Demam Berdarah Dengue.Abstract. Semiparametric regression is a regression model that includes parametric and nonparametric components in it. The regression model in this research is spline semiparametric regression with case studies of patients with Dengue Hemorrahagic Fever (DHF) at University of Hasanuddin Makassar Hospital during the period of January to March 2018. The best regression model estimation is obtained from the selection of optimal knot which has minimum Generalized Cross Validation (GCV) and Mean Square Error (MSE). Parametric component in this research is hemoglobin (g/dL) and age (years), body temperature ( ), platelets ( ) as a nonparametric components. The minimum value of GCV is 221,67745153 achieved at the point 14,552; 14,987; and 15,096 knot; MSE value of 199,1032; and the value of coefficient determination is 75,3% obtained from semiparametric regression model linear spline with third point of knots.Keywords: semiparametric regression, spline, knot, Generalized Cross Validation, Dengue Hemorrahagic Fever.


2019 ◽  
Vol 1 (1) ◽  
pp. 11
Author(s):  
Bidayani Bidayani ◽  
Mustika Hadijati ◽  
Nurul Fitriyani

This study was conducted with the aim of determining the semiparametric spline regression model in the analysis of factors that influence rice production in East Lombok District in 2014 and finding out what factors influence the rice production results. The method used was semiparametric spline regression, with the selection of the optimum knot points using Generalized Cross Validation. The results obtained indicate that the variable that significantly affects rice production was the height of the area above sea level, with the determination coefficient value of 99.71% and the RMSEP value of 41.65.


2021 ◽  
Vol 174 (1) ◽  
Author(s):  
Amirlan Seksenbayev

AbstractWe study two closely related problems in the online selection of increasing subsequence. In the first problem, introduced by Samuels and Steele (Ann. Probab. 9(6):937–947, 1981), the objective is to maximise the length of a subsequence selected by a nonanticipating strategy from a random sample of given size $n$ n . In the dual problem, recently studied by Arlotto et al. (Random Struct. Algorithms 49:235–252, 2016), the objective is to minimise the expected time needed to choose an increasing subsequence of given length $k$ k from a sequence of infinite length. Developing a method based on the monotonicity of the dynamic programming equation, we derive the two-term asymptotic expansions for the optimal values, with $O(1)$ O ( 1 ) remainder in the first problem and $O(k)$ O ( k ) in the second. Settling a conjecture in Arlotto et al. (Random Struct. Algorithms 52:41–53, 2018), we also design selection strategies to achieve optimality within these bounds, that are, in a sense, best possible.


Sign in / Sign up

Export Citation Format

Share Document