scholarly journals Estimating cavity tree abundance using Nearest Neighbor Imputation methods for western Oregon and Washington forests

Silva Fennica ◽  
2008 ◽  
Vol 42 (3) ◽  
Author(s):  
Hailemariam Temesgen ◽  
Tara Barrett ◽  
Greg Latta
2009 ◽  
Vol 39 (9) ◽  
pp. 1749-1765 ◽  
Author(s):  
Bianca N.I. Eskelson ◽  
Hailemariam Temesgen ◽  
Tara M. Barrett

Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods. The models were developed and fit to data collected by the Forest Inventory and Analysis program of the US Forest Service in Washington, Oregon, and California. For predicting cavity tree and snag abundance per stand, all three NB regression models performed better in terms of mean square prediction error than the NN imputation methods. The most similar neighbor imputation, however, outperformed the NB regression models in predicting overall cavity tree and snag abundance.


2009 ◽  
Vol 113 (3) ◽  
pp. 546-553 ◽  
Author(s):  
Andreas Barth ◽  
Jörgen Wallerman ◽  
Göran Ståhl

2011 ◽  
Vol 5 (2A) ◽  
pp. 824-842 ◽  
Author(s):  
Jae Kwang Kim ◽  
Wayne A. Fuller ◽  
William R. Bell

2020 ◽  
Vol 2019 (1) ◽  
pp. 275-285
Author(s):  
Iman Jihad Fadillah ◽  
Siti Muchlisoh

Salah satu ciri data statistik yang berkualitas adalah completeness. Namun, pada penyelenggaraan sensus atau survei, sering kali ditemukan masalah data hilang atau tidak lengkap (missing values), tidak terkecuali pada data Survei Sosial Ekonomi Indonesia (Susenas). Berbagai masalah dapat ditimbulkan oleh missing values. Oleh karena itu, masalah missing values harus ditangani. Imputasi adalah cara yang sering digunakan untuk menangani masalah ini. Terdapat beberapa metode imputasi yang telah dikembangkan untuk menangani missing values. Hot-deck Imputation dan K-Nearest Neighbor Imputation (KNNI) merupakan metode yang dapat digunakan untuk menangani masalah missing values. Metode Hot-deck Imputation dan KNNI memanfaatkan variabel prediktor untuk melakukan proses imputasi dan tidak memerlukan asumsi yang rumit dalam penggunaannya. Algoritma dan cara penanganan missing values yang berbeda pada kedua metode tentunya dapat menghasilkan hasil estimasi yang berbeda pula. Penelitian ini membandingkan metode Hot-deck Imputation dan KNNI dalam mengatasi missing values. Analisis perbandingan dilakukan dengan melihat ketepatan estimator melalui nilai RMSE dan MAPE. Selain itu, diukur juga performa komputasi melalui penghitungan running time pada proses imputasi. Implementasi kedua metode pada data Susenas Maret Tahun 2017 menunjukkan bahwa, metode KNNI menghasilkan ketepatan estimator yang lebih baik dibandingkan Hot-deck Imputation. Namun, performa komputasi yang dihasilkan pada Hot-deck Imputation lebih baik dibandingkan KNNI.


2021 ◽  
Vol 8 (3) ◽  
pp. 215-226
Author(s):  
Parisa Saeipourdizaj ◽  
Parvin Sarbakhsh ◽  
Akbar Gholampour

Background: PIn air quality studies, it is very often to have missing data due to reasons such as machine failure or human error. The approach used in dealing with such missing data can affect the results of the analysis. The main aim of this study was to review the types of missing mechanism, imputation methods, application of some of them in imputation of missing of PM10 and O3 in Tabriz, and compare their efficiency. Methods: Methods of mean, EM algorithm, regression, classification and regression tree, predictive mean matching (PMM), interpolation, moving average, and K-nearest neighbor (KNN) were used. PMM was investigated by considering the spatial and temporal dependencies in the model. Missing data were randomly simulated with 10, 20, and 30% missing values. The efficiency of methods was compared using coefficient of determination (R2 ), mean absolute error (MAE) and root mean square error (RMSE). Results: Based on the results for all indicators, interpolation, moving average, and KNN had the best performance, respectively. PMM did not perform well with and without spatio-temporal information. Conclusion: Given that the nature of pollution data always depends on next and previous information, methods that their computational nature is based on before and after information indicated better performance than others, so in the case of pollutant data, it is recommended to use these methods.


Sign in / Sign up

Export Citation Format

Share Document