An Algorithm for Density Enrichment of Sparse Collaborative Filtering Datasets Using Robust Predictions as Derived Ratings

Dionisis Margaris; Dimitris Spiliotopoulos; Gregory Karagiorgos; Costas Vassilakis

doi:10.3390/a13070174

An Algorithm for Density Enrichment of Sparse Collaborative Filtering Datasets Using Robust Predictions as Derived Ratings

Algorithms ◽

10.3390/a13070174 ◽

2020 ◽

Vol 13 (7) ◽

pp. 174

Author(s):

Dionisis Margaris ◽

Dimitris Spiliotopoulos ◽

Gregory Karagiorgos ◽

Costas Vassilakis

Keyword(s):

Collaborative Filtering ◽

Mean Absolute Error ◽

Pearson Correlation ◽

Absolute Error ◽

Mean Square ◽

Error Metrics ◽

Target User ◽

The Mean ◽

Rating Prediction ◽

Sparsity Problem

Collaborative filtering algorithms formulate personalized recommendations for a user, first by analysing already entered ratings to identify other users with similar tastes to the user (termed as near neighbours), and then using the opinions of the near neighbours to predict which items the target user would like. However, in sparse datasets, too few near neighbours can be identified, resulting in low accuracy predictions and even a total inability to formulate personalized predictions. This paper addresses the sparsity problem by presenting an algorithm that uses robust predictions, that is predictions deemed as highly probable to be accurate, as derived ratings. Thus, the density of sparse datasets increases, and improved rating prediction coverage and accuracy are achieved. The proposed algorithm, termed as CFDR, is extensively evaluated using (1) seven widely-used collaborative filtering datasets, (2) the two most widely-used correlation metrics in collaborative filtering research, namely the Pearson correlation coefficient and the cosine similarity, and (3) the two most widely-used error metrics in collaborative filtering, namely the mean absolute error and the root mean square error. The evaluation results show that, by successfully increasing the density of the datasets, the capacity of collaborative filtering systems to formulate personalized and accurate recommendations is considerably improved.

Download Full-text

Explanation Plus Prediction—The Logical Focus of Project Management Research

Project Management Journal ◽

10.1177/8756972821999945 ◽

2021 ◽

pp. 875697282199994

Author(s):

Joseph F. Hair ◽

Marko Sarstedt

Keyword(s):

Project Management ◽

Statistical Models ◽

Predictive Power ◽

Mean Absolute Error ◽

Explanatory Power ◽

Absolute Error ◽

Model Parameters ◽

Mean Square ◽

Management Research ◽

The Mean

Most project management research focuses almost exclusively on explanatory analyses. Evaluation of the explanatory power of statistical models is generally based on F-type statistics and the R 2 metric, followed by an assessment of the model parameters (e.g., beta coefficients) in terms of their significance, size, and direction. However, these measures are not indicative of a model’s predictive power, which is central for deriving managerial recommendations. We recommend that project management researchers routinely use additional metrics, such as the mean absolute error or the root mean square error, to accurately quantify their statistical models’ predictive power.

Download Full-text

Direct Calculation of Thermodynamic Wet-Bulb Temperature as a Function of Pressure and Elevation

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-12-00191.1 ◽

2013 ◽

Vol 30 (8) ◽

pp. 1757-1765 ◽

Cited By ~ 10

Author(s):

Sayed-Hossein Sadeghi ◽

Troy R. Peters ◽

Douglas R. Cobos ◽

Henry W. Loescher ◽

Colin S. Campbell

Keyword(s):

Air Temperature ◽

Mean Absolute Error ◽

Direct Calculation ◽

Ambient Air ◽

Absolute Error ◽

Mean Square ◽

Air Temperatures ◽

Order Polynomial ◽

The Mean ◽

Very High

Abstract A simple analytical method was developed for directly calculating the thermodynamic wet-bulb temperature from air temperature and the vapor pressure (or relative humidity) at elevations up to 4500 m above MSL was developed. This methodology was based on the fact that the wet-bulb temperature can be closely approximated by a second-order polynomial in both the positive and negative ranges in ambient air temperature. The method in this study builds upon this understanding and provides results for the negative range of air temperatures (−17° to 0°C), so that the maximum observed error in this area is equal to or smaller than −0.17°C. For temperatures ≥0°C, wet-bulb temperature accuracy was ±0.65°C, and larger errors corresponded to very high temperatures (Ta ≥ 39°C) and/or very high or low relative humidities (5% < RH < 10% or RH > 98%). The mean absolute error and the root-mean-square error were 0.15° and 0.2°C, respectively.

Download Full-text

Accuracy Assessment of TanDEM-X 90 and CartoDEM Using ICESat-2 Datasets for Plain Regions of Ratlam City and Surroundings

Engineering Proceedings ◽

10.3390/ecsa-8-11441 ◽

2021 ◽

Vol 10 (1) ◽

pp. 59

Author(s):

Unnati Yadav ◽

Ashutosh Bhardwaj

Keyword(s):

Mean Absolute Error ◽

Accuracy Assessment ◽

Absolute Error ◽

Terrain Analysis ◽

Mean Square ◽

High Quality ◽

Ice Cloud ◽

The Mean ◽

Cut And Fill ◽

Statistical Results

The spaceborne LiDAR dataset from the Ice, Cloud, and Land Elevation Satellite (ICESat-2) provides highly accurate measurements of heights for the Earth’s surface, which helps in terrain analysis, visualization, and decision making for many applications. TanDEM-X 90 (90 m) and CartoDEM V3R1 (30 m) elevation are among the high-quality openly accessible DEM datasets for the plain regions in India. These two DEMs are validated against the ICESat-2 elevation datasets for the relatively plain areas of Ratlam City and its surroundings. The mean error (ME), mean absolute error (MAE), and root mean square error (RMSE) of TanDEM-X 90 DEM are 1.35 m, 1.48 m, and 2.19 m, respectively. The computed ME, MAE, and RMSE for CartoDEM V3R1 are 3.05 m, 3.18 m, and 3.82 m, respectively. The statistical results reveal that TanDEM-X 90 performs better in plain areas than CartoDEMV3R1. The study further indicates that these DEMs and spaceborne LiDAR datasets can be useful for planning various works requiring height as an important parameter, such as the layout of pipelines or cut and fill calculations for various construction activities. The TanDEM-X 90 can assist planners in quick assessments of the terrain for infrastructural developments, which otherwise need time-consuming traditional surveys using theodolite or a total station.

Download Full-text

Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/324/1/012049 ◽

2018 ◽

Vol 324 ◽

pp. 012049 ◽

Cited By ~ 25

Author(s):

Weijie Wang ◽

Yanmin Lu

Keyword(s):

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Mean Absolute Error ◽

Absolute Error ◽

Mean Square ◽

The Mean

Download Full-text

Validation of the procedure: Quantification of the degradation index of Photovoltaic Grid Connection Systems

Journal La Multiapp ◽

10.37899/journallamultiapp.v2i5.492 ◽

2021 ◽

Vol 2 (5) ◽

pp. 8-13

Author(s):

Proenza Y. Roger ◽

Camejo C. José Emilio ◽

Ramos H. Rubén

Keyword(s):

Mean Absolute Error ◽

Absolute Error ◽

Coefficient Of Determination ◽

Mean Square ◽

Degradation Index ◽

Statistical Parameters ◽

Grid Connection ◽

The Mean

The results obtained from the validation of the procedure ‟Quantification of the degradation index of Photovoltaic Grid Connection Systems” are presented, using statistical parameters, which corroborate its accuracy, achieving a coefficient of determination of 0.9896, a percentage of the root of the mean square of the error RMSPE = 1.498% and a percentage of the mean absolute error MAPE = 1.15%, evidencing the precision of the procedure.

Download Full-text

Comparison of different approaches to estimate bark volume of industrial wood at disc and log scale

Scientific Reports ◽

10.1038/s41598-021-95188-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ferréol Berendt ◽

Felipe de Miguel-Diez ◽

Evelyn Wallor ◽

Lubomir Blasko ◽

Tobias Cremer

Keyword(s):

Mean Absolute Error ◽

Absolute Error ◽

Solid Wood ◽

Volume Estimation ◽

Coefficient Of Determination ◽

Mean Square ◽

Bark Thickness ◽

Wood Industry ◽

The Mean ◽

Wood Content

AbstractWithin the wood supply chain, the measurement of roundwood plays a key role due to its high economic impact. While wood industry mainly processes the solid wood, the bark mostly remains as an industrial by-product. In Central Europe, it is common that the wood is sold over bark but that the price is calculated on a timber volume under bark. However, logs are often measured as stacks and, thus, the volume includes not only the solid wood content but also the bark portion. Mostly, the deduction factors used to estimate the solid wood content are based on bark thickness. The aim of this study was to compare the estimation of bark volume from scaling formulae with the real bark volume, obtained by xylometric technique. Moreover, the measurements were performed using logs under practice conditions and using discs under laboratory conditions. The mean bark volume was 6.9 dm3 and 26.4 cm3 for the Norway spruce logs and the Scots pine discs respectively. Whereas the results showed good performances regarding the root mean square error, the coefficient of determination (R2) and the mean absolute error for the volume estimation of the total volume of discs and logs (over bark), the performances were much lower for the bark volume estimations only.

Download Full-text

Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance

Climate Research ◽

10.3354/cr030079 ◽

2005 ◽

Vol 30 ◽

pp. 79-82 ◽

Cited By ~ 1408

Author(s):

CJ Willmott ◽

K Matsuura

Keyword(s):

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Mean Absolute Error ◽

Model Performance ◽

Absolute Error ◽

Mean Square ◽

Average Model ◽

The Mean

Download Full-text

Correlation of Ultrasonographic Estimation of Fetal Weight with Actual Birth Weight as Seen in a Private Specialist Hospital in South East Nigeria

International Journal of Reproductive Medicine ◽

10.1155/2019/3693797 ◽

2019 ◽

Vol 2019 ◽

pp. 1-4

Author(s):

Chisolum Ogechukwu Okafor ◽

Charles Ikechukwu Okafor ◽

Ikechukwu Innocent Mbachu ◽

Izuchukwu Christian Obionwu ◽

Michael Echeta Aronu

Keyword(s):

Birth Weight ◽

Mean Absolute Error ◽

Pearson Correlation ◽

Absolute Error ◽

Fetal Weight ◽

Percentage Error ◽

Mean Error ◽

The Mean ◽

Route Of Delivery ◽

Private Specialist

Background. Ultrasound estimation of fetal weight at term provides vital information for the skilled birth attendants to make decisions on the possible best route of delivery of the fetus. This is more pertinent in a setting where women book late for antenatal care. Aim and Objectives. The study evaluated the accuracy of estimation of fetal weight with ultrasound machine at term. Methods. This was a cross sectional study conducted at a private specialist hospital in Nigeria. A coded questionnaire was used to retrieve relevant information which included the last menstrual period, gestational age, parity, and birth weight. Other information obtained includes Ultrasound-delivery interval, maternal weight, and route of delivery. The ultrasound was used to estimate the fetal weight. The actual birth weight was determined using a digital baby weighing scale. The data were inputted into Microsoft excel and analyzed using STATA version 14. Statistical significance was considered at p-values less than 0.05. Measures of accuracy evaluated in the statistical analysis included mean error, mean absolute error, mean percentage error, and mean absolute percentage error. Pearson correlation was done between the estimated ultrasound fetal weight and the actual birth weight. The proportion of estimates within ±10% of actual birth weight was also determined. Result.A total of 170 pregnant women participated in the study. The mean maternal age was 30.77 years ± 5.54. The mean birth weight was 3.47 kg ± 0.47, while the mean estimated ultrasound weight was 3.43 kg ± 0.8. There was positive correlation between the ultrasound estimated weight and the actual birth weight. The mean ultrasound scan to delivery interval was 0.8 days (with range of 0–2 days). The study recorded a mean error of estimation of 41.17 grams and mean absolute error of 258.22 grams. The mean percentage error was 0.65%, while the mean absolute error of estimation was 7.56%. About 72.54% of the estimated weights were within 10% of the actual birth weight. Conclusion. The ultrasound estimated fetal weight correlated with the actual birth weight. Ultrasound estimation of fetal weight should be done when indicated to aid the clinician in making decisions concerning routes of delivery.

Download Full-text

Augmenting Black Sheep Neighbour Importance for Enhancing Rating Prediction Accuracy in Collaborative Filtering

Applied Sciences ◽

10.3390/app11188369 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8369

Author(s):

Dionisis Margaris ◽

Dimitris Spiliotopoulos ◽

Costas Vassilakis

Keyword(s):

Collaborative Filtering ◽

Prediction Error ◽

Prediction Accuracy ◽

Supplementary Information ◽

Similarity Metrics ◽

Sources Of Information ◽

Black Sheep ◽

Error Metrics ◽

Target User ◽

Rating Prediction

In this work, an algorithm for enhancing the rating prediction accuracy in collaborative filtering, which does not need any supplementary information, utilising only the users’ ratings on items, is presented. This accuracy enhancement is achieved by augmenting the importance of the opinions of ‘black sheep near neighbours’, which are pairs of near neighbours with opinion agreement on items that deviates from the dominant community opinion on the same item. The presented work substantiates that the weights of near neighbours can be adjusted, based on the degree to which the target user and the near neighbour deviate from the dominant ratings for each item. This concept can be utilized in various other CF algorithms. The experimental evaluation was conducted on six datasets broadly used in CF research, using two user similarity metrics and two rating prediction error metrics. The results show that the proposed technique increases rating prediction accuracy both when used independently and when combined with other CF algorithms. The proposed algorithm is designed to work without the requirements to utilise any supplementary sources of information, such as user relations in social networks and detailed item descriptions. The aforesaid point out both the efficacy and the applicability of the proposed work.

Download Full-text

Implementasi Metode Item-Based Collaborative Filtering dalam Pemberian Rekomendasi Calon Pembeli Aksesoris Smartphone

Eksplora Informatika ◽

10.30864/eksplora.v9i1.244 ◽

2019 ◽

Vol 9 (1) ◽

pp. 17-27

Author(s):

Bondan Prasetyo ◽

Hanny Haryanto ◽

Setia Astuti ◽

Erna Zuni Astuti ◽

Yuniarsi Rahayu

Keyword(s):

Collaborative Filtering ◽

Hybrid Method ◽

Mean Absolute Error ◽

Pearson Correlation ◽

Weighted Average ◽

Absolute Error ◽

Content Based Filtering

Flazzstore merupakan sebuah toko yang bergerak dibidang penjualan casing smartphone. Terdapat banyak produk yang berbeda-beda dengan banyak tema yang berbeda pula, hal ini membuat beberapa user kesulitan dalam menentukan pilihan mengenai produk yang akan dipilih. Perlunya sebuah sistem rekomendasi yang mampu memberikan rekomendasi produk kepada user, untuk memudahkan user dalam memilih produk yang akan dibelinya. Penelitian ini menggunakan metode Item-Based Collaborative Filtering, metode ini mencari similarity/kesamaan item dengan item lainnya. Sistem akan mencari rating tiap item dan menghitung nilai similarity menggunakan persamaan pearson correlation-based similarity. Kemudian nilai dari hasil perhitungan similarity akan digunakan untuk menghitung nilai prediksi tiap produk dengan menggunakan persamaan weighted average of deviation. Sebelum direkomendasikan kepada pelanggan dari hasil prediksi tersebut dihitung nilai Mean Absolute Error (MAE) dihitung selisih antara nilai rating sebenarnya dengan prediksi, dan kemudian diurutkan mulai dari terkecil ke terbesar untuk direkomendasikan kepada user. Hasil dari penelitian menunjukkan kecilnya nilai rata-rata MAE 0,572039 namun untuk proses eksekusi, waktu yang dibutuhkan cukup lama yaitu 6,4 detik. Penelitian berikutnya dapat mengombinasikan pendekatan metode content based filtering dan collaborative filtering atau disebut dengan Item Based Clustering Hybrid Method (ICHM) supaya hasil yang diperoleh lebih baik dan dapat mempersingkat waktu yang dibutuhkan.

Download Full-text