Model Description of Similarity-Based Recommendation Systems

Takafumi Kanamori; Naoya Osugi

doi:10.3390/e21070702

Model Description of Similarity-Based Recommendation Systems

Entropy ◽

10.3390/e21070702 ◽

2019 ◽

Vol 21 (7) ◽

pp. 702

Author(s):

Takafumi Kanamori ◽

Naoya Osugi

Keyword(s):

Mixture Models ◽

Mixture Model ◽

Statistical Models ◽

Recommendation System ◽

Similarity Measures ◽

Model Description ◽

Statistical Interpretation ◽

Similarity Matrix ◽

Real World Data ◽

Completely Positive

The quality of online services highly depends on the accuracy of the recommendations they can provide to users. Researchers have proposed various similarity measures based on the assumption that similar people like or dislike similar items or people, in order to improve the accuracy of their services. Additionally, statistical models, such as the stochastic block models, have been used to understand network structures. In this paper, we discuss the relationship between similarity-based methods and statistical models using the Bernoulli mixture models and the expectation-maximization (EM) algorithm. The Bernoulli mixture model naturally leads to a completely positive matrix as the similarity matrix. We prove that most of the commonly used similarity measures yield completely positive matrices as the similarity matrix. Based on this relationship, we propose an algorithm to transform the similarity matrix to the Bernoulli mixture model. Such a correspondence provides a statistical interpretation to similarity-based methods. Using this algorithm, we conduct numerical experiments using synthetic data and real-world data provided from an online dating site, and report the efficiency of the recommendation system based on the Bernoulli mixture models.

Download Full-text

A Generalized Gamma Mixture Model for Ultrasonic Tissue Characterization

Computational and Mathematical Methods in Medicine ◽

10.1155/2012/481923 ◽

2012 ◽

Vol 2012 ◽

pp. 1-25 ◽

Cited By ~ 14

Author(s):

Gonzalo Vegas-Sanchez-Ferrero ◽

Santiago Aja-Fernandez ◽

Cesar Palencia ◽

Marcos Martin-Fernandez

Keyword(s):

Mixture Models ◽

Mixture Model ◽

Statistical Models ◽

Tissue Characterization ◽

State Of The Art ◽

Myocardial Tissue ◽

Generalized Gamma ◽

Nakagami Distribution ◽

Impulsive Response ◽

Ultrasonic Images

Several statistical models have been proposed in the literature to describe the behavior of speckles. Among them, the Nakagami distribution has proven to very accurately characterize the speckle behavior in tissues. However, it fails when describing the heavier tails caused by the impulsive response of a speckle. The Generalized Gamma (GG) distribution (which also generalizes the Nakagami distribution) was proposed to overcome these limitations. Despite the advantages of the distribution in terms of goodness of fitting, its main drawback is the lack of a closed-form maximum likelihood (ML) estimates. Thus, the calculation of its parameters becomes difficult and not attractive. In this work, we propose (1) a simple but robust methodology to estimate the ML parameters of GG distributions and (2) a Generalized Gama Mixture Model (GGMM). These mixture models are of great value in ultrasound imaging when the received signal is characterized by a different nature of tissues. We show that a better speckle characterization is achieved when using GG and GGMM rather than other state-of-the-art distributions and mixture models. Results showed the better performance of the GG distribution in characterizing the speckle of blood and myocardial tissue in ultrasonic images.

Download Full-text

An Approach to Spatiotemporal Trajectory Clustering Based on Community Detection

Wireless Communications and Mobile Computing ◽

10.1155/2021/5582341 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Xin Wang ◽

Xinzheng Niu ◽

Jiahui Zhu ◽

Zuoyan Liu

Keyword(s):

Community Detection ◽

Trajectory Analysis ◽

Moving Objects ◽

Clustering Algorithms ◽

Similarity Measures ◽

Detection Algorithm ◽

Similarity Matrix ◽

Trajectory Data ◽

Trajectory Similarity ◽

Community Detection Algorithm

Nowadays, large volumes of multimodal data have been collected for analysis. An important type of data is trajectory data, which contains both time and space information. Trajectory analysis and clustering are essential to learn the pattern of moving objects. Computing trajectory similarity is a key aspect of trajectory analysis, but it is very time consuming. To address this issue, this paper presents an improved branch and bound strategy based on time slice segmentation, which reduces the time to obtain the similarity matrix by decreasing the number of distance calculations required to compute similarity. Then, the similarity matrix is transformed into a trajectory graph and a community detection algorithm is applied on it for clustering. Extensive experiments were done to compare the proposed algorithms with existing similarity measures and clustering algorithms. Results show that the proposed method can effectively mine the trajectory cluster information from the spatiotemporal trajectories.

Download Full-text

AANMF: Attribute-Aware Attentional Neural Matrix Factorization

Information Technology And Control ◽

10.5755/j01.itc.48.4.23149 ◽

2019 ◽

Vol 48 (4) ◽

pp. 682-693

Author(s):

Bo Zheng ◽

Jinsong Hu

Keyword(s):

Matrix Factorization ◽

Recommendation System ◽

Auxiliary Information ◽

Inner Product ◽

Data Sets ◽

It Projects ◽

Real World Data ◽

Latent Space ◽

Almost All ◽

Novel Model

Matrix Factorization (MF) is one of the most intuitive and effective methods in the Recommendation System domain. It projects sparse (user, item) interactions into dense feature products which endues strong generality to the MF model. To leverage this interaction, recent works use auxiliary information of users and items. Despite effectiveness, irrationality still exists among these methods, since almost all of them simply add the feature of auxiliary information in dense latent space to the feature of the user or item. In this work, we propose a novel model named AANMF, short for Attribute-aware Attentional Neural Matrix Factorization. AANMF combines two main parts, namely, neural-network-based factorization architecture for modeling inner product and attention-mechanism-based attribute processing cell for attribute handling. Extensive experiments on two real-world data sets demonstrate the robust and stronger performance of our model. Notably, we show that our model can deal with the attributes of user or item more reasonably. Our implementation of AANMF is publicly available at https://github.com/Holy-Shine/AANMF.

Download Full-text

Response to “Comment on ‘Mixture model description of the T-, P dependence of the refractive index of water’ ” [J. Chem. Phys. 115, 7795 (2001)]

The Journal of Chemical Physics ◽

10.1063/1.1406530 ◽

2001 ◽

Vol 115 (16) ◽

pp. 7796-7797 ◽

Cited By ~ 1

Author(s):

C. H. Cho ◽

J. Urquidi ◽

Gregory I. Gellene

Keyword(s):

Refractive Index ◽

Mixture Model ◽

Chem Phys ◽

Model Description

Download Full-text

On Bayesian Mixture Credibility

Astin Bulletin ◽

10.1017/s0515036100014677 ◽

2006 ◽

Vol 36 (2) ◽

pp. 573-588 ◽

Cited By ~ 1

Author(s):

John W. Lau ◽

Tak Kuen Siu ◽

Hailiang Yang

Keyword(s):

Mixture Models ◽

Mixture Model ◽

Claim Data ◽

Sampling Scheme ◽

Sample Mean ◽

Infinite Mixture Model ◽

Risk Characteristics ◽

Insurance Portfolio ◽

Bayesian Mixture ◽

Prior Estimate

We introduce a class of Bayesian infinite mixture models first introduced by Lo (1984) to determine the credibility premium for a non-homogeneous insurance portfolio. The Bayesian infinite mixture models provide us with much flexibility in the specification of the claim distribution. We employ the sampling scheme based on a weighted Chinese restaurant process introduced in Lo et al. (1996) to estimate a Bayesian infinite mixture model from the claim data. The Bayesian sampling scheme also provides a systematic way to cluster the claim data. This can provide some insights into the risk characteristics of the policyholders. The estimated credibility premium from the Bayesian infinite mixture model can be written as a linear combination of the prior estimate and the sample mean of the claim data. Estimation results for the Bayesian mixture credibility premiums will be presented.

Download Full-text

Movie Recommendation System Based on Fuzzy Inference System and Adaptive Neuro Fuzzy Inference System

Fuzzy Systems ◽

10.4018/978-1-5225-1908-9.ch026 ◽

2017 ◽

pp. 573-608

Author(s):

Mahfuzur Rahman Siddiquee ◽

Naimul Haider ◽

Rashedur M. Rahman

Keyword(s):

Fuzzy Inference System ◽

Recommendation System ◽

Fuzzy Inference ◽

Pearson Correlation ◽

Similarity Measures ◽

Manhattan Distance ◽

Comparative Performance ◽

Inference System ◽

Neuro Fuzzy ◽

Similarity Calculation

One of most prominent features that social networks or e-commerce sites now provide is recommendation of items. However, the recommendation task is challenging as high degree of accuracy is required. This paper analyzes the improvement in recommendation of movies using Fuzzy Inference System (FIS) and Adaptive Neuro Fuzzy Inference System (ANFIS). Two similarity measures have been used: one by taking account similar users' choice and the other by matching genres of similar movies rated by the user. For similarity calculation, four different techniques, namely Euclidean Distance, Manhattan Distance, Pearson Coefficient and Cosine Similarity are used. FIS and ANFIS system are used in decision making. The experiments have been carried out on Movie Lens dataset and a comparative performance analysis has been reported. Experimental results demonstrate that ANFIS outperforms FIS in most of the cases when Pearson Correlation metric is used for similarity calculation.

Download Full-text

On Using Demographic Variables to Determine Segment Membership in Logit Mixture Models

Journal of Marketing Research ◽

10.1177/002224379403100111 ◽

1994 ◽

Vol 31 (1) ◽

pp. 128-136 ◽

Cited By ~ 74

Author(s):

Sachin Gupta ◽

Pradeep K. Chintagunta

Keyword(s):

Panel Data ◽

Mixture Models ◽

Mixture Model ◽

Demographic Variables ◽

Demographic Characteristics ◽

Empirical Application ◽

Scanner Panel Data ◽

Scanner Panel ◽

Specific Profile ◽

Brand Preferences

The authors propose an extension of the logit-mixture model that defines prior segment membership probabilities as functions of concomitant (demographic) variables. Using this approach it is possible to describe how membership in each of the segments, segments being characterized by a specific profile of brand preferences and marketing variable sensitivities, is related to household demographic characteristics. An empirical application of the methodology is provided using A.C. Nielsen scanner panel data on catsup. The authors provide a comparison with the results obtained using the extant methodology in estimation and validation samples of households.

Download Full-text

Disentangling synchrony from serial dependency in complex climate networks: Comparing Event Synchronization and Event Coincidence Analysis

10.5194/egusphere-egu2020-1030 ◽

2020 ◽

Author(s):

Adrian Odenweller ◽

Reik Donner

Keyword(s):

Time Series ◽

Extreme Events ◽

Spatial Scales ◽

Grid Point ◽

Explanatory Power ◽

Similarity Measures ◽

Tropical Rainfall Measuring Mission ◽

Real World Data ◽

Serial Dependency ◽

Climate Networks

The quantification of synchronization phenomena of extreme events has recently aroused a great deal of interest in various disciplines. Climatological studies therefore commonly draw on spatially embedded climate networks in conjunction with nonlinear time series analysis. Among the multitude of similarity measures available to construct climate networks, Event Synchronization and Event Coincidence Analysis (ECA) stand out as two conceptually and computationally simple nonlinear methods. While ES defines synchrony in a data adaptive local way that does not distinguish between different time scales, ECA requires the selection of a specific time scale for synchrony detection.Herein, we provide evidence that, due to its parameter-free structure, ES has structural difficulties to disentangle synchrony from serial dependency, whereas ECA is less prone to such biases. We use coupled autoregressive processes to numerically study the sensitivity of results from both methods to changes of coupling and autoregressive parameters. This reveals that ES has difficulties to detect synchronies if events tend to occur temporally clustered, which can be expected from climate time series with extreme events exceeding certain percentiles.These conceptual concerns are not only reproducible in numerical simulations, but also have implications for real world data. We construct a climate network from satellite-based precipitation data of the Tropical Rainfall Measuring Mission (TRMM) for the Indian Summer Monsoon, thereby reproducing results of previously published studies. We demonstrate that there is an undesirable link between the fraction of events on subsequent days and the degree density at each grid point of the climate network. This indicates that the explanatory power of ES climate networks might be hampered since trivial local properties of the underlying time series significantly predetermine the final network structure, which holds especially true for areas that had previously been reported as important for governing monsoon dynamics at large spatial scales. In contrast, ECA does not appear to be as vulnerable to these biases and additionally allows to trace the spatiotemporal propagation of synchrony in climate networks.Our analysis rests on corrected versions of both methods that alleviate different normalization problems of the original definitions, which is especially important for short time series. Our finding suggest that careful event detection and diligent preprocessing is recommended when applying ES, while this is less crucial for ECA. Results obtained from ES climate networks therefore need to be interpreted with caution.

Download Full-text

Enhancing Recommender Diversity Using Gaussian Cloud Transformation

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488515500233 ◽

2015 ◽

Vol 23 (04) ◽

pp. 521-544 ◽

Cited By ~ 4

Author(s):

Jinpeng Chen ◽

Yu Liu ◽

Deyi Li

Keyword(s):

Recommender Systems ◽

Recommendation System ◽

Data Sets ◽

Complete Spectrum ◽

Unified Framework ◽

Real World Data ◽

Balance Accuracy ◽

Average Accuracy ◽

Novel Method ◽

Attention To Diversity

The recommender systems community is paying great attention to diversity as key qualities beyond accuracy in real recommendation scenarios. Multifarious diversity-increasing approaches have been developed to enhance recommendation diversity in the related literature while making personalized recommendations to users. In this work, we present Gaussian Cloud Recommendation Algorithm (GCRA), a novel method designed to balance accuracy and diversity personalized top-N recommendation lists in order to capture the user's complete spectrum of tastes. Our proposed algorithm does not require semantic information. Meanwhile we propose a unified framework to extend the traditional CF algorithms via utilizing GCRA for improving the recommendation system performance. Our work builds upon prior research on recommender systems. Though being detrimental to average accuracy, we show that our method can capture the user's complete spectrum of interests. Systematic experiments on three real-world data sets have demonstrated the effectiveness of our proposed approach in learning both accuracy and diversity.

Download Full-text

VARIATIONAL BAYES AND LOCALIZED FEATURE SELECTION FOR STUDENT'S t-MIXTURE MODELS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800141350016x ◽

2013 ◽

Vol 27 (06) ◽

pp. 1350016 ◽

Cited By ~ 1

Author(s):

HUI ZHANG ◽

Q. M. JONATHAN WU ◽

THANH MINH NGUYEN

Keyword(s):

Feature Selection ◽

Mixture Models ◽

Mixture Model ◽

Real Data ◽

Feature Saliency ◽

Data Points ◽

Selection For ◽

Student’S T ◽

T Distribution ◽

Novel Algorithm

In this paper, we propose a novel algorithm for feature selection and model detection using Student's t-distribution based on the variational Bayesian (VB) approach. First, our method is based on the Student's t-mixture model (SMM) which has heavier tail than the Gaussian distribution and is therefore less sensitive to small numbers of data points and consequent precision-estimates of the components number. Second, the number of components, the local feature saliency and the parameters of the mixture model are simultaneously estimated by Bayesian variational learning. Experimental results using synthetic and real data demonstrate the improved robustness of our approach.

Download Full-text