CRPClustering: An R Package for Bayesian Nonparametric Chinese Restaurant Process Clustering with Entropy

10.7287/peerj.preprints.26533v1 ◽

2018 ◽

Author(s):

Masashi Okada

Keyword(s):

Dirichlet Process ◽

Scientific Method ◽

R Package ◽

Bayesian Nonparametrics ◽

Bayesian Nonparametric ◽

Number Of Clusters ◽

Chinese Restaurant Process ◽

Chinese Restaurant

Clustering is a scientific method which finds the clusters of data and many related methods are traditionally researched for long terms. Bayesian nonparametrics is statistics which can treat models having infinite parameters. Chinese restaurant process is used in order to compose Dirichlet process. The clustering which uses Chinese restaurant process does not need to decide the number of clusters in advance. This algorithm automatically adjusts it. Then, this package can calculate clusters in addition to entropy as the ambiguity of clusters.

Download Full-text

CRPClustering: An R Package for Bayesian Nonparametric Chinese Restaurant Process Clustering with Entropy

10.7287/peerj.preprints.26533v2 ◽

2018 ◽

Author(s):

Masashi Okada

Keyword(s):

Dirichlet Process ◽

Scientific Method ◽

R Package ◽

Bayesian Nonparametrics ◽

Bayesian Nonparametric ◽

Number Of Clusters ◽

Chinese Restaurant Process ◽

Chinese Restaurant

Clustering is a scientific method which finds the clusters of data and many related methods are traditionally researched for long terms. Bayesian nonparametrics is statistics which can treat models having infinite parameters. Chinese restaurant process is used in order to compose Dirichlet process. The clustering which uses Chinese restaurant process does not need to decide the number of clusters in advance. This algorithm automatically adjusts it. Then, this package can calculate clusters in addition to entropy as the ambiguity of clusters.

Download Full-text

A simple proof of Pitman–Yor’s Chinese restaurant process from its stick-breaking representation

Dependence Modeling ◽

10.1515/demo-2019-0003 ◽

2019 ◽

Vol 7 (1) ◽

pp. 45-52

Author(s):

Caroline Lawless ◽

Julyan Arbel

Keyword(s):

Dirichlet Process ◽

Elementary Proof ◽

Measure Theory ◽

Random Measure ◽

Bayesian Nonparametrics ◽

Practical Implementation ◽

Flexible Control ◽

Chinese Restaurant Process ◽

Chinese Restaurant ◽

Long Time

Abstract For a long time, the Dirichlet process has been the gold standard discrete random measure in Bayesian nonparametrics. The Pitman-Yor process provides a simple and mathematically tractable generalization, allowing for a very flexible control of the clustering behaviour. Two commonly used representations of the Pitman-Yor process are the stick-breaking process and the Chinese restaurant process. The former is a constructive representation of the process which turns out very handy for practical implementation, while the latter describes the partition distribution induced. Obtaining one from the other is usually done indirectly with use of measure theory. In contrast, we propose here an elementary proof of Pitman-Yor’s Chinese Restaurant process from its stick-breaking representation.

Download Full-text

A Distance-Dependent Chinese Restaurant Process Based Method for Event Detection on Social Media

Inventions ◽

10.3390/inventions3040080 ◽

2018 ◽

Vol 3 (4) ◽

pp. 80 ◽

Cited By ~ 1

Author(s):

Georgios Palaiokrassas ◽

Athanasios Voulodimos ◽

Antonios Litke ◽

Athanasios Papaoikonomou ◽

Theodora Varvarigou

Keyword(s):

Social Media ◽

Event Detection ◽

Dirichlet Process ◽

Chinese Restaurant Process ◽

Chinese Restaurant ◽

Social Events ◽

The Social ◽

Social Event Detection ◽

Clustering Approach ◽

Processing Steps

In this paper, we propose a method for event detection on social media, which aims at clustering media items into groups of events based on their textural information as well as available metadata. Our approach is based on distance-dependent Chinese Restaurant Process (ddCRP), a clustering approach resembling Dirichlet process algorithm. Furthermore, we scrutinize the effectiveness of a series of pre-processing steps in improving the detection performance. We experimentally evaluated our method using the Social Event Detection (SED) dataset of MediaEval 2013 benchmarking workshop, which pertains to the discovery of social events and their grouping in event-specific clusters. The obtained results indicate that the proposed method attains very good performance rates compared to existing approaches.

Download Full-text

Robust Cell Image Segmentation via Improved Markov Random Field Based on a Chinese Restaurant Process Model

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2020.p0963 ◽

2020 ◽

Vol 24 (7) ◽

pp. 963-971

Author(s):

Dongming Li ◽

Changming Sun ◽

Su Wei ◽

Yue Yu ◽

Jinhua Yang ◽

...

Keyword(s):

Image Segmentation ◽

Random Field ◽

Markov Random Field ◽

Process Model ◽

Mucosal Cell ◽

Number Of Clusters ◽

Chinese Restaurant Process ◽

Cell Image ◽

Chinese Restaurant ◽

Markov Random

In this paper, a segmentation method for cell images using Markov random field (MRF) based on a Chinese restaurant process model (CRPM) is proposed. Firstly, we carry out the preprocessing on the cell images, and then we focus on cell image segmentation using MRF based on a CRPM under a maximum a posteriori (MAP) criterion. The CRPM can be used to estimate the number of clusters in advance, adjusting the number of clusters automatically according to the size of the data. Finally, the conditional iteration mode (CIM) method is used to implement the MRF based cell image segmentation process. To validate our proposed method, segmentation experiments are performed on oral mucosal cell images. The segmentation results were compared with other methods, using precision, Dice, and mean square error (MSE) as the objective evaluation criteria. The experimental results show that our method produces accurate cell image segmentation results, and our method can effectively improve segmentation for the nucleus, binuclear cell, and micronucleus cell. This work will play an important role in cell image recognition and analysis.

Download Full-text

Novel trajectory clustering method based on distance dependent Chinese restaurant process

PeerJ Computer Science ◽

10.7717/peerj-cs.206 ◽

2019 ◽

Vol 5 ◽

pp. e206

Author(s):

Reza Arfa ◽

Rubiyah Yusof ◽

Parvaneh Shabanzadeh

Keyword(s):

Trajectory Analysis ◽

Traffic Monitoring ◽

Transport Systems ◽

Trajectory Clustering ◽

Number Of Clusters ◽

Chinese Restaurant Process ◽

Chinese Restaurant ◽

Wide Range ◽

Path Modelling ◽

Traditional Approaches

Trajectory clustering and path modelling are two core tasks in intelligent transport systems with a wide range of applications, from modeling drivers’ behavior to traffic monitoring of road intersections. Traditional trajectory analysis considers them as separate tasks, where the system first clusters the trajectories into a known number of clusters and then the path taken in each cluster is modelled. However, such a hierarchy does not allow the knowledge of the path model to be used to improve the performance of trajectory clustering. Based on the distance dependent Chinese restaurant process (DDCRP), a trajectory analysis system that simultaneously performs trajectory clustering and path modelling was proposed. Unlike most traditional approaches where the number of clusters should be known, the proposed method decides the number of clusters automatically. The proposed algorithm was tested on two publicly available trajectory datasets, and the experimental results recorded better performance and considerable improvement in both datasets for the task of trajectory clustering compared to traditional approaches. The study proved that the proposed method is an appropriate candidate to be used for trajectory clustering and path modelling.

Download Full-text

Dirichlet Process, Ewens Sampling Formula, and Chinese Restaurant Process

Statistics Based on Dirichlet Processes and Related Topics - SpringerBriefs in Statistics ◽

10.1007/978-981-15-6975-3_2 ◽

2020 ◽

pp. 7-28

Author(s):

Hajime Yamato

Keyword(s):

Dirichlet Process ◽

Ewens Sampling Formula ◽

Chinese Restaurant Process ◽

Sampling Formula ◽

Chinese Restaurant

Download Full-text

The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies

Journal of the ACM ◽

10.1145/1667053.1667056 ◽

2010 ◽

Vol 57 (2) ◽

pp. 1-30 ◽

Cited By ~ 237

Author(s):

David M. Blei ◽

Thomas L. Griffiths ◽

Michael I. Jordan

Keyword(s):

Nonparametric Inference ◽

Bayesian Nonparametric ◽

Chinese Restaurant Process ◽

Chinese Restaurant

Download Full-text

Estimating the Number of Clusters via Proportional Chinese Restaurant Process

2020 The 3rd International Conference on Machine Learning and Machine Intelligence ◽

10.1145/3426826.3426840 ◽

2020 ◽

Author(s):

Yingying Wen ◽

Hangjin Jiang ◽

Jianwei Yin

Keyword(s):

Number Of Clusters ◽

Chinese Restaurant Process ◽

Chinese Restaurant

Download Full-text

Risk, Return and Volatility Feedback: A Bayesian Nonparametric Analysis

Journal of Risk and Financial Management ◽

10.3390/jrfm11030052 ◽

2018 ◽

Vol 11 (3) ◽

pp. 52 ◽

Cited By ~ 2

Author(s):

Mark Jensen ◽

John Maheu

Keyword(s):

Dirichlet Process ◽

Joint Distribution ◽

Nonparametric Model ◽

Bayesian Nonparametric ◽

Excess Returns ◽

Dirichlet Process Prior ◽

Stock Market Returns ◽

Conditional Mean ◽

Risk Return ◽

Volatility Feedback

In this paper, we let the data speak for itself about the existence of volatility feedback and the often debated risk–return relationship. We do this by modeling the contemporaneous relationship between market excess returns and log-realized variances with a nonparametric, infinitely-ordered, mixture representation of the observables’ joint distribution. Our nonparametric estimator allows for deviation from conditional Gaussianity through non-zero, higher ordered, moments, like asymmetric, fat-tailed behavior, along with smooth, nonlinear, risk–return relationships. We use the parsimonious and relatively uninformative Bayesian Dirichlet process prior to overcoming the problem of having too many unknowns and not enough observations. Applying our Bayesian nonparametric model to more than a century’s worth of monthly US stock market returns and realized variances, we find strong, robust evidence of volatility feedback. Once volatility feedback is accounted for, we find an unambiguous positive, nonlinear, relationship between expected excess returns and expected log-realized variance. In addition to the conditional mean, volatility feedback impacts the entire joint distribution.

Download Full-text