beta mixture model
Recently Published Documents


TOTAL DOCUMENTS

20
(FIVE YEARS 2)

H-INDEX

5
(FIVE YEARS 0)

2021 ◽  
Vol 16 ◽  
Author(s):  
Zhaoyang Liu ◽  
Hongsheng Yin ◽  
Shutao Chen ◽  
Hui Liu ◽  
Jia Meng ◽  
...  

Background: m6A methylation is a ubiquitous post-transcriptional modification that exists in mammals. MeRIP-seq technology makes the acquisition of m6A data in the whole transcriptome under different conditions realizable. The specific regulation of the enzyme will present co-methylation module on m6A methylation level data. Thus, mining the co-methylation module from which can help to unveil the mechanism of m<sup>6</sup>A methylation modification and its mechanism in the occurrence and development of complex diseases such as cancer. Objective: To develop a clustering algorithm that can effectively realize the mining of m6 co-methylation module. Method: In this study, a novel beta mixture model-based clustering algorithm named MBMM was proposed, which is based on the EM framework and introduces the method of moment estimating in M-step for parameter estimation to tackle the high-dimensional small sample m6A data. Simulation research was employed to evaluate the clustering performance of the proposed algorithm, and by which the co-methylation module mining was done based on real data. Biological significance correlation analysis was employed to explore whether the clustering results are co-methylation modules. Results and Conclusion: Simulation research demonstrated that MBMM performed out than other clustering algorithms. In real data, seven co-methylation modules were found by MBMM. Six m6A-related pathways specific analysis showed that six co-methylation modules were enriched in the pathway and were different. Five enzymes substrate-specific analysis revealed that seven co-methylation modules expressed varying degrees of enrichment. Gene Ontology enrichment analysis indicated that these modules may be regulated by enzymes while having potential functional specificity.


2019 ◽  
Author(s):  
Bowen Liu ◽  
Xiaofei Yang ◽  
Tingjie Wang ◽  
Jiadong Lin ◽  
Yongyong Kang ◽  
...  

Abstract Motivation Tumor purity is a fundamental property of each cancer sample and affects downstream investigations. Current tumor purity estimation methods either require matched normal sample or report moderately high tumor purity even on normal samples. It is critical to develop a novel computational approach to estimate tumor purity with sufficient precision based on tumor-only sample. Results In this study, we developed MEpurity, a beta mixture model-based algorithm, to estimate the tumor purity based on tumor-only Illumina Infinium 450k methylation microarray data. We applied MEpurity to both The Cancer Genome Atlas (TCGA) cancer data and cancer cell line data, demonstrating that MEpurity reports low tumor purity on normal samples and comparable results on tumor samples with other state-of-art methods. Availability and implementation MEpurity is a C++ program which is available at https://github.com/xjtu-omics/MEpurity. Supplementary information Supplementary data are available at Bioinformatics online.


Risks ◽  
2019 ◽  
Vol 7 (1) ◽  
pp. 19 ◽  
Author(s):  
Hui Ye ◽  
Anthony Bellotti

Based on a rich dataset of recoveries donated by a debt collection business, recovery rates for non-performing loans taken from a single European country are modelled using linear regression, linear regression with Lasso, beta regression and inflated beta regression. We also propose a two-stage model: beta mixture model combined with a logistic regression model. The proposed model allowed us to model the multimodal distribution we found for these recovery rates. All models were built using loan characteristics, default data and collections data prior to purchase by the debt collection business. The intended use of the models was to estimate future recovery rates for improved risk assessment, capital requirement calculations and bad debt management. They were compared using a range of quantitative performance measures under K-fold cross validation. Among all the models, we found that the proposed two-stage beta mixture model performs best.


Author(s):  
Anthony Bellotti ◽  
Hui Ye

Based on a rich data set of recoveries donated by a debt collection business, recovery rates for non-performing loans taken from a single European country are modelled using linear regression, linear regression with Lasso, beta regression and inflated beta regression. We also propose a two-stage model: beta mixture model combined with a logistic regression model. The proposed model allows us to model the multimodal distribution we find for these recovery rates. All models are built using loan characteristics, default data and collections data prior to purchase by the debt collection business. The intended use of the models is to estimate future recovery rates for improved risk assessment, capital requirement calculations and bad debt management. They are compared using a range of quantitative performance measures under K-fold cross validation. Among all the models, we find that the proposed two-stage beta mixture model performs best.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 28383-28391 ◽  
Author(s):  
Daniel Alejandro Hernandez-Contreras ◽  
Hayde Peregrina-Barreto ◽  
Jose De Jesus Rangel-Magdaleno ◽  
Felipe Orihuela-Espina

Sign in / Sign up

Export Citation Format

Share Document