R-universe - mqbssppe (Panagiotis Papastamoulis)

label.switching - Relabelling MCMC Outputs of Mixture Models

The Bayesian estimation of mixture models (and more general hidden Markov models) suffers from the label switching phenomenon, making the MCMC output non-identifiable. This package can be used in order to deal with this problem using various relabelling algorithms.

Last updated

3.64 score 1 stars 13 dependents 113 scripts 957 downloads

fabMix - Overfitting Bayesian Mixtures of Factor Analyzers with Parsimonious Covariance and Unknown Number of Components

Model-based clustering of multivariate continuous data using Bayesian mixtures of factor analyzers (Papastamoulis (2019) <DOI:10.1007/s11222-019-09891-z> (2018) <DOI:10.1016/j.csda.2018.03.007>). The number of clusters is estimated using overfitting mixture models (Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>): suitable prior assumptions ensure that asymptotically the extra components will have zero posterior weight, therefore, the inference is based on the ``alive'' components. A Gibbs sampler is implemented in order to (approximately) sample from the posterior distribution of the overfitting mixture. A prior parallel tempering scheme is also available, which allows to run multiple parallel chains with different prior distributions on the mixture weights. These chains run in parallel and can swap states using a Metropolis-Hastings move. Eight different parameterizations give rise to parsimonious representations of the covariance per cluster (following Mc Nicholas and Murphy (2008) <DOI:10.1007/s11222-008-9056-0>). The model parameterization and number of factors is selected according to the Bayesian Information Criterion. Identifiability issues related to label switching are dealt by post-processing the simulated output with the Equivalence Classes Representatives algorithm (Papastamoulis and Iliopoulos (2010) <DOI:10.1198/jcgs.2010.09008>, Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>).

Last updated

openblascppopenmp

2.09 score 1 dependents 41 scripts 363 downloads

beast - Bayesian Estimation of Change-Points in the Slope of Multivariate Time-Series

Assume that a temporal process is composed of contiguous segments with differing slopes and replicated noise-corrupted time series measurements are observed. The unknown mean of the data generating process is modelled as a piecewise linear function of time with an unknown number of change-points. The package infers the joint posterior distribution of the number and position of change-points as well as the unknown mean parameters per time-series by MCMC sampling. A-priori, the proposed model uses an overfitting number of mean parameters but, conditionally on a set of change-points, only a subset of them influences the likelihood. An exponentially decreasing prior distribution on the number of change-points gives rise to a posterior distribution concentrating on sparse representations of the underlying sequence, but also available is the Poisson distribution. See Papastamoulis et al (2019) <doi:10.1515/ijb-2018-0052> for a detailed presentation of the method.

Last updated

2.00 score 3 scripts 325 downloads

poisson.glm.mix - Fit High Dimensional Mixtures of Poisson GLMs

Mixtures of Poisson Generalized Linear Models for high dimensional count data clustering. The (multivariate) responses can be partitioned into set of blocks. Three different parameterizations of the linear predictor are considered. The models are estimated according to the EM algorithm with an efficient initialization scheme <doi:10.1016/j.csda.2014.07.005>.

Last updated

1.52 score 1 dependents 11 scripts 224 downloads

factor.switching - Post-Processing MCMC Outputs of Bayesian Factor Analytic Models

A well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. The package applies a series of rotation, sign and permutation transformations (Papastamoulis and Ntzoufras (2022) <DOI:10.1007/s11222-022-10084-4>) into raw MCMC samples of factor loadings, which are provided by the user. The post-processed output is identifiable and can be used for MCMC inference on any parametric function of factor loadings. Comparison of multiple MCMC chains is also possible.

Last updated

1.48 score 1 stars 1 dependents 5 scripts 335 downloads

BayesBinMix - Bayesian Estimation of Mixtures of Multivariate Bernoulli Distributions

Fully Bayesian inference for estimating the number of clusters and related parameters to heterogeneous binary data.

Last updated

1.32 score 21 scripts 370 downloads

bayesCureRateModel - Bayesian Cure Rate Modeling for Time-to-Event Data

A fully Bayesian approach in order to estimate a general family of cure rate models under the presence of covariates, see Papastamoulis and Milienos (2024) <doi:10.1007/s11749-024-00942-w> and Papastamoulis and Milienos (2024b) <doi:10.48550/arXiv.2409.10221>. The promotion time can be modelled (a) parametrically using typical distributional assumptions for time to event data (including the Weibull, Exponential, Gompertz, log-Logistic distributions), or (b) semiparametrically using finite mixtures of distributions. In both cases, user-defined families of distributions are allowed under some specific requirements. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis-Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution.

Last updated

openblascpp

1.30 score 1 scripts 434 downloads

multinomialLogitMix - Clustering Multinomial Count Data under the Presence of Covariates

Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>.

Last updated

cppopenmp

1.00 score 308 downloads