Dynamic Expectation-Maximization algorithms for Mixed-type Data - Centre de mathématiques appliquées (CMAP) Access content directly
Preprints, Working Papers, ... Year : 2024

Dynamic Expectation-Maximization algorithms for Mixed-type Data


Modelling mixed-type data is still complex because of the heterogeneity of encountered data. With clustering as the objective, many methods are already doing well, but the inference of models and a posteriori exploitation is made difficult if not impossible. In this article we propose methodological developments of mixture models designed for mixed-type data. Component distributions of the continuous attributes can be either Gaussian, Student or Shifted Asymmetric Laplace. Categorical or discrete attributes, assumed independent conditionally on the class membership, can be distributed according to Bernoulli, Multinomial or Poisson distributions. The joint estimation of the number of classes and the parameters is carried out by EM-like algorithms that we have adapted to perform correctly. We show that our different dynamic algorithms allow us to reach the real number of classes and to correctly estimate the parameters of the discrete and continuous laws. We also highlight the benefits of introducing regularization to improve performance in situations where the sample size is insufficient for the complexity of the model. Our various models are then tested on real datasets from the literature, assessing that the objective of jointly estimating the number of components and the model parameters has been achieved.
Fichier principal
Vignette du fichier
paper_DEM_MD_3105.pdf (1.13 Mo) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-04510689 , version 1 (19-03-2024)
hal-04510689 , version 2 (31-05-2024)



  • HAL Id : hal-04510689 , version 2


Solange Pruilh, Stéphanie Allassonnière. Dynamic Expectation-Maximization algorithms for Mixed-type Data. 2024. ⟨hal-04510689v2⟩
198 View
43 Download


Gmail Mastodon Facebook X LinkedIn More