Filling the gaps: Gaussian mixture models from noisy, truncated or incomplete samples [IMA]

http://arxiv.org/abs/1611.05806


We extend the common mixtures-of-Gaussians density estimation approach to account for a known sample incompleteness by simultaneous imputation from the current model. The method called GMMis generalizes existing Expectation-Maximization techniques for truncated data to arbitrary truncation geometries and probabilistic rejection. It can incorporate an uniform background distribution as well as independent multivariate normal measurement errors for each of the observed samples, and recovers an estimate of the error-free distribution from which both observed and unobserved samples are drawn. We compare GMMis to the standard Gaussian mixture model for simple test cases with different types of incompleteness, and apply it to observational data from the NASA Chandra X-ray telescope. The python code is capable of performing density estimation with millions of samples and thousands of model components and is released as an open-source package at https://github.com/pmelchior/pyGMMis

Read this paper on arXiv…

P. Melchior and A. Goulding
Fri, 18 Nov 16
49/60

Comments: 12 pages, 6 figures, submitted to Computational Statistics & Data Analysis