Bayesian Methods in Cosmology [CEA]

These notes aim at presenting an overview of Bayesian statistics, the underlying concepts and application methodology that will be useful to astronomers seeking to analyse and interpret a wide variety of data about the Universe. The level starts from elementary notions, without assuming any previous knowledge of statistical methods, and then progresses to more advanced, research-level topics. After an introduction to the importance of statistical inference for the physical sciences, elementary notions of probability theory and inference are introduced and explained. Bayesian methods are then presented, starting from the meaning of Bayes Theorem and its use as inferential engine, including a discussion on priors and posterior distributions. Numerical methods for generating samples from arbitrary posteriors (including Markov Chain Monte Carlo and Nested Sampling) are then covered. The last section deals with the topic of Bayesian model selection and how it is used to assess the performance of models, and contrasts it with the classical p-value approach. A series of exercises of various levels of difficulty are designed to further the understanding of the theoretical material, including fully worked out solutions for most of them.

Read this paper on arXiv…

R. Trotta
Mon, 9 Jan 17

Comments: 86 pages, 16 figures. Lecture notes for the 44th Saas Fee Advanced Course on Astronomy and Astrophysics, “Cosmology with wide-field surveys” (March 2014), to be published by Springer. Comments welcome

Accelerating cross-validation with total variation and its application to super-resolution imaging [CL]

We develop an approximation formula for the cross-validation error (CVE) of a sparse linear regression penalized by $\ell_1$-norm and total variation terms, which is based on a perturbative expansion utilizing the largeness of both the data dimensionality and the model. The developed formula allows us to reduce the necessary computational cost of the CVE evaluation significantly. The practicality of the formula is tested through application to simulated black-hole image reconstruction on the event-horizon scale with super resolution. The results demonstrate that our approximation reproduces the CVE values obtained via literally conducted cross-validation with reasonably good precision.

Read this paper on arXiv…

T. Obuchi, S. Ikeda, K. Akiyama, et. al.
Wed, 23 Nov 16

Comments: 5 pages, 1 figure

Filling the gaps: Gaussian mixture models from noisy, truncated or incomplete samples [IMA]

We extend the common mixtures-of-Gaussians density estimation approach to account for a known sample incompleteness by simultaneous imputation from the current model. The method called GMMis generalizes existing Expectation-Maximization techniques for truncated data to arbitrary truncation geometries and probabilistic rejection. It can incorporate an uniform background distribution as well as independent multivariate normal measurement errors for each of the observed samples, and recovers an estimate of the error-free distribution from which both observed and unobserved samples are drawn. We compare GMMis to the standard Gaussian mixture model for simple test cases with different types of incompleteness, and apply it to observational data from the NASA Chandra X-ray telescope. The python code is capable of performing density estimation with millions of samples and thousands of model components and is released as an open-source package at

Read this paper on arXiv…

P. Melchior and A. Goulding
Fri, 18 Nov 16

Comments: 12 pages, 6 figures, submitted to Computational Statistics & Data Analysis

Bayes Factors via Savage-Dickey Supermodels [IMA]

We outline a new method to compute the Bayes Factor for model selection which bypasses the Bayesian Evidence. Our method combines multiple models into a single, nested, Supermodel using one or more hyperparameters. Since the models are now nested the Bayes Factors between the models can be efficiently computed using the Savage-Dickey Density Ratio (SDDR). In this way model selection becomes a problem of parameter estimation. We consider two ways of constructing the supermodel in detail: one based on combined models, and a second based on combined likelihoods. We report on these two approaches for a Gaussian linear model for which the Bayesian evidence can be calculated analytically and a toy nonlinear problem. Unlike the combined model approach, where a standard Monte Carlo Markov Chain (MCMC) struggles, the combined-likelihood approach fares much better in providing a reliable estimate of the log-Bayes Factor. This scheme potentially opens the way to computationally efficient ways to compute Bayes Factors in high dimensions that exploit the good scaling properties of MCMC, as compared to methods such as nested sampling that fail for high dimensions.

Read this paper on arXiv…

A. Mootoovaloo, B. Bassett and M. Kunz
Fri, 9 Sep 16

Comments: 24 pages, 11 Figures

Generalisations of Fisher Matrices [CEA]

Fisher matrices play an important role in experimental design and in data analysis. Their primary role is to make predictions for the inference of model parameters – both their errors and covariances. In this short review, I outline a number of extensions to the simple Fisher matrix formalism, covering a number of recent developments in the field. These are: (a) situations where the data (in the form of (x,y) pairs) have errors in both x and y; (b) modifications to parameter inference in the presence of systematic errors, or through fixing the values of some model parameters; (c) Derivative Approximation for LIkelihoods (DALI) – higher-order expansions of the likelihood surface, going beyond the Gaussian shape approximation; (d) extensions of the Fisher-like formalism, to treat model selection problems with Bayesian evidence.

Read this paper on arXiv…

A. Heavens
Wed, 22 Jun 16

Comments: Invited review article for Entropy special issue on ‘Applications of Fisher Information in Sciences’. Accepted version

Looking for a Needle in a Haystack? Look Elsewhere! A statistical comparison of approximate global p-values [CL]

The search for new significant peaks over a energy spectrum often involves a statistical multiple hypothesis testing problem. Separate tests of hypothesis are conducted at different locations producing an ensemble of local p-values, the smallest of which is reported as evidence for the new resonance. Unfortunately, controlling the false detection rate (type I error rate) of such procedures may lead to excessively stringent acceptance criteria. In the recent physics literature, two promising statistical tools have been proposed to overcome these limitations. In 2005, a method to “find needles in haystacks” was introduced by Pilla et al. [1], and a second method was later proposed by Gross and Vitells [2] in the context of the “look elsewhere effect” and trial factors. We show that, for relatively small sample sizes, the former leads to an artificial inflation of statistical power that stems from an increase in the false detection rate, whereas the two methods exhibit similar performance for large sample sizes. Finally, we provide general guidelines to select between statistical procedures for signal detection with respect to the specifics of the physics problem under investigation.

Read this paper on arXiv…

S. Algeri, J. Conrad, D. Dyk, et. al.
Fri, 12 Feb 16

Comments: Submitted to EPJ C

Preprocessing Solar Images while Preserving their Latent Structure [IMA]

Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics Observatory, a NASA satellite, collect massive streams of high resolution images of the Sun through multiple wavelength filters. Reconstructing pixel-by-pixel thermal properties based on these images can be framed as an ill-posed inverse problem with Poisson noise, but this reconstruction is computationally expensive and there is disagreement among researchers about what regularization or prior assumptions are most appropriate. This article presents an image segmentation framework for preprocessing such images in order to reduce the data volume while preserving as much thermal information as possible for later downstream analyses. The resulting segmented images reflect thermal properties but do not depend on solving the ill-posed inverse problem. This allows users to avoid the Poisson inverse problem altogether or to tackle it on each of $\sim$10 segments rather than on each of $\sim$10$^7$ pixels, reducing computing time by a factor of $\sim$10$^6$. We employ a parametric class of dissimilarities that can be expressed as cosine dissimilarity functions or Hellinger distances between nonlinearly transformed vectors of multi-passband observations in each pixel. We develop a decision theoretic framework for choosing the dissimilarity that minimizes the expected loss that arises when estimating identifiable thermal properties based on segmented images rather than on a pixel-by-pixel basis. We also examine the efficacy of different dissimilarities for recovering clusters in the underlying thermal properties. The expected losses are computed under scientifically motivated prior distributions. Two simulation studies guide our choices of dissimilarity function. We illustrate our method by segmenting images of a coronal hole observed on 26 February 2015.

Read this paper on arXiv…

N. Stein, D. Dyk and V. Kashyap
Tue, 15 Dec 15

Comments: N/A