A Sparse Gaussian Process Framework for Photometric Redshift Estimation [IMA]


Accurate photometric redshift are a lynchpin for many future experiments to pin down the cosmological model and for studies of galaxy evolution. In this study, a novel sparse regression framework for photometric redshift estimation is presented. Data from a simulated survey was used to train and test the proposed models. We show that approaches which include careful data preparation and model design offer a significant improvement in comparison with several competing machine learning algorithms. Standard implementation of most regression algorithms has as the objective the minimization of the sum of squared errors. For redshift inference, however, this induces a bias in the posterior mean of the output distribution, which can be problematic. In this paper we optimize to directly target minimizing $\Delta z = (z_\textrm{s} – z_\textrm{p})/(1+z_\textrm{s})$ and address the bias problem via a distribution-based weighting scheme, incorporated as part of the optimization objective. The results are compared with other machine learning algorithms in the field such as Artificial Neural Networks (ANN), Gaussian Processes (GPs) and sparse GPs. The proposed framework reaches a mean absolute $\Delta z = 0.002(1+z_\textrm{s})$, with a maximum absolute error of 0.0432, over the redshift range of $0.2 \le z_\textrm{s} \le 2$, a factor of three improvement over standard ANNs used in the literature. We also investigate how the relative size of the training affects the photometric redshift accuracy. We find that a training set of $>$30 per cent of total sample size, provides little additional constraint on the photometric redshifts, and note that our GP formalism strongly outperforms ANN in the sparse data regime.

Read this paper on arXiv…

I. Almosallam, S. Lindsay, M. Jarvis, et. al.
Thu, 21 May 15

Comments: N/A