The new international projects, such as the Euclid space telescope, are ushering in the era of Big Data for cosmologists. Our questions about dark matter and dark energy, which on their own account for 95% of the content of our Universe, throw up new algorithmic, computational and theoretical challenges. The fourth concerns reproducible research, a fundamental concept for the verification and credibility of the published results.
La Direction de la recherche fondamentale au CEA lance le projet COSMIC, né du rapprochement de deux compétences en traitement des données localisées à l'Institut des sciences du vivant Frédéric-Joliot (NeuroSpin) et au CEA-Irfu (CosmoStat). Les mécanismes d'acquisition de données en radio-astronomie et en IRM présentent des similarités. Les modèles mathématiques utilisés sont en effet basés sur les principes de parcimonie et d'acquisition comprimée, dérivés de l'analyse harmonique.
With the increasing number of deep multi-wavelength galaxy surveys, the spectral energy distribution (SED) of galaxies has become an invaluable tool for studying the formation of their structures and their evolution. In this context, standard analysis relies on simple spectro-photometric selection criteria based on a few SED colors. If this fully supervised classification already yielded clear achievements, it is not optimal to extract relevant information from the data. In this article, we propose to employ very recent advances in machine learning, and more precisely in feature learning, to derive a data-driven diagram. We show that the proposed approach based on denoising autoencoders recovers the bi-modality in the galaxy population in an unsupervised manner, without using any prior knowledge on galaxy SED classification. This technique has been compared to principal component analysis (PCA) and to standard color/color representations. In addition, preliminary results illustrate that this enables the capturing of extra physically meaningful information, such as redshift dependence, galaxy mass evolution and variation over the specific star formation rate. PCA also results in an unsupervised representation with physical properties, such as mass and sSFR, although this representation separates out less other characteristics (bimodality, redshift evolution) than denoising autoencoders.
Context: in astronomy, observing large fractions of the sky within a reasonable amount of time implies using large field-of-view (fov) optical instruments that typically have a spatially varying Point Spread Function (PSF). Depending on the scientific goals, galaxies images need to be corrected for the PSF whereas no direct measurement of the PSF is available. Aims: given a set of PSFs observed at random locations, we want to estimate the PSFs at galaxies locations for shapes measurements correction. Contributions: we propose an interpolation framework based on Sliced Optimal Transport. A non-linear dimension reduction is first performed based on local pairwise approximated Wasserstein distances. A low dimensional representation of the unknown PSFs is then estimated, which in turn is used to derive representations of those PSFs in the Wasserstein metric. Finally, the interpolated PSFs are calculated as approximated Wasserstein barycenters. Results: the proposed method was tested on simulated monochromatic PSFs of the Euclid space mission telescope (to be launched in 2020). It achieves a remarkable accuracy in terms of pixels values and shape compared to standard methods such as Inverse Distance Weighting or Radial Basis Function based interpolation methods.
Blind Source Separation (BSS) is a challenging matrix factorization problem that plays a central role in multichannel imaging science. In a large number of applications, such as astrophysics, current unmixing methods are limited since real-world mixtures are generally affected by extra instrumental effects like blurring. Therefore, BSS has to be solved jointly with a deconvolution problem, which requires tackling a new inverse problem: deconvolution BSS (DBSS). In this article, we introduce an innovative DBSS approach, called DecGMCA, based on sparse signal modeling and an efficient alternative projected least square algorithm. Numerical results demonstrate that the DecGMCA algorithm performs very well on simulations. It further highlights the importance of jointly solving BSS and deconvolution instead of considering these two problems independently. Furthermore, the performance of the proposed DecGMCA algorithm is demonstrated on simulated radio-interferometric data.
Removing the aberrations introduced by the Point Spread Function (PSF) is a fundamental aspect of astronomical image processing. The presence of noise in observed images makes deconvolution a nontrivial task that necessitates the use of regularisation. This task is particularly difficult when the PSF varies spatially as is the case for the Euclid telescope. New surveys will provide images containing thousand of galaxies and the deconvolution regularisation problem can be considered from a completely new perspective. In fact, one can assume that galaxies belong to a low-rank dimensional space. This work introduces the use of the low-rank matrix approximation as a regularisation prior for galaxy image deconvolution and compares its performance with a standard sparse regularisation technique. This new approach leads to a natural way to handle a space variant PSF. Deconvolution is performed using a Python code that implements a primal-dual splitting algorithm. The data set considered is a sample of 10 000 space-based galaxy images convolved with a known spatially varying Euclid-like PSF and including various levels of Gaussian additive noise. Performance is assessed by examining the deconvolved galaxy image pixels and shapes. The results demonstrate that for small samples of galaxies sparsity performs better in terms of pixel and shape recovery, while for larger samples of galaxies it is possible to obtain more accurate estimates of the galaxy shapes using the low-rank approximation.
Point Spread Function
The Point Spread Function or PSF of an imaging system (also referred to as the impulse response) describes how the system responds to a point (unextended) source. In astrophysics, stars or quasars are often used to measure the PSF of an instrument as in ideal conditions their light would occupy a single pixel on a CCD. Telescopes, however, diffract the incoming photons which limits the maximum resolution achievable. In reality, the images obtained from telescopes include aberrations from various sources such as:
- The atmosphere (for ground based instruments)
- Jitter (for space based instruments)
- Imperfections in the optical system
- Charge spread of the detectors
In order to recover the true image properties it is necessary to remove PSF effects from observations. If the PSF is known (which is certainly not trivial) one can attempt to deconvolve the PSF from the image. In the absence of noise this is simple. We can model the observed image as follows
where is the true image and is an operator that represents the convolution with the PSF. Thus, to recover the true image, one would simply invert as follows
Unfortunately, the images we observe also contain noise (e.g. from the CCD readout) and this complicates the problem.
This problem is ill-posed as even the tiniest amount of noise will have a large impact on the result of the operation. Therefore, to obtain a stable and unique solution, it is necessary to regularise the problem by adding additional prior knowledge of the true images.
One way to regularise the problem is using sparsity. The concept of sparsity is quite simple. If we know that there is a representation of that is sparse (i.e. most of the coefficients are zeros) then we can force our deconvolved observation to be sparse in the same domain. In practice we aim to minimise a problem of the following form
where is a matrix that transforms to the sparse domain and is a regularisation control parameter.
Another way to regularise the problem is assume that all of the images one aims to deconvolve live on a underlying low-rank manifold. In other words, if we have a sample of galaxy images we wish to deconvolve then we can construct a matrix X X where each column is a vector of galaxy pixel coefficients. If many of these galaxies have similar properties then we know that X X will have a smaller rank than if images were all very different. We can use this knowledge to regularise the deconvolution problem in the following way
In the paper I implement both of these regularisation techniques and compare how well they perform at deconvolving a sample of 10,000 Euclid-like galaxy images. The results show that, for the data used, sparsity does a better job at recovering the image pixels, while the low-rank approximation does a better job a recovering the galaxy shapes (provided enough galaxies are used).
SF_DECONVOLVE is a Python code designed for PSF deconvolution using a low-rank approximation and sparsity. The code can handle a fixed PSF for the entire field or a stack of PSFs for each galaxy position.
A key challenge in cosmological research is how to extract the most important information from satellite imagery and radio signals. The difficulty lies in the systematic processing of extremely noisy data for studying how stars and galaxies evolve through time. This is critical for astrophysicists in their effort to gain insights into cosmological processes such as the characterisation of dark matter in the Universe. Helping scientists find their way through this data maze is DEDALE, an interdisciplinary project that intends to develop the next generation of data analysis methods for the new era of big data in astrophysics and compressed sensing.
Today marks the release of the first papers to result from the XXL survey, the largest survey of galaxy clusters ever undertaken with ESA's XMM-Newton X-ray observatory. The gargantuan clusters of galaxies surveyed are key features of the large-scale structure of the Universe and to better understand them is to better understand this structure and the circumstances that led to its evolution. The first results from the survey, published in a special issue of Astronomy and Astrophysics, hint at the answers and surprises that are captured in this unique bank of data and reveal the true potential of the survey.
In a review article in "Reports on Progress in Physics", Martin Kilbinger of Astrophysics Department - AIM Laboratory at CEA-IRFU presents a comprehensive assessment of the results obtained from observations of the cosmic shear in the last 15 years. The cosmic shear effect has been measured for the first time in 2000. This effect is a distortion of the images of galaxies under the effect of gravity of the intervening clumps of matter. It allows to map the dark matter but also to determine how dark energy affects the cosmic web. The article highlights the most important challenges for turning cosmic shear into an accurate tool for cosmology. So far, dark matter has been mapped for only a tiny fraction of the sky. Future observations, such as those of the future space mission Euclid, will cover most accessible regions of the sky. The review presents the progress expected from these potential future missions for our understanding of the cosmos.
I l y a 13,8 milliards d’années naît l’Univers, sous la forme d’une singularité évoluant instantanément en un brouillard chaud, opaque, fait de noyaux d’hydrogène et d’électrons. Pendant plus de 300000 ans, ce plasma s’étend, par inflation•, mais les grains de lumière émis, les photons, sont aussitôt réabsorbés par les particules de matière. L’Univers est alors une véritable purée de pois. Puis vient le moment, en l’an 380000 après le big bang, où il est suffisamment dilaté et refroidi pour que les photons puissent se « libérer »: le cosmos devient transparent, la première lumière jaillit. Et c’est une image de cette toute première lumière, appelée fond diffus cosmologique (voir encadré), qu’ont publiée des chercheurs de l’École polytechnique fédérale de Lausanne (EPFL)1 et du CEA-Irfu. D’une précision exceptionnelle, elle a été reconstruite à partir des données enregistrées par les télescopes spatiaux WMAP et Planck, à l’aide de méthodes mathématiques très poussées.