Cosmostat Day on Machine Learning in Astrophysics
Date: January the 26th, 2018
Organizer: Joana Frontera-Pons <email@example.com>
CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See http://www.cosmostat.org/link/how-to-get-to-sap/ for detailed information on how to arrive.
On January the 26th, 2017, we organize the third day on machine learning in astrophysics at DAp, CEA Saclay.
All talks are taking place at DAp, Salle Galilée (Building 713)
10:00 - 10:45h. Artificial Intelligence: Past, present and future - Marc Duranton (CEA Saclay)
10:45 - 11:15h. Astronomical image reconstruction with convolutional neural networks - Rémi Flamary (Université Nice-Sophia Antipolis)
11:15 - 11:45h. CNN based strong gravitational Lens finder for the Euclid pipeline - Christoph Ernst René Schäfer (Laboratory of Astrophysics, EPFL)
12:00 - 13:30h. Lunch
13:30 - 14:00h. Optimize training samples for future supernova surveys using Active Learning - Emille Ishida (Laboratoire de Physique de Clermont)
14:00 - 14:30h. Regularization via proximal methods - Silvia Villa (Politecnico di Milano)
14:30 - 15:00h. Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge - Arthur Pajot (LIP6)
15:00 - 15:30h. Wasserstein dictionary Learning - Morgan Schmitz (CEA Saclay - CosmoStat)
15:30 - 16:00h. Coffe break
16:00 - 17:00h. Round table
Artificial Intelligence: Past, present and future
Marc Duranton (CEA Saclay)
There is a high hype today about Deep Learning and its applications. This technology originated from the 50's from a simplification of the observations done by neurophysiologists and vision specialists that tried to understand how the neurons interact with each other and how the brain is structured for vision. This talk will come back to the history of the connectionist approach and will give a quick overview of how it works and of the current applications in various domains. It will also open discussions on how bio-inspiration could lead to a new approach in computing science.
Astronomical image reconstruction with convolutional neural networks
Rémi Flamary (Université Nice-Sophia Antipolis)
State of the art methods in astronomical image reconstruction rely on the resolution of a regularized or constrained optimization problem.
Solving this problem can be computationally intensive especially with large images. We investigate in this work the use of convolutional
neural networks for image reconstruction in astronomy. With neural networks, the computationally intensive tasks is the training step, but
the prediction step has a fixed complexity per pixel, i.e. a linear complexity. Numerical experiments for fixed PSF and varying PSF in large
field of views show that CNN are computationally efficient and competitive with optimization based methods in addition to being interpretable.
CNN based strong gravitational Lens finder for the Euclid pipeline
Christoph Ernst René Schäfer (Laboratory of Astrophysics, EPFL)
Within the Euclid survey 10^5 new strong gravitational lenses are expected to be found within 35% of the observable sky. Identifying these objects in a reasonable of time necessitates the development of powerful machine learning based classifiers. One option for the Euclid pipeline are CNN-based classifiers which performed admirably during the last Galaxy-Galaxy Strong Lensing Challenge. This talk will showcase first the potential of CNN for this particular task and second expose some of the issues that CNN still have to overcome.
Optimize training samples for future supernova surveys using Active Learning
Emille Ishida (Laboratoire de Physique de Clermont)
The full exploitation of the next generation of large scale photometric supernova surveys depends heavily on our ability to provide a reliable early-epoch classification based solely on photometric data. In preparation for this scenario, there has been many attempts to apply different machine learning algorithms to the supernova photometric classification problem. Although different methods present different degree of success, text-book machine learning methods fail to address the crucial issue of lack of representativeness between spectroscopic (training) and photometric (target) samples. In this talk I will show how Active Learning (or optimal experiment design) can be used as a tool for optimizing the construction of spectroscopic samples for classification purposes. I will present results on how the design of spectroscopic samples from the beginning of the survey can achieve optimal classification results with a much lower number of spectra than the current adopted strategy.
Regularization via proximal methods
Silvia Villa (Politecnico di Milano)
In the context of linear inverse problems, I will discuss iterative regularization methods allowing to consider large classes of data-fit terms and regularizers. In particular, I will investigate regularization properties of first order proximal splitting optimization techniques. Such methods are appealing since their computational complexity is tailored to the estimation accuracy allowed by the data, as I will show theoretically and numerically.
Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge
Arthur Pajot (LIP6)
We consider the use of Deep Learning methods for modeling complex phenomena like those occurring in natural physical processes. With the large amount of data gathered on these phenomena the data intensive paradigm could begin to challenge more traditional approaches elaborated over the years in fields like maths or physics. However, despite considerable successes in a variety of application domains, the machine learning field is not yet ready to handle the level of complexity required by such problems. Using an example application, namely Sea Surface Temperature Prediction, we show how general background knowledge gained from physics could be used as a guideline for designing efficient Deep Learning models.
Wasserstein dictionary Learning
Morgan Schmitz (CEA Saclay - CosmoStat)
Optimal Transport theory enables the definition of a distance across the set of measures on any given space. This Wasserstein distance naturally accounts for geometric warping between measures (including, but not exclusive to, images). We introduce a new, Optimal Transport-based representation learning method in close analogy with the usual Dictionary Learning problem. This approach typically relies on a matrix dot-product between the learned dictionary and the codes making up the new representation. The relationship between atoms and data is thus ultimately linear.
We instead use automatic differentiation to derive gradients of the Wasserstein barycenter operator, and we learn a set of atoms and barycentric weights from the data in an unsupervised fashion. Since our data is reconstructed as Wasserstein barycenters of our learned atoms, we can make full use of the attractive properties of the Optimal Transport geometry. In particular, our representation allows for non-linear relationships between atoms and data.
Previous Cosmostat Days on Machine Learning in Astrophysics :
- 2017 : http://www.cosmostat.org/events/past-events/learning-in-astrophysics
- 2016 : http://www.cosmostat.org/events/past-events/machine_learning_day
CosmoStat postdoc Kostas Themelis has been awarded a Enhanced Eurotalents grant to work on “PRobabilistic cOmponent separaTion tEchniqUes with application to aStrophysical data analysis - PROTEUS”!
Be sure to keep an eye out for exciting new developments in this field!
The new international projects, such as the Euclid space telescope, are ushering in the era of Big Data for cosmologists. Our questions about dark matter and dark energy, which on their own account for 95% of the content of our Universe, throw up new algorithmic, computational and theoretical challenges. The fourth concerns reproducible research, a fundamental concept for the verification and credibility of the published results.
La Direction de la recherche fondamentale au CEA lance le projet COSMIC, né du rapprochement de deux compétences en traitement des données localisées à l'Institut des sciences du vivant Frédéric-Joliot (NeuroSpin) et au CEA-Irfu (CosmoStat). Les mécanismes d'acquisition de données en radio-astronomie et en IRM présentent des similarités. Les modèles mathématiques utilisés sont en effet basés sur les principes de parcimonie et d'acquisition comprimée, dérivés de l'analyse harmonique.
With the increasing number of deep multi-wavelength galaxy surveys, the spectral energy distribution (SED) of galaxies has become an invaluable tool for studying the formation of their structures and their evolution. In this context, standard analysis relies on simple spectro-photometric selection criteria based on a few SED colors. If this fully supervised classification already yielded clear achievements, it is not optimal to extract relevant information from the data. In this article, we propose to employ very recent advances in machine learning, and more precisely in feature learning, to derive a data-driven diagram. We show that the proposed approach based on denoising autoencoders recovers the bi-modality in the galaxy population in an unsupervised manner, without using any prior knowledge on galaxy SED classification. This technique has been compared to principal component analysis (PCA) and to standard color/color representations. In addition, preliminary results illustrate that this enables the capturing of extra physically meaningful information, such as redshift dependence, galaxy mass evolution and variation over the specific star formation rate. PCA also results in an unsupervised representation with physical properties, such as mass and sSFR, although this representation separates out less other characteristics (bimodality, redshift evolution) than denoising autoencoders.
Context: in astronomy, observing large fractions of the sky within a reasonable amount of time implies using large field-of-view (fov) optical instruments that typically have a spatially varying Point Spread Function (PSF). Depending on the scientific goals, galaxies images need to be corrected for the PSF whereas no direct measurement of the PSF is available. Aims: given a set of PSFs observed at random locations, we want to estimate the PSFs at galaxies locations for shapes measurements correction. Contributions: we propose an interpolation framework based on Sliced Optimal Transport. A non-linear dimension reduction is first performed based on local pairwise approximated Wasserstein distances. A low dimensional representation of the unknown PSFs is then estimated, which in turn is used to derive representations of those PSFs in the Wasserstein metric. Finally, the interpolated PSFs are calculated as approximated Wasserstein barycenters. Results: the proposed method was tested on simulated monochromatic PSFs of the Euclid space mission telescope (to be launched in 2020). It achieves a remarkable accuracy in terms of pixels values and shape compared to standard methods such as Inverse Distance Weighting or Radial Basis Function based interpolation methods.
Blind Source Separation (BSS) is a challenging matrix factorization problem that plays a central role in multichannel imaging science. In a large number of applications, such as astrophysics, current unmixing methods are limited since real-world mixtures are generally affected by extra instrumental effects like blurring. Therefore, BSS has to be solved jointly with a deconvolution problem, which requires tackling a new inverse problem: deconvolution BSS (DBSS). In this article, we introduce an innovative DBSS approach, called DecGMCA, based on sparse signal modeling and an efficient alternative projected least square algorithm. Numerical results demonstrate that the DecGMCA algorithm performs very well on simulations. It further highlights the importance of jointly solving BSS and deconvolution instead of considering these two problems independently. Furthermore, the performance of the proposed DecGMCA algorithm is demonstrated on simulated radio-interferometric data.
Removing the aberrations introduced by the Point Spread Function (PSF) is a fundamental aspect of astronomical image processing. The presence of noise in observed images makes deconvolution a nontrivial task that necessitates the use of regularisation. This task is particularly difficult when the PSF varies spatially as is the case for the Euclid telescope. New surveys will provide images containing thousand of galaxies and the deconvolution regularisation problem can be considered from a completely new perspective. In fact, one can assume that galaxies belong to a low-rank dimensional space. This work introduces the use of the low-rank matrix approximation as a regularisation prior for galaxy image deconvolution and compares its performance with a standard sparse regularisation technique. This new approach leads to a natural way to handle a space variant PSF. Deconvolution is performed using a Python code that implements a primal-dual splitting algorithm. The data set considered is a sample of 10 000 space-based galaxy images convolved with a known spatially varying Euclid-like PSF and including various levels of Gaussian additive noise. Performance is assessed by examining the deconvolved galaxy image pixels and shapes. The results demonstrate that for small samples of galaxies sparsity performs better in terms of pixel and shape recovery, while for larger samples of galaxies it is possible to obtain more accurate estimates of the galaxy shapes using the low-rank approximation.
Point Spread Function
The Point Spread Function or PSF of an imaging system (also referred to as the impulse response) describes how the system responds to a point (unextended) source. In astrophysics, stars or quasars are often used to measure the PSF of an instrument as in ideal conditions their light would occupy a single pixel on a CCD. Telescopes, however, diffract the incoming photons which limits the maximum resolution achievable. In reality, the images obtained from telescopes include aberrations from various sources such as:
- The atmosphere (for ground based instruments)
- Jitter (for space based instruments)
- Imperfections in the optical system
- Charge spread of the detectors
In order to recover the true image properties it is necessary to remove PSF effects from observations. If the PSF is known (which is certainly not trivial) one can attempt to deconvolve the PSF from the image. In the absence of noise this is simple. We can model the observed image as follows
where is the true image and is an operator that represents the convolution with the PSF. Thus, to recover the true image, one would simply invert as follows
Unfortunately, the images we observe also contain noise (e.g. from the CCD readout) and this complicates the problem.
This problem is ill-posed as even the tiniest amount of noise will have a large impact on the result of the operation. Therefore, to obtain a stable and unique solution, it is necessary to regularise the problem by adding additional prior knowledge of the true images.
One way to regularise the problem is using sparsity. The concept of sparsity is quite simple. If we know that there is a representation of that is sparse (i.e. most of the coefficients are zeros) then we can force our deconvolved observation to be sparse in the same domain. In practice we aim to minimise a problem of the following form
where is a matrix that transforms to the sparse domain and is a regularisation control parameter.
Another way to regularise the problem is assume that all of the images one aims to deconvolve live on a underlying low-rank manifold. In other words, if we have a sample of galaxy images we wish to deconvolve then we can construct a matrix X X where each column is a vector of galaxy pixel coefficients. If many of these galaxies have similar properties then we know that X X will have a smaller rank than if images were all very different. We can use this knowledge to regularise the deconvolution problem in the following way
In the paper I implement both of these regularisation techniques and compare how well they perform at deconvolving a sample of 10,000 Euclid-like galaxy images. The results show that, for the data used, sparsity does a better job at recovering the image pixels, while the low-rank approximation does a better job a recovering the galaxy shapes (provided enough galaxies are used).
SF_DECONVOLVE is a Python code designed for PSF deconvolution using a low-rank approximation and sparsity. The code can handle a fixed PSF for the entire field or a stack of PSFs for each galaxy position.
A key challenge in cosmological research is how to extract the most important information from satellite imagery and radio signals. The difficulty lies in the systematic processing of extremely noisy data for studying how stars and galaxies evolve through time. This is critical for astrophysicists in their effort to gain insights into cosmological processes such as the characterisation of dark matter in the Universe. Helping scientists find their way through this data maze is DEDALE, an interdisciplinary project that intends to develop the next generation of data analysis methods for the new era of big data in astrophysics and compressed sensing.