On January the 17th, 2020, we organize the 5th day on machine learning in astrophysics at DAp, CEA Saclay.

**Program:**

All talks are taking place at DAp, Salle Galilée (Building 713)

10:00 - 10:15h. Welcome and coffee

10:15 - 10:45h. Parameter inference using neural networks - **Tom Charnock** (Institut d'Astrophysique de Paris)

10:45 - 11:15h. Detection and characterisation of solar-type stars with machine learning - **Lisa Bugnet** (DAp, CEA Paris-Saclay)

11:15 - 11:45h. DeepMass: Deep learning dark matter map reconstructions with Dark Energy Survey data - **Niall Jeffrey **(ENS)

12:00 - 13:30h. Lunch

13:30 - 14:00h. Hybrid physical-deep learning models for astronomical image processing - **François Lanusse **(Berkeley Center for Cosmological Physics and CosmoStat CEA Paris Saclay)

14:00 - 14:30h. A flexible EM-like clustering algorithm for noisy data - **Violeta Roizman** (L2S, CentraleSupélec)

14:30 - 15:00h. Regularizing Optimal Transport Using Regularity Theory - **François-Pierre Paty **(CREST, ENSAE)

15:00 - 15:30h. Deep Learning @ Safran for Image Processing - **Arnaud Woiselle **(Safran Electronics and Defense)

15:30 - 16:00h. End of the day

### Parameter inference using neural networks

**Tom Charnock (Institut d'Astrophysique de Paris)**

Neural networks with large training sets are currently providing tighter constraints on cosmological and astrophysical parameters than ever before. However, in their current form, these neural networks are unable to give true Bayesian inference of such model parameters. I will describe why this is true and present two methods by which the information extracting power of neural networks can be built into the necessary robust statistical framework to perform trustworthy inference, whilst at the same time massively reducing the quantity of training data required.

### Detection and characterisation of solar-type stars with machine learning

**Lisa Bugnet (DAp, CEA Paris-Saclay)**

Stellar astrophysics has been strengthened in the 70’s by the discovery of stellar oscillations due to acoustic waves inside the Sun. These waves evolving inside solar-type stars contain information about the composition and dynamics of the surrounding plasma, and are thus very interesting for the understanding of stellar internal and surface physical processes. With classical asteroseismology we are able to extract very precise and accurate masses, radius, and ages of oscillating stars, that are key parameters for the understanding of stellar evolution.

However, classical methods of asteroseismology are time consuming processes, that can only be applied for stars showing a large enough oscillation signal. In the context of the hundred of thousand stars observed by the Transiting Exoplanet Survey Satellite (TESS), the stellar community has to adapt the methodologies previously built for the study of the few ten thousand of stars observed with much better resolution by the Kepler satellite. Our “method exploits the use of Random Forest machine learning algorithms that aim at automatically 1) classifying and 2) characterizing any stellar pulsators from global non-seismic parameters. We also present a recent result based on neural networks on the automatic detection of peculiar solar-type pulsators that have a surprinsigly low dipolar-oscillation amplitude, the signature of an unknown physical process affecting oscillation modes inside the core.

### DeepMass: Deep learning dark matter map reconstructions with Dark Energy Survey data

**Niall Jeffrey (ENS)**

I will present the first reconstruction of dark matter maps from weak lensing observational data using deep learning. We train a convolution neural network (CNN) with a Unet based architecture on over 3.6×10^5 simulated data realisations with non-Gaussian shape noise and with cosmological parameters varying over a broad prior distribution. We interpret our newly created DES SV map as an approximation of the posterior mean P(κ|γ) of the convergence given observed shear. DeepMass method is substantially more accurate than existing mass-mapping methods with a a validation set of 8000 simulated DES SV data realisations. With higher galaxy density in future weak lensing data unveiling more non-linear scales, it is likely that deep learning will be a leading approach for mass mapping with Euclid and LSST.

### Hybrid physical-deep learning models for astronomical image processing

**François Lanusse (Berkeley Center for Cosmological Physics and CosmoStat CEA Paris Saclay)**

The upcoming generation of wide-field optical surveys which includes LSST will aim to shed some much needed light on the physical nature of dark energy and dark matter by mapping the Universe in great detail and on an unprecedented scale. However, with the increase in data quality also comes a significant increase in data complexity, bringing new and outstanding challenges at all levels of the scientific analysis.

In this talk, I will illustrate how deep generative models, combined with physical modeling, can be used to address some of these challenges at the image processing level, specifically by providing data-driven priors of galaxy morphology.

I will first describe how to build such generative models from corrupted and heterogeneous data, i.e. when the training set contains varying observing

conditions (in terms of noise, seeing, or even instruments). This is a necessary step for practical applications, made possible by a hybrid modeling of the

generation process, using deep neural networks to model the underlying distribution of galaxy morphologies, complemented by a physical model of

the noise and instrumental response. Sampling from these models produces realistic galaxy light profiles, which can then be used in survey emulation,

for the purpose of validating and/or calibrating data reduction pipelines.

Even more interestingly, these models can be used as priors on galaxy morphologies and used as such as part of standard Bayesian inference techniques to solve astronomical inverse problems ranging from deconvolution to deblending galaxy images. I will present how combining these deep morphology priors with a physical forward model of observed blended scenes allows us to address the galaxy deblending problem in a physically motivated and interpretable way.

### A flexible EM-like clustering algorithm for noisy data

**Violeta Roizman** **(L2S, CentraleSupélec)**

Though very popular, it is well known that the EM algorithm suffers from non-Gaussian distribution shapes and outliers. This talk will present a flexible EM-like clustering algorithm that can deal with noise and outliers in diverse data sets. This flexibility is due to extra scale parameters that allow us to accommodate for heavier tail distributions and outliers without significantly loosing efficiency in various classical scenarios. I will show experiments where we compare it to other clustering methods such as k-means, EM and spectral clustering when applied to both synthetic data and real data sets. I will conclude with an application example of our algorithm used for image segmentation.

### Regularizing Optimal Transport Using Regularity Theory

**François-Pierre Paty (CREST, ENSAE)**

Optimal transport (OT) dates back to the end of the 18th century, when French mathematician Gaspard Monge proposed to solve the problem of déblais and remblais. In the last few years, OT has also found new applications in statistics and machine learning as a way to analyze and compare data. Both in practice and for statistical reasons, OT need be regularized. In this talk, I will present a new regularization of OT leveraging regularity of the Monge map. Instead of considering regularity as a property that can be proved under suitable assumptions, we consider regularity as a condition that must be enforced when estimating OT. This further allows us to transport out-of-sample points, as well as define a new estimator of the 2-Wasserstein distance between arbitrary measures. (Based on a joint work with Alexandre d'Aspremont and Marco Cuturi).

### Deep Learning @ Safran for Image Processing

**Arnaud Woiselle** **(Safran Electronics and Defense)**

Deep learning has become the natural tool in computer vision for nearly all high-level tasks, such as object detection and classification for many years, and is now state of the art in most image processing (restoration) tasks, such as debluring or super-resolution. Safran looked into these methods for a large variety of problems, focusing on the use of a low number of network structures, due to electronics constraints for future implementation, and transferred them to real-life noisy and blurry data, both in the visible and the infrared. I will show the results in many applications, and conclude with some tips and take-away messages on what seems important to apply deep learning on a given task.

** Previous Cosmostat Days on Machine Learning in Astrophysics :**