Position: | PhD |

Deadline: | 15/04/2019 |

Contact: | Jean-Luc Starck, Joana Frontera-Pons and Arnaud Woiselle |

Details about this position are provided in the following **PDF**

Posted on by Joana Frontera-Pons

Position: | PhD |

Deadline: | 15/04/2019 |

Contact: | Jean-Luc Starck, Joana Frontera-Pons and Arnaud Woiselle |

Details about this position are provided in the following **PDF**

Posted on by Joana Frontera-Pons

Date: **January the 24th, 2019**

Organizer: **Joana Frontera-Pons <joana.frontera-pons@cea.fr>**

Venue:

*Local information*

*CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See http://www.cosmostat.org/contact for detailed information on how to arrive.*

On January the 24th, 2019, we organize the fourth day on machine learning in astrophysics at DAp, CEA Saclay.

All talks are taking place at DAp, Salle Galilée (Building 713)

14:00 - 14:30h. Machine Learning in High Energy Physics : trends and successes - **David Rousseau** (LAL)

14:30 - 15:00h. Learning recurring patterns in large signals with convolutional dictionary learning - **Thomas Moreau** (Parietal team - INRIA Saclay)

15:00 - 15:30h. Distinguishing standard and modified gravity cosmologies with machine learning - **Austin Peel** (CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffee break

16:00 - 16:30h. The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery - **Pauline Tan** (LJLL - Sorbonne Université) 16:30 - 17:00h. Deep Learning for Blended Source Identification in Galaxy Survey Data - **Samuel Farrens** (CEA Saclay - CosmoStat)

**David Rousseau** **(LAL)**

Machine Learning has been used somewhat in HEP in the nighties, then at the Tevatron and recently at the LHC (mostly Boosted Decision Tree). However with the birth of internet giants at the turn of the century, there has been an explosion of Machine Learning tools in the industry.. A collective effort has been started for the last few years to bring state-of-the-art Machine Learning tools to High Energy Physics.

This talk will give a tour d’horizon of Machine Learning in HEP : review of tools ; example of applications, some used currently, some in a (possibly distant) future (e.g. deep learning, image vision, GAN) ; recent and future HEP ML Kaggle competitions. I’ll conclude on the key points to set up frameworks for High Energy Physics and Machine Learning collaborations.

**Thomas Moreau** **(Parietal team - INRIA Saclay)**

Convolutional dictionary learning has become a popular tool in image processing for denoising or inpainting. This technique extends dictionary learning to learn adapted basis that are shift invariant. This talk will discuss how this technique can also be used in the context of large multivariate signals to learn and localize recurring patterns. I will discuss both computational aspects, with efficient iterative and distributed convolutional sparse coding algorithms, as well as a novel rank 1 constraint for the learned atoms. This constraint, inspired from the underlying physical model for neurological signals, is then used to highlight relevant structure in MEG signals.

**Austin Peel** **(CEA Saclay - CosmoStat)**

Modified gravity models that include massive neutrinos can mimic the standard concordance model in terms of weak-lensing observables. The inability to distinguish between these cases could limit our understanding of the origin of cosmic acceleration and of the fundamental nature of gravity. I will present a neural network we have designed to classify such cosmological scenarios based on the weak-lensing maps they generate. I will discuss the network's performance on both clean and noisy data, as well as how results compare to conventional statistical approaches.

**Pauline Tan** **(LJLL - Sorbonne Université)**

In this talk, I will address the challenging problem of optimizing nonsmooth and nonconvex objective functions. Such problems are increasingly encountered in applications, especially when tackling joint estimation problems. I will propose a novel algorithm and demonstrate its convergence properties. Eventually, three actual applications in industrial imagery problems will be presented.

**Samuel Farrens** **(CEA Saclay - CosmoStat)**

Weak gravitational lensing is a powerful probe of cosmology that will be employed by upcoming surveys such as Euclid and LSST to map the distribution of dark matter in the Universe. The technique, however, requires precise measurements of galaxy shapes over larges areas. The chance alignment of galaxies along the line of sight, i.e. blending of images, can lead to biased shape measurements that propagate

to parameter estimates. Machine learning techniques can provide an automated and robust way of dealing with these blended sources. In this talk I will discuss how machine learning can be used to classify sources identified in survey data as blended or not and show some preliminary results for CFIS simulations. I will then present some plans for future developments making use of multi-class classification and segmentation.

Posted on by Joana Frontera-Pons

Date: **January the 26th, 2018**

Organizer: **Joana Frontera-Pons <joana.frontera-pons@cea.fr>**

Venue:

*Local information*

*CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See http://www.cosmostat.org/link/how-to-get-to-sap/ for detailed information on how to arrive.*

On January the 26th, 2017, we organize the third day on machine learning in astrophysics at DAp, CEA Saclay.

All talks are taking place at DAp, Salle Galilée (Building 713)

10:00 - 10:45h. Artificial Intelligence: Past, present and future - **Marc Duranton** ** **(CEA Saclay)

10:45 - 11:15h. Astronomical image reconstruction with convolutional neural networks - **Rémi Flamary** (Université Nice-Sophia Antipolis)

11:15 - 11:45h. CNN based strong gravitational Lens finder for the Euclid pipeline - **Christoph Ernst René Schäfer **(Laboratory of Astrophysics, EPFL)

12:00 - 13:30h. Lunch

13:30 - 14:00h. Optimize training samples for future supernova surveys using Active Learning - **Emille Ishida** (Laboratoire de Physique de Clermont)

14:00 - 14:30h. Regularization via proximal methods - **Silvia Villa **(Politecnico di Milano)

14:30 - 15:00h. Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge - **Arthur Pajot** (LIP6)

15:00 - 15:30h. Wasserstein dictionary Learning - **Morgan Schmitz **(CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffe break

16:00 - 17:00h. Round table

**Marc Duranton** **(CEA Saclay)**

There is a high hype today about Deep Learning and its applications. This technology originated from the 50's from a simplification of the observations done by neurophysiologists and vision specialists that tried to understand how the neurons interact with each other and how the brain is structured for vision. This talk will come back to the history of the connectionist approach and will give a quick overview of how it works and of the current applications in various domains. It will also open discussions on how bio-inspiration could lead to a new approach in computing science.

**Rémi Flamary** **(Université Nice-Sophia Antipolis)**

State of the art methods in astronomical image reconstruction rely on the resolution of a regularized or constrained optimization problem.

Solving this problem can be computationally intensive especially with large images. We investigate in this work the use of convolutional

neural networks for image reconstruction in astronomy. With neural networks, the computationally intensive tasks is the training step, but

the prediction step has a fixed complexity per pixel, i.e. a linear complexity. Numerical experiments for fixed PSF and varying PSF in large

field of views show that CNN are computationally efficient and competitive with optimization based methods in addition to being interpretable.

**Christoph Ernst René Schäfer** **(Laboratory of Astrophysics, EPFL) **

Within the Euclid survey 10^5 new strong gravitational lenses are expected to be found within 35% of the observable sky. Identifying these objects in a reasonable of time necessitates the development of powerful machine learning based classifiers. One option for the Euclid pipeline are CNN-based classifiers which performed admirably during the last Galaxy-Galaxy Strong Lensing Challenge. This talk will showcase first the potential of CNN for this particular task and second expose some of the issues that CNN still have to overcome.

**Emille Ishida** **(Laboratoire de Physique de Clermont)**

The full exploitation of the next generation of large scale photometric supernova surveys depends heavily on our ability to provide a reliable early-epoch classification based solely on photometric data. In preparation for this scenario, there has been many attempts to apply different machine learning algorithms to the supernova photometric classification problem. Although different methods present different degree of success, text-book machine learning methods fail to address the crucial issue of lack of representativeness between spectroscopic (training) and photometric (target) samples. In this talk I will show how Active Learning (or optimal experiment design) can be used as a tool for optimizing the construction of spectroscopic samples for classification purposes. I will present results on how the design of spectroscopic samples from the beginning of the survey can achieve optimal classification results with a much lower number of spectra than the current adopted strategy.

**Silvia Villa** **(Politecnico di Milano) **

In the context of linear inverse problems, I will discuss iterative regularization methods allowing to consider large classes of data-fit terms and regularizers. In particular, I will investigate regularization properties of first order proximal splitting optimization techniques. Such methods are appealing since their computational complexity is tailored to the estimation accuracy allowed by the data, as I will show theoretically and numerically.

**Arthur Pajot** **(LIP6)**

We consider the use of Deep Learning methods for modeling complex phenomena like those occurring in natural physical processes. With the large amount of data gathered on these phenomena the data intensive paradigm could begin to challenge more traditional approaches elaborated over the years in fields like maths or physics. However, despite considerable successes in a variety of application domains, the machine learning field is not yet ready to handle the level of complexity required by such problems. Using an example application, namely Sea Surface Temperature Prediction, we show how general background knowledge gained from physics could be used as a guideline for designing efficient Deep Learning models.

**Morgan Schmitz** **(CEA Saclay - CosmoStat)**

Optimal Transport theory enables the definition of a distance across the set of measures on any given space. This Wasserstein distance naturally accounts for geometric warping between measures (including, but not exclusive to, images). We introduce a new, Optimal Transport-based representation learning method in close analogy with the usual Dictionary Learning problem. This approach typically relies on a matrix dot-product between the learned dictionary and the codes making up the new representation. The relationship between atoms and data is thus ultimately linear.

We instead use automatic differentiation to derive gradients of the Wasserstein barycenter operator, and we learn a set of atoms and barycentric weights from the data in an unsupervised fashion. Since our data is reconstructed as Wasserstein barycenters of our learned atoms, we can make full use of the attractive properties of the Optimal Transport geometry. In particular, our representation allows for non-linear relationships between atoms and data.

- 2017 : http://www.cosmostat.org/events/past-events/learning-in-astrophysics
- 2016 : http://www.cosmostat.org/events/past-events/machine_learning_day

Posted on by Joana Frontera-Pons

With the increasing number of deep multi-wavelength galaxy surveys, the spectral energy distribution (SED) of galaxies has become an invaluable tool for studying the formation of their structures and their evolution. In this context, standard analysis relies on simple spectro-photometric selection criteria based on a few SED colors. If this fully supervised classification already yielded clear achievements, it is not optimal to extract relevant information from the data. In this article, we propose to employ very recent advances in machine learning, and more precisely in feature learning, to derive a data-driven diagram. We show that the proposed approach based on denoising autoencoders recovers the bi-modality in the galaxy population in an unsupervised manner, without using any prior knowledge on galaxy SED classification. This technique has been compared to principal component analysis (PCA) and to standard color/color representations. In addition, preliminary results illustrate that this enables the capturing of extra physically meaningful information, such as redshift dependence, galaxy mass evolution and variation over the specific star formation rate. PCA also results in an unsupervised representation with physical properties, such as mass and sSFR, although this representation separates out less other characteristics (bimodality, redshift evolution) than denoising autoencoders.