CosmosClub: Antoine Labatie (11/04/2019)

Date: April 11th 2019, 11am

Speaker: Antoine Labatie

Title: Characterizing Well-behaved vs. Pathological Deep Neural Networks [paper]

Room: Kepler


We introduce a novel approach, requiring only mild assumptions, for the characterization of deep neural networks at initialization. Our approach applies both to fully-connected and convolutional networks and easily incorporates the commonly used techniques of batch normalization and skip-connections. Our key insight is to consider the evolution with depth of statistical moments of signal and noise, thereby characterizing the presence or the absence of pathologies in the hypothesis space encoded by the choice of hyperparameters. We establish: (i) for feedforward networks with and without batch normalization, depth multiplicativity inevitably leads to ill-behaved moments and pathologies; (ii) for residual networks with batch normalization, on the other hand, skip-connections induce power-law rather than exponential behaviour, leading to well-behaved moments and no pathology.

CosmosClub: Adrien Picquenot (14/03/2019)

Date: March 14th 2019, 11am

Speaker: Adrien Picquenot (CEA Saclay)

Title: Applying the GMCA to extended sources in X-Ray Astronomy

Room: Kepler


In high-energy astronomy, spectro-imaging instruments such as X-ray detectors allow investigation of the spatial and spectral properties of extended sources including galaxy clusters, galaxies, diffuse interstellar medium, supernova remnants and pulsar wind nebulae. In these sources, each physical component possesses a different spatial and spectral signature, but the components are entangled. Extracting the intrinsic spatial and spectral information of the individual components from this data is a challenging task. Current analysis methods in this field do not fully exploit the 2D-1D (x,y,E) nature of the data, as the spatial and spectral information are considered separately. Here we investigate the application of the GMCA, initially developed to extract an image of the Cosmic Microwave Background from Planck data, in an X-ray context. 
The performance of the GMCA on X-ray data is tested using Monte-Carlo simulations of supernova remnant toy models, designed to represent typical science cases. We find that the GMCA is able to separate highly entangled components in X-ray data even in high contrast scenarios, and can extract with high accuracy the spectrum and map of each physical component. A modification of the algorithm is proposed in order to improve the spectral fidelity in the case of strongly overlapping spatial components, and we investigate a resampling method to derive realistic uncertainties associated to the results of the algorithm. Applying the modified algorithm to the deep Chandra observations of Cassiopeia A, we are able to produce detailed maps of the synchrotron emission at low energies (0.6-2.2 keV), and of the red/blue shifted distributions of a number of elements including  Si and Fe K.
We also tested pGMCA, a new version of the GMCA taking Poisson noise into account, more adapted to the X-ray nature of the data. A first application on the Perseus galaxy cluster shows impressive results, retrieving components that the original GMCA could not find.


CosmosClub: Clément Leloup (28/02/2019)

Date: February 28th 2019, 11am

Speaker: Clément Leloup (CEA Paris-Saclay, DPhP)

Title: Observational status of the Galileon model from cosmological data and gravitational waves [slides]

Room: Cassini


The Galileon model is a tensor-scalar theory of gravity which offers a theoretically viable explanation to the late acceleration of the Universe expansion and recovers General Relativity in the strong field limit. The main goal is to establish the status of the model from cosmological observations. Though, the multi-messenger observation of GW170817 and its consequences for the Galileon model will be briefly discussed, since most allowed Galileon scenarios have a gravitational wave speed different than the speed of light.
Most constraints obtained so far on Galileon model parameters from cosmological data were derived for the limited subset of tracker solutions and reported tensions between the model and data. We present here an exploration of the general solution of the Galileon model, which is confronted against recent cosmological data.
We find that, while the general solution provides a good fit to CMB spectra, it fails to reproduce cosmological data when extending the comparison to BAO and SNIa data. Tensions remain if the models are extended with an additional free parameter, such as the sum of active neutrino masses or the normalization of the CMB lensing spectrum.



CosmosClub: Tobias Liaudat (11/02/2019)

Date: February 11th 2019, 11am

Speaker: Tobias Liaudat (ENSAE ParisTech)

Title: Optimal Transport for Signed Measures [slides]

Room: Cassini


Optimal transport has become a mathematical gem at the interface of probability, analysis and optimization. It is a theory longly developed by the mathematician community, started by Monge and followed by Kantorovich which found applications in several fields like differential geometry, PDEs or gradient flows just to name a few.

Lately, it began to make its way into the machine learning and data treatment community. The optimal transport can be used to define a distance that is very useful when comparing histograms or point clouds, a typical scenario in nowadays applications. Some breakthrough contributions, like the entropic regularization, allowed to convexify and efficiently solve the transport problem opening the doors for many applications like Wasserstein barycenters or dictionary learning for example.

Nevertheless, Optimal Transport has not entered fully into the signal treatment community. One of the obstacles is the fact that the theory is well developed in the space of nonnegative measures but very little work has been done to extend it to signed measures. Considering a machine learning point of view, this presentation will deal with some theoretic aspects of an Optimal Transport based "distance" for signed measures that can be useful for future applications like Blind Source Separation. An algorithm for its efficient calculation will be presented as well.

Journal Club#2: DES cosmological constraints, Stochastic PALM and email signature

Date: February 7th 2019, 11am

Presenters: Fadi, Kostas & Martin

Room: Cassini

Journal Club#1: Genetic Algorithms, Adaptive Moments and Latex drawings

CosmosClub: Pol del Aguila Pla (14/01/2019)

Date: January 14th 2018, 10am

Speaker: Pol del Aguila Pla (KTH Royal Institute of Technology)

Title: Cell detection by functional inverse diffusion and non-negative group sparsity - Biology, physics, math and engineering [slides]

Room: Kepler


Image-based immunoassays are used every day across the world to develop new drugs, diagnose diseases, and research the workings of the human body. Since August, some of these are analyzed by technology that, at its core, has an algorithm included in my Ph.D. work. In this talk, I will outline the research project that lead to this algorithm and go through the modeling and optimization results we present in [1] and [2]. This will include, among others, the modeling of complex biochemical assays as systems of partial differential equations, a linear-systems view on diffusion models, investigations in group-sparsity regularization in function spaces, and first-order methods for optimization problems with more than 25 million variables. To conclude the presentation, I will go through the new paths we have started to explore in connecting all this work to deep learning frameworks [3].

[1]: Pol del Aguila Pla and Joakim Jaldén, “Cell detection by functional inverse diffusion and non-negative group sparsity—Part I: Modeling and Inverse Problems”, IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5407–5421, 2018
[2]: Pol del Aguila Pla and Joakim Jaldén, “Cell detection by functional inverse diffusion and non-negative group sparsity—Part II: Proximal optimization and Performance Evaluation”, IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5422–5437, 2018
[3]: Pol del Aguila Pla, Vidit Saxena, and Joakim Jaldén, “SpotNet – Learned iterations for cell detection in image-based immunoassays”, Submitted to the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), arXiv: 1810.06132 [eess.SP]




Cosmostat Day on Machine Learning in Astrophysics

Date: January the 24th, 2019

Organizer:  Joana Frontera-Pons  <>


Local information

CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See  for detailed information on how to arrive.

On January the 24th, 2019, we organize the fourth day on machine learning in astrophysics at DAp, CEA Saclay. 


All talks are taking place at DAp, Salle Galilée (Building 713)

14:00 - 14:30h. Machine Learning in High Energy Physics : trends and successes -  David Rousseau (LAL)                             
14:30 - 15:00h. Learning recurring patterns in large signals with convolutional dictionary learning - Thomas Moreau (Parietal team - INRIA Saclay)
15:00 - 15:30h. Distinguishing standard and modified gravity cosmologies with machine learning -  Austin Peel (CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffee break

16:00 - 16:30h.  The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery - Pauline Tan (LJLL - Sorbonne Université)                                      16:30 - 17:00h. Deep Learning for Blended Source Identification in Galaxy Survey Data - Samuel Farrens (CEA Saclay - CosmoStat)

Machine Learning in High Energy Physics : trends and successes

David Rousseau (LAL)

Machine Learning has been used somewhat in HEP in the nighties, then at the Tevatron and recently at the LHC (mostly Boosted Decision Tree). However with the birth of internet giants at the turn of the century, there has been an explosion of Machine Learning tools in the industry.. A collective effort has been started for the last few years to bring state-of-the-art Machine Learning tools to High Energy Physics.
This talk will give a tour d’horizon of Machine Learning in HEP : review of tools ; example of applications, some used currently, some in a (possibly distant) future (e.g. deep learning, image vision, GAN) ; recent and future HEP ML Kaggle competitions. I’ll conclude on the key points to set up frameworks for High Energy Physics and Machine Learning collaborations.

Learning recurring patterns in large signals with convolutional dictionary learning

Thomas Moreau (Parietal team - INRIA Saclay)

Convolutional dictionary learning has become a popular tool in image processing for denoising or inpainting. This technique extends dictionary learning to learn adapted basis that are shift invariant. This talk will discuss how this technique can also be used in the context of large multivariate signals to learn and localize recurring patterns. I will discuss both computational aspects, with efficient iterative and distributed convolutional sparse coding algorithms, as well as a novel rank 1 constraint for the learned atoms. This constraint, inspired from the underlying physical model for neurological signals, is then used to highlight relevant structure in MEG signals.

Distinguishing standard and modified gravity cosmologies with machine learning

Austin Peel (CEA Saclay - CosmoStat)

Modified gravity models that include massive neutrinos can mimic the standard concordance model in terms of weak-lensing observables. The inability to distinguish between these cases could limit our understanding of the origin of cosmic acceleration and of the fundamental nature of gravity. I will present a neural network we have designed to classify such cosmological scenarios based on the weak-lensing maps they generate. I will discuss the network's performance on both clean and noisy data, as well as how results compare to conventional statistical approaches.

The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery

Pauline Tan (LJLL - Sorbonne Université)

In this talk, I will address the challenging problem of optimizing nonsmooth and nonconvex objective functions. Such problems are increasingly encountered in applications, especially when tackling joint estimation problems. I will propose a novel algorithm and demonstrate its convergence properties. Eventually, three actual applications in industrial imagery problems will be presented.

Deep Learning for Blended Source Identification in Galaxy Survey Data

Samuel Farrens (CEA Saclay - CosmoStat)

Weak gravitational lensing is a powerful probe of cosmology that will be employed by upcoming surveys such as Euclid and LSST to map the distribution of dark matter in the Universe. The technique, however, requires precise measurements of galaxy shapes over larges areas. The chance alignment of galaxies along the line of sight, i.e. blending of images, can lead to biased shape measurements that propagate
to parameter estimates. Machine learning techniques can provide an automated and robust way of dealing with these blended sources. In this talk I will discuss how machine learning can be used to classify sources identified in survey data as blended or not and show some preliminary results for CFIS simulations. I will then present some plans for future developments making use of multi-class classification and segmentation.

 Previous Cosmostat Days on Machine Learning in Astrophysics :