CosmosClub: Tobias Liaudat (11/02/2019)

Date: February 11th 2019, 11am

Speaker: Tobias Liaudat (ENSAE ParisTech)

Title: Optimal Transport for Signed Measures

Room: Cassini


Optimal transport has become a mathematical gem at the interface of probability, analysis and optimization. It is a theory longly developed by the mathematician community, started by Monge and followed by Kantorovich which found applications in several fields like differential geometry, PDEs or gradient flows just to name a few.

Lately, it began to make its way into the machine learning and data treatment community. The optimal transport can be used to define a distance that is very useful when comparing histograms or point clouds, a typical scenario in nowadays applications. Some breakthrough contributions, like the entropic regularization, allowed to convexify and efficiently solve the transport problem opening the doors for many applications like Wasserstein barycenters or dictionary learning for example.

Nevertheless, Optimal Transport has not entered fully into the signal treatment community. One of the obstacles is the fact that the theory is well developed in the space of nonnegative measures but very little work has been done to extend it to signed measures. Considering a machine learning point of view, this presentation will deal with some theoretic aspects of an Optimal Transport based "distance" for signed measures that can be useful for future applications like Blind Source Separation. An algorithm for its efficient calculation will be presented as well.

Journal Club#2: DES cosmological constraints, Stochastic PALM and email signature

Date: February 7th 2019, 11am

Presenters: Fadi, Kostas & Martin

Room: Cassini

Journal Club#1: Genetic Algorithms, Adaptive Moments and Latex drawings

CosmosClub: Pol del Aguila Pla (14/01/2019)

Date: January 14th 2018, 10am

Speaker: Pol del Aguila Pla (KTH Royal Institute of Technology)

Title: Cell detection by functional inverse diffusion and non-negative group sparsity - Biology, physics, math and engineering [slides]

Room: Kepler


Image-based immunoassays are used every day across the world to develop new drugs, diagnose diseases, and research the workings of the human body. Since August, some of these are analyzed by technology that, at its core, has an algorithm included in my Ph.D. work. In this talk, I will outline the research project that lead to this algorithm and go through the modeling and optimization results we present in [1] and [2]. This will include, among others, the modeling of complex biochemical assays as systems of partial differential equations, a linear-systems view on diffusion models, investigations in group-sparsity regularization in function spaces, and first-order methods for optimization problems with more than 25 million variables. To conclude the presentation, I will go through the new paths we have started to explore in connecting all this work to deep learning frameworks [3].

[1]: Pol del Aguila Pla and Joakim Jaldén, “Cell detection by functional inverse diffusion and non-negative group sparsity—Part I: Modeling and Inverse Problems”, IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5407–5421, 2018
[2]: Pol del Aguila Pla and Joakim Jaldén, “Cell detection by functional inverse diffusion and non-negative group sparsity—Part II: Proximal optimization and Performance Evaluation”, IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5422–5437, 2018
[3]: Pol del Aguila Pla, Vidit Saxena, and Joakim Jaldén, “SpotNet – Learned iterations for cell detection in image-based immunoassays”, Submitted to the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), arXiv: 1810.06132 [eess.SP]




Cosmostat Day on Machine Learning in Astrophysics

Date: January the 24th, 2019

Organizer:  Joana Frontera-Pons  <>


Local information

CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See  for detailed information on how to arrive.

On January the 24th, 2019, we organize the fourth day on machine learning in astrophysics at DAp, CEA Saclay. 


All talks are taking place at DAp, Salle Galilée (Building 713)

14:00 - 14:30h. Machine Learning in High Energy Physics : trends and successes -  David Rousseau (LAL)                             
14:30 - 15:00h. Learning recurring patterns in large signals with convolutional dictionary learning - Thomas Moreau (Parietal team - INRIA Saclay)
15:00 - 15:30h. Distinguishing standard and modified gravity cosmologies with machine learning -  Austin Peel (CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffee break

16:00 - 16:30h.  The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery - Pauline Tan (LJLL - Sorbonne Université)                                      16:30 - 17:00h. Deep Learning for Blended Source Identification in Galaxy Survey Data - Samuel Farrens (CEA Saclay - CosmoStat)

Machine Learning in High Energy Physics : trends and successes

David Rousseau (LAL)

Machine Learning has been used somewhat in HEP in the nighties, then at the Tevatron and recently at the LHC (mostly Boosted Decision Tree). However with the birth of internet giants at the turn of the century, there has been an explosion of Machine Learning tools in the industry.. A collective effort has been started for the last few years to bring state-of-the-art Machine Learning tools to High Energy Physics.
This talk will give a tour d’horizon of Machine Learning in HEP : review of tools ; example of applications, some used currently, some in a (possibly distant) future (e.g. deep learning, image vision, GAN) ; recent and future HEP ML Kaggle competitions. I’ll conclude on the key points to set up frameworks for High Energy Physics and Machine Learning collaborations.

Learning recurring patterns in large signals with convolutional dictionary learning

Thomas Moreau (Parietal team - INRIA Saclay)

Convolutional dictionary learning has become a popular tool in image processing for denoising or inpainting. This technique extends dictionary learning to learn adapted basis that are shift invariant. This talk will discuss how this technique can also be used in the context of large multivariate signals to learn and localize recurring patterns. I will discuss both computational aspects, with efficient iterative and distributed convolutional sparse coding algorithms, as well as a novel rank 1 constraint for the learned atoms. This constraint, inspired from the underlying physical model for neurological signals, is then used to highlight relevant structure in MEG signals.

Distinguishing standard and modified gravity cosmologies with machine learning

Austin Peel (CEA Saclay - CosmoStat)

Modified gravity models that include massive neutrinos can mimic the standard concordance model in terms of weak-lensing observables. The inability to distinguish between these cases could limit our understanding of the origin of cosmic acceleration and of the fundamental nature of gravity. I will present a neural network we have designed to classify such cosmological scenarios based on the weak-lensing maps they generate. I will discuss the network's performance on both clean and noisy data, as well as how results compare to conventional statistical approaches.

The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery

Pauline Tan (LJLL - Sorbonne Université)

In this talk, I will address the challenging problem of optimizing nonsmooth and nonconvex objective functions. Such problems are increasingly encountered in applications, especially when tackling joint estimation problems. I will propose a novel algorithm and demonstrate its convergence properties. Eventually, three actual applications in industrial imagery problems will be presented.

Deep Learning for Blended Source Identification in Galaxy Survey Data

Samuel Farrens (CEA Saclay - CosmoStat)

Weak gravitational lensing is a powerful probe of cosmology that will be employed by upcoming surveys such as Euclid and LSST to map the distribution of dark matter in the Universe. The technique, however, requires precise measurements of galaxy shapes over larges areas. The chance alignment of galaxies along the line of sight, i.e. blending of images, can lead to biased shape measurements that propagate
to parameter estimates. Machine learning techniques can provide an automated and robust way of dealing with these blended sources. In this talk I will discuss how machine learning can be used to classify sources identified in survey data as blended or not and show some preliminary results for CFIS simulations. I will then present some plans for future developments making use of multi-class classification and segmentation.

 Previous Cosmostat Days on Machine Learning in Astrophysics :

CosmosClub: Sylvain Vanneste (16/11/2018)

Date: October 16th 2018, 2pm

Speaker: Sylvain Vanneste (LAL)

Title: Detecting CMB B-modes

Room: Kepler


The discovery of the Cosmic Microwave Background (CMB) by Penzias and Wilson in 1964 was an important confirmation of the Big Bang theory. The CMB constitutes a background of photons emitted during the first instants of our Universe history, and still permeates it today. Since its discovery, numerous telescopes, balloon-born, or satellite experiments such as Planck, have made it possible to produce measurements and precise temperature maps of the CMB, of which we have been able to deduce important information about our Universe.
However, a piece of the cosmological puzzle is still missing: the inflation, corresponding to a short period during which the Universe would have seen its size growing exponentially. Inflation is a theory introduced to solve several major cosmological questions, and which, to date, has only been verified indirectly.
The inflation phase, however, should produce a stochastic background of primordial gravitational waves that may have left an imprint on the CMB. More particularly, these gravitational waves would induce the so-called B-modes patterns on the polarisation maps of CMB photons. The precise measurement of the B-modes, still undetected to this day, represents the most powerful probe of inflationary physics.
The B-modes expected signal is however of low intensity, and many additional experimental difficulties arise when aiming at measuring it. Dust from our own galaxy partially masks the CMB, and many models are developed to clean up galactic contaminations. The extraction and analysis of the measured data signal thus requires the development of precise statistical algorithms. These must take into account the complexity of the data produced, such as residual galactic contaminations, incomplete sky map coverage, as well as statistical and instrumental errors.


CosmosClub: Jia Liu (07/11/2018)

Date: November 7th 2018, 2pm

Speaker: Jia Liu (Princeton University)

Title: Cosmology in the nonlinear regime with massive neutrinos [slides]

Room: Kepler


The non-zero mass of neutrinos suppresses the growth of cosmic structure on small scales. Since the level of suppression depends on the sum of the masses of the three active neutrino species, the evolution of large-scale structure is a promising tool to constrain the total mass of neutrinos and possibly shed light on the mass hierarchy. I will discuss recent progress and future prospects to constrain the neutrino mass sum with cosmology, with a focus on observables in the nonlinear regime.

CosmosClub: Chieh-An Lin (09/10/2018)

Date: October 9th 2018

Speaker: Chieh-An Lin (IfA, University of Edinburgh)

Title: Predicting weak-lensing covariance with a fast simulator

Weak lensing has been shown as an outstanding tool to constrain cosmology. The state-of-the-art studies have used the power spectrum and peak counts as estimators, and the combination of the two can break down parameter degeneracies and maximize the information extraction.

To constrain cosmology with both estimators, understanding the joint covariance is crucial. However, calculating it analytically seems to be intractable for peaks, and the empirical approach with N-body simulations will be expensive as the size of lensing surveys increase.

I will present a fast solution to solve this problem. The proposed approach simulates lognormal fields and halo models to predict lensing signals. We compared the resulting joint covariance with the one from a large number of N-body simulations and found an excellent agreement. In addition, our approach is orders of magnitude faster than N-body runs.