The workshop on Computational Intelligence in Remote Sensing and Astrophysics (CIRSA) aims at bringing together researchers from the environmental sciences, astrophysics and computer science communities in an effort to understand the potential and pitfalls of novel computational intelligence paradigms including machine learning and large-scale data processing.



EuroPython 2019

Date: July 8-14 2019

Venue: Basel, CH



Twitter: @europython

Conference App will be announced on the blog.

EuroPython is an annual conference hosting ~1200 participants from academia and companies, interested in development and applications of python programming language. It's also a good opportunity for students and postdocs who wish to find a job outside academia.

For more info, contact: Valeria Pettorino


A Distributed Learning Architecture for Scientific Imaging Problems


Authors: A. Panousopoulou, S. Farrens, K. Fotiadou, A. Woiselle, G. Tsagkatakis, J-L. Starck,  P. Tsakalides
Journal: arXiv
Year: 2018
Download: ADS | arXiv


Current trends in scientific imaging are challenged by the emerging need of integrating sophisticated machine learning with Big Data analytics platforms. This work proposes an in-memory distributed learning architecture for enabling sophisticated learning and optimization techniques on scientific imaging problems, which are characterized by the combination of variant information from different origins. We apply the resulting, Spark-compliant, architecture on two emerging use cases from the scientific imaging domain, namely: (a) the space variant deconvolution of galaxy imaging surveys (astrophysics), (b) the super-resolution based on coupled dictionary training (remote sensing). We conduct evaluation studies considering relevant datasets, and the results report at least 60\% improvement in time response against the conventional computing solutions. Ultimately, the offered discussion provides useful practical insights on the impact of key Spark tuning parameters on the speedup achieved, and the memory/disk footprint.

Cosmostat Day on Machine Learning in Astrophysics

Date: January the 24th, 2019

Organizer:  Joana Frontera-Pons  <>


Local information

CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See  for detailed information on how to arrive.

On January the 24th, 2019, we organize the fourth day on machine learning in astrophysics at DAp, CEA Saclay. 


All talks are taking place at DAp, Salle Galilée (Building 713)

14:00 - 14:30h. Machine Learning in High Energy Physics : trends and successes -  David Rousseau (LAL)                             
14:30 - 15:00h. Learning recurring patterns in large signals with convolutional dictionary learning - Thomas Moreau (Parietal team - INRIA Saclay)
15:00 - 15:30h. Distinguishing standard and modified gravity cosmologies with machine learning -  Austin Peel (CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffee break

16:00 - 16:30h.  The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery - Pauline Tan (LJLL - Sorbonne Université)                                      16:30 - 17:00h. Deep Learning for Blended Source Identification in Galaxy Survey Data - Samuel Farrens (CEA Saclay - CosmoStat)

Machine Learning in High Energy Physics : trends and successes

David Rousseau (LAL)

Machine Learning has been used somewhat in HEP in the nighties, then at the Tevatron and recently at the LHC (mostly Boosted Decision Tree). However with the birth of internet giants at the turn of the century, there has been an explosion of Machine Learning tools in the industry.. A collective effort has been started for the last few years to bring state-of-the-art Machine Learning tools to High Energy Physics.
This talk will give a tour d’horizon of Machine Learning in HEP : review of tools ; example of applications, some used currently, some in a (possibly distant) future (e.g. deep learning, image vision, GAN) ; recent and future HEP ML Kaggle competitions. I’ll conclude on the key points to set up frameworks for High Energy Physics and Machine Learning collaborations.

Learning recurring patterns in large signals with convolutional dictionary learning

Thomas Moreau (Parietal team - INRIA Saclay)

Convolutional dictionary learning has become a popular tool in image processing for denoising or inpainting. This technique extends dictionary learning to learn adapted basis that are shift invariant. This talk will discuss how this technique can also be used in the context of large multivariate signals to learn and localize recurring patterns. I will discuss both computational aspects, with efficient iterative and distributed convolutional sparse coding algorithms, as well as a novel rank 1 constraint for the learned atoms. This constraint, inspired from the underlying physical model for neurological signals, is then used to highlight relevant structure in MEG signals.

Distinguishing standard and modified gravity cosmologies with machine learning

Austin Peel (CEA Saclay - CosmoStat)

Modified gravity models that include massive neutrinos can mimic the standard concordance model in terms of weak-lensing observables. The inability to distinguish between these cases could limit our understanding of the origin of cosmic acceleration and of the fundamental nature of gravity. I will present a neural network we have designed to classify such cosmological scenarios based on the weak-lensing maps they generate. I will discuss the network's performance on both clean and noisy data, as well as how results compare to conventional statistical approaches.

The ASAP algorithm for nonsmooth nonconvex optimization. Applications in imagery

Pauline Tan (LJLL - Sorbonne Université)

In this talk, I will address the challenging problem of optimizing nonsmooth and nonconvex objective functions. Such problems are increasingly encountered in applications, especially when tackling joint estimation problems. I will propose a novel algorithm and demonstrate its convergence properties. Eventually, three actual applications in industrial imagery problems will be presented.

Deep Learning for Blended Source Identification in Galaxy Survey Data

Samuel Farrens (CEA Saclay - CosmoStat)

Weak gravitational lensing is a powerful probe of cosmology that will be employed by upcoming surveys such as Euclid and LSST to map the distribution of dark matter in the Universe. The technique, however, requires precise measurements of galaxy shapes over larges areas. The chance alignment of galaxies along the line of sight, i.e. blending of images, can lead to biased shape measurements that propagate
to parameter estimates. Machine learning techniques can provide an automated and robust way of dealing with these blended sources. In this talk I will discuss how machine learning can be used to classify sources identified in survey data as blended or not and show some preliminary results for CFIS simulations. I will then present some plans for future developments making use of multi-class classification and segmentation.

 Previous Cosmostat Days on Machine Learning in Astrophysics :

DEDALE: Mathematical Tools to Help Navigate the Big Data Maze

Managing the huge volumes and varying streams of Big Data digital information presents formidable analytical challenges to anyone wanting to make sense of it. Consider the mapping of space, where scientists collect, process and transmit giga-scale data sets to generate accurate visual representations of millions of galaxies. Or consider the vast information being generated by genomics and bioinformatics as genomes are mapped and new drugs discovered. And soon the Internet of Things will bring millions of interconnected information-sensing and transmitting devices.

Improving Weak Lensing Mass Map Reconstructions using Gaussian and Sparsity Priors: Application to DES SV


Authors: N. JeffreyF. B. AbdallaO. LahavF. LanusseJ.-L. Starck, et al
Year: 01/2018
Download: ADS| Arxiv


Mapping the underlying density field, including non-visible dark matter, using weak gravitational lensing measurements is now a standard tool in cosmology. Due to its importance to the science results of current and upcoming surveys, the quality of the convergence reconstruction methods should be well understood. We compare three different mass map reconstruction methods: Kaiser-Squires (KS), Wiener filter, and GLIMPSE. KS is a direct inversion method, taking no account of survey masks or noise. The Wiener filter is well motivated for Gaussian density fields in a Bayesian framework. The GLIMPSE method uses sparsity, with the aim of reconstructing non-linearities in the density field. We compare these methods with a series of tests on the public Dark Energy Survey (DES) Science Verification (SV) data and on realistic DES simulations. The Wiener filter and GLIMPSE methods offer substantial improvement on the standard smoothed KS with a range of metrics. For both the Wiener filter and GLIMPSE convergence reconstructions we present a 12% improvement in Pearson correlation with the underlying truth from simulations. To compare the mapping methods' abilities to find mass peaks, we measure the difference between peak counts from simulated {\Lambda}CDM shear catalogues and catalogues with no mass fluctuations. This is a standard data vector when inferring cosmology from peak statistics. The maximum signal-to-noise value of these peak statistic data vectors was increased by a factor of 3.5 for the Wiener filter and by a factor of 9 using GLIMPSE. With simulations we measure the reconstruction of the harmonic phases, showing that the concentration of the phase residuals is improved 17% by GLIMPSE and 18% by the Wiener filter. We show that the correlation between the reconstructions from data and the foreground redMaPPer clusters is increased 18% by the Wiener filter and 32% by GLIMPSE.

Big Bang and Big Data

The new international projects, such as the Euclid space telescope, are ushering in the era of Big Data for cosmologists. Our questions about dark matter and dark energy, which on their own account for 95% of the content of our Universe, throw up new algorithmic, computational and theoretical challenges. The fourth concerns reproducible research, a fundamental concept for the verification and credibility of the published results.