10th Astronomical Data Analysis Summer School (ADA X)
Date: September 18, 2023 - September 22, 2023
Venue: Hersonissos, Crete, Greece
Website: http://ada10.cosmostat.org
All CosmoStat events
Date: September 18, 2023 - September 22, 2023
Venue: Hersonissos, Crete, Greece
Website: http://ada10.cosmostat.org
Date: August 16 - August 27, 2021
Venue: Anglet, France
Website: https://ecole-euclid.cnrs.fr/2021-accueil/
Lecture ``Weak gravitational lensing'' (Le lentillage gravitationnel), cycle 2, Martin Kilbinger.
Find here links to the lecture notes, TD exercises, "tables rondes" topics, and other information.
Date: March the 5th, 2021
Organizer: Joana Frontera-Pons <joana.frontera-pons@cea.fr>
Venue: Remote conference. Zoom link:https://esade.zoom.us/j/88535176160?pwd=RzU1cHA5Z0xrWXkyN0x1a2tJSHZ1Zz09
On March the 5th, 2021, we organize the 6th day on machine learning in astrophysics at DAp, CEA Saclay.
All talks are taking place remotely
13:30 - 13:40h. Welcome message
13:40 - 14:20h. Data-driven detection of multi-messenger transients - Iftach Sadeh (Deutsches Elektronen-Synchrotron)
14:20 - 15:00h. Deep Learning in Radio Astronomy - Vesna Lukic (Vrije Universiteit Brussel)
15:00 - 15:40h. Machine Learning for Galaxy Image Reconstruction with Problem Specific Loss - Fadi Nammour (CosmoStat - CEA Saclay)
15:40 - 16:00h. Coffee break with virtual croissants
16:00 - 16:40h. Anomaly detection with generative methods - Coloma Ballester (Universitat Pompeu Fabra)
16:40 - 17:20h. Deep learning for environmental sciences - Jan Dirk Wegner (ETH Zurich)
17:20 - 18:00h. Graph Neural Networks - Fernando Gama ( University of California, Berkeley)
18:00 - 18:05h. End of the day
Iftach Sadeh (Deutsches Elektronen-Synchrotron)
The primary challenge in the study of explosive astrophysical transients is their detection and characterisation using multiple messengers. For this purpose, we have developed a new data-driven discovery framework, based on deep learning. We demonstrate its use for searches involving neutrinos, optical supernovae, and gamma rays. We show that we can match or substantially improve upon the performance of state-of-the-art techniques, while significantly minimising the dependence on modelling and on instrument characterisation. Particularly, our approach is intended for near- and real-time analyses, which are essential for effective follow-up of detections. Our algorithm is designed to combine a range of instruments and types of input data, representing different messengers, physical regimes, and temporal scales. The methodology is optimised for agnostic searches of unexpected phenomena, and has the potential to substantially enhance their discovery prospects.
Vesna Lukic (Vrije Universiteit Brussel)
Machine learning techniques have proven to be increasingly useful in astronomical applications over the last few years, for example in image classification and time series analysis. A topic of current interest is the classification of radio galaxy morphologies, as it gives us insight into the nature of the Active Galactic Nuclei and structure formation. Future surveys such as the Square Kilometre Array (SKA), will detect many million sources and will require the use of automated techniques. Convolutional neural networks are a machine learning technique that have been very successful in image classification, due to their ability to capture high-dimensional features in the data. We show the performance of simple convolutional network architectures in classifying radio sources from the Radio Galaxy Zoo. The use of pooling in such networks results in information losses which adversely affect the classification performance, however Capsule networks preserve this information with the use of dynamic routing. We explore a couple of convolutional neural network architectures against variations of Capsule network setups and evaluate their performance in replicating the classifications of radio galaxies detected by the Low Frequency Array (LOFAR). Finally, we also show how it is possible to use convolutional neural networks to find sources in radio surveys.
Fadi Nammour (CosmoStat - CEA Saclay)
Telescope images are corrupted with blur and noise. Generally, blur is represented by a convolution with a Point Spread Function and noise is modelled as Additive Gaussian Noise. Restoring galaxy images from the observations is an inverse problem that is ill-posed and specifically ill-conditioned. The majority of the standard reconstruction methods minimise the Mean Square Error to reconstruct images, without any guarantee that the shape objects contained in the data (e.g. galaxies) is preserved. Here we introduce a shape constraint, exhibit its properties and show how it preserves galaxy shapes when combined to Machine Learning reconstruction algorithms.
Coloma Ballester (Universitat Pompeu Fabra)
Anomaly detection is frequently approached as out-of-distribution or outlier detection. In this talk, a method for out-of-distribution will be discussed. It leverages the learning of the probability distribution of normal data through generative adversarial networks while simultaneously keeping track of the states of the learning to finally estimate an efficient anomaly detector.
Jan Dirk Wegner (ETH Zurich)
A multitude of different sensors is capturing massive amounts of geo-coded data with different spatial resolution, temporal frequency, viewpoint, and quality every day. Modelling functional relationships for applications is often hard and loses predictive power due to the high variance in sensor modality. Data-driven approaches, especially modern deep learning, come to the rescue and learn expressive models directly from (labeled) input data. In this talk, I will present deep learning methods to analyze geospatial data at large scale for two specific applications in the environmental sciences: biodiversity estimation and global vegetation height mapping.
Fernando Gama ( University of California, Berkeley)
Graphs are generic models of signal structure that can help to learn in several practical problems. To learn from graph data, we need scalable architectures that can be trained on moderate dataset sizes and that can be implemented distributedly. In this talk, I will draw from graph signal processing to define graph convolutions, and use them to introduce graph neural networks (GNNs). I will prove that GNNs are permutation equivariant and stable to perturbations of the graph, properties that explain their scalability and transferability. I will also use these results to explain the advantages of GNNs over linear graph filters. I will then discuss the problem of learning decentralized controllers, and how GNNs naturally leverage the partial information structure inherent to distributed systems. Using flocking as an illustrative example, I will show that GNNs, not only successfully learn distributed actions that coordinate the team but also transfer and scale to larger teams.
Date: July 9th 2020, 10.00 a.m.
Speaker: Ariel Sánchez (MPE Garching/ Max-Planck-Institut für extraterrestrische Physik )
Title: Let us bury the prehistoric h: arguments against using h−1Mpc units in observational cosmology
Room: Zoom Meeting (connection details will be updated soon)
Abstract
It is common to express cosmological measurements in units of h^-1 Mpc. Here, we review some of the complications that originate from this practice. A crucial problem caused by these units is related to the normalization of the matter power spectrum, which is commonly characterized in terms of the linear-theory rms mass fluctuation in spheres of radius 8h^-1 Mpc, σ8. This parameter does not correctly capture the impact of h on the amplitude of density fluctuations. We show that the use of σ8 has caused critical misconceptions for both the so-called σ8 tension regarding the consistency between low-redshift probes and cosmic microwave background data, and the way in which growth-rate estimates inferred from redshift-space distortions are commonly expressed. We propose to abandon the use of h^−1 Mpc units in cosmology and to characterize the amplitude of the matter power spectrum in terms of σ12, defined as the mass fluctuation in spheres of radius 12Mpc, whose value is similar to the standard σ8 for h∼0.67.
Date: July 2nd 2020, 10.00 a.m.
Speaker: Erwan Allys (ENS Paris / École Normale Supérieure, Laboratoire de Radioastronomie )
Title: The Wavelet Phase Harmonics, a new interpretable statistical description for analysis and synthesis of the LSS
Room: Zoom Meeting (connection details will be updated soon)
Abstract
The statistical characterization of non-Gaussian fields is a major problem in current astrophysics, and no method has clearly emerged up to now to do so. In this presentation, I will introduce the Wavelet Phase Harmonics (WPH), a low-dimensional and interpretable set of statistics that efficiently characterizes the couplings between scales in non-linear processes. This description, that has been recently introduced in data science, is inspired from neural networks. Applied to projected matter density field from Quijote N-body Large Scale Structure (LSS) simulations, I will show how the WPH are able to provide better constraints on five cosmological parameters than the joint power spectrum and bispectrum, as well as to produce new realistic statistical syntheses from a maximum-entropy model. These results open the path to the use of a new type of statistical description for non-Gaussian fields in astrophysics.
Date: March 18th 2020, 10.30am
Speaker: Florent Mertens (LERMA / Kapteyn Astronomical Institute)
Title: The challenges of observing the Epoch of Reionization and Cosmic Dawn
Room: Cassini
Abstract
Low-frequency observations of the redshifted 21cm line promise to open a new window onto the first billion years of cosmic history, allowing us to directly study the astrophysical processes occurring during the Epoch of Reionization (EoR) and the Cosmic Dawn (CD). This exciting goal is challenged by the difficulty of extracting the feeble 21-cm signal buried under astrophysical foregrounds orders of magnitude brighter and contaminated by numerous instrumental systematics. Several experiments such as LOFAR, MWA, HERA, and NenuFAR are currently underway aiming at statistically detecting the 21-cm brightness temperature fluctuations from the EoR and CD. While no detection is yet in sight, considerable progress has been made recently. In this talk, I will review the many challenges faced by these difficult experiments and I will share the latest development of the LOFAR Epoch of Reionization and NenuFAR Cosmic Dawn key science projects.
Date: February 20th 2020, 10.00 am
Room: Kepler
Speaker: Céline Gouin (IAS, COSMIX)
Title: Probing the azimuthal environment of galaxies around clusters. From cluster core to cosmic filaments
Abstract:
Galaxy clusters are connected at their peripheries to the large scale structures by cosmic filaments that funnel accreting material.Therefore, the vicinity of galaxy clusters are ideal places to quantify the geometry and topology of the cosmic web.These filamentary structures are studied to investigate both environment-driven galaxy evolution and the growth of massive structures. In this presentation, I probe angular features in the distribution of galaxies around clusters by performing harmonic decompositions in large photometric galaxy catalogues around low-z clusters. In the clusters’ outskirts, filamentary patterns are detected in harmonic space: massive clusters seem to have a larger number of connected filaments than low-mass ones. Our results suggest also a gradient of galaxy activity in filaments around clusters.
February 3 - 7, 2020
IAP - Institue d'Astrophysique de Paris, 98 bis, bd Arago, 75014 Paris
The preliminary schedule can be found here:
https://docs.google.com/document/d/1XHDepk3W4897GMqxABpo4vgubhm2LFVYVOCgyTqGS_I/edit
Slides (password-protected) are on redmine.
The meeting starts on Monday 3 February at 9:30.
Please add your name to the following list if you intend to participate. To access IAP, external people are required to indicate their name in advance of the meeting, and might have to show identification at the IAP front desk. There is no conference fee.
https://docs.google.com/document/d/17Hn8Z6LH54fJDbDY2uQPtZPauZotm6IsnNC4LbBcmII/edit
Martin Kilbinger <kilbinger@iap.fr>
Sandrine Codis <codis@iap.fr>
Date: December 4rd 2019, 10.30am
Speaker: Irène Waldspurger (CEREMADE, Université Paris-Dauphine)
Title: Convex and non-convex algorithms for phase retrieval
Room: Cassini
Abstract
Phase retrieval problems consist in recovering elements of a complex vector space from the modulus of their scalar product with a fixed family of measurement vectors. Traditional reconstruction algorithms rely on simple local optimization heuristics. Although they can in principle, because of the non-convexity of the problem, get stuck in local optima, they are observed to work well in many situations.
In this talk, we will see which theoretical correctness guarantees one can establish, in a particular setting, for the most well-known such algorithm. We will also present a different family of algorithms, based on so-called convexification techniques, describe its advantages and limitations.
Date: January the 17th, 2020
Organizer: Joana Frontera-Pons <joana.frontera-pons@cea.fr>
Venue:
Local information
CEA Saclay is around 23 km South of Paris. The astrophysics division (DAp) is located at the CEA site at Orme des Merisiers, which is around 1 km South of the main CEA campus. See http://www.cosmostat.org/contact for detailed information on how to arrive.
On January the 17th, 2020, we organize the 5th day on machine learning in astrophysics at DAp, CEA Saclay.
All talks are taking place at DAp, Salle Galilée (Building 713)
10:00 - 10:15h. Welcome and coffee
10:15 - 10:45h. Parameter inference using neural networks - Tom Charnock (Institut d'Astrophysique de Paris)
10:45 - 11:15h. Detection and characterisation of solar-type stars with machine learning - Lisa Bugnet (DAp, CEA Paris-Saclay)
11:15 - 11:45h. DeepMass: Deep learning dark matter map reconstructions with Dark Energy Survey data - Niall Jeffrey (ENS)
12:00 - 13:30h. Lunch
13:30 - 14:00h. Hybrid physical-deep learning models for astronomical image processing - François Lanusse (Berkeley Center for Cosmological Physics and CosmoStat CEA Paris Saclay)
14:00 - 14:30h. A flexible EM-like clustering algorithm for noisy data - Violeta Roizman (L2S, CentraleSupélec)
14:30 - 15:00h. Regularizing Optimal Transport Using Regularity Theory - François-Pierre Paty (CREST, ENSAE)
15:00 - 15:30h. Deep Learning @ Safran for Image Processing - Arnaud Woiselle (Safran Electronics and Defense)
15:30 - 16:00h. End of the day
Tom Charnock (Institut d'Astrophysique de Paris)
Neural networks with large training sets are currently providing tighter constraints on cosmological and astrophysical parameters than ever before. However, in their current form, these neural networks are unable to give true Bayesian inference of such model parameters. I will describe why this is true and present two methods by which the information extracting power of neural networks can be built into the necessary robust statistical framework to perform trustworthy inference, whilst at the same time massively reducing the quantity of training data required.
Lisa Bugnet (DAp, CEA Paris-Saclay)
Stellar astrophysics has been strengthened in the 70’s by the discovery of stellar oscillations due to acoustic waves inside the Sun. These waves evolving inside solar-type stars contain information about the composition and dynamics of the surrounding plasma, and are thus very interesting for the understanding of stellar internal and surface physical processes. With classical asteroseismology we are able to extract very precise and accurate masses, radius, and ages of oscillating stars, that are key parameters for the understanding of stellar evolution.
However, classical methods of asteroseismology are time consuming processes, that can only be applied for stars showing a large enough oscillation signal. In the context of the hundred of thousand stars observed by the Transiting Exoplanet Survey Satellite (TESS), the stellar community has to adapt the methodologies previously built for the study of the few ten thousand of stars observed with much better resolution by the Kepler satellite. Our “method exploits the use of Random Forest machine learning algorithms that aim at automatically 1) classifying and 2) characterizing any stellar pulsators from global non-seismic parameters. We also present a recent result based on neural networks on the automatic detection of peculiar solar-type pulsators that have a surprinsigly low dipolar-oscillation amplitude, the signature of an unknown physical process affecting oscillation modes inside the core.
Niall Jeffrey (ENS)
I will present the first reconstruction of dark matter maps from weak lensing observational data using deep learning. We train a convolution neural network (CNN) with a Unet based architecture on over 3.6×10^5 simulated data realisations with non-Gaussian shape noise and with cosmological parameters varying over a broad prior distribution. We interpret our newly created DES SV map as an approximation of the posterior mean P(κ|γ) of the convergence given observed shear. DeepMass method is substantially more accurate than existing mass-mapping methods with a a validation set of 8000 simulated DES SV data realisations. With higher galaxy density in future weak lensing data unveiling more non-linear scales, it is likely that deep learning will be a leading approach for mass mapping with Euclid and LSST.
François Lanusse (Berkeley Center for Cosmological Physics and CosmoStat CEA Paris Saclay)
The upcoming generation of wide-field optical surveys which includes LSST will aim to shed some much needed light on the physical nature of dark energy and dark matter by mapping the Universe in great detail and on an unprecedented scale. However, with the increase in data quality also comes a significant increase in data complexity, bringing new and outstanding challenges at all levels of the scientific analysis.
In this talk, I will illustrate how deep generative models, combined with physical modeling, can be used to address some of these challenges at the image processing level, specifically by providing data-driven priors of galaxy morphology.
I will first describe how to build such generative models from corrupted and heterogeneous data, i.e. when the training set contains varying observing
conditions (in terms of noise, seeing, or even instruments). This is a necessary step for practical applications, made possible by a hybrid modeling of the
generation process, using deep neural networks to model the underlying distribution of galaxy morphologies, complemented by a physical model of
the noise and instrumental response. Sampling from these models produces realistic galaxy light profiles, which can then be used in survey emulation,
for the purpose of validating and/or calibrating data reduction pipelines.
Even more interestingly, these models can be used as priors on galaxy morphologies and used as such as part of standard Bayesian inference techniques to solve astronomical inverse problems ranging from deconvolution to deblending galaxy images. I will present how combining these deep morphology priors with a physical forward model of observed blended scenes allows us to address the galaxy deblending problem in a physically motivated and interpretable way.
Violeta Roizman (L2S, CentraleSupélec)
Though very popular, it is well known that the EM algorithm suffers from non-Gaussian distribution shapes and outliers. This talk will present a flexible EM-like clustering algorithm that can deal with noise and outliers in diverse data sets. This flexibility is due to extra scale parameters that allow us to accommodate for heavier tail distributions and outliers without significantly loosing efficiency in various classical scenarios. I will show experiments where we compare it to other clustering methods such as k-means, EM and spectral clustering when applied to both synthetic data and real data sets. I will conclude with an application example of our algorithm used for image segmentation.
François-Pierre Paty (CREST, ENSAE)
Optimal transport (OT) dates back to the end of the 18th century, when French mathematician Gaspard Monge proposed to solve the problem of déblais and remblais. In the last few years, OT has also found new applications in statistics and machine learning as a way to analyze and compare data. Both in practice and for statistical reasons, OT need be regularized. In this talk, I will present a new regularization of OT leveraging regularity of the Monge map. Instead of considering regularity as a property that can be proved under suitable assumptions, we consider regularity as a condition that must be enforced when estimating OT. This further allows us to transport out-of-sample points, as well as define a new estimator of the 2-Wasserstein distance between arbitrary measures. (Based on a joint work with Alexandre d'Aspremont and Marco Cuturi).
Arnaud Woiselle (Safran Electronics and Defense)
Deep learning has become the natural tool in computer vision for nearly all high-level tasks, such as object detection and classification for many years, and is now state of the art in most image processing (restoration) tasks, such as debluring or super-resolution. Safran looked into these methods for a large variety of problems, focusing on the use of a low number of network structures, due to electronics constraints for future implementation, and transferred them to real-life noisy and blurry data, both in the visible and the infrared. I will show the results in many applications, and conclude with some tips and take-away messages on what seems important to apply deep learning on a given task.