Very recent advances in machine learning and in deep learning methods introduced highly sophisticated data analysis tools that are promising candidates to build unsupervised data-driven representations. These statistical methods have already proved their efficiency to solve supervised data classification tasks in applications as diverse as computer vision, speech recognition, natural language processing , to only name a few. Machine learning techniques have recently been advocated as a powerful tool for deriving useful features straight from the data. This aspect of learning, called representation learning, is expected to provide an efficient re-parameterization of the data, composed of more salient features. The main interest of the methods studied by the CosmoStat team is their ability to extract information in a unsupervised manner. In other words, the aim is to design features allowing to efficiently unfold complex underlying structures in the data without including prior information or labelled examples.
Machine learning techniques have been applied to a variety of topics within the CosmoStat group including:
We have investigated the use of one recently introduced machine learning method, namely denoising autoencoders, for unsupervised feature learning from galaxy SEDs. In the spirit of SED color diagrams, the proposed approach allows deriving a new galaxy SEDs’ representation. We have evaluated how the resulting DAE diagram can recover the standard star-forming/quiescent galaxy bimodality. As well, we show that, according to the current understanding of autoencoders, DAE yields a diagram that extracts astrophysically relevant information from the data that standard SED colour diagrams do not exhibit. This work therefore illustrates the interest of these methods for galaxy SEDs’ representation and paves the way for the design of more sophisticated models,
Ongoing work is be carried out to investigate the possibility of identifying blended sources in survey images using machine learning techniques. Current method often employ fixed thresholds to determine whether or not a given patch of the sky contains contributions from multiple sources. Machine learning may offer a more flexible approach that will account for the diversity of objects in the field.
One of the main challenges in Weak Gravitational Lensing is the correct measurement of the shear signal obtained from the shapes of the galaxies. This signal is usually biased due to many factors such as the shape estimation method, pixellization, model bias, image noise, etc. The dependences of shear bias are very complex and cannot be modelled with simple analytical approaches. We use denoising autoencoders to recover the dependencies of shear bias on many properties simultaneously in order to infer the shear bias coming from individual galaxy images.