## PySAP: Python Sparse Data Analysis Package for Multidisciplinary Image Processing

 Authors: S. Farrens, A. Grigis, L. El Gueddari, Z. Ramzi, Chaithya G. R., S. Starck, B. Sarthou, H. Cherkaoui, P.Ciuciu, J-L. Starck Journal: Astronomy and Computing Year: 2020 DOI: 10.1016/j.ascom.2020.100402 Download: ADS | arXiv

## Abstract

We present the open-source image processing software package PySAP (Python Sparse data Analysis Package) developed for the COmpressed Sensing for Magnetic resonance Imaging and Cosmology (COSMIC) project. This package provides a set of flexible tools that can be applied to a variety of compressed sensing and image reconstruction problems in various research domains. In particular, PySAP offers fast wavelet transforms and a range of integrated optimisation algorithms. In this paper we present the features available in PySAP and provide practical demonstrations on astrophysical and magnetic resonance imaging data.

PySAP Code

## Euclid preparation: VI. Verifying the Performance of Cosmic Shear Experiments

### Euclid preparation: VI. Verifying the Performance of Cosmic Shear Experiments

 Authors: Euclid Collaboration, P. Paykari, ..., S. Farrens, M. Kilbinger, V. Pettorino, S. Pires, J.-L. Starck, F. Sureau, et al. Journal: Astronomy and Astrophysics Year: 2020 DOI: 10.1051/0004-6361/201936980 Download: ADS | arXiv

## Abstract

Our aim is to quantify the impact of systematic effects on the inference of cosmological parameters from cosmic shear. We present an end-to-end approach that introduces sources of bias in a modelled weak lensing survey on a galaxy-by-galaxy level. Residual biases are propagated through a pipeline from galaxy properties (one end) through to cosmic shear power spectra and cosmological parameter estimates (the other end), to quantify how imperfect knowledge of the pipeline changes the maximum likelihood values of dark energy parameters. We quantify the impact of an imperfect correction for charge transfer inefficiency (CTI) and modelling uncertainties of the point spread function (PSF) for Euclid, and find that the biases introduced can be corrected to acceptable levels.

## Euclid preparation. V. Predicted yield of redshift 7 < z < 9 quasars from the wide survey

### Euclid preparation: V. Predicted yield of redshift 7

 Authors: Euclid Collaboration, R. Barnett, ..., S. Farrens, M. Kilbinger, V. Pettorino, F. Sureau, et al. Journal: Astronomy and Astrophysics Year: 2019 DOI: 10.1051/0004-6361/201936427 Download: ADS | arXiv

## Abstract

We provide predictions of the yield of 7<z<9 quasars from the Euclid wide survey, updating the calculation presented in the Euclid Red Book in several ways. We account for revisions to the Euclid near-infrared filter wavelengths; we adopt steeper rates of decline of the quasar luminosity function (QLF; Φ) with redshift, Φ∝10k(z−6), k=−0.72, and a further steeper rate of decline, k=−0.92; we use better models of the contaminating populations (MLT dwarfs and compact early-type galaxies); and we use an improved Bayesian selection method, compared to the colour cuts used for the Red Book calculation, allowing the identification of fainter quasars, down to JAB∼23. Quasars at z>8 may be selected from Euclid OYJH photometry alone, but selection over the redshift interval 7<z<8 is greatly improved by the addition of z-band data from, e.g., Pan-STARRS and LSST. We calculate predicted quasar yields for the assumed values of the rate of decline of the QLF beyond z=6. For the case that the decline of the QLF accelerates beyond z=6, with k=−0.92, Euclid should nevertheless find over 100 quasars with 7.0<z<7.5, and ∼25 quasars beyond the current record of z=7.5, including ∼8 beyond z=8.0. The first Euclid quasars at z>7.5 should be found in the DR1 data release, expected in 2024. It will be possible to determine the bright-end slope of the QLF, 7<z<8, M1450<−25, using 8m class telescopes to confirm candidates, but follow-up with JWST or E-ELT will be required to measure the faint-end slope. Contamination of the candidate lists is predicted to be modest even at JAB∼23. The precision with which k can be determined over 7<z<8 depends on the value of k, but assuming k=−0.72 it can be measured to a 1 sigma uncertainty of 0.07.

## Abstract

Galaxy cluster counts in bins of mass and redshift have been shown to be a competitive probe to test cosmological models. This method requires an efficient blind detection of clusters from surveys with a well-known selection function and robust mass estimates. The Euclid wide survey will cover 15000 deg2 of the sky in the optical and near-infrared bands, down to magnitude 24 in the H-band. The resulting data will make it possible to detect a large number of galaxy clusters spanning a wide-range of masses up to redshift ∼2. This paper presents the final results of the Euclid Cluster Finder Challenge (CFC). The objective of these challenges was to select the cluster detection algorithms that best meet the requirements of the Euclid mission. The final CFC included six independent detection algorithms, based on different techniques, such as photometric redshift tomography, optimal filtering, hierarchical approach, wavelet and friend-of-friends algorithms. These algorithms were blindly applied to a mock galaxy catalog with representative Euclid-like properties. The relative performance of the algorithms was assessed by matching the resulting detections to known clusters in the simulations. Several matching procedures were tested, thus making it possible to estimate the associated systematic effects on completeness to <3%. All the tested algorithms are very competitive in terms of performance, with three of them reaching >80% completeness for a mean purity of 80% down to masses of 1014 M⊙ and up to redshift z=2. Based on these results, two algorithms were selected to be implemented in the Euclid pipeline, the AMICO code, based on matched filtering, and the PZWav code, based on an adaptive wavelet approach.

## A Distributed Learning Architecture for Scientific Imaging Problems

 Authors: A. Panousopoulou, S. Farrens, K. Fotiadou, A. Woiselle, G. Tsagkatakis, J-L. Starck,  P. Tsakalides Journal: arXiv Year: 2018 Download: ADS | arXiv

## Abstract

Current trends in scientific imaging are challenged by the emerging need of integrating sophisticated machine learning with Big Data analytics platforms. This work proposes an in-memory distributed learning architecture for enabling sophisticated learning and optimization techniques on scientific imaging problems, which are characterized by the combination of variant information from different origins. We apply the resulting, Spark-compliant, architecture on two emerging use cases from the scientific imaging domain, namely: (a) the space variant deconvolution of galaxy imaging surveys (astrophysics), (b) the super-resolution based on coupled dictionary training (remote sensing). We conduct evaluation studies considering relevant datasets, and the results report at least 60\% improvement in time response against the conventional computing solutions. Ultimately, the offered discussion provides useful practical insights on the impact of key Spark tuning parameters on the speedup achieved, and the memory/disk footprint.

## Abstract

Removing the aberrations introduced by the Point Spread Function (PSF) is a fundamental aspect of astronomical image processing. The presence of noise in observed images makes deconvolution a nontrivial task that necessitates the use of regularisation. This task is particularly difficult when the PSF varies spatially as is the case for the Euclid telescope. New surveys will provide images containing thousand of galaxies and the deconvolution regularisation problem can be considered from a completely new perspective. In fact, one can assume that galaxies belong to a low-rank dimensional space. This work introduces the use of the low-rank matrix approximation as a regularisation prior for galaxy image deconvolution and compares its performance with a standard sparse regularisation technique. This new approach leads to a natural way to handle a space variant PSF. Deconvolution is performed using a Python code that implements a primal-dual splitting algorithm. The data set considered is a sample of 10 000 space-based galaxy images convolved with a known spatially varying Euclid-like PSF and including various levels of Gaussian additive noise. Performance is assessed by examining the deconvolved galaxy image pixels and shapes. The results demonstrate that for small samples of galaxies sparsity performs better in terms of pixel and shape recovery, while for larger samples of galaxies it is possible to obtain more accurate estimates of the galaxy shapes using the low-rank approximation.

## Summary

The Point Spread Function or PSF of an imaging system (also referred to as the impulse response) describes how the system responds to a point (unextended) source. In astrophysics, stars or quasars are often used to measure the PSF of an instrument as in ideal conditions their light would occupy a single pixel on a CCD. Telescopes, however, diffract the incoming photons which limits the maximum resolution achievable. In reality, the images obtained from telescopes include aberrations from various sources such as:

• The atmosphere (for ground based instruments)
• Jitter (for space based instruments)
• Imperfections in the optical system
• Charge spread of the detectors

### Deconvolution

In order to recover the true image properties it is necessary to remove PSF effects from observations. If the PSF is known (which is certainly not trivial) one can attempt to deconvolve the PSF from the image. In the absence of noise this is simple. We can model the observed image $$\mathbf{y}$$ as follows

$$\mathbf{y}=\mathbf{Hx}$$

where $$\mathbf{x}$$ is the true image and $$\mathbf{H}$$ is an operator that represents the convolution with the PSF. Thus, to recover the true image, one would simply invert $$\mathbf{H}$$ as follows

$$\mathbf{x}=\mathbf{H}^{-1}\mathbf{y}$$

Unfortunately, the images we observe also contain noise (e.g. from the CCD readout) and this complicates the problem.

$$\mathbf{y}=\mathbf{Hx} + \mathbf{n}$$

This problem is ill-posed as even the tiniest amount of noise will have a large impact on the result of the operation. Therefore, to obtain a stable and unique solution, it is necessary to regularise the problem by adding additional prior knowledge of the true images.

### Sparsity

One way to regularise the problem is using sparsity. The concept of sparsity is quite simple. If we know that there is a representation of $$\mathbf{x}$$ that is sparse (i.e. most of the coefficients are zeros) then we can force our deconvolved observation $$\mathbf{\hat{x}}$$ to be sparse in the same domain. In practice we aim to minimise a problem of the following form

\begin{aligned} & \underset{\mathbf{x}}{\text{argmin}} & \frac{1}{2}\|\mathbf{y}-\mathbf{H}\mathbf{x}\|_2^2 + \lambda\|\Phi(\mathbf{x})\|_1 & & \text{s.t.} & & \mathbf{x} \ge 0 \end{aligned}

where $$\Phi$$ is a matrix that transforms $$\mathbf{x}$$ to the sparse domain and $$\lambda$$ is a regularisation control parameter.

### Low-Rank Approximation

Another way to regularise the problem is assume that all of the images one aims to deconvolve live on a underlying low-rank manifold. In other words, if we have a sample of galaxy images we wish to deconvolve then we can construct a matrix X X where each column is a vector of galaxy pixel coefficients. If many of these galaxies have similar properties then we know that X X will have a smaller rank than if images were all very different. We can use this knowledge to regularise the deconvolution problem in the following way

\begin{aligned} & \underset{\mathbf{X}}{\text{argmin}} & \frac{1}{2}\|\mathbf{Y}-\mathcal{H}(\mathbf{X})\|_2^2 + \lambda|\mathbf{X}\|_* & & \text{s.t.} & & \mathbf{X} \ge 0 \end{aligned}

### Results

In the paper I implement both of these regularisation techniques and compare how well they perform at deconvolving a sample of 10,000 Euclid-like galaxy images. The results show that, for the data used, sparsity does a better job at recovering the image pixels, while the low-rank approximation does a better job a recovering the galaxy shapes (provided enough galaxies are used).

## Code

SF_DECONVOLVE is a Python code designed for PSF deconvolution using a low-rank approximation and sparsity. The code can handle a fixed PSF for the entire field or a stack of PSFs for each galaxy position.

## Friends-of-Friends Groups and Clusters in the 2SLAQ Catalogue

 Authors: S. Farrens, F.B. Abdalla, E.S. Cypriano, C. Sabiu, C. Blake Journal: MNRAS Year: 2011 Download: ADS | arXiv

## Abstract

We present a catalogue of galaxy groups and clusters selected using a friends-of-friends (FoF) algorithm with a dynamic linking length from the 2dF-SDSS LRG and QSO (2SLAQ) luminous red galaxy survey. The linking parameters for the code are chosen through an analysis of simulated 2SLAQ haloes. The resulting catalogue includes 313 clusters containing 1152 galaxies. The galaxy groups and clusters have an average velocity dispersion of ? km s-1 and an average size of ? Mpc h-1. Galaxies from regions of 1 deg2 and centred on the galaxy clusters were downloaded from the Sloan Digital Sky Survey Data Release 6. Investigating the photometric redshifts and cluster red sequence of these galaxies shows that the galaxy clusters detected with the FoF algorithm are reliable out to z˜ 0.6. We estimate masses for the clusters using their velocity dispersions. These mass estimates are shown to be consistent with 2SLAQ mock halo masses. Further analysis of the simulation haloes shows that clipping out low-richness groups with large radii improves the purity of catalogue from 52 to 88 per cent, while retaining a completeness of 94 per cent. Finally, we test the two-point correlation function of our cluster catalogue. We find a best-fitting power-law model, ξ(r) = (r/r0)γ, with parameters r0= 24 ± 4 Mpc h-1 and γ=-2.1 ± 0.2, which are in agreement with other low-redshift cluster samples and consistent with a Λ cold dark matter universe.