Blind Source Separation

Blind source separation

Blind source separation (BSS) is a very effective mathematical method to analyze data which are modeled as the linear combination of elementary sources or components. The underlying linear model used so far in BSS is the instantaneous linear mixture model :

\(\forall i,\quad x_i = \sum\limits_j a_{i j } s_j + n_j \qquad \mathbf{X} = \mathbf{A} \mathbf{S} + \mathbf{N}\)

Following this model, each observation is modeled as the linear combination of the same sources plus some noise/perturbation term. The goal of BSS is to jointly estimate the mixing matrix A and the sources S. This yield an inverse problem that is clearly ill-posed. Attempting to make this problem better posed is classically done by assuming further a priori information about either the sources and/or the mixing matrix.

Since the beginning of the 90s, this problem has been well studied in statistics yielding the famous ICA framework (Independent Component Analysis). In the beginning the XXI-st century, fostered by recent advances in harmonic analysis and statistics, the sparsity of the sources has been put forward as an efficient way to disentangle between the sources.
Based on the concept of morphological diversity, the Generalized Morphological Component Analysis (GMCA) algorithm allows for the separation of sources which are sparse in any signal representation (orthonormal, redundant, ... etc). It has been proved to be quite robust and performs very well to process noisy data. In a nutshell, the GMCA algorithm tackles the following optimization problem:

\({A, S} =\mathrm{argmin}_{A, S} \sum\limits_j \lambda_j \parallel s_j \mathbf{W} \parallel_1 + \parallel \mathbf{X} - \mathbf{A} \mathbf{S} \parallel_{ F , \Sigma}^2\)

This problem is solved using iterative thresholding algorithms with an particularly. An exhaustive description of the GMCA algorithm can be found in this review article.

Applications of BSS

Astrophysics and Cosmology: the analysis of the Planck data has probably been one of the major application of BSS in astrophysics. In this scope, the estimation one of the cosmology graals, Cosmological Microwave Background (CMB), mandates the use of BSS methods to disentangle between the cosmological signal and the foregrounds. To that end, the L-GMCA algorithm has been specifically desgined to tackle the estimation of the CMB map from the Planck data. In contrast to the standard GMCA algorithm, the L-GMCA builds upon a new modeling of multiscale data where the linear mixture model is local rather than global and applied at various scales. GMCA has also being used to estimate the EoR cosmological signal (see [1] and [2], [3]).

Joint deconvolution and BSS: In radio-interferometry, multispectral data are observed through so-called visibilities, which are composed of partial Fourier measurements. Therefore, tackling BSS problems requires joint tackling a component separation problem and a compressed sensing reconstruction task. For that purpose, the decGMCA algorithm has been introduce to solve sparse component separation from Fourier measurements. Further details can be found in this article. The decGMCA code is available at this location. The method has also been extended in Carloni et al (2020) to work in spherical data set.

Hyperspectral unmixing: The GMCA algorithm has been further extended to process hyperspectral data. In contrast to usual multispectral data, the number of observations can be large (~100) in the hyperspectral case. Furthermore, the columns of the mixing matrix A are generally related to the electromagnetic spectra of the sought-after components; these spectra generally exhibit some regularities/structures which have sparse distributions in an adequate signal representation (e.g. wavelets). In the hyperspectral case, the HypGMCA algorithm further enforce the sparsity of the sources as well as the sparsity of the columns of the mixing matrix. A description of the HypGMCA algorithm as well as an illustration on Mars Express data are presented here.

Sparse non-negative matrix factorization (NMF) is a very active field in machine learning. However, properly enforcing sparsity together with non-negativity is non-trivial. The GMCA has recently been extended to handle both constraint in a proper way thanks to recent advances in optimization and more precisely in proximal calculus. Further details can be found at this location.

Robust sparse BSS: While real-world data are often grossly corrupted, most techniques of blind source separation (BSS) give erroneous results in the presence of outliers. We propose two robust algorithms, coined rGMCA and rAMCA, that jointly estimate the sparse sources and outliers without requiring any prior knowledge on the outliers. More precisely, they use an alternative weighted scheme to weaken the influence of the estimated outliers. The latest rAMCA algorithm, based on AMCA, provides good performances in terms of accuracy and reliability, both in the determined and over-determined case. Further details can be found in this report.

Current research at LCS focus on dealing with more complex data modeling (e.g. spectral variabilities, Poisson statistics, etc.) as well as understanding how to design a reliable and yet effective optimization framework for tackling sparse BSS problems. The latest advances can be found at this location.

Our sparse BSS softwares are available at this location.