Matthieu Puigt ’s PhD thesis defense
Par Matthieu Puigt - 27/04/2009
Blind source separation methods based on time-frequency transforms.
Application to speech signals
Pierre Comon, I3S, CNRS, Sophia-Antipolis, president
Ali Mansour, E3I2, ENSIETA, Brest, reviewer
Eric Moreau, MS, Université de Toulon-ISITV, reviewer
Jean-Philippe Bernard, CESR, CNRS, Toulouse, member
Shahram Hosseini, LATT, Université de Toulouse, member
Yannick Deville, LATT, Université de Toulouse, thesis supervisor
Several time-frequency (TF) blind source separation (BSS) methods have been proposed in this thesis. In the systems output that have been used, a contribution of each source is estimated, using only mixed signals. All the methods proposed in this manuscript find tiny TF zones where only one source is active and estimate the mixing parameters in these zones. These approaches are particularly well suited for non-stationary sources (speech, music).
We first studied and improved linear instantaneous methods based on variance or correlation criteria, that have been previously proposed by our team. They yield excellent performance for speech signals and can also separate spectra from astrophysical data. However, the nature of the mixtures that they can process limits their application fields.
We have extended these approaches to more realistic mixtures. The first extensions consider attenuated and delayed mixtures of sources, which corresponds to mixtures in anechoic chamber. They require less restrictive sparsity assumptions than some approaches previously proposed in the literature, while addressing the same type of mixtures. We have studied the contribution of clustering techniques to our approaches and have achieved good performance for mixtures of speech signals.
Lastly, a theoretical extension of these methods to general
convolutive mixtures is described. It needs strong sparsity
hypotheses and we have to solve classical indeterminacies of frequency-domain BSS methods.
Keywords : blind source separation, linear instantaneous mixtures, attenuated and delayed mixtures, convolutive mixtures, time-frequecy analysis, non-stationnary sources, short-time Fourier transform, sparsity, correlation, variance, clustering, speech, astrophysics.
Date : December 13, 2007
Hour : 11h00
Place : Conference room, CESR, 9 Av du Colonel Roche, 31028 Toulouse, France
[Dans la même rubrique]