|
Linear Instantaneous TIme-Frequency Ratio Of Mixtures
Par Matthieu Puigt - 27/10/2008
LI-TIFROM software
TIme-Frequency Ratio of Mixtures blind source separation method for Linear-Instantaneous mixtures
LI-TIFROM is a sparsity-based Blind Source Separation (BSS)
method. It is based on a time-frequency (TF) analysis :
- it first finds single-source TF zones, i.e. a set of adjacent TF windows, where a single source is active.
- Then, in each of the above zones, it estimates a column of the mixing matrix.
- When all the columns of the mixing matrix have been estimated, the last step consists in recovering the sources.
Keywords : blind source separation ; sparsity ;
time-frequency analysis ; stationary, non-stationary and/or
dependent sources ;
LI-TIFROM est une méthode de Séparation Aveugle de Sources (SAS). Elle est fondée sur une analyse temps-fréquence (TF) :
- elle trouve d’abord des zones TF mono-sources, c.-à-d. un ensemble de fenêtres TF adjacentes, où une source est seule à être active.
- Ensuite, dans chacune de ces zones, elle estime une colonne de la matrice de mélange.
- Quand toutes les colonnes de la matrice de mélange ont été estimées, la dernière étape consiste à reconstruire les sources.
Mots clés : séparation aveugle de sources ; parcimonie ; analyse temps-fréquence ; sources stationnaires, non-stationnaires et/ou dépendantes ;
Please acknowledge the use of this software in any
publication : "The LI-TIFROM software is available at
http://www.ast.obs-mip.fr/bss-softwares" and cite the
references :
- F. Abrard and Y. Deville, A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources, Signal Processing, vol. 85, issue 7, pp. 1389-1403, July 2005.
- Y. Deville, M. Puigt, and B. Albouy, Time-frequency blind signal separation : extended methods, performance evaluation for speech sources, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2004), pp. 255-260, Budapest, Hungary, July 25-29, 2004.
- M. Puigt and Y. Deville, Time-frequency ratio-based blind separation methods for attenuated and time-delayed sources, Section 3 : "Summary of the TIFROM method for linear instantaneous mixtures", Mechanical Systems and Signal Processing, vol. 19, issue 6, pp. 1348-1379, November 2005.
Please send a copy of your publication to
Matthieu.Puigt@ast.obs-mip.fr and
Yannick.Deville@ast.obs-mip.fr .
Blind source separation (BSS) consists in estimating a set of
unknown sources from a set of observations resulting from
mixtures of these sources through unknown propagation channels.
The mixing model that we consider here is the linear instantaneous
mixture :
Unlike the majority of the proposed methods (based on such tools
as higher-order statistics or information theory), some approaches
based on time-frequency representations appeared since the last
decade. Some of these methods suppose that at most one source is
active in each atom of the analysis domain. On the contrary, the
LI-TIFROM method only needs one tiny single-source
time-frequency zone per source in order to estimate the mixing
matrix, while allowing several sources to be active in all other
zones.
This approach is based on the following stages :
- it first computes Short-Time Fourier Transform (STFT) of the
sources.
- Detection stage : it finds Time-Frequency (TF) zones where a single source is active (called single-source zones).
- Identification stage : it estimates a column of the unknown mixing matrix in each of
the above zones and only keeps the "best" "distant" columns (in
the sense of the single-source quality measured in Step 2).
- Reconstruction stage : it eventually recovers the sources.
Two versions have been proposed (See Subsection
2.2 for more explanations) and details can be
found in :
- F. Abrard and Y. Deville, A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources, Signal Processing, vol. 85, issue 7, pp. 1389-1403, July 2005.
- Y. Deville, M. Puigt, and B. Albouy, Time-frequency blind signal separation : extended methods, performance evaluation for speech sources, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2004), pp. 255-260, Budapest, Hungary, July 25-29, 2004.
- M. Puigt and Y. Deville, Time-frequency ratio-based blind separation methods for attenuated and time-delayed sources, Section 3 : "Summary of the TIFROM method for linear instantaneous mixtures", Mechanical Systems and Signal Processing, vol. 19, issue 6, pp. 1348-1379, November 2005.
As numerous TF-BSS approaches, the LI-TIFROM
method needs to select "good" parameters in order to achieve the separation ! In particular, depending on your data and the selected parameters, it still may fail or provide inaccurate results. In this case, please, do not give up ! Send your comments to Matthieu.Puigt@ast.obs-mip.fr and
Yannick.Deville@ast.obs-mip.fr. We will do our best to help you.
The LI-TIFROM package is written in MATLAB
, needs the MATLAB Signal Processing Toolbox in order to run and has been tested on MATLAB 6.0, 7.0 and 7.5. This
program is under the CeCILL license so you can download the sources and modify them under the terms of this license.
2.2 Versions of LI-TIFROM
The first version of LI-TIFROM has been proposed at the
beginning of the 2000’s in :
- F. Abrard and Y. Deville, A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources, Signal Processing, vol. 85, issue 7, pp. 1389-1403, July 2005.
The identification stage is based on the study of one ratio
of mixtures ,
, where
( ) is the TF transform of and indices are chosen arbitrarily among the available signals. The variance of
in one TF
zone, composed of a series of adjacent TF windows, allows us to find TF single-source zones.
However, we showed that the variance of the ratio of observations in the above LI-TIFROM approach has an asymmetrical behaviour (see e.g. Y. Deville, M. Puigt, Temporal and time-frequency correlation-based blind source separation methods. Part I : Determined and underdetermined linear instantaneous mixtures, Signal Processing, vol. 87, issue 3, pp. 374-407, March 2007). As a consequence, we proposed a Symmetrical version, hence its name LI-TIFROM-S, which takes into account the
ratio
and its inverse
.
Moreover, in the case when the number of observations is higher
than , we showed that it was better to study all ratios
and
and study the variance of the means over of these ratios to
find single-source zones.
All these improvements have been proposed and detailed in :
- Y. Deville, M. Puigt, and B. Albouy, Time-frequency blind
signal separation : extended methods, performance evaluation for
speech sources, Proceedings of the IEEE International Joint
Conference on Neural Networks (IJCNN 2004), pp. 255-260,
Budapest, Hungary, July 25-29, 2004.
- M. Puigt and Y. Deville, Time-frequency ratio-based blind separation methods for attenuated and time-delayed sources, Section 3 : "Summary of the TIFROM method for linear instantaneous mixtures", Mechanical Systems and Signal Processing, vol. 19, issue 6, pp. 1348-1379, November 2005.
Both versions of LI-TIFROM (i.e. asymmetrical and
symmetrical ones) are available in this package.
In the proposed package, as explained above, two versions of
LI-TIFROM that we resp. named li_tifrom and
li_tifrom_sym are provided. Both methods can estimate the
unknown mixing matrix in the determined ( ) and
underdetermined ( ) case. They use the same parameters :
[B,coord,compt,Ns_found,error,X,f,t] = ...
li_tifrom(x,N_sources,nb_samp_in_win,overlap,nb_win,fs) ;
[B,coord,compt,Ns_found,error,X,f,t] = ...
li_tifrom_sym(x,N_sources,nb_samp_in_win,overlap,nb_win,fs) ;
- Input parameters :
- x : the 2D matrix of
observations, where each row is an observation.
- N_sources : the number of
unknown sources.
- nb_samp_in_win : the number
of samples in temporal windows in STFT computations. By
default, this value is set to 128.
- overlap : the temporal
overlap between two time-adjacent TF windows. By default, this
value is set to 0.75.
- nb_win : the number of
TF windows in TF zones. By default, this value is
set to 10.
- fs : the sampling frequency. By default, this value
is set to 1.
- Output parameters :
- B : the estimated mixing matrix.
- coord : the 2D matrix of the coordinates of the TF zones used to find the columns of B.
- compt : the number of TF zones used to estimate B.
- Ns_found : the number of sources which have been found.
- error : the parameter which
indicates why the program stops. The possible values of error
are 0,1 and 2 :
- X : the 3D matrix of TF
observations (i.e. STFTs of observations).
- f : the vector of frequencies at which the STFTs are computed.
- t : the vector of time samples at which the STFTs are computed.
For both methods, some input parameters are optional, as we can
see below :
- fs is only needed if you want to plot STFTs of
observations, using output parameters X, f and
t. When it is not selected, e.g.
[B,coord,compt,Ns_found,error,X,f,t] = ...
li_tifrom(x,N_sources,nb_samp_in_win,overlap,nb_win) ;
fs is set to 1.
- nb_win is the number of TF windows per analysis zone.
In our tests, we found that, for the tested signals, it may be set
to 10. In the case below,
[B,coord,compt,Ns_found,error,X,f,t] = ...
li_tifrom(x,N_sources,nb_samp_in_win,overlap) ;
fs is set to 1 and nb_win is set to 10.
- overlap is the temporal overlap between two TF
windows in STFTs computations. In our tests, we found that, for
the tested signals, it may be set to 75%. With the following
script,
[B,coord,compt,Ns_found,error,X,f,t] = ...
li_tifrom(x,N_sources,nb_samp_in_win) ;
fs, nb_win and overlap are resp. set to 1, 10
and 0.75.
- Lastly, nb_samp_in_win is the number of samples per TF
window in STFTs computations. In our tests with speech signals
sampled at 20kHz, we showed that this value should be preferably
set to 128 (or 256).
[B,coord,compt,Ns_found,error,X,f,t] = li_tifrom(x,N_sources) ;
In this case, fs, nb_win, overlap and
nb_samp_in_win are resp. set to 1, 10, 0.75 and 128.
As explained above, these methods need their input
parameters to be "correctly selected". If you do not obtain the
expected results, please inform us. We will do our best to help
you.
In this section, we propose 4 examples provided in the
LI-TIFROM package, involving several configurations. The
first examples concern the determined mixtures, i.e. when the
number of sources is equal to the number of observations.
The example in Subsection 3.2
illustrates the performance of the proposed approaches when there
are more sources than observations ( ).
The sources are real English spoken signals, from the MULTEXT
database1.
All these signals have first been centered and normalized, so that
their higher absolute value is equal to 1. Temporal and TF
representations of these signals are presented in Figure
1.

Figure 1 : Temporal and TF representations of the sources.
|
Depending whether the mixtures are determined or underdetermined,
we use different measure performance criteria :
- In all the examples of Subsection 3.1, we compute the
Signal-To-Interference Ratio (
) parameters2 :
- the input SIR (
) defines the "mixture rate" : the more the sources are mixed, the lower of the values of this parameter take.
- The output SIR (
) defines the "separation
quality".
- The SIR Improvement (
) is the improvement provided by the BSS
methods :
- In the underdetermined case presented in Subsection 3.2,
recovering the sources by inverting the mixing matrix is not
possible. What has been proposed by our team consists in
considering an invertible submatrix of the estimated mixing matrix
and in separating the sources partially3. In the example provided in Subsection
3.2, we propose a global measure
criterion by computing the Frobenius norm of the difference
between theoretical and estimated mixing matrices.
3.1 Determined mixtures
In this subsection, we propose three examples :
- demo1 shows the performance of the proposed
approaches when
with a "strong" mixture, i.e. we mix real
speech signals with the following mixing matrix :
Performance achieved by the LI-TIFROM methods :
SIR^in : 1.4 dB
"Asymmetrical" LI-TIFROM method :
SIR^out : 82.0 dB
SIRI : 80.6 dB
"Symmetrical" LI-TIFROM method :
SIR^out : 82.0 dB
SIRI : 80.6 dB
- demo2 shows the performance of the proposed
approaches when
with a "weak" mixture, i.e. we mix the
same signals with the following mixing matrix :
This case illustrates the improvement provided by the symmetrical
version of LI-TIFROM :
Performance achieved by the LI-TIFROM methods :
SIR^in : 30.0 dB
"Asymmetrical" LI-TIFROM method :
SIR^out : -34.4 dB
SIRI : -64.4 dB
"Symmetrical" LI-TIFROM method :
SIR^out : 68.3 dB
SIRI : 38.3 dB
- demo3 shows the performance of the proposed approches
when
(the sparsity in the observations is reduced when
sources are numerous). The mixing matrix is set to :
As expected, when the number of observations is above 2, the
LI-TIFROM-S approach yields better performance :
Performance achieved by the LI-TIFROM methods :
SIR^in : -3.6 dB
"Asymmetrical" LI-TIFROM method :
SIR^out : -64.6 dB
SIRI : -61.0 dB
"Symmetrical" LI-TIFROM method :
SIR^out : 43.0 dB
SIRI : 46.7 dB
3.2 Underdetermined mixtures
Contrary to numerous classical BSS approaches,
LI-TIFROM(-S) deals with underdetermined mixtures, while
taking advantage of the sparsity assumption. We here show the
performance of the approaches in this case. In demo4, we
mix sources in observations with the following matrix :
Performance achieved by the LI-TIFROM methods :
"Asymmetrical" LI-TIFROM method :
Frobenius norm : 1.6e-003
"Symmetrical" LI-TIFROM method :
Frobenius norm : 1.6e-003
The LI-TIFROM package is currenlty maintained by Matthieu
PUIGT and Yannick DEVILLE. Many thanks to Frédéric
ABRARD who developped the first version of this software.
LI-TIFROM software
This document was generated using the
LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
Footnotes
- ...
database1
- http://aune.lpl.univ-aix.fr/projects/MULTEXT/
- ... parameters2
- These
parameters are e.g. defined in : Y. Deville, M. Puigt,
Temporal and time-frequency correlation-based blind source
separation methods. Part I : Determined and underdetermined linear
instantaneous mixtures, Signal Processing, vol. 87, issue 3, pp.
374-407, March 2007.
- ... partially3
- See : F. Abrard
and Y. Deville, A time-frequency blind signal separation
method applicable to underdetermined mixtures of dependent
sources, Signal Processing, vol. 85, issue 7, pp. 1389-1403,
July 2005.
[Page Précédente]
[Dans la même rubrique]
[Sommaire]
|