Source localization method by using steering vector estimation based on on-line complex Gaussian mixture model

11257488 · 2022-02-22

Assignee

Inventors

Cpc classification

International classification

Abstract

Provided is a source localization in an apparatus for performing a source localization, a target sound source enhancement or speech recognition. The source localization method using input signals input from a plurality of microphones, comprising steps of: (a) obtaining a log likelihood function or an auxiliary function under the assumption that a target source signal mixed with noises satisfies a CGMM model; (b) obtaining an equation for estimating parameter values of the log likelihood function or the auxiliary function so that a value of the log likelihood function or the auxiliary function is maximized recursively in each time frame; (c) estimating a covariance matrix recursively in each time frame; and (d) estimating a steering vector recursively by using the estimated covariance matrix, wherein the steering vector of the target sound source is estimated from the input signals.

Claims

1. A source localization method comprising: (a) receiving, by one or more processors, input signals from a plurality of microphones; (b) obtaining, by the one or more processors, a log likelihood function or an auxiliary function under the assumption that a target source signal mixed with noises of the input signals satisfies a CGMM model, (c) obtaining, by the one or more processors, an equation for estimating parameter values of the log likelihood function or the auxiliary function so that a value of the log likelihood function or the auxiliary function is maximized recursively in each time frame, (d) estimating, by the one or more processors, a covariance matrix recursively in each time frame, (e) estimating, by the one or more processors, a steering vector recursively by using the estimated covariance matrix, wherein the steering vector of the target sound source is estimated from the input signals, and wherein the step (e) is estimating a steering vector as a value of an eigenvector having a largest eigenvalue by eigenvector-decomposing the covariance matrix, and (f) performing, by the one or more processors, a source localization of a target sound from the input signals by performing a sound enhancement of the input signals using the steering vector.

2. The source localization method according to claim 1, wherein the step (d) is estimating a covariance matrix in each time frame and normalizing the covariance matrix.

3. The source localization method according to claim 1, wherein the log likelihood function or the auxiliary function includes a forgetting factor.

4. A non-transitory computer-readable storage medium recording a program that implements the source localization method according to claim 1 so as to be executable by a processor of the apparatus for performing a source localization, a target sound source enhancement or speech recognition.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a flowchart illustrating a steering vector estimation method according to a preferred embodiment of the present invention;

(2) FIG. 2 is a graph illustrating a MVDR result obtained by using a CHiME4 data in a simulation environment in the steering vector estimation method according to the present invention; and

(3) FIG. 3 is a graph illustrating a MVDR result obtained by using a CHiME4 data in a real environment in the steering vector estimation method according to the present invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

(4) Hereinafter, a real-time CGMM-based steering vector estimation method using recursive estimation according to a preferred embodiment of the pre sent invention will be described in detail with reference to the accompanying drawings.

(5) In order to recursively estimate parameters, a log likelihood function or an auxiliary function introduced with a forgetting factor γ can be obtained as follows.

(6) Q = .Math. t = 1 T γ T - t .Math. f .Math. v λ f , t ( v ) log κ f ( v ) p ( y f , t | d f , t = v , Θ ) ) [ Equation 13 ] p ( y f , t | d f , t = v , Θ ) = ξ f , c ( v ) det ( ϕ f , t ( v ) R f ( v ) exp ( ( - y f , t H ( ϕ f , t ( v ) R f ( v ) ) - 1 y f , t ) ) c [ Equation 14 ]

(7) Herein, a probability density function of a signal is a complex generalized Gaussian distribution. The parameter κ.sub.f.sup.(v) is a Gaussian mixture weight and allows the sum of probability densities to be 1, and the distribution of the probability density function can be varied through a value of a parameter c. The parameter ξ.sub.f,c.sup.(v) is calibrated to allow the sum of probability densities of the complex generalized Gaussian to be 1 according to the value of a parameter c. In the present invention, the value of the parameter c is 1, and in this case, the probability density function matches with the Gaussian distribution.

(8) Q = .Math. t = 1 T γ T - t .Math. f .Math. v λ f , t ( v ) log κ f ( v ) N c ( y f , t ; 0 , ϕ f , t ( v ) R f ( v ) ) , N c ( y f , t ; 0 , ϕ f , t ( v ) R f ( v ) ) = 1 π M det ( ϕ f , t ( v ) R f ( v ) ) exp ( - y f , t H ( ϕ f , t ( v ) R f ( v ) ) - 1 y f , t ) [ Equation 15 ]

(9) The update expression of the variables for maximizing the value of the log likelihood function or an auxiliary function is as follows.

(10) Q ϕ f , t ( v ) = γ T - t λ f , t ( v ) [ - M ϕ f , t ( v ) + 1 ( ϕ f , t ( v ) ) 2 y f , t H ( R f ( v ) ) - 1 y f , t ] = 0 ϕ f , t ( v ) 1 M y f , t H ( R f ( v ) ) - 1 y f , t λ f , t ( v ) κ f ( v ) p ( y f , t | d f , t = v , Θ ) .Math. v κ f ( v ) p ( y f , t | d f , t = v , Θ ) κ f ( v ) 1 .Math. t γ T - t .Math. t γ T - t λ f , t ( v ) [ Equation 16 ]

(11) Similarly, the equation of R.sub.f.sup.(v)can be obtained by the method of maximizing the log likelihood or the auxiliary function, as follows.

(12) Q ( R f ( v ) ) - 1 = .Math. t γ T - t λ f , t ( v ) [ R f ( v ) - 1 ϕ f , t ( v ) y f , t y f , t H = 0 ( .Math. t γ T - t λ f , t ( v ) ) R f ( v ) = .Math. t γ T - t λ f , t ( v ) 1 ϕ f , t ( v ) y f , t ( v ) y f , t ( v ) H [ Equation 17 ]

(13) For simplification, it is defined that

(14) ( .Math. t γ T - t λ f , t ( v ) ) R f ( v ) = Γ f ( v ) ( t ) ,
and thus, the following equation is obtained.

(15) 0 Γ f ( v ) ( T ) = γ .Math. t = 1 T γ ( T - 1 ) - t λ f , t ( v ) 1 ϕ f , t ( v ) y f , t y f , t H + λ f , T ( v ) 1 ϕ f , T ( v ) y f , t y f , T H = γΓ f ( v ) ( T - 1 ) + λ f , T ( v ) 1 ϕ f , T ( v ) y f , T y f , T H [ Equation 18 ]

(16) Therefore, an expression for recursively estimating the value of the parameter in each time frame can be obtained. This estimation equation of (Γ.sub.f.sup.(v)(t)).sup.−1 can be expressed as follows.

(17) ( Γ f ( v ) ( T ) ) - 1 = γ - 1 [ ( Γ f ( T - 1 ) ) - 1 - ( Γ f ( T - 1 ) ) - 1 y f , T y f , T H ( Γ f ( T - 1 ) ) - 1 γ ϕ f , T ( v ) λ f , T ( v ) + y f , T H ( Γ f ( T - 1 ) ) - 1 y f , T ] [ Equation 19 ]

(18) Therefore, the estimation equation of (R.sub.f.sup.(v)(T)).sup.−1 is as follows.

(19) ( R f ( v ) ( T ) ) - 1 = ( .Math. t γ T - t λ f , t ( v ) ) ( Γ f ( v ) ( T ) ) - 1 [ Equation 20 ]

(20) In order to prevent the covariance matrix R.sub.f.sup.(v) from becoming too large or too small to converge, the covariance matrix is normalized as follows. Herein, ⊙ indicates element-wise multiplication. The parameter ρ is a parameter for adjusting a value multiplied by the diagonal elements of the unit matrix, and thus, in the present invention, the parameter ρ is fixed as ρ=−1.

(21) R f ( v ) = R f ( v ) + R f ( v ) H 2 , η = mean ( diag ( R f ( v ) ) diag ( R f ( v ) ) ) , R f ( v ) = 1 η R f ( v ) + 10 ρ I , R f ( v ) = 1 ( det ( R f ( v ) ) ) 1 M R f ( v ) [ Equation 21 ]

(22) The covariance matrix to be estimated for the steering vector calculation can also be obtained in each time frame, as follows.

(23) ( .Math. t γ T - t λ f , t ( v ) ) R f ( v ) ( T ) = γ ( .Math. t = 1 T - 1 γ ( T - 1 ) - t λ f , t ( v ) R f ( v ) ( T - 1 ) + λ f , T ( v ) y f , T y f , T H Γ f ( v ) ( T ) = γΓ f ( v ) ( T - 1 ) - λ f , T ( v ) y f , T y f , T H R f ( v ) ( T ) = 1 .Math. t = 1 T γ T - t λ f , t ( v ) Γ f ( v ) ( T ) R f ( x ) = R f ( x + n ) - R f ( n ) [ Equation 22 ]

(24) The steering vector can be calculated as the value of the eigenvector having the largest eigenvalue obtained through eigenvector decomposition of the covariance matrix obtained as above. The parameter κ.sub.f.sup.(v) can be used by a fixed value to simplify the algorithm.

(25) In addition, it is necessary to set the initial value of R.sub.f.sup.(v), (Γ.sup.(v)).sup.−1, Γ.sup.(v) in the starting process of the algorithm. There are various methods as the initialization methods, and the following four methods are exemplified. α.sub.f.sup.(v), β.sub.f.sup.(v), η.sub.f.sup.(v) are parameters acting as weighting factors according to frequency and v. In the present invention, the initialization is performed by using the equations described in the entire algorithm of FIG. 1. (1) Method 1: This method is in accordance with Equation 23.
R.sub.f.sup.(x+n)(0)=α.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H),R.sub.f.sup.(n)(0)=α.sub.f.sup.(v)I
(Γ.sub.f.sup.(x+n)(0)).sup.−1=β.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H).sup.−1,(Γ.sub.f.sup.(n)(0)).sup.−1=β.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H).sup.−1
Γ′.sub.f.sup.(x+n)(0)=η.sub.f.sup.(v)(y.sub.f,1,y.sub.f,1.sup.H),Γ′.sub.f.sup.(n)(0)=η.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H).sup.−1  [Equation 23]

(26) (2) Method 2: This method is in accordance with Equation 24.
R.sub.f.sup.(x+n)(0)=α.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H),R.sub.f.sup.(n)(0)=α.sub.f.sup.(v)I
(Γ.sub.f.sup.(x+n)(0)).sup.−1=β.sub.f.sup.(v)(y.sub.f,1y.sub.f,1.sup.H).sup.−1,(Γ.sub.f.sup.(n)(0)).sup.−1(0)=β.sub.f.sup.(v)I
Γ′.sub.f.sup.(x+n)(0)=η.sub.f.sup.(v)(y.sub.f,1,y.sub.f,1.sup.H),Γ′.sub.f.sup.(n)(0)=η.sub.f.sup.(v)I  [Equation 24]

(27) (3) Method 3: This method is a method of taking only diagonal elements from methods 1 and 2 described above.

(28) (4) Method 4: This method is a method of steering information of a target signal in a case where the target signal is known.

(29) The overall algorithm using the above results is illustrated in FIG. 1, where N.sub.t is the number of total time frames. FIG. 1 is a flowchart illustrating an algorithm for a steering vector estimation method according to a preferred embodiment of the present invention.

(30) Hereinafter, the MVDR beamforming which is a representative application example using the steering vector according to the present invention will be described in detail.

(31) The MVDR beamforming for noise removal by using a steering vector r.sub.f,t and a covariance matrix of noise R.sub.f.sup.(n) which are recursively estimated. The MVDR beamforming estimates a filter coefficient such that the gain of a signal coming in the direction of the estimated steering vector is maintained to be 1 and the gains of signals coming in the other directions are reduced. This estimation can be expressed by Equation 25.

(32) ω ^ = argmin ω ( ω H R n ω ) , s . t . ω H r = 1 .fwdarw. ω ^ = R n - 1 r r H R n - 1 r [ Equation 25 ]

(33) Therefore, when the MVDR beamforming is performed by using the recursively estimated steering vector and the recursively estimated covariance matrix of noise using the CGMM in each time frame, the noise can be effectively removed.

(34) The MVDR results obtained by using the CHiME4 data are as follows:

(35) FIG. 2 is a graph illustrating a MVDR result obtained by using the CHiME4 data in a simulation environment in the steering vector estimation method according to the present invention. FIG. 3 is a graph illustrating a MVDR result obtained by using the CHiME4 data in a real environment in the steering vector estimation method according to the present invention.

(36) While the present invention has been particularly illustrated and described with reference to exemplary embodiments thereof, it should be understood by the skilled in the art that the invention is not limited to the disclosed embodiments, but various modifications and applications not illustrated in the above description can be made without departing from the spirit of the invention. In addition, differences relating to the modifications and applications should be construed as being included within the scope of the invention as set forth in the appended claims.