ADAPTIVE DELAY DIVERSITY FILTER AND ECHO CANCELLATION APPARATUS AND METHOD USING THE SAME
20220053268 · 2022-02-17
Inventors
Cpc classification
G10K11/178
PHYSICS
G10L21/02
PHYSICS
International classification
Abstract
An adaptive filter and an echo cancellation apparatus and method using the same, utilize an adaptive delay diversity filter for coping with the situation in which variation in a time delay between a reference signal and a target signal of an adaptive filter is large. The adaptive delay diversity filter includes multiple sub-filters, each of which generates output through filtering by utilizing a signal, obtained by applying different time delay values to the reference signal, as a reference signal, and selects the best output value among generated output values of the multiple sub-filters, as an output value of the adaptive delay diversity filter. Further, the adaptive delay diversity filter includes a means for reducing a computational load.
Claims
1. An adaptive filter, comprising: a transmit signal x(n) transmitted from a transmitting end; an input signal y(n) collected at a receiving end; a desired signal included in the y(n); and an adaptive delay diversity filter for estimating the desired signal using the x(n) as a reference signal and using the y(n) as a target signal, wherein the adaptive delay diversity filter comprises a plurality of sub-filters, wherein the plurality of sub-filters is composed of M sub-filters (where M is a natural number greater than 1), wherein each sub-filter of the plurality of sub-filters is implemented as an adaptive filter, wherein each of the sub-filters uses the y(n) as a target signal, wherein each of the sub-filters uses signals x.sub.j(n) (j=1, . . . , M), obtained by time-delaying the x(n) using different delay values for respective sub-filters, as respective reference signals, wherein each of the signals x.sub.j(n) is configured to satisfy Equation 1, wherein Equation 1 is x.sub.j(n)=x(n−D.sub.j), j=1, . . . , M, where a time delay value D.sub.j is an integer satisfying 0≤D.sub.1<D.sub.2< . . . <D.sub.M, and wherein the adaptive delay diversity filter computes output values of respective sub-filters through filtering with respect to each input value, and selects one of the computed multiple output values as an output value of the adaptive delay diversity filter.
2. The adaptive filter of claim 1, wherein: the plurality of sub-filters is configured such that the time delay value D.sub.j used to derive the x.sub.j(n) (j=1, . . . , M) from the x(n) satisfies Equation 2, Equation 2 is D.sub.j=(j−1)D+D.sub.const, j=1, . . . , M, where D is a constant having a positive integer value and D.sub.const is a constant having an integer value, a time difference between reference signals used by two adjacent sub-filters based on Equation 2 corresponds to D samples, and filter lengths of respective sub-filters have an identical value.
3. The adaptive filter of claim 2, wherein: in a case where an adaptive filter algorithm for calculating filter coefficients per sample is applied to each sub-filter of the plurality of sub-filters, when there is an adaptive filter algorithm in which a formula v(n) represented only by the reference signal x(n) of the adaptive filter exists in a filter coefficient calculation formula present in the algorithm and in which v(n) is represented by a function f(⋅) of the x(n), as given by Equation 3, the Equation 3 is represented by v(n)=f(x(n)), the v(n) is one of a scalar, a vector, and a matrix, n is a sample number, when v(n), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(n) (j=1, . . . , M), a value of v.sub.1(n) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value v.sub.j(n) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 4, Equation 4 is v.sub.1(n)=v.sub.j−1(n−D), j=2, . . . , M, and D in Equation 4 is a value identical to D in Equation 2.
4. The adaptive filter of claim 2, wherein: when an adaptive filter algorithm for calculating filter coefficients per frame is applied to each sub-filter of the plurality of sub-filters, new N input data samples flow in the adaptive filter in each frame and are used to form a current frame, when D is an integer multiple of N and satisfies Equation 5, Equation 5 is D=BN, where B is a positive integer value, D in Equation 5 is a value identical to D in Equation 2, values corresponding to the current frame of the reference signal x(n) of the adaptive filter are collected and a value generated by including values of past frames of x(n) in the collected values is represented by X(m), m is a frame number, if there is an adaptive filter algorithm in which a formula v(m) represented only by the X(m) exists in a filter coefficient calculation formula present in the adaptive filter algorithm and in which v(m) is represented by a function f(⋅) of the X(m), as given by Equation 6, Equation 6 is represented by v(m)=f(X(m)), the v(m) is one of a scalar, a vector, and a matrix, if v(m), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(m) (j=1, . . . , M), a value of v.sub.1(m) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value of v.sub.j(m) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 7, and Equation 7 is v.sub.j(m)=v.sub.j−1(m−B), j=2, . . . , M.
5. The adaptive filter of claim 1, wherein the adaptive delay diversity filter is configured to: deactivate an active sub-filter according to a designated condition, and allow only active sub-filters among the plurality of sub-filters to participate in computation of output values and allow deactivated sub-filters to be excluded from computation of output values while computing output values using the plurality of sub-filters.
6. The adaptive filter of claim 5, wherein the adaptive delay diversity filter is configured to compute performance metric values for respective active sub-filters among the plurality of sub-filters, find a maximum value or a minimum value among the computed performance metric values, and thereafter perform an operation of deactivating a sub-filter having the maximum performance metric value or the minimum performance metric value.
7. The adaptive filter of claim 5, wherein the adaptive delay diversity filter is configured to perform an operation of deactivating a sub-filter according to the designated condition until at least one active sub-filter or a predefined number of active sub-filters among the plurality of sub-filters remains.
8. The adaptive filter of claim 5, wherein the adaptive delay diversity filter is configured to control deactivation of a sub-filter so that a number of sub-filters that perform adaptation among the plurality of sub-filters becomes less than or equal to a predefined number of sub-filters.
9. An electronic device, comprising: a speaker output signal x(n) collected at an audio output terminal; a microphone input signal y(n) collected at the microphone; an echo included in the y(n); and an echo canceller for cancelling the echo using the x(n) as a reference signal and using the y(n) as a target signal, wherein the echo canceller comprises a plurality of sub-filters, wherein the plurality of sub-filters is composed of M sub-filters (where M is a natural number greater than 1), wherein each sub filter of the plurality of sub-filters is implemented as an adaptive filter, wherein each of the sub-filters uses the y(n) as a target signal, wherein each of the sub-filters uses signals x.sub.j(n) (where j=1, . . . , M), obtained by time-delaying the x(n) using different delay values for respective sub-filters, as respective reference signals, wherein each of the signals x.sub.j(n) is configured to satisfy Equation 8, wherein Equation 8 is x.sub.j(n)=x(n−D.sub.j), j=1, . . . , M, where a time delay value D.sub.j is an integer satisfying 0≤D.sub.i<D.sub.2< . . . <D.sub.M, and wherein the echo canceller computes output values of respective sub-filters through filtering with respect to each input value, and selects one of the computed multiple output values as an output value of the echo canceller.
10. The electronic device of claim 9, wherein: the plurality of sub-filters is configured such that the time delay value D.sub.j used to derive the x.sub.j(n) (j=1, . . . , M) from the x(n) satisfies Equation 9, Equation 9 is D.sub.j=(j−1)D+D.sub.const, j=1, . . . , M, where D is a constant having a positive integer value and D.sub.const is a constant having an integer value, a time difference between reference signals used by two adjacent sub-filters based on Equation 9 corresponds to D samples, and filter lengths of respective sub-filters have an identical value.
11. The electronic device of claim 10, wherein: in a case where an adaptive filter algorithm for calculating filter coefficients per sample is applied to each sub-filter of the plurality of sub-filters, when there is an adaptive filter algorithm in which a formula v(n) represented only by the reference signal x(n) of the adaptive filter exists in a filter coefficient calculation formula present in the algorithm and in which v(n) is represented by a function f(⋅) of the x(n), as given by Equation 10, Equation 10 is represented by v(n)=f(x(n)), the v(n) is one of a scalar, a vector, and a matrix, n is a sample number, when v(n), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(n) (j=1, . . . , M), a value of v.sub.1(n) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value of v.sub.j(n) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 11, Equation 11 is v.sub.j(n)=v.sub.j−1(n−D), j=2, . . . , M, and D in Equation 11 is a value identical to D in Equation 9.
12. The electronic device of claim 10, wherein: when an adaptive filter algorithm for calculating filter coefficients per frame is applied to each sub-filter of the plurality of sub-filters, new N input data samples flow in the adaptive filter in each frame and are used to form a current frame, when D is an integer multiple of N and satisfies Equation 12, Equation 12 is D=BN, where B is a positive integer value, D in Equation 12 is a value identical to D in Equation 9, values corresponding to the current frame of the reference signal x(n) of the adaptive filter are collected and a value generated by including values of past frames of x(n) in the collected values is represented by X(m), m is a frame number, if there is an adaptive filter algorithm in which a formula v(m) represented only by the X(m) exists in a filter coefficient calculation formula present in the adaptive filter algorithm and in which v(m) is represented by a function f(⋅) of the X(m), as given by Equation 13, Equation 13 is represented by v(m)=f(X(m)), the v(m) is one of a scalar, a vector, and a matrix, if v(m), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(m) (j=1, . . . , M), a value of v.sub.1(m) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value of v.sub.j(m) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 14, and Equation 14 is v.sub.j(m)=v.sub.j−1(m−B), j=2, . . . , M.
13. The electronic device of claim 9, wherein the echo canceller is configured to: deactivate an active sub-filter according to a designated condition, and allow only active sub-filters among the plurality of sub-filters to participate in computation of output values and allow deactivated sub-filters to be excluded from computation of output values while computing output values using the plurality of sub-filters.
14. The electronic device of claim 13, wherein the echo canceller is configured to compute performance metric values for respective active sub-filters among the plurality of sub-filters, find a maximum value or a minimum value among the computed performance metric values, and thereafter perform an operation of deactivating a sub-filter having the maximum performance metric value or the minimum performance metric value.
15. The electronic device of claim 13, wherein the echo canceller is configured to perform an operation of deactivating a sub-filter according to the designated condition until at least one active sub-filter or a predefined number of active sub-filters among the plurality of sub-filters remains.
16. The electronic device of claim 13, wherein the echo canceller is configured to control deactivation of a sub-filter so that a number of sub-filters that perform adaptation among the plurality of sub-filters becomes less than or equal to a predefined number of sub-filters.
17. An echo cancellation method, comprising: collecting a speaker output signal x(n) at an audio output terminal; outputting the speaker output signal x(n) through the audio output terminal and thereafter collecting a microphone input signal y(n) through the microphone; and allowing an echo canceller to cancel an echo included in the y(n) using the x(n) as a reference signal and using the y(n) as a target signal, wherein the echo cancellation method further comprises: configuring the echo canceller using a plurality of sub-filters, wherein the plurality of sub-filters includes M sub-filters (where M is a natural number greater than 1), configuring each sub-filter of the plurality of sub-filters as an adaptive filter; allowing the sub-filters to obtain delayed signals x.sub.j(n) (j=1, . . . , M) by time-delaying the x(n) using different delay values for respective sub-filters, wherein obtaining the delayed signals x.sub.j(n) is configured to satisfy Equation 15, and Equation 15 is x.sub.j(n)=x(n−D.sub.j), j=1, . . . , M, where a time delay value D.sub.j is an integer satisfying 0≤D.sub.1<D.sub.2< . . . <D.sub.M; configuring the inputs of each sub-filter using the y(n) as a target signal and using each of the delayed signals x.sub.j(n) as a reference signal; and performing an output value computation operation of computing output values of respective sub-filters through filtering with respect to each input value, and selecting one of the computed multiple output values as an output value of the echo canceller.
18. The echo cancellation method of claim 17, wherein: obtaining the delayed signals x.sub.j(n) (j=1, . . . , M) comprises computing the time delay value D.sub.j used to derive the x.sub.j(n) from the x(n) using Equation 16, Equation 16 is D.sub.j=(j−1)D+D.sub.const, j=1, . . . , M, where D is a constant having a positive integer value and D.sub.const is a constant having an integer value, a time difference between reference signals used by two adjacent sub-filters based on Equation 16 corresponds to D samples, and configuring the inputs of each sub-filter comprises allowing filter lengths of respective sub-filters to have an identical value.
19. The echo cancellation method of claim 18, further comprising performing a filter coefficient calculation operation, wherein: in a case where an adaptive filter algorithm for calculating filter coefficients per sample is applied to each sub-filter of the plurality of sub-filters, when there is an adaptive filter algorithm in which a formula v(n) represented only by the reference signal x(n) of the adaptive filter exists in a filter coefficient calculation formula present in the algorithm and in which the v(n) is represented by a function f(⋅) of the x(n), as given by Equation 17, Equation 17 is represented by v(n)=f(x(n)), the v(n) is one of a scalar, a vector, and a matrix, n is a sample number, when v(n), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(n) (j=1, . . . , M), a value of v.sub.1(n) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value of v.sub.j(n) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 18, Equation 18 is v.sub.j(n)=v.sub.j−1(n−D), j=2, . . . , M, and D in Equation 18 is a value identical to D in Equation 16.
20. The echo cancellation method of claim 18, further comprising performing a filter coefficient calculation operation, wherein: when an adaptive filter algorithm for calculating filter coefficients per frame is applied to each sub-filter of the plurality of sub-filters, new N input data samples flow in the adaptive filter in each frame and are used to form a current frame, when D is an integer multiple of N and satisfies Equation 19, Equation 19 is D=BN, where B is a positive integer value, D in Equation 19 is a value identical to D in Equation 16, values corresponding to the current frame of the reference signal x(n) of the adaptive filter are collected and a value generated by including values of past frames of x(n) in the collected values is represented by X(m), m is a frame number, if there is an adaptive filter algorithm in which a formula v(m) represented only by the X(m) exists in a filter coefficient calculation formula present in the adaptive filter algorithm and in which v(m) is represented by a function f(⋅) of the X(m), as given by Equation 20, Equation 20 is represented by v(m)=f(X(m)), the v(m) is one of a scalar, a vector, and a matrix, if v(m), obtained when the adaptive filter algorithm is applied to each of the plurality of sub-filters, is represented by v.sub.j(m) (j=1, . . . , M), a value of v.sub.1(m) is computed only for a first sub-filter (j=1) among the plurality of sub-filters, and a value of v.sub.j(m) (j=2, . . . , M) for remaining sub-filters other than the first sub-filter is obtained using a time delay relational expression in Equation 21, and Equation 21 is v.sub.j(m)=v.sub.j−1(m−B), j=2, . . . , M.
21. The echo cancellation method of claim 17, wherein performing the output value computation operation comprises: performing a pruning operation of deactivating an active sub-filter according to a designated condition; and allowing only active sub-filters among the plurality of sub-filters to participate in computation of output values and allowing deactivated sub-filters to be excluded from a procedure for computing the output values while computing the output values using the plurality of sub-filters.
22. The echo cancellation method of claim 21, wherein performing the pruning operation comprises: computing performance metric values for respective active sub-filters among the plurality of sub-filters; finding a maximum value or a minimum value among the computed performance metric values; and deactivating a sub-filter having the maximum performance metric value or the minimum performance metric value.
23. The echo cancellation method of claim 21, wherein performing the pruning operation comprises deactivating a sub-filter according to the designated condition until at least one active sub-filter or a predefined number of active sub-filters among the plurality of sub-filters remains.
24. The echo cancellation method of claim 21, wherein performing the pruning operation comprises controlling deactivation of a sub-filter so that a number of sub-filters that perform adaptation among the plurality of sub-filters becomes less than or equal to a predefined number of sub-filters.
25. The adaptive filter of claim 1, wherein the adaptive delay diversity filter is configured to: compute an error signal e.sub.j(n) (j=1, . . . , M) between a target estimation signal ŷ.sub.j(n) (j=1, . . . , M) that is computed for each sub-filter through filtering and a target input signal y(n), compute performance metric values p.sub.j(n) (j=1, . . . , M) of respective sub-filters using the above signals, select a sub-filter corresponding to a maximum value or a minimum value from among the performance metric values {p.sub.1(n), . . . , p.sub.M(n)} as a best sub-filter, and select an output value of the best sub-filter as an output value of the adaptive delay diversity filter.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
DETAILED DESCRIPTION
[0050] In the following description, it should be noted that only parts required for understanding of embodiments of the present disclosure will be described, and explanation of other parts will be omitted so as to avoid making the gist of the present disclosure unclear.
[0051] It should be noted that the terms and words used in the specification and the claims, which will be described below, should not be construed as being limited to ordinary meanings or dictionary definitions, and an inventor can appropriately define the concepts of terms in order to best describe his or her invention. Meanwhile, the embodiments described in the present specification and the configurations illustrated in the drawings are merely preferable embodiments and do not exhaustively present the technical spirit of the present disclosure. Accordingly, it should be appreciated that there may be various equivalents and modifications that can replace the embodiments and the configurations at the time at which the present application is filed.
[0052] Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
[0053]
[0054] In
[0055] The present disclosure relates to a method for efficiently estimating a system, which has an impulse response with large t.sub.r variation, through the use of an adaptive filter whenever an impulse response is measured. In particular, the present disclosure presents a method for yielding the same effect as that obtained when estimating the entire period of the impulse response by estimating a t.sub.d interval using a small sub-filter, instead of directly estimating the entire period of the impulse response.
[0056]
[0057] Referring to
[0058] The memory 110 may store various programs and various types of data related to the operation of the electronic device. For example, the memory 110 may store an operating system related to the operation of the electronic device 100, applications related to various user functions provided based on the electronic device 100, and various types of data related to execution of the applications. In an example, the memory 110 may store a music playback application related to music playback and a speech (voice) recognition application related to speech recognition. Further, the memory 110 may store applications for supporting various functions, such as a recording function, a video playback function, and a remote control function for an external electronic device. The memory 110 may store an echo canceller including an adaptive filter according to an embodiment of the present disclosure, and may provide the echo canceller to the audio-processing unit 130 under the control of the processor 120. Alternatively, the echo canceller may be mounted in the audio-processing unit 130 in an embedded manner. Alternatively, the echo canceller may be implemented as at least one hardware processor or at least one software module.
[0059] The processor 120 may perform various types of signal processing related to the operation of the electronic device 100, e.g., transfer and processing of audio data. The processor 120 may be implemented using multiple sub-processors, or may be implemented to allow one processor to process multiple functions. In an example, the audio-processing unit 130 may be integrated into the processor 120. Here, the audio-processing unit 130 may be understood to be a component of the processor 120. The processor 120 may be supplied with power from a battery included in the electronic device 100 or an external power source connected thereto, and may perform various user functions using the corresponding power. For example, the processor 120 may perform various types of processing, such as playing music based on user input that is received through the input unit 150 or preset scheduling information, searching for an external electronic device and establishing a communication channel therewith, playing video, or accessing a server and receiving and outputting content provided by the server. In the present disclosure, the processor 120 may manage the application of an echo cancellation function to audio data produced during a specific application execution procedure by controlling the audio-processing unit 130.
[0060] The speaker 141 may be connected to the audio-processing unit 130, and may convert received audio data into an audio signal and output the audio signal, under the control of the audio-processing unit 130.
[0061] The microphone 142 may be connected to the audio-processing unit 130, and may be activated under the control of the audio-processing unit 130, after which the microphone 142 may collect an external audio signal, convert the external audio signal into audio data, and transfer the audio data to the audio-processing unit 130.
[0062] The display unit 160 may output various screens related to the operation of the electronic device 100. For example, the display unit 160 may output various screens, such as a screen related to music playback, a screen related to a speech recognition function, and a video playback screen. As an example, the display unit 160 may include a touch function, and in which case the display unit 160 may be operated as a component of the input unit.
[0063] The communication circuit 170 may support at least one communication function of the electronic device 100. For example, the communication circuit 170 may establish a base station communication channel in relation to a connection to a server. Alternatively, the communication circuit 170 may establish a short-range communication channel with an external electronic device. In relation to this, the communication circuit 170 may include at least one of a network communication module and a short-range communication module. As an example, the communication circuit 170 may receive audio data (e.g., sound source information) provided by a server (e.g., a sound source provision server) under the control of the processor 120, and may output the audio data to the speaker 141 through the audio-processing unit 130. Alternatively, the communication circuit 170 may receive audio data from an additional electronic device connected through the short-range communication channel and transfer the audio data to the audio-processing unit 130, or may transmit played audio data to the additional electronic device under the control of the processor 120. Further, the communication circuit 170 may also transmit speech data required in order for the server to perform speech recognition to the server, under the control of the processor 120. Furthermore, the communication circuit 170 may transmit the results of speech recognition to the server or to an external electronic device under the control of the processor 120.
[0064] The input unit 150 may support an input function of the electronic device 100. For example, the input unit 150 may include a keypad, a key button or the like. Alternatively, the electronic device 100 may include the configuration of the microphone 142, which is capable of performing a voice command function, as the input unit 150. A user may request the execution or termination of various user functions provided by the electronic device 100 through the input unit 150. For example, the input unit 150 may collect an input signal for executing a music playback function, an input signal for playback control during music playback, an input signal for establishing or releasing a connection to the server, a voice command input signal related to a speech recognition function, etc., and may transfer the collected input signals to the processor 120.
[0065] The audio-processing unit 130 may support the execution of applications related to audio-data processing, and may control at least one of the speaker 141 and the microphone 142 to output audio data or collect an audio signal during the application execution support procedure. The audio-processing unit 130 may perform function processing depending on execution of a specific application under the control of the processor 120, or may be authorized by the processor 120 to control the execution and function processing of a specific application. As described above, the audio-processing unit 130 may be implemented as a partial component of the processor 120, or may be implemented as a component independent of the processor 120 to communicate with the processor 120, thus performing function processing depending on execution of a specific application. Due to this implementation scheme, the audio-processing unit 130 may be regarded as a single processor. The audio-processing unit 130 may process the functions to be performed depending on the type of application that is requested to be executed or execution of the application, and various tasks or various tasks related to processing of audio data by applications currently being executed. For example, the audio-processing unit 130 may perform control such that the speaker 141 is activated in response to a request received from a music playback application and such that various types of audio processing are performed on audio data to be played, and thereafter processed audio data is output through the speaker 141.
[0066] Further, when a recording function or an application for controlling an external electronic device is executed, the audio-processing unit 130 may support audio processing such that the microphone 142 is activated to collect user speech information and various types of audio processing are performed on audio signals collected through the microphone 142, after which the processed audio signals are stored in the memory 110 or are transmitted to an external electronic device through the communication circuit 170.
[0067] In particular, when audio signals are collected through the microphone 142 during an audio data output process, the audio-processing unit 130 according to the present disclosure may perform echo cancellation on the collected audio signals by performing echo cancellation using an adaptive delay diversity filter, and thereafter may perform speech recognition on the corresponding audio signals. For example, the adaptive delay diversity filter may be configured such that at least a portion thereof is implemented in the form of hardware or at least a portion thereof is implemented in the form of a software module. When at least a portion of the adaptive delay diversity filter is implemented as a software module, the present disclosure may include memory, which stores the software module forming at least the portion of the adaptive delay diversity filter and a hardware processor component, which accesses the memory and executes the operation of the adaptive delay diversity filter.
[0068] Such a speech recognition process may be performed using, for example, a speech recognition database (DB) stored in the memory 110 or a speech recognition DB stored in the server, and may be configured to process various user functions based on the results of speech recognition. Referring to
[0069] The application block 131 may include an audio output unit 131a, a speech recognizer 131b, and an echo canceller 131c.
[0070] The audio output unit 131a may collect audio data according to execution of a music playback application stored in the memory 110, and may transfer the collected audio data to a sound source playback processing block 132a. Alternatively, the audio output unit 131a may transfer audio data received from the outside through the communication circuit 170 to the sound source playback processing block 132a. The audio output unit 131a may be activated according to execution of the sound source application, and may perform processing of the collected audio data.
[0071] The speech recognizer 131b may be activated according to execution of a specific application, and may control the supply of power to the microphone 142. For example, when an application supporting a speech recognition function (e.g., a music playback application, a navigation program or the like) is executed, the speech recognizer 131b may be activated, and may control the supply of power to the microphone 142 to collect audio signals. The speech recognizer 131b may analyze audio data transferred from a sound source collection processing block 132b, and may transfer the results of analysis of the audio data to the processor 120. Alternatively, the speech recognizer 131b may control the execution of a user function (e.g., playback control for a music playback program, or the like) corresponding to the results of analysis of the audio data.
[0072] The echo canceller 131c may collect first audio data transferred from the audio output unit 131a to the sound source playback processing block 132a and second audio data transferred from the sound source collection processing block 132b to the speech recognizer 131b. The echo canceller 131c may estimate and subtract an echo contained in the second audio data using the collected first audio data, and may provide the result of the estimation and subtraction to the speech recognizer 131b. During this process, the echo canceller 131c may perform filtering by operating an adaptive filter composed of a plurality of sub-filters. During this process, the echo canceller 131c may set a value corresponding to the acoustic path length (e.g., a physical distance or an acoustic distance between the speaker and the microphone) of the electronic device 100 to the filter length value of a sub-filter, and may configure the adaptive delay diversity filter using a plurality of such sub-filters. Through the configuration of the adaptive delay diversity filter using the plurality of sub-filters, whenever new input is received, respective sub-filters may generate output values through filtering, and the best output value among the output values of the plurality of sub-filters may be selected as a final output value.
[0073] Also, the echo canceller 131c may allow respective sub-filters to share calculation values (e.g., an inverse matrix of a correlation matrix in the case of a Wiener filter) used to compute filter coefficients of the adaptive filter, thus reducing the computational load required by respective sub-filters to compute coefficients. The echo canceller 131c may finally select one sub-filter by gradually reducing the number of sub-filters that are in an active state using a pruning method, thus reducing a computational load.
[0074] The audio input/output processing block 132 may include at least one sound source playback processing block 132a and at least one sound source collection processing block 132b. The sound source playback processing block 132a may be disposed between the speaker 141 and the audio output unit 131a. The sound source playback processing block 132a may include one or more blocks for providing various acoustic effects required for sound source playback. The sound source collection processing block 132b may be disposed between the microphone 142 and the speech recognizer 131b.
[0075]
[0076] The impulse response of the echo path illustrated in
[0077] When the signal transmission environment is regarded as being extended to a general signal transmission environment, without being limited to an audio environment, the impulse response of a transmission path from the transmission of a signal from a transmitting end to the reception of the signal at a receiving end in the environment may be similarly generalized, as illustrated in
[0078]
[0079] Referring to
[0080]
[0081] Referring to
[0082] Here, the filter length formed by all of the sub-filters may be set to a length equal to or greater than the length of the echo path (e.g., the sum of the non-acoustic time delay t.sub.r and the acoustic time delay t.sub.d).
[0083]
[0084] Meanwhile, respective sub-filters forming the adaptive delay diversity filter are implemented in the form of an adaptive filter, may adopt and use an arbitrary adaptive filter algorithm, and may perform adaptation independently of each other.
[0085]
[0086] Referring to
[0087] At step 303, the echo canceller 131c computes the inverse matrix of a correlation matrix for the first sub-filter. This inverse matrix is used to compute filter coefficients through adaptation of the first sub-filter at step 309_1. Delayed values of the inverse matrix are used for adaptation of the second sub-filter, the third sub-filter, . . . , at steps 309_2, . . . , 309_M.
[0088] Next, the echo canceller 131c may perform an adaptation and filtering procedure on respective sub-filters. For example, the echo canceller 131c may allocate an ID for computation of the first sub-filter (e.g., sub-filter ID=1) at step 305_1, and may check whether the first sub-filter is in an active state may be checked at step 307_1. If the first sub-filter is in an inactive state, the echo canceller 131c may stop computation for the first sub-filter. If the first sub-filter is in an active state, the echo canceller 131c computes filter coefficients through adaptation using the inverse matrix, obtained at step 303, at step 309_1, and performs filtering at step 311_1.
[0089] The echo canceller 131c may perform the above-described ID allocation, checking of activation or deactivation, application of the correlation matrix, and filtering for other sub-filters in the same manner. In this operation, the echo canceller 131c may simultaneously perform computations for respective sub-filters in parallel. For example, the echo canceller 131c may perform allocation of an ID to the second sub-filter (e.g., sub-filter=2, at step 305_2), check activation or deactivation of the second sub-filter (e.g., active? at step 307_2), perform adaptation using the delayed values of the inverse matrix of the correlation matrix directly computed for the first sub-filter (at step 309_2) and perform filtering (at step 313_2) while performing processing of computation for the first sub-filter. Furthermore, the echo canceller 131c may perform allocation of an ID to an M-th sub-filter (e.g., sub-filter=M, at step 305_M), check activation or deactivation of the M-th sub-filter (e.g., active? at step 307_M), perform adaptation using the delayed values of the inverse matrix of the correlation matrix (at step 309_M) and perform filtering (at step 313_M) so as to perform computation for the M-th sub-filter.
[0090] Next, the echo canceller 131c may select the best filter at step 313. For example, the echo canceller 131c may select the sub-filter having the best result by comparing the results of filtering by the first to M-th sub-filters with each other.
[0091] Next, the echo canceller 131c may reduce a computational load related to the operation of sub-filters by performing pruning for reducing the total number of active sub-filters at step 315. With using the pruning function, as illustrated in
[0092] As described above, respective sub-filters of the adaptive delay diversity filter are implemented and operated as adaptive filters, and the adaptive delay diversity filter selects the output of the sub-filter having the best result as final output while the sub-filters compete with each other whenever input is received for each time period. At this time, the inverse matrix of the correlation matrix used to obtain filter coefficients may be computed only for the first sub-filter, rather than for all sub-filters, and values obtained by delaying the values computed for the first sub-filter may be used by second and subsequent sub-filters. This is intended to exemplify the case where a Wiener filter is used, and computations may be reduced using a time delay relational expression such as that shown in Equation 30 or Equation 31, even when other algorithms are used.
[0093] Henceforth, equations will be introduced to help understand the detailed structure of the adaptive delay diversity filter, wherein, in equations, symbols written in lowercase italics denote scalars, symbols written in bold lowercase italics denote vectors, and symbols written in bold uppercase italics denote matrixes. In the following equations, j is used as a sub-filter number, wherein j=1, 2, . . . , M may be given. M is the number of sub-filters present in the adaptive delay diversity filter. In the following equations, n is a sample number in the time domain, and m is the number of a frame used in an algorithm for calculating filter coefficients by collecting samples in frame (or block) units, or in an algorithm operating in the frequency domain.
[0094] Considering an echo canceller operating in an audio signal environment when a reference signal of the adaptive delay diversity filter is denoted by x(n) and a target signal thereof is indicated by y(n), x(n) may be a signal output from the speaker, and y(n) may be a signal input to the microphone 142, wherein y(n) may contain an echo, noise, user speech, or the like. Considering a normal signal transmission environment, x(n) may be a transmit signal that is transmitted from a transmitting end, and y(n) may be a received signal at a receiving end, and y(n) may contain a modified signal of x(n), noise, an external interference signal, or the like.
[0095] When the reference signal of each sub-filter included in the adaptive delay diversity filter is indicated by x.sub.j(n) (where j=1, . . . , M), x.sub.j(n) may have the form of a time-delayed signal of x(n), and may be represented by the following Equation 1.
x.sub.j(n)=x(n−D.sub.j),j=1, . . . ,M
[0096] Here, the time delay D.sub.j is an integer satisfying 0≤D.sub.i<D.sub.2< . . . <D.sub.M.
[0097] When the impulse response of the echo path is estimated using the adaptive delay diversity filter, respective sub-filters use delayed signals x.sub.j(n), obtained by applying different time delay values D.sub.j to x(n), as shown in Equation 1, as respective reference signals, and thus the respective sub-filters may be arranged to cover different partial time areas of the impulse response.
[0098] If the reference signal vector of the j-th sub-filter is noted by x.sub.j(n) assuming that the filter length of each sub-filter included in the adaptive delay diversity filter is L.sub.j (where j=1, . . . , M), x.sub.j(n) may be represented by the following Equation 2.
x.sub.j(n)=[x.sub.j(n),x.sub.j(n−1), . . . ,x.sub.j(n−L.sub.j+1)].sup.T [Equation 2]
[0099] For example, in the case where an adaptive filter algorithm operating in the time domain is used, an echo estimate signal or a target estimate signal ŷ.sub.j(n) and an error signal e.sub.j(n) of the j-th sub-filter may be computed using the following Equation 3.
e.sub.j(n)=y(n)−ŷ.sub.j(n)
ŷ.sub.j(n)=Σ.sub.i=0.sup.L.sup.
[0100] w.sub.j,i denotes an i-th value among filter coefficients of the j-th sub-filter, and the filter coefficient vector w.sub.j of the j-th sub-filter may be defined by the following Equation 4.
w.sub.j=[W.sub.j,0 w.sub.j,1 . . . w.sub.j,L.sub.
[0101] After the target estimate signal ŷ.sub.j(n) and the error signal e.sub.j(n) of each sub-filter are computed, the target estimate signal ŷ(n) and the error signal e(n) of the adaptive delay diversity filter may be computed using the following Equation (5).
{circumflex over (y)}(n)=best{ŷ.sub.1(n), . . . ,ŷ.sub.M(n)}
e(n)=best{e.sub.1(n), . . . ,e.sub.M(n)} [Equation 5]
[0102] As shown in Equation 5, the best value among the target estimate signals {ŷ.sub.1(n), . . . , ŷ.sub.M(n)} of the M sub-filters may be selected as the target estimate signal ŷ(n) of the adaptive delay diversity filter, and the best value among the error signals {e.sub.1(n), . . . , e.sub.M(n)} of the M sub-filters may be selected as the error signal e(n) of the adaptive delay diversity filter. Here, the object used as a criterion for selecting the best value is a performance metric p.sub.j(n) (where j=1, . . . , M). The performance metric p.sub.j(n) is a value indicating the estimation performance of each sub-filter, and may be designed in various forms. Each sub-filter may perform adaptation using every new input signal, and may then compute the performance metric p.sub.j(n), wherein the input signal y(n) and the target estimate signal ŷ.sub.j(n) or the error signal e.sub.j(n) of each sub-filter may be used to compute p.sub.j(n). The p.sub.j(n) may be used as a criterion for selecting the best sub-filter from among multiple sub-filters and the best output. In Equation 5, best{⋅} means the value obtained from the sub-filter having the best performance metric value among the performance metric values {p.sub.1(n), . . . , p.sub.M(n)} of the respective sub-filters. Here, the meaning of “best” indicates the case where the performance metric value is an extreme value (minimum value or maximum value), wherein the “best” performance metric value may mean the maximum performance metric value or the minimum performance metric value according to the designed form of the performance metric. That is, when the performance metric is designed, it may be designed such that, as the performance metric value is larger, the estimation performance of the corresponding sub-filter is judged to be better, or such that, as the performance metric value is smaller, the estimation performance of the corresponding sub-filter is judged to be better. For example, when the performance metric is designed as shown in the following Equation 6, the input signals y(n) are identical for all sub-filters, and thus it may be considered that the performance of the corresponding sub-filter is better as the squared value of the error signal, that is, |e.sub.j(n)|.sup.2, is smaller. Therefore, the sub-filter having the lowest performance metric value, among respective performance metric values computed for multiple sub-filters, may be judged to have better estimation performance than other sub-filters.
[0103] In Equation 6, p.sub.j(n) may be obtained by normalizing the power |e.sub.j(n)|.sup.2 of the error signal to the power |y(n)|.sup.2 of the input signal.
[0104] Meanwhile, when the performance metric is designed, as shown in Equation 7, a sub-filter having the highest performance metric value may be judged to have better estimation performance than other sub-filters.
[0105] The performance metric may be designed in various modified forms in addition to the previous two examples, and may be designed to use the average value of the current value and past values, as shown in the following Equation 8 or Equation 9.
[0106] In Equation 8 or Equation 9, E{⋅} may specify expectation, and may be the average value of values in certain time intervals.
[0107] When the equation for the performance metric is defined by Equation 6 or Equation 8, the best sub-filter ID, i.e., best sub_filter_id and the worst sub-filter ID, i.e., worst_sub_filter_id, are given as follows.
[0108] In Equation 10, best sub_filter_id is the ID of the sub-filter corresponding to p.sub.j(n), which is the minimum value among values {p.sub.1(n), . . . , p.sub.M(n)}, and worst_sub_filter_id may be the ID of the sub-filter corresponding to p.sub.j(n), which is the maximum value p.sub.j(n) among the values {p.sub.1(n), . . . , p.sub.M(n)}.
[0109] When a performance metric equation is defined by Equation 7 or Equation 9, the best sub-filter ID, that is, best sub_filter_id, and the worst sub-filter ID that is, worst_sub_filter_id, are given as follows.
[0110] In Equation 11, best sub_filter_id is the ID of the sub-filter corresponding to p.sub.j(n), which is the maximum value among the values {p.sub.1(n), . . . , p.sub.M(n)}, and worst_sub_filter_id is the ID of the sub-filter corresponding to p.sub.j(n), which is the minimum value among the values {p.sub.1(n), . . . , p.sub.M(n)}.
[0111] Since the performance metric value may be computed either for each sample or once for each interval including several samples, it will be abbreviated to p.sub.j after a time factor n is omitted, for convenience of description.
[0112] After best_sub_filter_id has been detected from Equation 10 or Equation 11, the target estimate signal and the error signal of the best sub-filter best_sub_filter_id are selected as the target estimate signal ŷ(n) and the error signal e(n) of the adaptive delay diversity filter. The following Equation 12 is an equation representing this meaning.
ŷ(n)=best{ŷ.sub.1(n), . . . ,ŷ.sub.M(n)}=ŷ.sub.best_sub_filter_id(n)
e(n)=best{e.sub.1(n), . . . ,e.sub.M(n)}=e.sub.best_sub_filter_id(n) [Equation 12]
[0113]
[0114] The echo canceller 131c may be implemented using the adaptive delay diversity filter, wherein, in the situation in which the adaptive delay diversity filter is operated, filter coefficients of all sub-filters are updated for each time period, after which the output of the sub-filter (e.g., the best filter or best sub-filter) having the best estimation performance among the sub-filters may be selected as the output of the adaptive delay diversity filter.
[0115] Referring to
[0116] Each sub-filter may generate the echo estimate signal ŷ.sub.j(n) by filtering the reference signal x.sub.j(n), as shown in Equation 3, and may generate the error signal e.sub.j(n) of each sub-filter by subtracting the estimate signal ŷ.sub.j(n) from the microphone input signal y(n).
[0117] When the error signals e.sub.j(n) (where j=1, . . . , M) of respective sub-filters are obtained, the echo canceller 131c may obtain the performance metric values p.sub.j(n) (where j=1, . . . , M) of respective sub-filters using e.sub.j(n) and the input signal y(n), select the ID of the sub-filter corresponding to the maximum value (or minimum value), among the p.sub.j(n) values of the multiple sub-filters, as the best filter ID best_sub_filter_id, and thereafter determine the output value e.sub.j=best_sub_filter_id(n) of the selected sub-filter to be the output value e(n) of the adaptive delay diversity filter. The echo canceller may perform a pruning operation of deactivating a sub-filter having a relatively worse performance metric value, among the multiple sub-filters, so as to reduce a computational load.
[0118] In the adaptive delay diversity filter, whenever a new input is received, the multiple sub-filters generate outputs through filtering, and the best output is selected from among the outputs and is used as the output of the adaptive delay diversity filter. In this way, from the concept in which the best output is selected as the final filter output from among the outputs of respective sub-filters while respective sub-filters cover different time delay intervals of an echo path, the term “adaptive delay diversity filter” has been adopted.
[0119] As described above, the adaptive delay diversity filter uses the sub-filters having a filter length set to be less than the total length t.sub.r+t.sub.d of the entire echo path, but a sub-filter for best-estimating the impulse response in the interval t.sub.d, which is an interval having a substantially significant meaning in the entire echo path, may be found. That is, because the filter length of the sub-filter of the adaptive delay diversity filter may be tightly set in conformity with the delay value t.sub.d occurring in the acoustic path, the convergence time may be shorter than that of the case where one large filter is used as in a conventional method, and error occurring in the results of the filter may be reduced by preventing unnecessary filter coefficients from participating in computation.
[0120] Because a computational load occurring when M sub-filters are always operating in the adaptive delay diversity filter is considerable, a method for allowing some of the values computed for the first sub-filter to be shared with other sub-filters is proposed as a method for reducing the computational load. Here, the value to be shared is the result of a calculation formula including portions associated only with the reference signal x(n) in the equation for calculating filter coefficients.
[0121] In the above Equation 1, when the time delay D.sub.j, used to obtain the reference signal x.sub.j(n) of each sub-filter from x(n), and the filter length L.sub.j of each sub-filter satisfy the following Equation 13 and Equation 14, respectively, the time delay relational expression in the following Equation 16 is established, and some values, computed through the calculation formula required for computation of filter coefficients, are shared between respective sub-filters using the time delay relational expression, and thus a computational load may be reduced.
D.sub.j=(j−1)D+D.sub.const,j=1, . . . ,M [Equation 13]
[0122] Here, D is a constant having a positive integer value, and D.sub.const is a constant having an integer value.
[0123] The above Equation 13 indicates that reference signals used between adjacent sub-filters are time-delayed at regular intervals of D.
[0124] The following Equation 14 indicates that filter lengths of all sub-filters forming the adaptive delay diversity filter are identical to each other.
L.sub.1=L.sub.2= . . . =L.sub.M=L [Equation 14]
[0125] Based on the above Equation 1 and Equation 13, the following Equation 15 is established.
x.sub.j(n)=x.sub.j−1(n−D),j=2, . . . ,M [Equation 15]
[0126] In Equation 15, x.sub.j(n) is the reference signal of the j-th sub-filter.
[0127] When the condition of Equation 14 is added to the above Equation 15, the time delay relational expression of the following Equation 16 is established.
x.sub.j(n)=x.sub.j−i(n−D),j=2, . . . ,M [Equation 16]
[0128] In Equation 16, x.sub.j(n) is the reference signal vector of the j-th sub-filter and is as defined in Equation 2.
[0129] Therefore, when the filter lengths of respective sub-filters have the same value (e.g., L), as shown in the above Equations 13 and 14, and all of the time differences between reference signals used by respective sub-filters are identical to D, the relational expression of Equation 30 or Equation 31, which will be described later, may be obtained based on the above Equation 16, and a computational load may be reduced using the relational expression.
[0130]
[0131] In
[0132]
[0133] For example, when the solution of each sub-filter is obtained based on the Wiener filter, w.sub.j(n) may be computed using the following Equation 17.
w.sub.j(n)=R.sub.j.sup.−1(n)p.sub.j(n)
p.sub.j(n)=Σ.sub.i=0.sup.nλ.sup.(n−i)y(i)x.sub.j(i)
R.sub.j(n)=Σ.sub.i=0.sup.nλ.sup.(n−i)x.sub.j(i)x.sub.j.sup.T(i) [Equation 17]
[0134] In the above Equation 17, λ is a forgetting factor.
[0135] In order to obtain w.sub.j(n) from Equation 17, R.sub.j.sup.−1(n) of each sub-filter must be computed, wherein only x.sub.j(n) defined in the above Equation 2 is used as a variable value in the calculation formula of R.sub.j(n) in the above Equation 17. As illustrated in the above Equation 15, the reference signal of each sub-filter is a time-delayed version of the reference signal of the preceding sub-filter, and thus a time delay relational expression, such as that shown in the following Equation 18, may be obtained for R.sub.j(n) and R.sub.j.sup.−1(n) based on the time delay relational expression of the above Equation 16.
R.sub.j(n)=R.sub.j−1(n),j=2, . . . ,M
R.sub.j.sup.−1(n)=R.sub.j−1.sup.−1(n),j=2, . . . ,M [Equation 18]
[0136] Therefore, R.sub.1.sup.−1(n) may be directly obtained through computation, and the values of R.sub.2.sup.−1(n), . . . , R.sub.M.sup.−1(n) may be obtained with reference to the past value of R.sub.1.sup.−1(n) without being computed. Therefore, because w.sub.1(n) may be obtained using R.sub.1.sup.−1(n), which is directly obtained through computation, and w.sub.2(n), . . . , w.sub.M(n) may be obtained with reference to the past value of R.sub.1.sup.−1(n), the filter coefficients of respective sub-filters from the second sub-filter, other than the first sub-filter, may be efficiently calculated through the structure illustrated in
[0137]
[0138] For example, when the filter coefficients of respective sub-filters are calculated using the Recursive Least Square (RLS) algorithm, a formula for updating the RLS algorithm for the j-th sub-filter is represented by the following Equation 19.
[0139] Here, k.sub.j(n) is the Kalman gain vector of the j-th sub-filter, and R.sub.j.sup.−1(n) may be the inverse matrix of the correlation matrix for the j-th sub-filter.
[0140] As shown in the above Equation 15, the reference signal of each sub-filter is a time-delayed version of the reference signal of the preceding sub-filter, and thus the following Equation 20 may be established.
x.sub.j(n)=x.sub.j−1(n−D),j=2, . . . M
R.sub.j.sup.−1(n)=R.sub.j−1.sup.−1(n−D),j=2, . . . M
k.sub.j(n)=k.sub.j−1(n−D),j=2, . . . M [Equation 20]
[0141] Accordingly, when the filter coefficient of the first sub-filter is obtained, k.sub.1(n) and R.sub.1.sup.−1(n) must be directly computed using a formula present in the update equation of the RLS algorithm, but, when the filter coefficients of second and subsequent sub-filters are obtained, the filter coefficients may be calculated using the delayed values of k.sub.1(n) and R.sub.1.sup.−1(n) calculated by the first sub-filter. That is, by means of the structure illustrated in the drawing, calculation of the Kalman gain vector and the inverse matrix of the correlation matrix is performed only once, rather than M times, for the M sub-filters, thus reducing a computational load.
[0142]
[0143] Even in an adaptive filter operating in the frequency domain, some calculations may be skipped using the time delay relationship, as in the case of the previous examples in which the two algorithms are used. The case where the Partitioned Block Frequency Domain Adaptive Filter (PBFDAF) algorithm is applied to each sub-filter is described by way of example. The PBFDAF adaptive filter algorithm is well known to those skilled in the art, and is implemented such that, when a filter having a filter length of L=P×N is partitioned into P filters, each of which has a partition size of N samples in the time domain, the original filter may be regarded as the sum of the partitioned filters. Accordingly, the estimate signal ŷ(n) of the filter may be represented by the total sum of the estimation values of the P filters, as represented by the following Equation 21.
ŷ(n)=Σ.sub.i=0.sup.L−1x(n−i)h.sub.i=Σ.sub.p=0.sup.P−1Σ.sub.i=0.sup.N−1x(n−Np−i)h.sub.Np+i=Σ.sub.p=0.sup.P−1(x.sup.(p)(n)).sup.Th.sup.(p)
x.sup.(p)(n)=[x(n−Np),x(n−Np−1), . . . x(n−Np−N+1)].sup.T
h.sup.(p)=[h.sub.Np,h.sub.Np+1, . . . h.sub.Np+N−1].sup.T [Equation 21]
[0144] An error signal e(n) in the time domain and an error signal vector e(m) in an m-th frame (or block) may be defined by the following Equation 22.
e(n)=y(n)−ŷ(n)
y(m)=[y(mN), . . . ,y(mN+N−1)]
e(m)=[e(mN), . . . ,e(mN+N−1)] [Equation 22]
[0145] Here, m is a frame number, and one frame may be composed of N input samples. When the equation is converted into the frequency domain and is then arranged so as to obtain the filter coefficient for minimizing the error signal e(n), the PBFDAF algorithm operating in the frequency domain may be derived. As is well known to those skilled in the art, the basic update formula of the PBFDAF adaptive filter algorithm is given by the following Equation 23.
X.sup.(p)(m)=diag{F[x(mN−Np−N), . . . ,x(mN−Np+N−1)]}
X(m)=[X.sup.(0)(m),X.sup.(1)(m), . . . ,X.sup.(P−1)(m)]
S(m)=λS(m−1)+(1−λ)X.sup.H(m)G.sub.1X(m)
K(m)=(1−λ)S.sup.−1(m)X.sup.H(m)
e(m)=y(m)−G.sub.1X(m)h(m−1)
h(m)=h(m−1)+G.sub.2K(m)e(m) [Equation 23]
[0146] Here, X(m) is generated using only the reference signal, and may be defined, as shown in Equation 23. F may be a Discrete Fourier Transform (DFT) matrix used to perform a DFT. In the above Equation 23, G.sub.1 and G.sub.2 are matrixes having constant values, which are matrixes related to the DFT matrix F, and may be defined by the following Equation 24.
[0147] In the above Equation, K(m) may be a Kalman gain in the frequency domain.
[0148] The underlined symbols of values e(m) and y(m) in the time domain indicate values in the frequency domain, and may be defined by the following Equation 25.
[0149] As described above, when the PBFDAF algorithm is adopted and used by the adaptive delay diversity filter, the PBFDAF algorithm may be applied to each sub-filter of the adaptive delay diversity filter. Here, assuming that the number of partition blocks of the adaptive filter using the PBFDAF algorithm is P, the number of new input samples required to configure one frame is N, and a time shift or time delay D between reference signals used by respective sub-filters forming the adaptive delay diversity filter is an integer multiple of N, D may be defined by the following Equation 26.
D=BN (unit of D: sample, and B is a positive integer) [Equation 26]
[0150] Here, D may be a time delay value between reference signals used by adjacent sub-filters, as defined in the above Equation 13, and B may be construed as a time delay value in frame (or block) units, and may typically have values within a range of 0<B≤P. When X(m) and K(m), obtained when the update equation for the PBFDAF algorithm of Equation 23 is applied to each sub-filter, are denoted by X.sub.j(m) and K.sub.j(m) (j=1, . . . , M), X(m) is a matrix including only values associated with DFT values of the reference signals x(n), without including additional variables, and reference signals by two adjacent sub-filters have the time delay relationship, such as that shown in Equation 15, and thus the matrix X.sub.j(m) may be represented by the following Equation 27.
X.sub.j(m)=X.sub.j−1(m−B),j=2, . . . M [Equation 27]
[0151] Here, m is a frame number, and B is the time delay value between reference signals used by adjacent sub-filters, and has a value in frame units. That is, since the reference signals used by the two adjacent sub-filters have a time delay relationship corresponding to B frames, X.sub.j(m) having values associated with the DFT values of the reference signals may have the time delay relationship corresponding to the B frames. In the update Equation 23, variables participating in the Kalman gain K(m) are also associated only with the reference signal x(n), and thus the following Equation 28 may be established.
K.sub.j(m)=K.sub.j−1(m−B),j=2, . . . M [Equation 28]
[0152] Therefore, the first sub-filter directly calculates X.sub.1(m) and K.sub.1(m) thereafter obtains an error signal vector e.sub.1(m) and a filter coefficient vector h.sub.1(m), and sub-filters from the second sub-filter may calculate e.sub.j(m) and h.sub.j(m) using delayed values of XI(m) and K.sub.1(m) calculated by the first sub-filter. That is, through the use of a structure such as that illustrated in
[0153] The forgoing examples of three types of algorithms are summarized below. When a formula v(n) represented only by the reference signals x(n) is present in the equation for calculating filter coefficients in a certain adaptive filter algorithm in the case where the certain adaptive filter algorithm is adopted and applied to each sub-filter of the adaptive delay diversity filter having the structure satisfying Equation 13 and Equation 14, the formula v(n) may be represented by a function f(⋅) of x(n), as given by the following Equation 29. Here, v(n) may be one of a scalar, a vector, and a matrix.
v(n)=f(x(n)) [Equation 29]
[0154] When v(n), obtained when the algorithm is applied to each sub-filter present in the adaptive delay diversity filter, is represented by v.sub.j(n) (where j=1, . . . , M), v.sub.1(n) of the first sub-filter (j=1) may be directly obtained through calculation, and v.sub.j(n) at j=2, . . . , M may be obtained using the time delay relational expression, such as that shown in the following Equation 30.
V.sub.j(n)=v.sub.j−1(n−D),j=2, . . . M [Equation 30]
[0155] In the above Equation 30, D is a value defined in Equation 13 and is a value indicating the time shift or the delay between the reference signals used by adjacent sub-filters, and n is also the value in sample units.
[0156] In an algorithm for updating filter coefficients in frame (or block) units, for example, an algorithm for updating filter coefficients in block units in the time domain, or an algorithm operating in the frequency domain, the following Equation 31 may be applied.
V.sub.j(m)=v.sub.j−1(m−B),j=2, . . . M [Equation 31]
[0157] In Equation 31, m is a frame number, and B is a value that is defined in Equation 26, indicates the time difference between reference signals used by adjacent sub-filters, and is the value in frame units.
[0158]
[0159] The adaptive delay diversity filter is composed of a total of M sub-filters, and in the initial stage, all sub-filters are initiated in an active state, switch to an inactive state through a pruning operation, and sub-filters in the inactive state are excluded from all computation procedures.
[0160] When a description is made with reference to the attached drawings, sub-filters in an active state (active sub-filters) may acquire (shift in) target signals and reference signals, which are used as input data, at step 601, and may check whether all reference signals are 0 at step 603. When data in which at least some of the reference signals are not ‘0’ is input, the corresponding sub-filter performs adaptation at step 605. Here, adaptation refers to updating of coefficients of sub-filters to minimize an estimation error, and the method of adaptation is determined depending on the type of the adaptive filter algorithm that is adopted by the corresponding sub-filter. At step 607, filtering (ŷ.sub.j(n)=w.sub.j.sup.Tx.sub.j(n), and e.sub.j(n)=y(n)−ŷ.sub.j(n)) may be performed using the filter coefficients obtained at step 605. At step 603, when the entire reference data is 0, the corresponding sub-filter may branch to step 609 where bypassing of the input signals (e.sub.j(n)=ŷ.sub.j(n)) may be performed.
[0161] In the adaptive delay diversity filter, only sub-filters in an active state, among M sub-filters, are operated for each time period, one sub-filter having the best performance metric value is selected as the best filter, and the output of the selected sub-filter is selected as the output of the adaptive delay diversity filter. This process is illustrated in
[0162]
[0163] Referring to
[0164] At steps 709 and 711, the echo canceller 131c saves the ID of the sub-filter having the best performance metric to best_sub_filter_id, and saves the output value of the sub-filter to e.sub.best(n). When comparisons between all sub-filters are completed, the value of e.sub.best(n) is determined to be the final filter output e(n) at step 717. At step 717, the value of n.sub.best[ ] is the value in which the frequency at which each of the sub-filters was selected as the best filter is recorded, and may be used later for a final filter selection procedure, as illustrated in
[0165] Meanwhile, at step 703, when the j-th sub-filter is not in an active state, the echo canceller 131c may skip subsequent steps, for example, steps 705, 707, 709, and 711.
[0166] In this way, whenever new input is received, the adaptive delay diversity filter allows multiple active sub-filters to generate respective outputs through filtering, selects the best output from among the outputs of the sub-filters, and uses the selected output as the final output of the adaptive delay diversity filter.
[0167]
[0168] As described above, the echo canceller 131c (or the audio-processing unit 130 or the processor 120) may reduce a computational load by performing computation related to a formula required for calculation of filter coefficients only once, rather than M times, during a process for calculating filter coefficients of M sub-filters, and may further decrease a computational load by additionally applying a pruning method. For example, in the adaptive delay diversity filter, a maximum of M sub-filters may simultaneously perform adaptation, in which case a required computational load is increased, and thus the echo canceller 131c checks the performance of each sub-filter for each time period, and removes a sub-filter, the performance of which is deteriorated, using a pruning method, with the result that a computational load may be reduced.
[0169] In relation to this, referring to
[0170] Next, at step 803, the echo canceller 131c may read the time t.sub.1 elapsed since the time at which the first sub-filter started adaptation to the current time.
[0171] Next at step 805, the echo canceller 131c may determine whether the elapsed time t.sub.1 is greater than the predefined pruning start time T.sub.p. If it is determined that the elapsed time t.sub.1 is greater than the pruning start time T.sub.p, the echo canceller 131c may call a program (or an algorithm or a function) for starting pruning at step 807. Meanwhile, at step 805, if it is determined that the elapsed time t.sub.1 is less than the predefined pruning start time T.sub.p, the echo canceller 131c may skip pruning at step 807.
[0172]
[0173] Referring to
[0174] Next, the echo canceller 131c may perform control such that the sub-filter corresponding to the worst filter ID is excluded from a computation procedure such as adaptation and filtering by deactivating the sub-filter corresponding to the worst filter ID at step 905. At step 901, when the situation in which one active filter needs to be deactivated does not occur, the echo canceller 131c may skip steps 903 and 905.
[0175]
[0176] The adaptive delay diversity filter used in the echo canceller 131c is composed of multiple sub-filters, wherein, when all of the multiple sub-filters perform adaptation to update filter coefficients, a computational load is greatly increased, and thus the peak computational load may be controlled by limiting the number of sub-filters that perform adaptation.
[0177] Referring to
[0178] Thereafter, at step 1005, the echo canceller 131c may count the number of filters n_adapt_filters that currently perform the adaptation operation.
[0179] Next, at step 1007, the echo canceller 131c may check whether the number of filters n_adapt_filters that perform the adaptation operation is greater than the number of filters N_allowed_filters that are allowed to perform simultaneous adaptation. If the number of filters n_adapt_filters that perform the adaptation operation is greater than the number of filters N_allowed_filters that are allowed to perform simultaneous adaptation, the echo canceller 131c may find the filter having the worst performance metric p.sub.j among the active filters at step 1009, and may save the ID of the corresponding filter to the worst filter ID worst_sub_filter_id.
[0180] At step 1011, the echo canceller 131c may deactivate the sub-filter corresponding to the worst filter ID, and may then exclude the corresponding sub-filter so that the sub-filter does not participate in computation any further. Meanwhile, when the limit condition set at step 1001 is not satisfied, the echo canceller 131c may skip the procedure from steps 1003 to 1011. In this way, the echo canceller 131c may perform filtering in the state in which a number of sub-filters less than or equal to the allowed number of sub-filters (N_allowed_filters), among the multiple sub-filters, are active.
[0181]
[0182] Referring to
[0183] When the performance of the marginal pruning operation is requested, the echo canceller 131c may read a predefined marginal threshold value P.sub.m arg in at step 1103. The marginal threshold value P.sub.m arg in may be stored in advance in the memory 110.
[0184] Next, the echo canceller 131c may find the best performance metric value p.sub.best among the performance metric values for active sub-filters at step 1105. Thereafter, the echo canceller 131c may find sub-filters satisfying the following Equation 32 at step 1107.
p.sub.j<p.sub.best−p.sub.margin,j=1, . . . ,M [Equation 32]
[0185] The echo canceller 131c may deactivate the sub-filters satisfying the above equation. The marginal threshold value may be a preset constant value, and the speed at which the sub-filters switch to an inactive state may be controlled by adjusting the marginal threshold value. Meanwhile, the electronic device may change the marginal threshold value depending on the system performance or may change the marginal threshold value depending on the frequency at which an error occurred in a specific application (e.g., a speech recognition function). For example, when the frequency of error occurrence is high, the marginal threshold value may rise (or fall).
[0186] Respective methods for limiting sub-filters, described above with reference to
[0187]
[0188] Referring to
[0189] Next, the echo canceller 131c may read the time t.sub.1 elapsed since the first sub-filter started adaptation to the current time point at step 1203, and may determine whether the elapsed time t.sub.1 is greater than the final filter selection time T.sub.f at step 1205. If it is determined that t.sub.1 is greater than T.sub.f, the echo canceller 131c may call a function (or a program or an algorithm) for final filter selection at step 1207. Meanwhile, if it is determined at step 1205 that t.sub.1 is less than T.sub.f, step 1207 may be skipped.
[0190]
[0191] Referring to
[0192] Next, the echo canceller 131c may select a filter having the best performance metric value p.sub.j, from among the active filters, and may save the ID of the selected filter as the best filter ID best_sub_filter_id at step 1303.
[0193] Next, the echo canceller 131c may perform filtering by operating only the sub-filter corresponding to the best filter ID, best_sub_filter_id and deactivating the remaining sub-filters other than the best sub-filter at step 1305.
[0194]
[0195] Referring to
[0196] Next, the echo canceller 131c may calculate accumulated values p.sub.acc,j of the performance metric values for the active filters at step 1403.
[0197] Then, the echo canceller 131c may select the best accumulated performance metric value best p.sub.acc,j from among the active filters, and may save the ID of a sub-filter having the selected best accumulated value to the best filter ID best_sub_filter_id at step 1405.
[0198] Next, the echo canceller 131c may perform filtering by operating only the sub-filter corresponding to the best filter ID, best_sub_filter_id and deactivating the remaining sub-filters other than the best sub-filter at step 1407.
[0199]
[0200] Referring to
[0201] Next, the echo canceller 131c may find the maximum value among the n.sub.best[j] values of all active sub-filters, and may save the ID of the sub-filter corresponding to the maximum value to the best filter ID best_sub_filter_id at step 1503.
[0202] Next, the echo canceller 131c may perform filtering by operating only the sub-filter corresponding to the best filter ID, best_sub_filter_id and deactivating all of the remaining sub-filters other than the best sub-filter at step 1505.
[0203] By means of the above-described pruning function, a structure for allowing the final winner to remain while respective sub-filters compete with each other may be formed, and a better output signal may be obtained using a low computational load.
[0204] Meanwhile, the method for limiting the number of sub-filters that perform adaptation so as to reduce a computational load includes additional methods other than the pruning method. For example, there is a method for reducing a computational load by allowing respective sub-filters to sequentially perform adaptation in time, instead of a scheme in which all sub-filters perform adaptation for each time period. This method may be used to reduce a computational load while accommodating the deterioration in performance that occurs as the number of times that each sub-filter performs adaptation is decreased.
[0205] As described above, the adaptive delay diversity filter included in the echo canceller 131c may be configured such that the number of sub-filters participating in adaptation does not exceed a predefined value, and may then manage a computational load so that the peak computational load does not exceed a certain level. Further, the adaptive delay diversity filter may have imposed thereon a computational load corresponding to one sub-filter by finally selecting one sub-filter through a pruning operation.
[0206] Due to the characteristics of the adaptive filter, if a target signal and a reference signal, which are input signals of the filter, are not time-aligned therebetween, and then time reversal in which the target signal is ahead of the reference signal in time occurs or if a time delay between the two signals is too large for the filter to estimate, then the filter may diverge or produce a large error signal value. Due to these characteristics, in the electronic device according to the present disclosure, as the pruning operation is continuously performed with the lapse of time, many of the sub-filters participating in adaptation have poor performance and may be quickly deactivated. For example, at the initial stage, when pieces of target and reference data, which are input for the adaptive filter, start to flow in the adaptive filter, respective sub-filters sequentially participate in adaptation, but the number of sub-filters that simultaneously participate in adaptation does not exceed a defined value due to the pruning function, wherein the sub-filters having poor performance are deactivated, and the number of active sub-filters is gradually decreased since the last sub-filter participates in adaptation, with the result that one sub-filter finally remains as a survivor. Alternatively, a predefined number of sub-filters may remain as survivors. Accordingly, the echo canceller 131c may manage a computational load so that, even if the number M of sub-filters forming the adaptive delay diversity filter is increased, the adaptive delay diversity filter has a computational load at a certain level or less, regardless of the number of sub-filters.
[0207] The present disclosure may effectively solve the situation in which variation in a time delay between speaker data and microphone data is large when it is desired to apply one echo canceller engine to all smartphone products. In particular, considering various connection cases, such as the case where the speaker and microphone of a smartphone are used as standalone, the case where a smartphone and a Bluetooth headset are connected to each other and are then used, or the case where a smartphone and Audio, Video, and Navigation (AVN) devices in a vehicle are used via Bluetooth connection, variation in the time delay may be further increased. In such an environment, the present disclosure may provide an adaptive processing method. For example, time delay values in the case where a smartphone is operated as standalone have a range from about several tens to 200 ms, but, in the case where the smartphone and the Audio, Video, and Navigation (AVN) devices in the vehicle are connected via Bluetooth, the speaker and the microphone of the vehicle are used, the time delay at this time has a range from about 500 ms to 1000 ms or more due to the addition of new transmission path. In order to effectively cancel echoes in all such cases, the adaptive filter of the echo canceller must be designed to cover a wide range of time delay. For this purpose, the present disclosure avoids setting the filter length to a large value as in conventional methods and sets the length of a filter to a value approximately corresponding to the length of an acoustic path, thus realizing a short convergence time and a small error value. Further, because values calculated through a calculation formula that is used when the filter coefficients of respective sub-filters are obtained can be shared between sub-filters using a time delay relational expression, the values are calculated only for the first sub-filter, and delayed versions of these values are used in all remaining sub-filters, and thus a computational load may be reduced. Furthermore, because the present disclosure can limit the number of filters participating in adaptation while removing sub-filters having deteriorated performance through a pruning method, rather than obtaining filter coefficient values of all sub-filters for each time period, the peak computational load may be maintained at a certain level or less even in the case where the number of sub-filters M used is increased to a very large value so as to cover large variation in time delay. Further, only one sub-filter may also be finally activated, and since then the same computational load as that when one sub-filter operates may be required. Furthermore, even in additional systems other than an audio system, when a signal transmission environment has large variation in a time delay between a target signal and a reference signal, the present disclosure may have a structure efficient for estimating of the impulse response of the transmission path or estimating of a target signal that is received after undergoing deformation.
[0208] Meanwhile, embodiments disclosed in the present specification and drawings are merely intended to present specific examples for better understanding of the present disclosure, and are not intended to limit the scope of the present disclosure. It should be understood that many variations and modifications of the basic inventive concept described herein will still fall within the spirit and scope of the present invention as defined in the appended claims and their equivalents.