HEARING DEVICE WITH OMNIDIRECTIONAL SENSITIVITY
20220369029 · 2022-11-17
Assignee
Inventors
Cpc classification
H04R5/04
ELECTRICITY
H04R2430/20
ELECTRICITY
International classification
Abstract
A method performed by a first hearing device comprising microphone(s) configured to generate a first input signal, a communication unit configured to receive a second input signal from a second hearing device, an output unit, and a processor, the method comprising: generating a first intermediate signal including or based on a first weighted combination of the first input signal and the second input signal, wherein the first weighted combination is based on a first gain value and/or a second gain value; and generating an output signal for the output unit based on the first intermediate signal; wherein one or both of the first gain value and the second gain value are determined in accordance with an objective of making a power of the first input signal and a power of the second input signal differ by a preset power level difference greater than 2 dB in the weighted combination.
Claims
1. A method performed by a first hearing device; the first hearing device comprising a first input unit including one or more microphones and being configured to generate a first input signal, a communication unit configured to receive a second input signal from a second hearing device, an output unit, and a processor coupled to the first input unit, the communication unit, and the output unit, the method comprising: determining a first gain value, a second gain value or both of the first gain value and the second gain value; generating a first intermediate signal including or based on a first weighted combination of the first input signal and the second input signal, wherein the first weighted combination is based on the first gain value, the second gain value, or both of the first gain value and the second gain value; and generating an output signal for the output unit based on the first intermediate signal; wherein one or both of the first gain value and the second gain value are determined in accordance with an objective of making a power of the first input signal and a power of the second input signal differ by a preset power level difference greater than 2 dB in the weighted combination.
2. The method according to claim 1, wherein the preset power level difference is greater than or equal to 3 dB, 4 dB, 5 dB or 6 dB in the weighted combination.
3. The method according to claim 1, wherein the preset power level difference is equal to or less than 6 dB, 8 dB, 10 dB or 12 dB in the weighted combination.
4. The method according to claim 1, wherein the generated first input signal has a higher power than that of the received second input signal, and wherein in the weighted combination, the power of the first input signal is higher than the power of the second input signal.
5. The method according to claim 1, wherein the received second input signal has a higher power than that of the generated first input signal, and wherein in the weighted combination, the power of the second input signal is higher than the power of the first input signal.
6. The method according to claim 1, further comprising: generating a second intermediate signal including or based on a second weighted combination of the first input signal and the second input signal in accordance with the first gain value and the second gain value, respectively; generating a third intermediate signal including or based on a third weighted combination of the first input signal and the second input signal in accordance with the second gain value and the first gain value, respectively; wherein the first intermediate signal is based on the second intermediate signal and the third intermediate signal in accordance with a first output value and a second output value based on a mixing function; wherein the mixing function transitions smoothly or in multiple steps between a first limit value and a second limit value as a function of a difference between the power of the first input signal and the power of the second input signal, or as a function of a ratio of the power of the first input signal and the power of the second input signal.
7. The method according to claim 1, further comprising: determining the power of the first input signal and determining the power of the second input signal; determining a highest power level (P.sub.max) based on the power of the first input signal and the power of the second input signal and based on an output value (gx) of a mixing function; determining a lowest power level (P.sub.min) based on the power of the first input signal and the power of the second input signal and based on a complementary output value (1−gx) of the mixing function; wherein the mixing function transitions smoothly or in multiple steps between a first limit value and a second limit value as a function of a difference between the power of the first input signal and the power of the second input signal, or as a function of a ratio of the power of the first input signal and the power of the second input signal.
8. The method according to claim 1, wherein the power of the first input signal is based on smoothed and squared values of the first input signal; and wherein the power of the second input signal is based on smoothed and squared values of the second input signal.
9. The method according to claim 1, wherein the first gain value satisfies the below equation:
10. The method according to claim 1, wherein the first gain value is determined based on the following equation:
11. The method according to claim 1, further comprising recurrently, at least at a first time and a second time, determining a current value (α.sub.n) of the first gain value, wherein the current value (α.sub.n) of the first gain value is determined iteratively in accordance with: an estimate of the first gain value satisfying the objective of making the power of the first input signal and the power of the second input signal differ by the preset power level difference greater than 2 dB in the weighted combination, and a previous value (α.sub.n−1) of the first gain value plus an iteration step value which is based on the estimate of first gain value and the previous value (α.sub.n−1).
12. The method according to claim 1, further comprising delaying one the first input signal and the second input signal to delay the first input signal relative to the second input signal, or to delay the second input signal relative to the first input signal.
13. The method according to claim 1, further comprising recurrently determining the first gain value, the second gain value, or both of the first gain value and the second gain value, based on a non-instantaneous level of the first input signal and a non-instantaneous level of the second input signal.
14. The method according to claim 1, wherein the first gain value and the second gain value are recurrently determined, subject to a constraint that the first gain value and the second gain value sums to a predefined time-invariant value.
15. The method according to claim 1, further comprising processing the intermediate signal to perform a hearing loss compensation.
16. A hearing device, comprising: a first input unit including one or more microphones; a communication unit; an output unit comprising an output transducer; at least one processor coupled to the first input unit, the communication unit, and the output unit; and a memory storing at least one program, the at least one program including instructions for causing the at least one processor to perform the method of claim 1.
17. A computer readable storage medium storing at least one program, the at least one program comprising instructions, which, when executed by a processor of a hearing device, enable the hearing device to perform the method of claim 1.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0131] A more detailed description follows below with reference to the drawing, in which:
[0132]
[0133]
[0134]
[0135]
[0136]
[0137]
DETAILED DESCRIPTION
[0138] Various embodiments are described hereinafter with reference to the figures. Like reference numerals refer to like elements throughout. Like elements will, thus, not be described in detail with respect to the description of each figure. It should also be noted that the figures are only intended to facilitate the description of the embodiments. They are not intended as an exhaustive description of the claimed invention or as a limitation on the scope of the claimed invention. In addition, an illustrated embodiment needs not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced in any other embodiments even if not so illustrated, or if not so explicitly described.
[0139]
[0140] The communications unit 120 receives a second input signal, r, e.g. from the contralateral hearing device. The second input signal, r, may also be a time-domain signal, which may be designated r(t). At the contralateral device, the second signal r may be captured by an input unit corresponding to the first input unit 110.
[0141] For convenience, the first input signal, l, and the second input signal, r, are denoted an ipsilateral signal and a contralateral signal, respectively. In some examples, a first device, e.g. the ipsilateral device, is positioned and/or configured for being positioned at or in a left ear of a user. In some examples, a second device, e.g. a contralateral device, is positioned at or in a right ear of the user. The first device and the second device may have identical or similar processors. In some examples one of the processors is configured to operate as a master and another is configured to operate as a slave.
[0142] The first input signal, l, and the second signal, r, are input to a processor 130 comprising a mixer unit 131. The mixer unit 131 may be based on gain units or filters as described in more detail herein and outputs an intermediate signal, v, e.g. designated v(t). The mixer unit 131 is configured to generate the intermediate signal, v, based on a first weighted combination of the first input signal (l) and the second input signal (r) in accordance with a first gain, α, value and a second gain value, ‘1−α’. The first gain value, α, and the second gain value, ‘1−α’ are determined in accordance with an objective of making the power of the first input signal, l, and the power of the second input signal, r, differ by a preset power level difference, d, greater than 2 dB when subjected to the weighing. This has shown to increase fidelity of the monitor signal mentioned in background section. In particular, it has shown to reduce artefacts, such as comb filtering effects, in the intermediate signal. This is illustrated in
[0143] In some examples the mixer unit 131 outputs a single-channel intermediate signal v. In some examples, the single-channel intermediate signal is a monaural signal.
[0144] In some embodiments, the mixer unit 131 is based on filters, e.g. a multi-tap FIR filters. Each of the input signals, l and r, may be filtered by a respective multi-tap FIR filter before the respectively filtered signals are combined e.g. by summation.
[0145] The intermediate signal, v, output from the mixing unit 131 is input to the post-filter 132 which outputs a filtered intermediate signal, y. In some embodiments the post-filter 132 is integrated in the mixer 131. In some embodiments the post-filter 132 is omitted or at least temporarily dispensed with or by-passed.
[0146] In some embodiments, the intermediate signal, v, and/or the filtered intermediate signal, y, is input to a hearing loss compensation unit 133, which includes a prescribed compensation for a hearing loss of a user as it is known in the art. The hearing loss compensation unit 133 outputs a hearing-loss-compensated signal, z. In some embodiments, the hearing loss compensation unit 133 is omitted or by-passed.
[0147] The intermediate signal, v, and/or the filtered intermediate signal, y, and/or the hearing-loss-compensated signal, z, is input to an output unit 140, which may include a so-called ‘receiver’ or a loudspeaker 141 of the ipsilateral device for providing an acoustical signal to the user. In some embodiments one or more of the signals v, y and z are input to a second communications unit for transmission to a further device. The further device may be a contralateral device or an auxiliary device.
[0148] Although, time domain to frequency domain transformation, e.g. short time Fourier transformation (STFT), and corresponding inverse transformations, e.g. short time inverse Fourier transformation (STIFT), may be used, such transformations are not shown here.
[0149] In some examples, the contralateral device 100 includes a further beamformer (not shown) configured with a focussed (high directionality) characteristic providing a further beamformed signal based on the microphones 112 and 113 and optionally additional microphones. The further beamformed signal may be transmitted to the contralateral device (not shown.)
[0150] More details about the processing, in particular the processing performed by the mixing unit, are given below:
[0151]
[0152] Pi, of the first input signal, l, and a power level, P.sub.r, of the second input signal, r. Secondly, the first processing unit 201 estimates a maximum power level, P.sub.max, and a minimum power level, P.sub.min. The estimation of the maximum power level and the minimum power level corresponds to:
P.sub.max=max(P.sub.l, P.sub.r)
P.sub.min=min(P.sub.l, P.sub.r)
[0153] Wherein max( ) and min( ) are functions selecting or estimating the maximum or minimum power based on the input (P.sub.l, P.sub.r) to the functions.
[0154] The estimation of the maximum power level and the minimum power level may be based on a continuously computed estimate rather than a (binary) decision.
[0155] This will be explained in more detail below.
[0156] The first processing unit 201 is also configured to output values, gx, of a mixing function and values, ‘1−gx’, of a complementary mixing function. The mixing function is a function, based on e.g. the Sigmoid function or the inverse function of the tangent function, sometimes denoted A tan( ). In essence, the mixing function transitions smoothly or in multiple, discrete steps between a first limit value (e.g. ‘0’) and a second limit value (e.g. ‘1’) as a function of a difference between or a ratio of the power (P.sub.l) of the first input signal (l) and the power (P.sub.r) of the second input signal (r). An advantage is that estimation of the maximum power level and the minimum power level may be based on a continuously computed estimate rather than a (binary) decision. In some examples the mixing function is a piecewise linear function, e.g. with three or more linear segments.
[0157] The second processing unit 202 is configured to determine the first gain value (α) and the second gain value (1−α) based on the maximum power level, P.sub.max, and the minimum power level, P.sub.min.
[0158] Estimation of the first gain value, a, and the second gain value, ‘1−α’, may be based on the following expression, wherein g is the difference in gain corresponding to the preset power level difference, d:
Which, as desired, at least approximately satisfies the below expression, which is quadratic with respect to solving for α:
Thus, d=20.Math.log.sub.10(1/g.sup.2). In one example, 1/g.sup.2=0.45 corresponds to a preset power level difference, d, approximately equal to 7 dB.
[0159] It should be noted, for the sake of completeness, that the above expression, which is quadratic with respect to solving for α, can be solved conventionally, but the solution would require stationary input signals l and r, which is not generally the case for hearing devices.
[0160] The third processing unit 203 generates a value, α.sub.n, which iteratively converges towards the first gain value, α. Subscript ‘n’ designates a time-index. A value, β.sub.n, which correspondingly iteratively converges towards the second gain value, β, is computed as β.sub.n=1−α.sub.n is simply computed therefrom. The third processor, recurrently computes α.sub.n and β.sub.n, e.g. at predefined time intervals e.g. one or more times pr. frame, wherein a frame comprises a predefined number of samples e.g. 32, 64, 128 or another number of samples.
[0161]
[0162] As shown, the first input signal, l, is input to two complementary units 310 and 320, which outputs respective intermediate signals, va, and, vb to a unit 330, which mixes the intermediate signals, va, and, vb, into an intermediate signal v.
[0163] Thus, the fourth processing unit 300 provides mixing of the first input signal and the second input signal to output an intermediate signal v, which is also denoted a first intermediate signal, v. Despite being a mixer in itself, the fourth processing unit 300 includes the two complementary units 310 and 320, which are also mixers, and—further—the unit 330 which is also a mixer. The fourth processing unit 300 may thus be denoted a first mixer, the units 310 and 320 may be denoted second and third mixers, and the unit 330 may be denoted a fourth mixer. The second mixer 310 generates a second intermediate signal (va) including or based on a second weighted combination of the first input signal (l) and the second input signal, r, in accordance with the first gain value, a, and the second gain value, ‘1−α’, respectively. The third mixer generates a third intermediate signal, vb, including or based on a third weighted combination of the first input signal, l, and the second input signal, r, in accordance with the second gain value, ‘1−α’, and the first gain value, α, respectively. The fourth mixer generates the first intermediate signal, v, including or based on a fourth weighted combination of the second intermediate signal, va, and the third intermediate signal, vb, in accordance with a first output value, gx, and a second output value, ‘1−gx’, based on a mixing function. The mixing function serves to implement switching based on the maximum power level, P.sub.max, and the minimum power level, P.sub.min. which is smooth, rather than hard to reduce artefacts. The mixing function transitions smoothly or in multiple steps between a first limit value and a second limit value as a function of a difference between or a ratio of the power, P.sub.l, of the first input signal, l, and the power, P.sub.r, of the second input signal, r. For instance, the mixing function is the Sigmoid function with limit values ‘0’ and ‘1’. The Sigmoid function may be defined as follows:
wherein x=k.Math.ln(R), wherein
wherein k is a number e.g. larger than 3, e.g. 4 to 10. The value of gx is gx=S(x). Other implementations can be defined.
[0164] In some aspects, for saving computational resources, the computation of S(x) may be cut off (forgone) for values of x exceeding or going below respective thresholds known to cause S(x) to assume values close to the limit values. The value gm may then be selected to assume the respective limit value or a value close to the respective limit value.
[0165] The fourth processing unit 300 implements the below expression:
v(t)=(gx*(α*l(t)+(1−α)*r(t−τ))+(1−gx)(α*r(t−τ)+(1−α)*l(t))
Wherein the symbol ‘*’ designates multiplication in embodiments wherein a is implemented by a gain stage. The symbol ‘*’ may also designate a convolution operation in embodiments wherein a is implemented by a Finite Impulse Response, FIR, filter. For the sake of simplicity, the embodiment in
[0166] As shown, the second signal, r, is delayed by delay unit 301 by a time delay, τ. The delay unit 301 is thus delaying the second input signal, r, relatively to the first input signal, l. The delay, r is in the range of 3 to 17 milliseconds; e.g. 5 to 15 milliseconds. In some embodiments the delay is omitted.
[0167] The unit 310, the second mixer, comprises a gain unit 311 and a gain unit 312, to provide respective signals α*l(t) and (1−α)*r(t−τ) which are input to an adder 313, which outputs signal va.
[0168] In a mirrored way, the unit 320, the third mixer, comprises a gain unit 322 and a gain unit 321, to provide respective signals α*r(t−τ) and (1−α)*l(t) which are input to an adder 323, which outputs signal vb.
[0169] The signals va and vb are input to the unit 330, the fourth mixer. The fourth mixer comprises a gain stage 331, which weighs the signal va in accordance with the value gx, and a gain stage 332, which weighs the signal vb in accordance with the complementary value ‘1−gx’ before the weighed signals are combined by adder 333 to provide the intermediate signal v. Thus, a smooth mixing can be implemented in a manner which is particularly suitable for a time-domain implementation. Although a time-domain implementation is preferred, it should be mentioned that the smooth mixing is also possible in a frequency domain implementation or short-time frequency domain implementation. However, for frequency domain or short-time frequency domain implementation better options may exist.
[0170]
wherein k is a number e.g. larger than 3, at least for some embodiments.
[0171] The first processing unit receives the first input signal, l=l(t), and the second input signal r=r(t) and computes respective power levels, P.sub.l and P.sub.r. The power levels may be computed recursively to obtain a smooth power estimate. The power levels may be computed using the following expressions:
p.sub.L(n)=γ.Math.p.sub.L(n−1)+(1−γ).Math.l(n).Math.r(n)
p.sub.R(n)=γ.Math.p.sub.R(n−1)+(1−γ).Math.r(n).Math.r(n)
[0172] Wherein γ is a ‘forgetting factor’ reflecting how much a sum of previous values should be weighted over instantaneous values. Here, n designates a time index of individual samples of the signals or frames of samples of the signals. The power levels may be computed in other ways.
[0173] Based on the computed respective power levels, P.sub.l and P.sub.r, values gx of the mixing function, S( ), which may be based on a Sigmoid function, are computed by unit 413. Correspondingly, complementary values, ‘1−gx’, are computed based on input from unit 413 in unit 414.
[0174] The respective power levels, P.sub.l and P.sub.r, are weighed in accordance with the values gx of the mixing function and the complementary value ‘1−gx’ by units 421 and 422, which may be mixers, multipliers or gain stages ora combination thereof.
[0175] A weighted sum is generated by an adder 423, which receives the respective power levels, P.sub.l and P.sub.r, weighed in accordance with the values gx of the mixing function and the complementary value 1−gx'. The weighted sum is an estimate of the maximum power level, P.sub.max=max(P.sub.l, P.sub.r). The estimate of P.sub.max is output by unit 420, which receives values of gm and ‘1−gx’ from unit 410.
[0176] Also based on values of gm and ‘1−gx’ from unit 410, albeit in a mirrored way, unit 430 outputs an estimate of the minimum power level, P.sub.min=P.sub.r). A weighted sum is generated by an adder 433, which receives the respective power levels, P.sub.l and P.sub.r, weighed in accordance with the complementary values ‘1−gx’ of and the value ‘gx’ of the mixing function.
[0177] In this way, the maximum and minimum power levels can be estimated sample-by-sample or frame-by-frame, while suppressing sudden changes, which may otherwise cause audible artefacts.
[0178]
[0179] With respect to the hearing devices, 501 and 502, the right hearing device 502 (also denoted the ipsilateral device) may be configured to provide the monitor signal to the wearer and the left hearing device 501 (also designated the contralateral device) may be configured to provide the focussed signal to the wearer 510. The hearing devices, 501 and 502, are in communication via a wireless link 503.
[0180] The ipsilateral device 502, here at the right hand side of the wearer, receives the first input signal, l, and the second input signal, r, as described herein. These signals may have, approximately, omnidirectional characteristics 520 and 521, however effectively different from an omnidirectional characteristic due to a head shadow effect caused by the wearer's head.
[0181] The contralateral device 502, here at the right-hand side of the wearer, may be configured to provide the focussed signal to the wearer. The focussed signal may be based on monaural or binaural signals forming one or more focussed characteristics 522 and 523. The focussed characteristics may be fixed, e.g. at about 0 degrees, in front of the wearer, adaptive or controllable by wearer. This is known in the art.
[0182] The first speaker 511 is on-axis, in front, of the wearer 510. Therefore, an acoustic speech signal from the first speaker 511 arrives, at least substantially, at the same time at both the ipsilateral device and the contralateral device whereby the signals are captured simultaneously. In respect of the first speaker 511, signals l and r thus have equal strength. To suppress the comb effect, it has been observed that a delay, delaying the signals 1 and r relative to each other is effective. The delay is small enough to not be perceivable as an echo.
[0183] However, the second speaker 512 is off-axis, slightly to the right, of the wearer 510. When the second speaker 512 speaks, the claimed method suppresses the signal from the first target speaker 511, who is on-axis relative to the user, proportionally to the strength of the signal received, at the ipsilateral device and at the contralateral device, from the second speaker 512, who is off-axis relative to the user. Thereby, it is possible to forgo entering an omnidirectional mode while still being able to perceive the (speech) signal from the second speaker 512. Further, the power of the first input signal, l, and the power of the second input signal, r, are reproduced to differ by the preset power level difference, d, greater than 2 dB in the weighted combination to reduce the comb effect. The comb effect is described in more detail in connection with
[0184] In some situations, in the prior art, a determination that a signal is present e.g. from speaker 512 may result in a listening device switching to a so-called omnidirectional mode whereby noise sources 513 and 514 all of a sudden contribute to sound presented to the user of a prior art listening device who may be experiencing a significantly increased noise level despite the sound level of the noise sources 513 and 514 being lower than the sound level of the target speaker 512.
[0185]
[0186] For comparison, a magnitude response, 603, is plotted for a signal from a front microphone (front mic) arranged towards the look direction. Correspondingly, a magnitude response, 602, is plotted for a signal from a rear microphone (rear mic) arranged away from the look direction.
[0187] Also, for comparison, a signal designated 601a and 601b is plotted for a mixer wherein the preset power level difference is about 0 dB and wherein the first gain value, a, and the second gain value, ‘1−α’ are kept fixed e.g. ata value a=0.5.
[0188] It can be seen that the signal designated 601a and 601b at 601a exhibits a relatively large comb effect spanning a range of about 10 dB peak-to-peak in the frequency range of about 1000Hz to about 4000-5000 Hz.
[0189] Comparatively, the intermediate signal, v, designated by reference numerals 604a and 604b and output from the mixer 131, exhibits a suppressed, relatively smaller comb effect spanning a range less than about 3-5 dB peak-to-peak in the frequency range of about 1000 Hz to about 4000-5000 Hz.
[0190] When one or both of the first gain value, a, and the second gain value, ‘1−α’, are determined in accordance with an objective of making the power of the first input signal, l, and the power of the second input signal, r, differ by a preset power level difference, d, greater than 2 dB in the weighted combination, the comb effect is reduced. Thus, artefacts in the intermediate signal is reduced and fidelity of the signal reproduced for the wearer can be improved.
[0191] There is also provided the following item:
[0192] 1. A method performed by a first hearing device (100); the first hearing device comprising a first input unit (110) including one or more microphones (112,113) and being configured to generate a first input signal (l), a communication unit (120) configured to receive a second input signal (r) from a second hearing device, an output unit (140); and a processor (130) coupled to the first input unit (110), the communication unit (120) and the output unit (140), the method comprising: [0193] determining a first gain value (α), a second gain value (1−α) or both of the first gain value (α) and the second gain value (1−α); [0194] generating a first intermediate signal (v) including or based on a first weighted combination of the first input signal (l) and the second input signal (r); wherein weighing into the weighted combination is based on the first gain value (α), the second gain value (1−α), or both of the first gain value (α) and the second gain value (1−α); and [0195] generating an output signal (z) for the output unit (140) based on the first intermediate signal; [0196] wherein one or both of the first gain value (α) and the second gain value (1−α) are determined in accordance with an objective of making the power of the first input signal (l) and the power of the second input signal (r) differ by a preset power level difference (d) greater than 2 dB in the weighted combination.
[0197] Embodiments of item 1 are set out in the claims in particular in the dependent claims 2-14 and in claims 15-16.
[0198] In some embodiments, with reference to item 1 above, the power of the first input signal (l) may be the power of the original first input signal. In other embodiments, the power of the first input signal (l) may be the power of the weighted first input signal. Also, in other embodiments in which the weighing is based on the first gain value, the power of the first input signal (l) may be the power of the gain-applied first input signal.
[0199] Similarly, in some embodiments, with reference to item 1 above, the power of the second input signal (r) may be the power of the original second input signal. In other embodiments, the power of the second input signal (r) may be the power of the weighted second input signal. Also, in other embodiments in which the weighing is based on the second gain value, the power of the second input signal (r) may be the power of the gain-applied second input signal.
[0200] Also, in some embodiments, with reference to item 1 above, the objective of making the power of the first input signal (l) and the power of the second input signal (r) differ by the preset power level difference (d) greater than 2 dB in the weighted combination, may apply when |P1−P2|<=6 dB, wherein P1 is the power of the generated first input signal, and P2 is the power of the received second input signal. In other embodiments, the objective may apply when |P1−P2|>=6 dB. In further embodiments, the objective may apply regardless of the value of |P1−P2|.
[0201] It should be appreciated that the method described herein can be implemented in different ways. However, some details may be appreciated.
[0202] In some examples, the monitor signal is generated with the aim to achieve a similar sensitivity as the binaural natural ear for surrounding, e.g. moving, sound sources, while the focus signal uses a beamformed signal.
[0203] In a time-domain implementation mixing of the left and right signals to achieve at least an approximated ‘true’ omnidirectional characteristic, where the mixing is generated as follows:
v(t)=α*l(t)+(1−α)*r(t−τ)
[0204] Due to the head shadowing effect, the relative level between the left and right signals varies significantly as a sound source moves around the user. Further, it is desired to suppress the observed comb effect (aka. the comb filtering effect). Therefore, it is proposed to control the weighing of the signals l(t) and r(t) through the parameter a to improve the (true) omnidirectional sensitivity or Situational Awareness Index in cocktail party situations and alleviate the comb filtering effect.
[0205] The wearer's head has a little head shadow effect in low frequencies (below 500-1000 Hz) and there is no need to mix the left and right signals in low frequencies for true omnidirectional characteristic. The signals, signals l(t) and r(t) may therefore be split into a low-frequency band and a high-frequency band. Also, we can avoid the major cause of the comb filtering by skipping the mixing in the low-frequency band. This is because the human auditory system has a higher frequency resolution or narrow critical bands in low frequencies. That could make some audio sound a little harsh and sharp in anechoic chamber listening monaurally.
[0206] In the high-frequency band, when the signals coming from the front, the hearing aids received the same signals, it still could result in some combs by combining two signals. The signals from the off-axis sources will show some significant interaural level difference due to the head shadow effect. The mixing of the two signals will show a shallow comb effect.
[0207] Given the discussion above, the cross-correlation or the levels of the two signals plays an important role in achieving a shallow comb filtering effect and the Omni polar pattern. The introduction of delay is one way to reduce the cross-correlation for speech signals. More importantly, it is proposed to control the level difference between the two signals dynamically to achieve better omnidirectional sensitivity in the mixing.
[0208] The mixing parameter a is controlled adaptively.
[0209] For the mixing,
v(n)=α*(l(n)+(1−α)*r(n−τ)
[0210] In general, α can be treated as a FIR filter and the symbol * indicates a convolution operation.
[0211] The powers of the signals P.sub.l and P.sub.r are calculated as:
and
[0212] A goal is to obtain the optimal a so that the power difference with a scaling constant g is minimal, i.e.
[0213] It is possible to solve a adaptively with the gradient decent method as follows:
[0214] For a one tap filter (gain stage), it is also possible to derive the mixing parameter in the following. Firstly, we compute the short-term, smoothed power of the signals as:
P.sub.l =forgetingFactor*P.sub.l+(1−forgetingFactor)*(l*l)
P.sub.r=forgetingFactor*P.sub.r+(1−forgetingFactor)*(r*r)
[0215] Then, we can pick a better signal between the left and right signals. Let us assume P.sub.l>P.sub.r, the level ratio in the mixing would be:
[0216] Our goal is to maintain the level ratio g as a constant for the source from any direction. Therefore,
[0217] In dynamical acoustic scene, we adaptively update mixing parameter a as follows:
α.sub.n=α.sub.n−1+stepSize*(α−α.sub.n−1)
[0218] The stepSize may be chosen to be 0.005 and the forgetingFactor may be around 0.7. When g is 0.25, the level difference between the mixing signals is about 6 dB. If P.sub.l==P.sub.r, α.sub.n will converge to
and (1−α)=0.2. For default fixed mixing, we set α=0.5.
[0219] In the above, we assumed the assume P.sub.l>P.sub.r and the parameter a is multiplied with the left signal. Vice versa, for the right signal. To avoid a binary decision to determine the maximum and minimum:
[0220] We introduce a sigmoid function to make a soft decision as follows:
[0221] So R>>1, gx=0; and R<<1, gx=1; k is a positive constant k=4 to 10. The square root of R can be absorbed in to k;
[0222] Therefore, P.sub.max=(gx p.sub.p+(1−gx)p.sub.r, P.sub.min=(gxp.sub.r+(1−gx)p.sub.l)
[0223] In dynamical acoustic scenes, for each incoming block of signals, we adaptively update mixing parameter α to reach the target as follows:
α.sub.n=α.sub.n−1+stepSize*(α−α.sub.n−1)
[0224] The output is mixed as follows:
v(t)=(gx*(α*l(t)+(1−α)*r(t−τ))+(1−gx)(α*r(t−τ)+(1−α)*l(t))
[0225] Thus, at least in some aspects, there the present disclosure relates to methods of performing bilateral processing of respective microphone signals from a left ear hearing device and a right ear hearing device of a binaural hearing system and to corresponding binaural hearing systems. The binaural hearing system uses ear-to-ear wireless exchange or streaming of a plurality of monaural signals over a wireless communication link. The left ear or right ear head-wearable hearing device is configured to generate a bilaterally or monaurally beamformed signal with a high directivity index that may exhibit maximum sensitivity in a target direction, e.g. at the user's look direction, and reduced sensitivity at the respective ipsilateral sides of the left and right ear head-wearable hearing devices. The opposite ear head-wearable hearing device generates a bilateral omnidirectional microphone signal at the opposite ear by mixing a pair of the monaural signals wherein the bilateral omnidirectional microphone signal exhibits a omnidirectional response or polar pattern with a low directivity index and therefore substantially equal sensitivity for all sound incidence directions or azimuth angles around the user's head.
[0226] Generally, herein the term ‘on-axis’ refers to a direction, or ‘cone’ of directions, relative to one or both of the hearing devices at which directions the signals are predominantly captured from. That is, ‘on-axis’ refers to the focus area of one or more beamformer(s) or directional microphone(s). This focus area is usually, but not always, in front of the user's face, i.e. the ‘look direction’ of the user. In some aspects, one or both of the hearing devices capture the respective signals from a direction in front, on-axis, of the user. The term ‘off-axis’ refers to all other directions than the ‘on-axis’ directions relative to one or both of the hearing devices. The term ‘target sound source’ or ‘target source’ refers to any sound signal source which produces an acoustic signal of interest e.g. from a human speaker. A ‘noise source’ refers to any undesired sound source which is not a ‘target source’. For instance, a noise source may be the combined acoustic signal from many people talking at the same time, machine sounds, vehicle traffic sounds etc.
[0227] The term ‘reproduced signal’ refers to a signal which is presented to the user of the hearing device e.g. via a small loudspeaker, denoted a ‘receiver’ in the field of hearing devices. The ‘reproduced signal’ may include a compensation for a hearing loss or the ‘reproduced signal’ may be a signal with or without compensation for a hearing loss. The wording ‘strength’ of a signal refers to a non-instantaneous level of the signal e.g. proportional to a one-norm (1-norm) or a two-norm (2-norm) or a power (e.g. power of two) of the signal.
[0228] The term ‘ipsilateral hearing device’ or ‘ipsilateral device’ refers to one device, worn at one side of a user's head e.g. on a left side, whereas a ‘contralateral hearing device’ or ‘contralateral device’ refers to another device, worn at the other side of a user's head e.g. on the right side. The ‘ipsilateral hearing device’ or ‘ipsilateral device’ may be operated together with a contralateral device, which is configured in the same way as the ipsilateral device or in another way. In some aspects, the ‘ipsilateral hearing device’ or ‘ipsilateral device’ is an electronic listening device configured to compensate for a hearing loss. In some aspects the electronic listening device is configured without compensation for a hearing loss. A hearing device may be configured to one or more of: protect against loud sound levels in the surroundings, playback of audio, communicate as a headset for telecommunication, and to compensate for a hearing loss.
[0229] Also, as used in this specification, the term “first input signal” may refer to the original first input signal, a weighted version of the first input signal, or a gain-applied first input signal, depending on the context. For example, the term “the generated first input signal” indicates that the first input signal is the original signal. As another example, the term “the first input signal in the weighted combination” (or any of other similar terms) may indicate that the first input signal is the original first input signal if the first input signal is not weighted or is weighed by a factor of 1 in the weighted combination, or may indicate that the first input signal is a weighted first input signal if it is multiplied by a weight factor in the weighted combination, or may indicate that the first input signal is a gain-applied first input signal if the original first input signal is adjusted by a gain factor in the weighted combination (in which case, the weight may be the gain factor or may be based on the gain factor).
[0230] Similarly, as used in this specification, the term “second input signal” may refer to the original second input signal, a weighted version of the second input signal, or a gain-applied second input signal, depending on the context. For example, the term “the received second input signal” indicates that the second input signal is the original signal as received by a device. As another example, the term “the second input signal in the weighted combination” (or any of other similar terms) may indicate that the second input signal is the original second input signal if the second input signal is not weighted or is weighed by a factor of 1 in the weighted combination, or may indicate that the second input signal is a weighted second input signal if it is multiplied by a weight factor in the weighted combination, or may indicate that the second input signal is a gain-applied second input signal if the original second input signal is adjusted by a gain factor in the weighted combination (in which case, the weight may be the gain factor or may be based on the gain factor).
[0231] Herein the term ‘characteristic’ e.g. in omnidirectional characteristic corresponds to the term ‘sensitivity’, e.g. in omnidirectional sensitivity.