BINAURAL HEARING AID SYSTEM AND A HEARING AID COMPRISING OWN VOICE ESTIMATION

Abstract

A binaural hearing aid system includes first and second hearing aids configured to be worn by a user at or in respective first and second ears of the user, each of the first and second hearing aids including: at least one input transducer configured to pick up a sound at the at least one input transducer and to convert the sound to at least one electric input signal representative of the sound, the sound at the at least one input transducer including a mixture of a target signal and noise; a controller for evaluating the sound at the at least one input transducer and providing a control signal indicative of a property of the sound; a transceiver configured to establish a communication link between the first and second hearing aids allowing the exchange of the control signal between the first and second hearing aids; a transmitter for establishing an audio link for transmitting the at least one electric input signal, or a processed version thereof, to another device. The controller is configured to: transmit the locally provided control signal to, and receive a corresponding remotely provided control signal from the opposite hearing aid via the communication link, and to compare the locally provided and the remotely provided control signals and to provide a comparison control signal in dependence thereof, and to transmit the at least one electric input signal, or a processed version thereof, to the another device via the audio link in dependence of the comparison control signal.

Claims

1. A binaural hearing aid system comprising first and second hearing aids configured to be worn by a user at or in respective first and second ears of the user, each of the first and second hearing aids comprising at least one input transducer configured to pick up a sound at said at least one input transducer and to convert the sound to at least one electric input signal representative of said sound, the sound at said at least one input transducer comprising a mixture of a target signal and noise, a controller for evaluating said sound at said at least one input transducer and providing a control signal indicative of a property of said sound, a transceiver configured to establish a communication link between the first and second hearing aids allowing the exchange of said control signal between the first and second hearing aids, a transmitter for establishing an audio link for transmitting said at least one electric input signal, or a processed version thereof, to another device, wherein said controller is configured to transmit said locally provided control signal to, and receive a corresponding remotely provided control signal from the opposite hearing aid via said communication link, and to compare said locally provided and said remotely provided control signals and to provide a comparison control signal in dependence thereof, and to transmit said at least one electric input signal, or a processed version thereof, to said another device via said audio link in dependence of said comparison control signal.

2. A binaural hearing aid system according to claim 1 wherein said a least one input transducer comprises at least two input transducers providing at least two electric input signals.

3. A binaural hearing aid system according to claim 2 the first and second hearing aids comprises a beamformer filter connected to said at least two input transducers.

4. A binaural hearing aid system according to claim 3 wherein the beamformer filter comprises an own voice beamformer configured to provide an estimate of the user's own voice based on said at least two electric input signals.

5. A binaural hearing aid system according to claim 1, wherein the property of said sound comprises a signal-to-noise ratio.

6. A binaural hearing aid system according to claim 1, wherein the property of said sound comprises a noise level estimate, or a level estimate of the at least one electric input signal.

7. A binaural hearing aid system according to claim 1 wherein the property of said sound comprises a speech intelligibility estimate.

8. A binaural hearing aid system according to claim 1, wherein the property of said sound comprises a feedback estimate.

9. A hearing binaural hearing aid system comprising according to claim 4, wherein the beamformer filter further comprises an environment beamformer configured to provide an estimate of a target signal in the environment.

10. A binaural hearing aid system configured to operate in at least two modes, a normal mode, wherein the estimate of the target signal in the environment has first priority and an own voice mode, wherein the estimate of the user's own voice has first priority.

11. A binaural hearing aid system according to claim 1, wherein the first and second hearing aids each are constituted by or comprises an air-conduction type hearing aid, a bone-conduction type hearing aid, a cochlear implant type hearing aid, or a combination thereof.

12. A hearing aid configured to be worn by a user at or in an ear of the user, the hearing aid comprising at least one input transducer configured to pick up a sound at said at least one input transducer and to convert the sound to at least one electric input signal representative of said sound, the sound at said at least one input transducer comprising a mixture of a target signal and noise, a controller for evaluating said sound at said at least one input transducer and providing a control signal indicative of a property of said sound, a transceiver configured to establish a communication link to a contra-lateral hearing aid of a binaural hearing aid system allowing the exchange of said control signal between the two hearing aids, a transmitter for establishing an audio link for transmitting said at least one electric input signal, or a processed version thereof, to another device, wherein said controller is configured to transmit said locally provided control signal to, and receive a corresponding remotely provided control signal from said contra-lateral hearing aid via said communication link, and to compare said locally provided and said remotely provided control signals and to provide a comparison control signal in dependence thereof, and to transmit said at least one electric input signal, or a processed version thereof, to said another device via said audio link in dependence of said comparison control signal.

13. A method of operating a hearing aid configured to be worn at or in an ear of a user, the method comprising converting sound to at least one electric input signal representative of said sound, the sound at said at least one input transducer comprising a mixture of a target signal and noise; evaluating said sound at said at least one input transducer and providing a control signal indicative of a property of said sound; establishing a communication link to a contra-lateral hearing aid of a binaural hearing aid system allowing the exchange of said control signal between the two hearing aids; establishing an audio link for transmitting said at least one electric input signal, or a processed version thereof, to another device; transmitting said locally provide control signal to, and receiving a corresponding remotely provided control signal from said contra-lateral hearing aid via said communication link, and comparing said locally provided and said remotely provided control signals and providing a comparison control signal in dependence thereof, and transmitting said at least one electric input signal, or a processed version thereof, to said another device via said audio link in dependence of said comparison control signal.

14. Use of a hearing aid as claimed in claim 12 in a binaural hearing aid system.

15. A hearing aid configured to be worn by a user at or in an ear of the user is provided by the present disclosure, the hearing aid comprising at least two input transducers configured to pick up a sound at said at least two input transducers and to convert the sound to respective at least two electric input signals representative of said sound, a first filter for filtering said at least two electric input signals and providing a first filtered signal, an output transducer for converting said first filtered signal, or a signal derived therefrom, to stimuli perceivable by the user as sound, a second filter for filtering said at least two electric input signals and providing a second filtered signal comprising a current estimate of the user's own voice, a transceiver for establishing an audio link to an external communication device (e.g. a telephone), a controller configured to allow the hearing aid to operate in at least two modes, a communication mode wherein said audio link to said external communication device is established, and at least one non-communication mode, wherein each of the first and second filters are configured to operate in a more power consuming and a less power consuming mode in dependence of said controller, and wherein said controller, when said hearing aid is in said communication mode, is configured to set said first filter in said less power consuming mode, and set said second filter in said more power consuming mode.

16. A binaural hearing aid system according to claim 2, wherein the property of said sound comprises a signal-to-noise ratio.

17. A binaural hearing aid system according to claim 3, wherein the property of said sound comprises a signal-to-noise ratio.

18. A binaural hearing aid system according to claim 4, wherein the property of said sound comprises a signal-to-noise ratio.

19. A binaural hearing aid system according to claim 2, wherein the property of said sound comprises a noise level estimate, or a level estimate of the at least one electric input signal.

20. A binaural hearing aid system according to claim 3, wherein the property of said sound comprises a noise level estimate, or a level estimate of the at least one electric input signal.

Description

BRIEF DESCRIPTION OF DRAWINGS

[0150] The aspects of the disclosure may be best understood from the following detailed description taken in conjunction with the accompanying figures. The figures are schematic and simplified for clarity, and they just show details to improve the understanding of the claims, while other details are left out. Throughout, the same reference numerals are used for identical or corresponding parts. The individual features of each aspect may each be combined with any or all features of the other aspects. These and other aspects, features and/or technical effect will be apparent from and elucidated with reference to the illustrations described hereinafter in which:

[0151] FIG. 1 shows a hearing aid according to the present disclosure in a setup configured for facilitating a telephone conversation,

[0152] FIG. 2 illustrates a hearing aid user wearing a binaural hearing aid system according to the present disclosure in a first mode of a telephone conversation conducted in asymmetrically distributed background noise,

[0153] FIG. 3 illustrates a hearing aid user wearing a binaural hearing aid system according to the present disclosure in a second mode of a telephone conversation conducted in asymmetrically distributed background noise,

[0154] FIG. 4 shows a first embodiment of binaural hearing aid system comprising first and second hearing aids according to the present disclosure in a telephone mode, where a telephone conversation is conducted with a remotely located person,

[0155] FIG. 5 shows a second embodiment of binaural hearing aid system comprising first and second hearing aids according to the present disclosure in a telephone mode, where a telephone conversation is conducted with a remotely located person,

[0156] FIG. 6 shows an adaptive (own voice) beamformer configuration, wherein the adaptive beamformer in the k′th frequency sub-band Ŝ.sub.ov(k) is created by subtracting a (e.g. fixed) target cancelling beamformer C.sub.2(k) scaled by the adaptation factor β(k) from an (e.g. fixed) omni-directional beamformer C.sub.1(k),

[0157] FIG. 7 shows an adaptive (own voice) beamformer configuration similar to the one shown in FIG. 6, where the adaptive beampattern Ŝ.sub.ov(k) is created by subtracting a target cancelling beamformer C.sub.2(k) scaled by the adaptation factor β(k) from another fixed beampattern C.sub.1(k),

[0158] FIG. 8 shows a hearing device in a telephone configuration,

[0159] FIG. 9A and FIG. 9B illustrates a scheme for managing processing in a hearing device depending on its mode of operation,

[0160] FIG. 9A illustrating a normal mode of operation,

[0161] FIG. 9B illustrating a telephone mode of operation.

[0162] FIG. 10A shows a binaural hearing aid system comprising first and second hearing aids, where the binaural audio signals are combined, and

[0163] FIG. 10B shows a further binaural hearing aid system comprising first and second hearing aids, where the binaural audio signals are combined.

[0164] The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out. Throughout, the same reference signs are used for identical or corresponding parts.

[0165] Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.

DETAILED DESCRIPTION OF EMBODIMENTS

[0166] The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. Several aspects of the apparatus and methods are described by various blocks, functional units, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as “elements”). Depending upon particular application, design constraints or other reasons, these elements may be implemented using electronic hardware, computer program, or any combination thereof.

[0167] The electronic hardware may include micro-electronic-mechanical systems (MEMS), integrated circuits (e.g. application specific), microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), gated logic, discrete hardware circuits, printed circuit boards (PCB) (e.g. flexible PCBs), and other suitable hardware configured to perform the various functionality described throughout this disclosure, e.g. sensors, e.g. for sensing and/or registering physical properties of the environment, the device, the user, etc. Computer program shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.

[0168] The present application relates to the field of hearing aids. The disclosure relates in particular to own voice estimation, e.g. in noisy environments.

[0169] FIG. 1 shows a hearing aid according to the present disclosure in a setup configured for facilitating a telephone conversation. The hearing aid user is presented to a mixture (SO) of local sound (SI) as well as the voice (REMV) of a far-end talker (CP), while the far-end listener (CP) is presented to the audio obtained from the hearing aid microphones (M1, M2). Possibly the hearing aid microphone signals (IN1, IN2) are enhanced (DSP) in order to present the voice (UV) of the hearing aid user, where the noisy background sounds (SI) have been reduced.

[0170] FIG. 2 shows a hearing aid user wearing a binaural hearing aid system according to the present disclosure in a first mode of a telephone conversation conducted in asymmetrically distributed background noise. A hearing instrument (hearing aid, headset, hearable) is being used for phone conversation such that the voice (OV) of the hearing instrument wearer (U) is picked up by the hearing instrument and the sound is possibly enhanced and transmitted to the far-end listener via a telephone (Phone). This is illustrated in FIG. 1. The audio signal (REMV) from the far-end talker (CP) is streamed directly from the phone via the hearing instrument (HA1, HA2) into the ear canal of the person (U) wearing the hearing instrument. In order to keep a phone conversation running, it is important that both talkers are intelligible. In order to enhance the signal transmitted to the far-end listener, background noise can be suppressed e.g. by use of an own voice beamformer, aiming at reducing the noise as much as possible while the own voice is unaltered. The own voice may be further enhanced by an own voice filter. Typically, signal enhancement is applied locally at one of the hearing instruments, which then transmits the enhanced signal to the telephone. In some situations, as illustrated in FIG. 2, one hearing instrument (HA2) is exposed to more background noise than the other hearing instrument (HA1) (here due to noise source N #1, e.g. a baby crying). In this situation it would be better to transmit the enhanced audio signal of the instrument (HA1) which (e.g. due to the head shadowing effect) is exposed to a smaller amount of background noise. This is illustrated in FIG. 3.

[0171] FIG. 3 shows illustrates a hearing aid user wearing a binaural hearing aid system according to the present disclosure in a second mode of a telephone conversation conducted in asymmetrically distributed background noise. The second mode of a telephone conversation is similar to the first mode illustrated in FIG. 2 apart from the user's voice (UV) being picked up at the opposite ear (Ear1) in the second mode compared to first mode (Ear2), By transmitting the (possibly enhanced) audio signal (OV) from the hearing instrument (HA1) exposed to the smallest amount of background noise, a more intelligible signal can be presented to the far-end talker.

[0172] By having access to two hearing instruments (HA1, HA2), it is possible to obtain an even better estimate of the hearing instrument user's own voice (OV). In order to achieve the full potential of the two hearing instruments, it would be advantageous to combine the microphone signals from the two instruments. This would either require that the microphone signal from one instrument is transmitted to the other (oppositely located) instrument, which then transmits a (possibly) linear combination of the microphone signals to the far end talker via the telephone. Transmitting the audio signal from one instrument (e.g. HA1) to the other (e.g. HA2) is, however, expensive (in a power consumption sense). Alternatively, the enhanced signals (OV1, OV2) from both hearing instruments (HA1, HA2) are transmitted to the telephone, which then combines the signals from left and right instrument into a single enhanced signal. This solution is however difficult, as full access to the different telephone's (e.g. different brands) signal processing capabilities is not always possible.

[0173] FIGS. 1, 4 and 5 illustrate exemplary embodiments of solutions to the problem illustrated in FIG. 2. FIG. 1, FIG. 3, FIG. 4 and FIG. 5 show a hearing aid (FIG. 1) and a binaural hearing aid system (FIG. 3, 4, 5) in a telephone mode of operation, where the user is engaged in a telephone conversation with a remote communication partner (CP) via respective telephone sets and, e.g. a public switched telephone network (PSTN). The hearing aid (HA) and the first and second hearing aids (HA1, HA2) of the binaural hearing aid system worn by the user are used as an audio interface to the user's telephone apparatus (Phone), here a portable telephone, e.g. a smartphone. As illustrated in FIG. 1, the electric input signals (S1, S2) are branched off from the forward (audio) path of the hearing aid(s) and processed in a processor (e.g. a digital signal processor, DSP) and an estimate Ŝ.sub.ov of the user's own voice is thereby provided. The user's own voice is (e.g. wirelessly, e.g. via a wireless link (WL), cf. ‘OV’) transmitted to another device, here to a telephone apparatus (phone), from where it—in a telephone mode of operation—may be transmitted to a telephone apparatus of a remotely connected communication partner (CP).

[0174] In other embodiments (modes of operation), the user's own voice (‘OV’) may be transmitted to a personal digital assistant, e.g. of a smartphone, or similar device, e.g. for providing an audio interface to a search engine, or to a cloud service, e.g. for keyword detection, speech recognition, source separation, or other tasks.

[0175] In the telephone mode of operation illustrated in FIG. 1, 4, 5, the hearing aid(s) receive(s) inputs from the user's telephone representing audio from the remote communication partner (CP). The remove voice (‘REMV’) is received by appropriate transceiver circuitry in the hearing aid(s) and forwarded as signal S.sub.REM to combination unit (CU), e.g. a summation unit (‘+’) located in the forward (audio) path. The forward (audio) path comprises a processor (e.g. a digital signal processor, DSP) for applying one or more processing algorithms to the electric input signals (S1, S2), e.g. beamforming, noise reduction, compression (e.g. for compensating for a user's hearing loss), etc. and providing a processed signal (PS) representing sound from the environment (SI) as received by the input transducers (here microphones) (M1, M2) of the hearing aid. The processed signal (PS) may be mixed with, e.g. added to, the signal S.sub.REM comprising the sound received from the remote telephone apparatus (e.g. including the voice of the communication partner (CP)). The resulting (combined) signal (OUT) is fed to an output transducer of the hearing aid(s), here a loudspeaker (SPK) configure to convert the output signal (OUT) to acoustic stimuli (sound, SO) propagated to the user's ear (Ear). Thereby a handsfree audio interface to a telephone apparatus (Phone) of the user is established, cf. e.g. US20150163602A1.

[0176] FIG. 4 shows a first embodiment of binaural hearing aid system comprising first and second hearing aids according to the present disclosure in a telephone mode, where a telephone conversation is conducted with a remotely located person. Each of the first and second hearing aids (HA1, HA2) comprises the functional elements of the embodiment of FIG. 1. Based on a locally estimated background noise level (e.g. a background noise level estimate, SNR estimate (as in FIG. 4), speech intelligibility estimate, sound quality estimate or simply a level estimate), the hearing instrument (HA1 or HA2) exhibiting the best quality of the own voice estimate, and from which own voice audio should hence be transmitted via the phone to the far-end listener, may be selected.

[0177] At both left and right hearing instruments (HA1, HA2), a local own voice enhancement algorithm is running in a processor (DSP). In principle, no enhancement algorithm is mandatory for the proposed method, however. Furthermore, an SNR estimator (SNR) in each hearing instrument is configured to estimate a local (own voice) signal-to-noise ratio, which may be exchanged between the first and second instruments (HA1, HA2) via an interaural link, e.g. a wireless link (cf. dashed arrows denoted SNR.sub.1 (from HA1 to HA2) and SNR.sub.2 (from HA2 to HA1), respectively). The SNR values (SNR1 and SNNR2) are compared in respective controllers (C&S RxTx) in each hearing instrument and the own voice signal estimate (Ŝ.sub.ov1, Ŝ.sub.ov2) from the instrument with the highest signal-to-noise ratio, may be selected for audio transmission to the telephone (Phone). In the example of FIGS. 4 and 5 the best quality own voice signal estimate is the one provided by the first hearing instrument (HA1), and consequently the own voice estimate (Ŝ.sub.ov1) of the first hearing instrument (HA1) is transmitted from HA1 to the user's telephone apparatus (Phone), cf. zig-zag arrow (denoted OV1) from unit C&S Rx/Tx of the first hearing instrument (HA1) to the user's telephone apparatus (Phone). The controllers (C&S Rx/Tx) each comprises a comparator for comparing a property (here SNR) of the two electric inputs signals (or as here, the SNR of the beamformed signals in the form of the own voice estimate (Ŝ.sub.ov1, Ŝ.sub.ov2)) of the local and the opposite hearing aids. The controllers are configured to provide a control signal indicative of which of the first and seconds hearing instrument exhibiting the best own voice estimate according to a criterion related to the property or properties compared (here SNR, the criterion e.g. being largest SNR). The controllers (C&S Rx/Tx) each further comprises appropriate transceiver circuitry (Rx/Tx) allowing the property of the electric inputs signals (or a signal or signals derived therefrom, here an SNR of the beamformed own voice signal) to be exchanged between the two hearing instruments. The signal from the telephone of the remote communication partner (CP), as received by the user's telephone (Phone) (e.g. via a telephone network (PSTN)), is transmitted (via wireless link, e.g. based on Bluetooth or Bluetooth Low Energy (or similar technology)) to the hearing aid(s), e.g. to the hearing instrument (here HA1) that transmits the user's own voice to the Phone, or to both hearing instruments (HA1, HA2), cf. ‘REMV’ from Phone to receiver (Rx) of the respective hearing instrument(s) (HA1, HA2). The remote signal is received in the hearing instrument(s) by respective wireless receivers (Rx) and the corresponding audio signal (S.sub.REM) is extracted and forwarded to combination unit (CU), and e.g. mixed with the processed environment signal (PS1, PS2) of the forward audio path to output signal (OUT1, OUT2). The output signal (OUT1, OUT2) is presented to the user via the output transducer (SPK) of the hearing aid(s) in question.

[0178] The ‘far-end selection’, i.e. the selection of which of the first and seconds hearing instrument exhibiting the best own voice estimate according to a criterion related to the property or properties compared could be based on (or influenced by) how well each hearing instrument is mounted. This could e.g. be measured by an accelerometer measuring the tilt of the instrument. If the angle of the microphone array direction w.r.t the mouth is small a better own voice pickup is expected (worst case is when the mouth direction is orthogonal to the mouth direction).

[0179] As the own voice level is similar at the two hearing instruments (due to the symmetry of the head and ears relative to the mouth), other (simpler) measures than the SNR estimate may be applied for comparison and selection, e.g. noise level estimate (select instrument with lowest noise estimate) or simply a level estimate (select instrument with lowest level estimate, e.g. measured during absence of OV, and/or e.g. measured while the far-end talker is active).

[0180] As an alternative to SNR. Further, local speech intelligibility estimates or speech quality estimates may be applied to the selection criterion. In order to make a possible switch of transmitting instrument as inaudible as possible, switching may be performed while the far-end is talking.

[0181] FIG. 5 shows a second embodiment of binaural hearing aid system comprising first and second hearing aids according to the present disclosure in a telephone mode, where a telephone conversation is conducted with a remotely located person. The embodiment of a binaural hearing aid system of FIG. 5 is similar to the embodiment of FIG. 4 but comprises more functional elements and is described in more detail. Input sound s.sub.in1, s.sub.in2 at the input (IU.sub.MIC) of the respective first and second hearing aids (HA1, HA2) is picked up by M input transducers, e.g. microphones, and corresponding M electric input signals S.sub.11, . . . , S.sub.1M and S.sub.21, . . . , S.sub.2M of the first and second hearing aids are provided to the beamformer filter. The electric input signals of each hearing aid may be provided in a time-frequency representation (k,n) (where k and n are frequency and time indices, respectively) by respective (M) analysis filter banks (cf. e.g. ‘Filterbank’ in FIG. 6, 7), e.g. included in the input unit (IU.sub.MIC) The beamformer filter comprises an environment beamformer (BF) and an own voice beamformer (OVBF). The environment beamformer (BF) provides spatially filtered environment signal Ŝ.sub.env(k,n), e.g. an estimate of a target signal in the (far-field) environment of the user. The own voice beamformer (OVBF) provides spatially filtered estimate of the user's own voice Ŝ.sub.ov(k,n).

[0182] The first and second hearing aids each comprises a forward audio processing path for processing acoustic signals picked up by the input unit and for presenting (at least in a normal mode of operation) to the user via an output transducer (OT), preferably in an enhanced version, e.g. for better perception (e.g. intelligibility of speech) by the user. In the embodiment of FIG. 5 the forward audio processing path is assumed to be conducted in the frequency domain (k,n). The forward audio processing path comprises environment beamformer (BF) and a selector-mixer (SEL-MIX) connected to the environment beamformer. The selector-mixer is configured to allow a mixing of the environment signal Ŝ.sub.env(k,n) (or a processed version thereof) with another signal, here a signal (S.sub.REM) received from an external device, e.g. a telephone. The output signal (Ŝ.sub.x(k,n)) from the selector-mixer (SEL-MIX) is a weighted combination of the two input signals (Ŝ.sub.env(k,n), S.sub.REM) in the respective hearing aids (HD1, HD2). The output signal (Ŝ.sub.x(k,n)) from the selector-mixer (SEL-MIX) may be equal to one of the input signals or to a mixture of the two (e.g. Ŝ.sub.x(k,n)=α Ŝ.sub.env(k,n)+(1−α) S.sub.REM, 0≤α≤1), e.g. in each of the first and second hearing aids (HA1, HA2)). In a telephone mode, the weighting factor α may e.g. be smaller than 0.5 (e.g. ≤0.8) so that the largest weight is on the remotely received audio signal. In a normal (non-communication) mode, the weighting factor α may e.g. be equal to 1, so that only the environment sound signal (Ŝ.sub.env(k,n)) is propagated in the forward audio processing path. The selector-mixer (SEL-MIX) is controlled by a mode control signal (Mode). The forward audio processing path may further comprise a processor (HAG) for applying one or more processing algorithms to an input signal (Ŝ.sub.x1(k,n), Ŝ.sub.x2(k,n)), and providing a processed (enhanced) output signal (OUT1, OUT2). The one or more processing algorithms may comprise a compressive amplification algorithm configured to compensate for a hearing impairment of the user (e.g. to apply a frequency and level dependent gain to the input signal to the processor (HAG)). The forward path may further comprise a synthesis filter bank (FBS) configured to convert a signal (OUT1, OUT2) in the frequency domain to a signal in the time domain (out1, out2). The time domain output signal (out1, out2) is fed to respective output transducers of the first and second hearing aids (HA1, HA2) for presentation as stimuli perceivable as sound to the user of the binaural hearing aid system. The output transducers (OT) may comprise a loudspeaker of an air conduction hearing aid, and/or a vibrator bone conducting hearing aid or a multielectrode array of a cochlear implant type hearing aid. In the embodiment of FIG. 5, the output transducer (OT) of the first and second hearing aids is assumed to provide ‘Output sound’ s.sub.out1 and s.sub.out2 at the first and second ear, respectively, of the user.

[0183] The own voice beamformer (OVBF) is configured to provide an estimate Ŝ.sub.ov1(k,n), Ŝ.sub.ov2(k,n)) of the user's own voice in dependence of the electric input signals (S.sub.11, . . . , S.sub.1M and S.sub.21, . . . , S.sub.2M) of the respective hearing aids (HA1, HA2). The estimate of the user's own voice (or a further processed (e.g. further noise reduced) version thereof) is fed to a synthesis filter bank (FBS) for converting the frequency sub-band signals Ŝ.sub.ov1(k,n), Ŝ.sub.ov2(k,n)) to time-domain signals (ŝ.sub.ov1(t), ŝ.sub.ov2(t), where t is time) in the respective first and second hearing aids (HA1, HA2). The time domain representation of the own voice estimate is fed to the transmitter part (ATx) of the audio transceiver and transmitted to the external device (cf. ‘Own voice audio’ from ‘HA2’ to ‘Phone’ (cf. solid bold zig-zag arrow) in FIG. 5) in dependence of a comparison control signal (CTx1, CTx2), cf. below.

[0184] The first and second hearing aids may each comprise a controller (CTR1, CTR2) configured to evaluate the sound at the input unit (IU.sub.MIC) and providing a control signal (PCT1, PCT2) indicative of a property (e.g. SNR, or nose level, etc.) of the sound, by evaluating one or more of the electric input signals (S.sub.11, . . . , S.sub.1M and S.sub.21, . . . , S.sub.2M) (or a processed (e.g. filtered) version thereof) of the first and second hearing aids, respectively, cf. signals S.sub.x-P1 and S.sub.x-P2 from the own voice beamformer (OVBF) to the controller (CTR1, CTR2) of the first and second hearing aids, respectively. The controller (CTR1, CTR2) may further be configured to control (e.g. enable, disable) the own voice beamformer (OVBF) (cf. signal CBF1, CBF2), e.g. in dependence of the mode of operation, e.g. controlled by a mode control signal (Mode). An own voice beamformer of a particular hearing aid may be disabled when the estimate of the user's own voice is not required (e.g. to be transmitted to the users telephone in a telephone mode or to be forwarded to a voice control interface in a voice control mode, etc.).

[0185] The first and second hearing aids may each comprise a transceiver (IARx/IATx) configured to establish a (interaural) communication link (IA-WL) between the first and second hearing aids (HA1, HA2) allowing the exchange of the control signals (PCT1, PCT2) between the first and second hearing aids. The first and second hearing aids may each be transmit said locally provided control signal (PCT1, PCT2) to, and to receive a corresponding remotely provided control signal (PCT3, PCT1) from the opposite hearing aid via the (interaural) communication link (IA-WL). The controller (CTR1, CTR2) of the first and second hearing aids may be configured to compare the locally provided and the remotely provided control signals (PCT1, PCT2) and to provide a comparison control signal (CTx1, CTx2) in dependence thereof.

[0186] The first and second hearing aids (HA1, HA2) may each comprise an audio transceiver (ATx, ARx) for establishing an audio link for transmitting an audio signal, e.g. an own voice estimate (Ŝ.sub.ov1(k,n), Ŝ.sub.ov2(k,n)), or a processed version thereof, to another device, e.g. a telephone (Phone in FIG. 5). The first and second hearing aids (HA1, HA2) are configured to control the transceiver (at least the transmitter part (ATx)) in dependence of the comparison control signal (CTx1, CTx2). The controller (CTR1, CTR2) may be activated or deactivated in dependence of a mode control signal (Mode) indicative of a present mode of operation (e.g. a telephone mode or a normal (non-communication) mode). The controller (CTR1, CTR2) may be configured to provide the mode control signal (Mode) indicative of an intended present mode of operation, e.g. based on one or more detectors or external inputs (e.g. a request from a telephone) or based on an input from a user interface. In the telephone mode of operation, audio from a remote communication partner may be received by the first and/or second hearing aid (HA1, HA2) via the user's telephone (Phone), cf. ‘Remote audio’ from ‘Phone’ to the receiver (ARx) of the second hearing aid (HA2) (solid bold zig-zag arrow) and optionally to the receiver (ARx) of the first hearing aid (HA1) (dashed bold zig-zag arrow)

[0187] The first and second hearing aids (HA1, HA2) of the binaural hearing aid system are configured to operate in at least two modes, e.g. a communication mode (e.g. a telephone mode), a non-communication mode (e.g. a normal mode), and/or a voice control mode, e.g. controlled by a mode control signal (Mode). The mode control signal may be provided via a user interface (e.g. a remote control, e.g. implanted via an APP of a smartphone or similar device). The mode control signal may be provided automatically, as a result of one or more detectors or sensors or other control signals. The first and second hearing aids may be configured to receive the mode control signal from a telephone, e.g. indicative of an incoming call. The mode control signal, e.g. an incoming call indicator, may bring the first and second hearing aids in a communication mode, where the selector/mixer is controlled to select an input signal SREM from a remote speaker (or to mix such signal with the environment signal (Ŝ.sub.ENV1(k,n), Ŝ.sub.ENV2(k,n)) of the first and second hearing aids, respectively.

[0188] The first and second hearing aids (HA1, HA2) may each further comprise a keyword detector of a voice control interface (KWD-VCT) to allow a user to influence functionality of the hearing aid (cf. signal CHA) by a limited number of specific spoken commands. The keyword detector may receive the estimate of the user's own voice (Ŝ.sub.ov1(k,n), Ŝ.sub.ov2(k,n)). The voice control interface may be enabled in a specific voice control mode of operation. The keyword detector/voice control interface (KWD-VCT) provides a control signal (CHA) to the processor (HAG), e.g. to change a hearing aid program, e.g. to change mode of operation (e.g. to enter a telephone mode), to change a volume, etc. Keyword detection in a hearing aid is e.g. discussed in EP3726856A1.

[0189] FIGS. 6 and 7 illustrate respective embodiments of adaptive beamformer configurations that may be used to implement an own voice beamformer (OVBF) for use in a sound capture device according to the present disclosure as e.g. illustrated in FIG. 5. FIGS. 6 and 7 both show a two-microphone configuration, which is frequently used in state of the art hearing devices, e.g. hearing aids (or other sound capture devices). The beamformers may, however, be based on more than two microphones, e.g. on three or more (e.g. as a linear array or possibly arranged in a non-linear configuration). An adaptive beampattern (Ŝ.sub.ov(k)), for a given frequency band k, is obtained by linearly combining two beamformers C.sub.1(k) and C.sub.2(k). C.sub.1(k) and C.sub.2(k) (time indices have been skipped for simplicity), each representing different (possibly fixed) linear combinations of first and second electric input signals X.sub.1 and X.sub.2, from first and second microphones M.sub.1 and M.sub.2, respectively. The first and second electric input signals X.sub.1 and X.sub.2 are provided by respective analysis filter banks (‘Filterbank’). The frequency domain signals (downstream of the respective analysis filter banks (‘Filterbank’) are indicated with bold arrows, whereas the time domain nature of the outputs of the first and second microphones (M.sub.1, M.sub.2) are indicated as thin line arrows. The block ‘F-BF’ (indicated by a dashed rectangular enclosure) in FIGS. 6 and 7 refer to so-called fixed beamformers defined by complex sets of constants w.sub.1=(w.sub.11, w.sub.12) and w.sub.2=(w.sub.21, w.sub.22) providing beamformed signals C.sub.1(k) and C.sub.2(k), respectively.

[0190] FIG. 6 shows an adaptive beamformer configuration, wherein the adaptive beamformer in the k′th frequency sub-band Ŝ.sub.ov(k) is created by subtracting a (e.g. fixed) target cancelling beamformer C.sub.2(k) scaled by the adaptation factor β(k) from an (e.g. fixed) omni-directional beamformer C.sub.1(k). The adaptation factor β may e.g. be determined as

[00001] β = .Math. C 2 * C 1 .Math. .Math. .Math. C 2 .Math. 2 .Math.

[0191] The two beamformers C.sub.1 and C.sub.2 of FIG. 6 are e.g. orthogonal. This is actually not necessarily the case. The beamformers of FIG. 7 are not orthogonal. When the beamformers C.sub.1 and C.sub.2 are orthogonal, uncorrelated noise will be attenuated when β=0.

[0192] Whereas the (reference) beampattern C.sub.1(k) in FIG. 6 is an omni-directional beampattern, the (reference) beampattern C.sub.1(k) in FIG. 7 is a beamformer with a null towards the opposite direction of that of C.sub.2(k). Other sets of fixed beampatterns C.sub.1(k) and C.sub.2(k) may as well be used.

[0193] FIG. 7 shows an adaptive beamformer configuration similar to the one shown in FIG. 6, where the adaptive beampattern Ŝ.sub.ov(k) is created by subtracting a target cancelling beamformer C.sub.2(k) scaled by the adaptation factor β(k) from another fixed beampattern C.sub.1(k). This set of beamformers are not orthogonal. In case that C.sub.2 in FIGS. 6 and 7 represents an own voice-cancelling beamformer, β will increase, when own voice is present.

[0194] The beampatterns could e.g. be the combination of an omni-directional delay-and-sum-beamformer C.sub.1(k) and a delay-and-subtract-beamformer C.sub.2(k) with its null direction pointing towards the target direction (e.g. the mouth of the person wearing the device, i.e. a target-cancelling beamformer) as shown in FIG. 6 or it could be two delay-and-subtract-beamformers as shown in FIG. 7, where one, C.sub.1(k), has maximum gain towards the target direction, and the other beamformer, C.sub.2(k), is a target-cancelling beamformer. Other combinations of beamformers may as well be applied. Preferably, the beamformers should be orthogonal, i.e. w.sub.1w.sub.2.sup.H=[w.sub.11 w.sub.12][w.sub.21w.sub.22]=0. The adaptive beampattern arises by scaling the target cancelling beamformer C.sub.2(k) by a complex-valued, frequency-dependent, e.g. adaptively updated scaling factor β(k) and subtracting it from the C.sub.1(k), i.e.

[00002] Y ( k ) = C 1 ( k ) - β ( k ) C 2 ( k ) = w 1 H ( k ) x ( k ) - β ( k ) w 2 H ( k ) x ( k ) .

[0195] Where w.sub.1.sup.H=[w.sub.11, w.sub.12], w.sub.2.sup.H=[w.sub.21, w.sub.22] are complex beamformer weights according to FIG. 6 or FIG. 7 and x=[x.sub.1, x.sub.2].sup.T is the input signals at the two microphones (after filter bank processing).

[0196] In the context of FIGS. 6 and 7, the fixed reference beamformer C.sub.1=w.sub.1.sup.H(k)×(k), and the fixed target-cancelling beamformer C.sub.2=w.sub.2.sup.H(k)×(k), where w.sub.1.sup.H=[w.sub.11, w.sub.12], and w.sub.2.sup.H=[w.sub.21, w.sub.22] are complex beamformer weights, e.g. predetermined and stored in a memory (or occasionally updated during use), and x=[x.sub.1, x.sub.2].sup.T represent the (current) electric input signals at the two microphones (after filter bank processing).

[0197] An Example of Controlling Processing of Beamformers:

[0198] A method for selecting beamforming with a limited amount of processing is described in the following.

[0199] Consider a hearing device (HD) in FIG. 8, e.g. a hearing aid or a headset. FIG. 8 shows a hearing device in a telephone configuration. The hearing device user is listening to a mixture (OUT) of the surroundings (PS) and the far-end talker (S.sub.REM). The far-end talker is preferably listening to the hearing device user's own voice (Ŝ.sub.ov), where in the background noise has been attenuated.

[0200] In other words, the hearing device (HA) is preferably processing two different sound streams: [0201] one sound stream to be presented to the hearing device user consisting of a mixture (OUT) of the far-end talker (S.sub.REM) and the possibly noise-reduced surroundings (signal PS); [0202] the other sound stream presented to the far-end talker mainly consisting of the hearing device wearer's own voice (Ŝ.sub.ov), preferably with background noise reduced, e.g. using a beamformer (OVBF).

[0203] The embodiment of a hearing device in FIG. 8 is equivalent to the embodiment shown in FIG. 1, which may represent a hearing aid (in a specific communication mode of operation, or a headset in a normal mode of operation). The processors ‘DSP’ in FIG. 1 are denoted ‘BF-NR’ and ‘OVBF’, respectively, in FIG. 8. BF-NR represents an environment beamformer-noise reduction system (e.g. a beamformer filter followed by a post filter). OVBF represents an own voice beamformer-noise reduction system (e.g. a beamformer filter followed by a post filter).

[0204] Enhancing a sound by removing noise requires processing power. An adaptive beamformer may e.g. require more processing power than a fixed beamformer. We thus have a trade-off between performance and processing power.

[0205] In a typical hearing device, the speech enhancement system may consist of a directional microphone unit followed by a noise reduction system. The directional system may consist of an adaptive beamformer, which adaptively attenuates the noise while keeping the target sound unaltered. An example of such a beamformer is an MVDR beamformer. Compared to a fixed beamformer, an adaptive beamformer is able to adapt to the noise (and sometimes even to the direction of the target signal). For the special case where the hearing device microphones are used as input signals for a telephone conversation, the hearing device may be able to process the microphone signals into two output signals, each having a different purpose. One output contains the sound to be presented to the person who is wearing the hearing device(s) (local signal); the other output contains the sound which should be presented to the far-end listener (far-end signal). In most situations (e.g. in a hearing aid), the local signal is the most important sound, and the main processing power should be applied to the local signal in order to obtain the best possible balance between the sound of interest and the background noise. However, in a telephone situation, the situation is different. If the voice of the hearing device wearer is not intelligible, a telephone conversation is not possible. Thus, the most important signals are a) the far-end signal to be presented to the hearing instrument user and b) the voice of the hearing instrument wearer which is presented to the far-end listener. During a telephone conversation, the local signal is of less importance. It is typically presented to the hearing device wearer with a reduced level, where it does not reduce the intelligibility of the far-end talker—just to make the person wearing the hearing device aware of the surroundings.

[0206] In order to make most use of the processing power, it is therefore proposed to prioritize the processing power (e.g. adaptive noise reduction, post processing, processing based on neural networks, etc.) depending on the mode of operation of the hearing device. If the hearing device (e.g. a hearing aid) is in a normal mode, the majority of the processing is aimed at enhancement of the surroundings while less (or no) processing power is applied in order to pick up the own voice signal (e.g. to be used as pre-processing for keyword detection). Contrary if the hearing device is in telephone mode, it is proposed to change the processing such that the majority of the processing power available for noise reduction is applied to the own voice signal and less processing is applied to the local sound presented to the hearing instrument wearer. The proposed processing scheme is illustrated in FIG. 9.

[0207] FIG. 9A and FIG. 9B illustrates a scheme for managing processing in a hearing device depending on its mode of operation.

[0208] FIG. 9A and FIG. 9B embody the following general concept in the form of a hearing aid configured to be worn by a user at or in an ear of the user. The hearing aid comprises at least two input transducers configured to pick up a sound at said at least two input transducers and to convert the sound to respective at least two electric input signals representative of said sound. The hearing aid further comprises first and second filters for filtering the at least two electric input signals and providing respective first and second filtered signals. The hearing aid further comprises an output transducer for converting the first filtered signal, or a signal derived therefrom, to stimuli perceivable by the user as sound. The second filter is configured to provide that the second filtered signal comprises a current estimate of the user's own voice. The hearing aid further comprises a transceiver for establishing an audio link to an external communication device (e.g. a telephone). The hearing aid may further comprise a controller configured to allow the hearing aid to operate in at least two modes, a communication mode wherein the audio link to the external communication device is established, and at least one non-communication mode. The first and second filters aid may be configured to operate in a more power consuming and a less power consuming mode in dependence of the controller. The controller may be configured to a) set said first filter in the less power consuming mode, and b) set said second filter in said more power consuming mode, when the hearing aid is in said communication mode. The controller may additionally or alternatively be configured to c) set the first filter in the more power consuming mode, and d) set the second filter in the less power consuming mode, when the hearing aid is in the non-communication mode.

[0209] FIG. 9A illustrates a normal mode of operation (e.g. of a hearing aid), where the adaptive beamformer is applied to the local processing (cf. block BF-NR implementing an adaptive beamformer providing a noise reduced version (PS) of a target signal in the environment if the hearing device (cf. e.g. FIG. 6, 7 where the target signal is a signal in the environment, e.g. in the look direction of the user (microphone direction of the hearing device)). The own voice beamformer (block Fixed OVBF-NR), on the other hand, relies on a fixed own voice enhancing beamformer (cf. signal Ŝ′.sub.ov), because the estimate of the user's own voice is used for secondary processing such as pre-processing for own voice keyword detection (cf. unit KWD). In the case of a fixed (e.g. own voice) beamformer, the weights are estimated based on a fixed noise distribution, e.g. in order to maximize the directivity or maximize the ratio between OV impinging within a certain range of near-field angles and omni-directional far-field noise.

[0210] FIG. 9B illustrates a telephone mode of operation (e.g. of a hearing aid or a normal mode of a headset), where the adaptive beamformer is applied in order to enhance the user's own voice (cf. block OVBF-NR providing an enhanced own voice estimate, cf. signal Ŝ.sub.ov). A fixed beamformer (or alternatively just the signal from a single microphone, cf. block Fixed BF-NR), on the other hand, is used to process the local signal (cf. resulting processed signal PS′), which is presented to the user together with the signal S.sub.REM from the remote end, as the main signal of interest to the hearing instrument user is the far-end talker.

[0211] In other modes too, where the main signal of interest is not received by the hearing instrument microphones, change of processing focus may be applied. Such situations could e.g. be TV streaming, Bluetooth streaming, FM or telecoil streaming, see e.g. EP3637800A1.

[0212] FIG. 10A shows a binaural hearing aid system comprising first and second hearing aids, where the binaural audio signals are combined.

[0213] In FIG. 10A, a binaural hearing aid system worn by a hearing aid user (U) is shown. The binaural hearing aid system may comprise a first and a second hearing aid each comprising a first hearing aid microphone (M1). The first and second hearing aids of the binaural hearing aid system may each comprise a level estimator (LVL), and one or both of the first and second hearing aids may comprise a comparison unit (COMP). The level estimator may either measure the level of the mixture or the level of a noise estimate (e.g. the level of a target cancelling beamformer), but as the level of the target is assumed to be similar on the two ears, the level measured directly on the mixture may be preferred.

[0214] A level may typically be measured in dB (or in the logarithmic domain). Alternatively, the level may be calculated directly from the magnitude or magnitude-squared signal (the actual level does not matter, the levels just need to be compared in order find the minimum). The level may be based on a single sample (e.g. every millisecond) or be measured as an average across several samples, e.g. by filtering across the time axis by a 1st order IIR low pass filter with a time constant. In an embodiment, the time constant is 0 millisecond, in another embodiment, the time constant is less than 5 milliseconds.

[0215] The exemplified drawing shows a case with a single microphone on each ear, but more local microphones may be used (e.g. with two or more microphones in the first and/or second hearing aids as illustrated in the FIGS. above). The selection of local microphones may be done in a similar way as exemplified in the drawing, where the microphone (or linear combination of microphones) with the lowest level is selected.

[0216] In FIG. 10A it is assumed that the audio signal (e.g. the electric input signal) may be transmitted to the hearing aid which is selected to transmit the own voice enhanced signal to an external device, such as a mobile phone. The selection criterion for selecting which hearing aid should transmit the audio signal to the external device could e.g. be based on link quality between each of the hearing aids and the external device (which may be different (and independent) from the binaural link quality).

[0217] Based on noise level measurements/estimates (in time-frequency segments) of the audio signals of the two hearing aids, time-frequency segments may be selected such that the audio signal with the smallest noise level is chosen. Thereby, a binary gain pattern (BGP) relating to each of the first and second hearing aid may be created. It may be assumed that the level of the own voice signal will be similar in the first and the second hearing aids due to the similar and symmetric distance of the mouth compared to the microphones (M1).

[0218] A combination unit (‘Combination unit’) of the binaural hearing aid system may provide a combined audio signal based on the time-frequency segments in the binary gain pattern (BGP), where the audio signal with the smallest noise level in each time-frequency segment is selected. The resulting signal may be synthesized back to a time-domain signal and be transmitted to the external device.

[0219] Thereby, the binaural hearing aid system may combine binaural audio signals in order to reduce e.g. wind noise.

[0220] FIG. 10B shows a further binaural hearing aid system comprising first and second hearing aids, where the binaural audio signals are combined. For similar features as in FIG. 10A, similar reference numbers are used.

[0221] In FIG. 10B, only the noise levels (as estimated by the level estimator (LVL)) are exchanged between the first and second hearing aids. Only exchanging noise levels binaurally may require less binaural transmission bandwidth compared to transmitting a full audio signal between the two hearing devices.

[0222] The noise levels may be compared (by comparison units (COMP) in each of the first and second hearing aids) in order to select/create two binary gain patterns (BGP), which may be configured to attenuate the time-frequency units which have the highest local noise level after comparison.

[0223] The binary gain patterns (BGP) of the respective first and second hearing aids may be applied to the local audio signals, so that the respective audio signals may be attenuated or kept/enhanced depending on the binary gain patterns (BGP). The synthesized audio signals from both hearing aids may then be transmitted to an external device (such as a mobile phone), where the audio signals from the first and second hearing aids may be combined, e.g. by a simple addition.

[0224] Alternatively, the local microphone signals may be transmitted directly to the external device in which similar processing steps may take place. However, an external device may not be capable of doing the proposed processing steps, and it may thus be an advantage to apply the majority of the processing in the hearing aids before the audio signals are transmitted to the external device.

[0225] It is intended that the structural features of the devices described above, either in the detailed description and/or in the claims, may be combined with steps of the method, when appropriately substituted by a corresponding process.

[0226] As used, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well (i.e. to have the meaning “at least one”), unless expressly stated otherwise. It will be further understood that the terms “includes,” “comprises,” “including,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element but an intervening element may also be present, unless expressly stated otherwise. Furthermore, “connected” or “coupled” as used herein may include wirelessly connected or coupled. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. The steps of any disclosed method is not limited to the exact order stated herein, unless expressly stated otherwise.

[0227] It should be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” or “an aspect” or features included as “may” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the disclosure. The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects.

[0228] The claims are not intended to be limited to the aspects shown herein but are to be accorded the full scope consistent with the language of the claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” Unless specifically stated otherwise, the term “some” refers to one or more.

REFERENCES

[0229] US20150163602A1 (Oticon) 11 Jun. 2015 [0230] EP3637800A1 (Oticon) 15 Apr. 2020 [0231] EP3726856A1 (Oticon) 21 Oct. 2020