Hearing device or system adapted for navigation

Abstract

A hearing system comprises a hearing device, e.g. a hearing aid, and is adapted for being worn by a user, the hearing system comprising a) an audio input unit configured to receive a multitude of audio signals comprising sound from a number of localized sound sources in an environment around the user, and b) a sensor unit configured to receive and/or provide sensor signals from one or more sensors, said one or more sensors being located in said environment and/or form part of said hearing system, and c) a first processor configured to generate and update over time data representative of a map of said environment of the user, said data being termed map data, said environment comprising a number of, stationary or mobile, landmarks, said landmarks comprising said number of localized sound sources, and said map data being representative of the physical location of said landmarks in the environment relative to the user, wherein the hearing system is configured to, preferably continuously, generate and update over time said map data based on said audio signals and said sensor signals. The invention further comprises a method.

Claims

1. A hearing system comprising a hearing device, the hearing device being adapted for being worn by a user, the hearing system comprising an audio input unit configured to receive a multitude of audio signals comprising sound from a number of localized sound sources in an environment around the user, and a sensor unit configured to receive and/or provide sensor signals from one or more sensors, said one or more sensors being located in said environment and/or form part of said hearing system, and a first processor configured to generate and update over time data representative of a map of said environment of the user, said data being termed map data, said environment comprising a number of, stationary or mobile, landmarks, said landmarks comprising said number of localized sound sources, and said map data being representative of the physical location of said landmarks in the environment relative to the user, wherein the hearing system is configured to, generate and update over time said map data based on said audio signals and said sensor signals, and wherein the hearing device comprises a beamformer filtering unit coupled to the audio input unit and configured to provide one or more beamformers based on said multitude of audio signals, and wherein the first processor is configured to control and update over time the beamformer filtering unit in dependence of the map data, and to direct the one or more beamformers towards one or more of the localized sound sources in the environment of the user, a signal processor for enhancing said multitude of audio signals and providing a processed electric output signal comprising one or more of the localized sound sources based on the one or more beamformers, and an output unit for providing stimuli perceivable as sound to the user based on said processed electric output signal representing or comprising sound from said one or more of said localized sound sources.

2. A hearing system according to claim 1 wherein said beamformer filtering unit comprises a linearly constrained minimum variance (LCMV) beamformer.

3. A hearing system according to claim 1 wherein said first processor is configured to allow a user to select at least one of said localized sound sources as a sound source of current interest.

4. A hearing system according to claim 1 configured to automatically direct said beamformers towards said localized sound sources based on said map data.

5. A hearing system according to claim 1 comprising a memory and being configured to store said map data in said memory, allowing to track the movement of said user and/or said localized sound sources in said environment over time.

6. A hearing system according to claim 1 wherein the audio input unit comprises a microphone array comprising a multitude of microphones for picking up sound from said environment and providing respective microphone signals comprising sound from said number of localized sound sources and providing at least some of said multitude of audio signals.

7. A hearing system according to claim 6 comprising a head worn frame or structure whereon at least some of said multitude of microphones are located.

8. A hearing system according to claim 1 comprising one or two hearing devices configured to be located at or in respective left and right ears of a user, and a head worn frame or structure, wherein or whereon one or more of said sensors are mounted.

9. A hearing system according to claim 8 wherein communication between the one or two hearing devices and the one or more sensors, is wired or wireless.

10. A hearing system according to claim 1 wherein said sensor unit comprises one or more of an accelerometer, a gyroscope, and a magnetometer.

11. A hearing system according to claim 1 wherein said sensor unit comprises one or more of said sensors, and wherein at least one of said one or more sensors comprises an electrode for picking up body signals.

12. A hearing system according to claim 1 wherein said sensor unit comprises one or more vision sensors.

13. A hearing system according to claim 1 wherein said first processor comprises a simultaneous localization and mapping (SLAM) algorithm.

14. A hearing system according to claim 1 wherein said first processor comprises a face recognition algorithm for identifying one or more faces in the environment of the user.

15. A hearing system according to claim 1 wherein the hearing device comprises a hearing aid, a headset, an earphone, an ear protection device or a combination thereof.

16. A method of operating a hearing system comprising a hearing device, the hearing device being adapted for being worn by a user, the method comprising receiving and/or providing audio signals comprising sound from a number of localized sound sources in an environment around the user, receiving sensor signals from one or more sensors, said one or more sensors being located in said environment and/or form part of said hearing system, generating and updating over time data representative of a map of said environment of the user, said data being termed map data, said environment comprising a number of, stationary or mobile, landmarks, said landmarks comprising said number of localized sound sources, and said map data being representative of the physical location of said landmarks in the environment relative to the user; generating and updating over time said map data based on said audio signals and said sensor signals; providing one or more beamformers based on said audio signals, controlling and updating over time the one or more beamformers in dependence of the map data, and to direct the one or more beamformers towards one or more of the localized sound sources in the environment of the user, enhancing said multitude of audio signals and providing a processed electric output signal comprising one or more of the localized sound sources based on the one or more beamformers, and providing stimuli perceivable as sound to the user based on said processed electric output signal representing or comprising sound from said one or more of said localized sound sources by an output unit of said hearing device.

17. A hearing system according to claim 1 wherein said localized sound sources are constituted by or comprises a speech source originating from a person talking or from a localized sound transducer.

18. A hearing system according to claim 1 wherein number of localized sound sources is larger than or equal to two.

19. A hearing system according to claim 1 wherein localized sound sources are stationary or mobile.

Description

BRIEF DESCRIPTION OF DRAWINGS

(1) The patent or application file contains at least one color drawing. Copies of this patent or patent application publication with color drawing will be provided by the USPTO upon request and payment of the necessary fee.

(2) The aspects of the disclosure may be best understood from the following detailed description taken in conjunction with the accompanying figures. The figures are schematic and simplified for clarity, and they just show details to improve the understanding of the claims, while other details are left out. Throughout, the same reference numerals are used for identical or corresponding parts. The individual features of each aspect may each be combined with any or all features of the other aspects. These and other aspects, features and/or technical effect will be apparent from and elucidated with reference to the illustrations described hereinafter in which:

(3) FIGS. 1A and 1B show respective top and front views of an embodiment of a hearing system according to the present disclosure comprising a sensor integration device configured to be worn on the head of a user, e.g. comprising a head worn carrier, such as a spectacle frame,

(4) FIG. 2 illustrates a simultaneous localization and mapping technique where several time-stamps of the localization of the user (triangles) are indicated during movement in space,

(5) FIG. 3 shows map data created by a magnetic field sensor reflecting variations in the magnetic field in an environment comprising two localized sound sources, and a listener (user),

(6) FIG. 4 shows a scenario comprising a multitude of landmarks in the form of number of (simultaneous or intermittent) talkers, a number of silent persons and additional landmarks, and a user wearing a hearing system according to the present disclosure,

(7) FIG. 5A shows an augmenting device comprising a head worn frame supporting a microphone array comprising a multitude of microphones (e.g. (MEMS) microphones),

(8) FIG. 5B shows a beamformer with 0 degrees horizontal angle relative the frontal direction,

(9) FIG. 5C shows a beamformers with 30 degrees horizontal angle relative the frontal direction, and

(10) FIG. 5D shows the 30 degrees beamformer of FIG. 5C in severe reverberant conditions,

(11) FIG. 6 shows a situation of use EarEOG control of a beamformer of a hearing device according to the present disclosure,

(12) FIG. 7 shows a situation of use EarEEG control of a beamformer of a hearing device according to the present disclosure, and

(13) FIG. 8A shows a pair of behind the ear (BTE) hearing devices comprising electrodes on a mould surface of an ITE part,

(14) FIG. 8B shows a pair of hearing devices comprising electrodes in an ear canal as well as outside the ear canal of the user, and a wireless interface for communicating with a separate processing part, and

(15) FIG. 8C shows a hearing system comprising a bimodal hearing system comprising an air conduction hearing device as well as a cochlear implant hearing device,

(16) FIG. 9A shows a first embodiment of a hearing device according to the present disclosure, and

(17) FIG. 9B shows a second embodiment of a hearing device according to the present disclosure,

(18) FIG. 10 illustrates a user interface for a hearing system according to the present disclosure.

(19) The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out. Throughout, the same reference signs are used for identical or corresponding parts.

(20) Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.

DETAILED DESCRIPTION OF EMBODIMENTS

(21) The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practised without these specific details. Several aspects of the apparatus and methods are described by various blocks, functional units, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as elements). Depending upon particular application, design constraints or other reasons, these elements may be implemented using electronic hardware, computer program, or any combination thereof.

(22) The electronic hardware may include microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure. Computer program shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.

(23) The present application relates to the field of hearing systems, e.g. comprising one or more hearing devices, e.g. hearing aids, in particular to a hearing device or devices adapted to provide an improved sensation of sound sources in a confined environment, e.g. indoor, of a user wearing the hearing device.

(24) The present application deals with a hearing system comprising sensor integration device (e.g. forming part of, such as integrated in, or in communication with a hearing device) that enhances the sounds in the user's focus.

(25) In an embodiment, the hearing system, e.g. a hearing device, comprises an indoor navigation device for user localization, multiple target mapping and tracking of sound sources around the user. In an embodiment, the hearing system is configured to allow the use of eye-gaze tracking of the user to enhance/augment a sound source of the user's (current) focus.

(26) Listeners/hearing aid users may have problems segregating, following and focusing attention to a given dynamic auditory object in a scene of multiple acoustic objects. This is especially true for hard-of-hearing listeners. In spite of the ability of modern hearing aids to create some separation of the sources, e.g. using multiple microphone beamforming (and further noise reduction), there is a need for augmenting the sounds that are in the users focus.

(27) In other words, the user needs help to track multiple acoustic objects, to segregate these, and to indicate which one is of interest for the moment, and for auditory display for a natural perception of the sound scene.

(28) The solution consists of four elements:

(29) A. User localization and multiple target mapping and tracking (SLAM)

(30) B. Determining the (sound) source in user's focus of attention for the moment

(31) C. Augmenting/enhancing the source in the user's focus

(32) D. Calculating the auditory display (sound to be presented to the left and right ears) of the enhanced sound scene and transmitting this to a pair of hearing aids.

(33) FIGS. 1A and 1B illustrate respective top and front views of an embodiment of a hearing system according to the present disclosure comprising a sensor integration device configured to be worn on the head of a user, e.g. comprising a head worn carrier, such as a spectacle frame.

(34) FIG. 1A and FIG. 1B show respective (schematic) top and front views of a spectacle frame for carrying an embodiment of a hearing system according to the present disclosure. The hearing system comprises left and right hearing devices and a number of sensors mounted on the spectacle frame. The hearing system (HS) comprises a number of sensors S.sub.1i, S.sub.2i, (i=1, . . . , N.sub.S) associated with (e.g. forming part of or connected to) left and right hearing devices (HD.sub.1, HD.sub.2), respectively. Ns is the number of sensors located on each side of the frame (in the example of FIG. 1A, 1B assumed to be symmetric, which need not necessary be so, though). The first, second, third, and fourth sensors S.sub.11, S.sub.12, S.sub.13, S.sub.14 and S.sub.21, S.sub.22, S.sub.23, S.sub.24 are mounted on a spectacle frame of the glasses (GL). In the embodiment of FIG. 1A, sensors S.sub.11, S.sub.12 and S.sub.21, S.sub.22 are mounted on the respective sidebars (SB.sub.1 and SB.sub.2), whereas sensors S.sub.13 and S.sub.23 are mounted on the cross bar (CB) having hinged connections to the right and left side bars (SB.sub.1 and SB.sub.2). Finally, sensors S.sub.14 and S.sub.24 are mounted on first and second nose sub-bars (NSB.sub.1, NSB.sub.2) extending from the cross bar (CB) and adapted for resting on the nose of the user. Glasses or lenses (LE) of the spectacles are mounted on the cross bar (CB) and nose sub-bars (NSB.sub.1, NSB.sub.2). The left and right hearing devices (HD.sub.1, HD.sub.2) comprises respective BTE-parts (BTE.sub.1, BTE.sub.2), and may e.g. further comprise respective ITE-parts (ITE.sub.1, ITE.sub.2). The ITE-parts may e.g. comprise electrodes for picking up body signals from the user, e.g. forming part of sensors S.sub.1i, S.sub.2i (1=1, . . . N.sub.S) for monitoring physiological functions of the user, e.g. brain activity or eye movement activity or temperature. Likewise, the one or more of the sensors on the spectacle frame may comprise electrodes for picking up body signals from the user. In an embodiment, sensors S11, S14 and S21, S24 (black rectangles) may represent sensor electrodes for picking up body signals e.g. Electroocculography (EOG) potentials and/or brainwave potentials, e.g. Electroencephalography (EEG) potentials, cf. e.g. EP3185590A1. The sensors mounted on the spectacle frame may e.g. comprise one or more of an accelerometer, a gyroscope, a magnetometer, a radar sensor, an eye camera (e.g. for monitoring pupillometry), a camera (e.g. for imaging objects of the environment of the user), or other sensors for localizing or contributing to localization of a sound source (or other landmark) of interest to the user wearing the hearing system. The sensors (S.sub.13, S.sub.23) located on the cross bar (CB) and/or sensors (e.g. S.sub.12, S.sub.22) located on the side bars (SB.sub.1, SB.sub.2) may e.g. include one or more cameras or radar or ultra sound sensors for monitoring the environment. The hearing system further comprises a multitude of microphones, here configured in three separate microphone arrays (MA.sub.R, MA.sub.L, MA.sub.F) located on the right, left side bars and on the (front) cross bar, respectively. Each microphone array (MA.sub.R, MA.sub.L, MA.sub.F) comprises a multitude of microphones (MIC.sub.R, MIC.sub.L, MIC.sub.F, respectively), here four, four and eight, respectively. The microphones may form part of the hearing system (e.g. associated with the right and left hearing devices (HD.sub.1, HD.sub.2), respectively, and contribute to localise and spatially filter sound from the respective sound sources of the environment around the user, cf. e.g. our co-pending European patent application number 17179464.7 filed with the European Patent Office on 4.sup.th of Jul. 2017 and having the title Direction Of Arrival Estimation In Miniature Devices Using A Sound Sensor Array.

(35) A. User Localization and Multiple Target Mapping and Tracking (SLAM)

(36) In SLAM the problem of localization (of the user) and mapping (of the targets) is solved simultaneously, and can take a multitude of sensor inputs to build up the localization and mapping. In FIG. 2, triangles may represent the true and estimated position (location) and orientation of the user, respectively. In an embodiment, the hearing system is configured to estimate a location and orientation of landmarks as well. In an embodiment, only the location is estimated (while its orientation is ignored). The orientation of landmarks can e.g. be estimated with a camera (as one of the sensors). The orientation of a landmark representing a sound source (e.g. a person speaking) may be useful for qualifying eye gaze estimation, as it may be less likely that the user attends to a source that is not facing the user.

(37) FIG. 2 shows a simultaneous localization and mapping (SLAM) technique where several time-stamps of the localization of the user (triangles) are indicated during movement in space. The time-stamps are represented by index k. In FIG. 2, a movement (track) of the user from left to right (cf. locations X.sub.k1, X.sub.k, X.sub.k+1, X.sub.k+2, of the user (represented by white triangles) in an environment comprising a number of landmarks m (e.g. LM1, . . . LM5) represented by sun symbols). The user's estimated and true location at a given point in time is indicated by grey shaded and open (un-shaded, white) triangles, respectively. The estimated and true location of the landmarks (LM1-LM5) at a given point in time is indicated by solid (black) and open (white) sun symbols, respectively. The user can be stationary (sitting at a dinner table) or moving around (walking around in a museum). The landmarks can be static sound objects with a given direction (direction of arrival, DOA) relative to and distance from the listener, but they can also be dynamic (e.g. represented by persons talking, while moving around in a museum). The DOA e.g. can be determined from hearing aid microphone arrays (e.g. worn by a user, cf. FIG. 1A, 1B), from a number of connected microphones at a table or mounted in a room, or from remote microphones close to the sound sources (e.g. talkers each wearing a wireless microphone). Other static landmarks can comprise objects whose location can be estimated from sensors capable of sensing physical properties of the objects in question (e.g. its magnetic properties). Such objects may be identified in a map created by a sensor over time, see e.g. the magnetometer map in FIG. 3, that maps out the static magnetic variations in a room. Other static landmarks can be beacons (e.g. BT transmitters, short wide band pulses of radio, or magnet field transmitters), or the signals from remote static FM transmitters (FM radio band, creating a variation of the received signal strength in the room), or the signal strength from WiFi transmitters across a room. FIG. 3 shows a user in a room with two sound sources and a listener (user).

(38) In embodiments, one or more image sensors, e.g. cameras, may be used to contribute to the mapping of the environment, e.g. one or more of a monocular camera (cf. e.g. [Davison; 2003]), a stereo camera (cf. e.g. [Pax et al.; 2008]), and a combination of an IMU and a monocular or a stereo camera (cf. e.g. [Lupton; 2008])

(39) FIG. 3 shows map data created by a magnetic field sensor reflecting variations in the magnetic field in an environment comprising two localized sound sources, and a listener (user). The legend (spanning magnetic field strengths from 0 to 100 micro Tesla (T)) is indicated in the right part of the graph (cf. column denoted MFS-legend). The same information is intended to be (coarsely) given by letters L, M, H indicating a relatively low, middle and relatively high field strength, respectively, in the depicted part of the environment of the user. L, M and H and e.g. be associated with approximately 20, 50 and 80 T, respectively. The magnetic field strength (assumed to relatively static) can be used to identify a present location of a landmark (sound source, S1, S2) and/or the user (U), possibly in combination with other location dependent data.

EXAMPLE

(40) An example of how the combination of sensors can be used in hearing aids is provided in the following. A method for indoor localization using opportunistic signals, e.g., FM radio in the 88-106 MHz band, together with inertial sensors is proposed.

(41) Input:

(42) Antenna and receiver circuitry in the hearing device, which are sensitive to 88-106 MHz FM bands (and possibly other additional bands) provides electric signals representing the FM signal at the hearing device. Inertial sensors, that is, 3D accelerometer, 3D gyroscopes, and 3D magnetometers located in the hearing device provides sensor signals representative of linear and angular movement of the use as well as the magnetic field strength at the hearing device.
Output: A map reflecting the received signal strength (RSS) of the FM signal, with uncertainties, of the radio environment of the visited locations for the selected frequencies (recorded during the movement of the user in the environment). 2D Position and 3D orientation of the hearing device with their corresponding uncertainty estimates.

(43) There are several reasons to why it is interesting to study the opportunistic use of multi-frequency RSS for indoor localization. FM radio and TV signals are present almost everywhere and may therefore be utilized in e.g., first responder scenarios where pre-installed infrastructure cannot be trusted. Signal fading characteristics depends on frequency, and the surrounding environment, resulting in different RSS maps. These maps combined are naturally more informative than a single map alone.

(44) A radio signal interacts with the physical environment in an extremely complex manner. It experiences a distance dependent attenuation (path loss) and the radio signal is reflected off different objects, diffracted around obstacles, and scattered off objects. Hence, the receiver will receive many (distorted and delayed) signal components. The constructive and destructive addition of these multipath components is the cause for the rapid fluctuations as a function of spatial displacement of the RSS values that are typical for all non-line-of-sight (NLOS) wireless radio channels which is the typical situation here. This phenomenon is called multipath fading. Apart from the multipath fading, also antenna orientation and shadowing affect the measurements.

(45) Radio Environment Model

(46) A RSS fingerprint is a set of mean signal strength values collected at different frequencies at different positions. Each frequency in the RSS fingerprint vector is assumed to be a function of its position in 2D space. In indoor localization, fingerprints are measured beforehand and used as a map. Another alternative is to utilize signal source locations, e.g. WiFi access points (APs), and a radio channel path loss model which describes how the signal attenuates as a function of distance to each source. In some cases, these methods do not work, for instance, if the map cannot be obtained beforehand, the radio source locations are not known, or if it is not meaningful to describe the signal using a path loss model. An alternative then is to model the RSS with a Gaussian Process (GP) which locally describes how the signal RSS varies as a function of the 2D position.

(47) Inertial Navigation

(48) Inertial navigation is a well-known technique used for computing position, velocity, orientation, and other quantities by the means of inertial sensors. In this application, inertial navigation is used to compute a position and orientation estimate. Without additional information, the errors of the position and orientation estimates grows quadratically and linearly, respectively.

(49) RSS SLAM

(50) Simultaneous localization and mapping (SLAM) is a methodology to jointly estimate the position of mobile sensors and a map of the signals that the sensors measure. In this application, the RSS map is represented as a GP over the traversed area. Inertial navigation is used to provide position and orientation estimates which are corrected using the RSS map in a statistical filtering framework. These corrections are most useful when an area is revisited as it can potentially remove the errors completely.

(51) The processing for RSS SLAM comprises the following steps: Provide radio and IMU data Preprocess RSS SLAM Position and Map

(52) In SLAM static landmarks/sources are treated differently than dynamic/moving sources. Therefore, five different use scenarios are described below.

(53) Table 1 below describes a number of sensors that may be used in different combinations to solve the SLAM problem for the five different scenarios depending on whether sound sources are stationary or mobile, whether the listener (user) is stationary of mobile, and whether the SLAM solution considers a single room or a building (with several rooms).

(54) The sensors are arranged in the table according to the following groups: 3D Accelerometer, 3D Gyroscope, 3D Magnetometer, EEG/EOG electrodes, Augmenting device (MEMS) microphones, Radio antenna receiver, Radio antenna transmitter, Magnetic antenna (T-coil), Bluetooth transmitter, Bluetooth receiver, Ultra sound sensors, Front camera Eye camera

(55) TABLE-US-00001 TABLE 1 A number of sensors that may be used in different combinations to solve the SLAM problem. Sensor/input type Signals Quantities of interest 3D Accelerometer Linear acceleration Linear acceleration 1 per ear or multiple vector Linear velocity Centrifugal force Position Euler force Direction of gravity vector Magnitude of gravity vector Movement detection - Walking, running . . . Anomalies . . . 3D Gyroscope Angular velocity Angular velocity 1 per ear or multiple Short time local orientation Movement detection/classification - Walking, running . . . Anomalies 3D Magnetometer Magnetic field Local magnetic field vector 1 per ear or multiple Local magnetic field vector magnitude Magnetic disturbances/targets characteristics Magnetic field variation .fwdarw.linear velocity, position EEG/EOG electrodes Electric field and other Eye gaze direction relative to 2 per ear or multiple disturbances the head Augmenting device Sound sources. Direction of arrival (MEMS) microphones Reverberation. Position of targets Number of microphones: Position of microphone 12 or more Number of sound sources Environment characteristics - reverberation . . . Radio antenna receiver Electromagnetic waves RSS 1 per ear RSSI Access point IDs Access point location Position Velocity Radio antenna Electromagnetic waves transmitter 1 per ear Magnetic antenna (T-coil) Magnetic signal magnetic field disturbance representing sound -- a informed DOA - the sound in 1 per ear potential disturbance the T-coil can be used to for navigation estimate DOAs to targets Bluetooth transmitter Radio signal Transmitter position by external 1 per ear network Bluetooth receiver Radio signal Position Other information? Ultra sound sensors, Distance to objects. Distance to objects. Can be used 1 per ear Object shape, etc. to produce a depth map w.r.t. the sensor. Front camera Object shape, etc. Map w.r.t. the sensor Eye camera Pupil tracking Eye gaze

(56) In the following the term augmenting device is used. The augmenting device comprises a number of microphones and may e.g. form part of the hearing device, or be a separate device in communication with the hearing device.

(57) Use Cases for SLAM:

(58) 1. One Stationary Listener (User) and N Stationary Sound Sources in a Room (Conference Scenario, Restaurant)

(59) FIG. 4 shows a scenario comprising a multitude of landmarks (LMi, i=1, 2, 3, 4, 5) in the form of number of (simultaneous or intermittent) talkers in an environment around a user (U) wearing a hearing system (HS) according to the present disclosure. The environment (e.g. representing a restaurant) further comprises a number of silent persons (SP) and additional landmarks (e.g. objects denoted LM6, LM7, e.g. tables or other environmental structures).

(60) Here a 3D Accelerometer, a 3D Gyroscope, and a 3D Magnetometer, may be used to estimate the head rotation (HR) of the listener (User U). An eye gaze angle (EGA) may e.g. be determined using an eye tracking camera or electrodes for measuring Electroocculography (EOG) potentials.

(61) The Augmenting device (MEMS) microphones are e.g. integrated in a spectacle frame of the hearing system and used in combination to estimate the direction of arrival of the N stationary sound sources, cf. e.g. [Skoglund et al.; 2017]. Using the same microphones on a sensor-integration device, cf. e.g. [Farmani et al.; 2015] Combing remote microphones and head-worn microphones.

(62) Ultrasound sensors are e.g. used to measure the distance to each N stationary sound sources. A front camera is e.g. used to map the N stationary sound sources.

(63) 2. One Stationary Listener and N Mobile Sound Sources in a Room (Event, Like a Theater)

(64) Here the 3D Accelerometer, 3D Gyroscope, and the 3D Magnetometer, are used to estimate the head rotation of the listener.

(65) The Augmenting device (MEMS) microphones are used to in combination to track the direction of arrival of the N mobile sound sources. In an embodiment, the number of (MEMS) microphones is larger than 4, such as 12 or more. In an embodiment, the augmenting device (e.g. the microphones) is used to estimate target velocity through Doppler effect.

(66) The ultrasound sensors are used to measure the distance to each N mobile sound sources. The front camera is used to map the N mobile sound sources.

(67) 3. One Mobile Listener and N Mobile Sound Sources in a Room (the Dynamic Cocktail Party)

(68) The 3D Accelerometer, 3D Gyroscope, 3D Magnetometer, Radio Antenna Receiver, Magnetic antenna (T-coil), and Bluetooth receiver are used to track the position of the mobile listener, and the head rotation of the listener.

(69) The augmenting device (MEMS) microphones are used to in combination to track the direction of arrival of the N mobile sound sources. The ultrasound sensors are used to measure the distance to each N mobile sound sources. The front camera is used to map the N mobile sound sources.

(70) 4. One Mobile Listener and N Stationary Sound Sources in a Building (an Empty Museum)

(71) The 3D Accelerometer, 3D Gyroscope, 3D Magnetometer, Radio antenna receiver, Magnetic antenna (T-coil), and Bluetooth receiver are used to track the position of the mobile listener, and the head rotation of the listener.

(72) The Augmenting device (MEMS) microphones are used to in combination to track the direction of arrival of the N stationary sound sources.

(73) Radio antenna transmitter and Bluetooth transmitter are used to communicate with beacons. The ultrasound sensors are used to measure the distance to each N mobile sound sources. The front camera is used to map the N mobile sound sources.

(74) 5. One Mobile Listener and N Mobile Sound Sources in a Building (Crowded Museum)

(75) The 3D Accelerometer, 3D Gyroscope, 3D Magnetometer, Radio antenna receiver, Magnetic antenna (T-coil), and Bluetooth receiver are used to track the position of the mobile listener, and the head rotation of the listener.

(76) The Augmenting device (MEMS) microphones are used to in combination to track the direction of arrival of the N mobile sound sources

(77) Radio antenna transmitter and Bluetooth transmitter are used to communicate with beacons. The ultrasound sensors are used to measure the distance to each N mobile sound sources. The front camera is used to map the N mobile sound sources.

(78) B. Determining the (Sound) Source in User's Focus of Attention for the Moment

(79) Put electrodes (dry electrodes) where skin touches the device, at least two electrodes.

(80) a. Estimate the position and rotation of the head in the physical scene (see above)

(81) b. Estimate the eye-gaze vector relative to the head

(82) i. Combine a and b in the SLAM solutions above.

(83) 1. Compensating for the Missing DC Component in EOG.

(84) a. By using deadreckoning/modelling/estimation of the eye gaze (position)

(85) b. Relating this to landmarks (we can for this part of SLAM assume that we know the landmarks), and using on-line re-calibration to landmarks or zero gaze angle.

(86) 2. Modeling of Different Behaviors when Head-Movements (IMU) and Eye-Movements (EOG) are Integrated into One SLAM Model

(87) a. Fixations. If the estimate classifies that we are in a fixation we have two sources for estimating the eye-position:

(88) i. The EOG signal

(89) ii. The opposite IMU signal. Here the assumption (and the literature) says that when you are fixating the oculomotor system tries to keep a fixated picture on the retina. Therefore, the head-movements are registered in the balance organ (and proprioceptors in the muscle coils in the neck), and directly (through a feedback control system) affect the eye-muscles to contra-movements to compensate for the head-movement. Thus, head-movements and eye-movements are tightly coupled. So in this model (fixation) also the IMU will (also) register the eye-movements with opposite sign.
b. Saccades. Model saccades, that is when the person switch from fixations towards a new target.
c. Smooth pursuit. Model smooth pursuit, that is slow eye-movements (typically following a target at a distance). Difficult to detect with EOG since the change rate is so low (comparable to the DC-drift and thus filtered out), can possibly be traced from the IMUs.
3. Given a Stable and Reliable Eye-Gaze Signal, Look at the Statistics of the Looking Behavior so to Verify the Landmarks and Combine that with the Other Sensors (DOA Estimation).

(90) The use of eye gaze to control a hearing device, e.g. a beamformer is e.g. described in EP3185590A1.

(91) Alternatively, if size and power consumption permits, an eye-camera could be included in the device, which tracks the pupil (REFS, Mobile eye, pupillabs.com). This is e.g. dealt with in EP2813175A2.

(92) C. Augmenting/Enhancing the Source in the User's Focus

(93) Given the SLAM localization and mapping and the estimated eye-gaze direction N beams are calculated from the Augmenting device microphone (MEMS) signals, one beam for each sound source and pointing in the direction of the DOA estimate. E.g. MVDR beamformers. The source that is in the user's focus is defined as where the eye-gaze vector is pointing in the absolute coordinate system.

(94) Furthermore, DNNs or other Noise cancelling techniques (e.g. Virtanen et al. 2017) can be used to further suppress the sources that are not in the user's focus.

(95) Another complementing solution (e.g. together with, or instead of the use of eye gaze) would be to use face recognition and stream segregation in combination with the teaching of the present disclosure (e.g. using standard face recognition algorithms, to place face positions on the map). By using an image sensor, e.g. a frontal camera/scene camera, and by using face recognition algorithms (cf. e.g. http://www.face-rec.org/algorithms/), the position of one or several faces in the image/video can be determined, video-frame by video-frame. The image sensor may be located on the spectacle frame(s), e.g. a cross-bar to provide a frontal field of view relative to the user (cf. e.g. FIG. 1A, 1B, e.g. sensors S.sub.13, S.sub.23). Combined with other sensors of the hearing system (e.g. IMU, RSS, acoustic DOA, magnetic field), the position of the face(s) can be determined in the reference coordinate system.

(96) Another complementing solution would be to use source classification based on video and audio. By combining video and audio data, there are several ways to determine the position of human objects in the reference coordinate system, visual by the scene camera with face recognition (cf. above), acoustic by DOA and analysis of speaking objects.

(97) FIG. 5A shows an augmenting device comprising a head worn frame supporting a microphone array comprising a multitude of microphones (e.g. (MEMS) microphones),

(98) FIG. 5B shows a beamformer with 0 degrees horizontal angle relative the frontal direction,

(99) FIG. 5C shows a beamformers with 30 degrees horizontal angle relative the frontal direction, and

(100) FIG. 5D shows the 30 degrees beamformer of FIG. 5C in severe reverberant conditions.

(101) Use the eye-gaze vector to weigh (weighing function as a function of angle relative to the eye-gaze vector) the relative amplitude of the different beamformers (to make the acoustic object in focus being enhanced, but without completely removing the other acoustic object, a target enhancement of about 12 dB is suggested).

(102) FIG. 6 shows a situation of use EarEOG control of a beamformer of a hearing device according to the present disclosure. Steerable real-time beamformers can implemented in hearing devices. In an embodiment, an EarEOG control signal is used to steer the beamformer angle of maximum sensitivity towards the gaze direction.

(103) The scenario of FIG. 6 is e.g. discussed in EP3185580A1. It illustrates a table situation with quasi constant location of a multitude of occasional talkers/listeners (S1-S5). In an embodiment, the hearing devices of a hearing system according to the present disclosure provides eye gaze control (i.e. comprise electrodes for picking up body potentials and adapted for exchanging amplified voltages based thereon to provide a control signal representative of a current eye gaze direction (EyeGD in FIG. 6) of the user, e.g. relative to a look direction (LookD in FIG. 6) of the user). Each hearing device comprises a beamformer for providing a beam (i.e. the microphone system has a maximum in sensitivity in a specific direction) directed towards a target sound source, here a talker in the vicinity of the user, controlled by a user's eye gaze. For such situation, a beamformer filtering unit for providing a number of beamformers directed at respective localized sound sources in the environment of a user are provided using map data according to the present disclosure. Alternatively, or additionally, predefined look vectors (transfer functions from sound source to microphones, or corresponding relative transfer functions) and/or filter weights corresponding to a particular direction may be determined and stored in a memory unit MEM of the hearing device, so that they can be quickly loaded into the beamformer, when a given talker is selected. Thereby a beamformer directed at a sound source of current interest of the user can be activated via eye gaze control. Thereby the non-active beams can be considered to represent virtual microphones that can be activated one at a time.

(104) FIG. 6 shows an application of a hearing system according to the present disclosure for segregating individual sound sources in a multi-sound source environment. In FIG. 6, the sound sources are persons (that at a given time are talkers or listeners) located around a user (U, that at the time illustrated is a listener). The user (U) wears a hearing system according to the present disclosure that allows segregation of each talker and allows the user to tune in depending on the person (S1, S2, S3, S4, S5) that is currently speaking as indicated by (schematic) elliptic beams of angular width () sufficiently small to enclose (mainly) one of the persons surrounding the user. In the example of FIG. 6, the person speaking is indicated by S2, and the sound system is focused on this person as indicated by direction of eye gaze of the user (EyeGD) and a bold elliptic beam including the speaker (S2).

(105) In an embodiment, the hearing device or devices of the hearing system worn by the user (U) are hearing devices according to the present disclosure. Preferably, the hearing system comprises two hearing devices forming part of a binaural hearing system, e.g. a binaural hearing aid system. In an embodiment, the sensor part of the hearing devices comprises a number of electromagnetic sensors each comprising a sensing electrode configured to be coupled to the surface of the user's head (e.g. at or around an ear or in an ear canal), when the hearing device is operatively mounted on the user. In an embodiment, the sensor part comprises an electrical potential sensor for sensing an electrical potential. In another embodiment, the sensor part comprises a magnetic field sensor for sensing a magnetic field (e.g. generated by a user's body, e.g. originating from neural activity in the user's head, e.g. the brain). In an embodiment, the electrical potential and/or magnetic field sensors are configured to sense electric and/or magnetic brain wave signals, respectively. In an embodiment, the sensing electrode(s) is(are) configured to be capacitively or inductively coupled to the surface of the user's head, when the hearing device is operatively mounted on the user. In an embodiment, the electrical potential sensor comprises a sensing electrode configured to be coupled to the surface of the user's head (e.g. at or around an ear or in an ear canal), when the hearing device is operatively mounted on the user. In an embodiment, the sensing electrode is configured to be directly (e.g. electrically (galvanically)) coupled to the surface of the user's head (e.g. via a dry or wet contact area between the skin of the user and the (electrically conducting) sensing electrode), when the hearing device is operatively mounted on the user.

(106) Another complementing solution would be to use visually based eye-trackers, e.g. glasses with cameras or eye-glasses with EOG (see e.g. Jins Meme, https://jins-meme.com/en/eyewear-apps).

(107) By Kalman-filtering the output from (Ear)EOG sensors (or other eye trackers) the eye-angle (cf. e.g. angle in FIG. 6) relative to the head's nose-pointing direction (cf. e.g. LookD in FIG. 6) can be estimated. By Kalman-filtering the output from 9DOF sensors (9 degree of freedom sensors, 3D accelerometer, 3D gyroscope, 3D magnetometer), or other motion-tracking devicesplaced at head level close to the earthe absolute head-angle relative to the local magnetic field and/or local gravity field can be determined. The absolute head-angle relative to the room can be determined in case the local magnetic field and local gravity field is/are is known (e.g. measured by a magnetic field sensor and an accelerometer, respectively) for the room in question. By combining the outputs from the (Ear)EOG and 9DOF another (or the same) Kalman filter can be made whose output is the absolute eye-angle relative to the room.

(108) By further Kalman-filtering (e.g. using another or the same Kalman filter) the output from the absolute eye-angle relative to the room for Simultaneous Location and Mapping (SLAM), a kind of hotspot(s) can be estimated, where some eye-gaze angles are more plausible than others (the person is probably looking more at the persons in the scene than at the backgrounds). The principle idea is to extend the Kalman filter, where eye-gaze angle is a state, with a number of states/parameters that describe the angle to the hotspots (the Map in general robotic-terms). This principle works well if you switch between a number of discrete hotspots as the case is in this application. The Map can be points or normal-distributions, assuming that the eye-gaze angle follow a mix of gauss-distributions.

(109) Extended Kalman filter:

(110) In the extended Kalman filter (EKF), the state transition and observation models don't need to be linear functions of the state but may instead be differentiable functions.
x.sub.k=(x.sub.k1,u.sub.k1)+w.sub.k
z.sub.k=h(x.sub.k)+v.sub.k
where w.sub.k and v.sub.k are the process and observation noises, which are both assumed to be zero mean multivariate Gaussian noises with covariance Q.sub.k and R.sub.k respectively. u.sub.k1 is the control vector.

(111) The function can be used to compute the predicted state from the previous estimate and similarly the function h can be used to compute the predicted measurement from the predicted state. However, and h cannot be applied to the covariance directly. Instead a matrix of partial derivatives (the Jacobian) is computed.

(112) At each time step, the Jacobian is evaluated with current predicted states. These matrices can be used in the Kalman filter equations. This process essentially linearizes the non-linear function around the current estimate

(113) And if we use e.g., particle filter for SLAM, known as FastSLAM, then the functions h( ) and ( ) need not even be differentiable.

(114) Including a smartphone (or perhaps several smartphones, e.g. belonging to other users, sound sources (landmarks)) into the loop, their on-board sensors may also be utilized by the hearing system, either alone or in combination with the hearing device (e.g. embodied in glasses and/or hearing aids). In an embodiment, the smartphone may be used to compute a direction of arrival (DOA) of a sound source, e.g. in a restaurant scenario, cf. e.g. FIG. 6, e.g. in combination with the hearing device(s) as (which provides more microphones, e.g. microphone pairs). The relative position and orientation between the user and the hearing device can be calculated e.g., via IMU, smartphone-device, DOA, or other methods. It can also be used as a remote microphone (array), possibly improving beamforming and such.

(115) FIG. 7 shows a situation of a use of EarEEG control of a beamformer of a hearing device according to the present disclosure. FIG. 7 illustrates a situation where the control unit configured to allow a user to select one of the localized sound sources as a sound source of current interest comprises a user interface based on detection of brain waves using electrodes for picking up EEG potentials from the user's brain. At least some of the electrodes may be located in or on the hearing device. The different localized sound sources may be identified as proposed by the present disclosure (e.g. using a SLAM algorithm) and e.g. sequentially (or simultaneously) presented to the user by directing beamformers at the sound sources and let the user choose the sound source of current interest via a comparison of recorded EEG signals with the individual sound signals e.g. extracted by the beamformers. An example of such brain-wave dependent audio processing is e.g. presented in US20140098981A1.

(116) FIG. 8A shows a pair of behind the ear (BTE) hearing devices comprising electrodes on a mould surface of an ITE part. The embodiment of FIG. 8A shows electrodes (E1, E2, E3) located on the surface of (customized) in-the-ear moulds (ITE) of a hearing system (HS) comprising first and second hearing devices (HD1, HD2), but may also be applied to more general purpose domes or ITE-parts, or may be integrated in a behind the ear part (BTE). In an embodiment, the electrodes are conventional (dry or wet) electrodes for establishing a direct electric contact between skin and electrode.

(117) FIG. 8B shows a pair of hearing devices comprising electrodes in an ear canal as well as outside the ear canal of the user, and a wireless interface for communicating with a separate processing part. In the embodiment of FIG. 8B, the first and second hearing devices (HD.sub.1, HD2) comprises first and second parts (P1, P2), respectively, each being located partially in an ear canal and partially outside the ear canal of the user (e.g. in concha or pinna or behind the ear). Further, the first hearing devices (HD.sub.1, HD.sub.2) each comprises EEG- or EOG-electrodes, or a mixture thereof (ExGe1, ExGe2), and a reference electrode (REFe1, REFe2), respectively, arranged on the outer surface of the respective ear pieces (EarP1, EarP2). Each of the first and second parts (P1, P2) comprises a number of EEG- and/or EOG-electrodes (ExGe1, ExGe2), here four are shown (but more or less may be present in practice depending on the application), and a reference electrode (REFe1, REFe2). Thereby a reference voltage (V.sub.REF2) picked up by the reference electrode (REFe2) of the second part (P2) can be used as a reference voltage for the EEG- or EOG-potentials (V.sub.ExG1i) picked up by the EEG- or EOG-electrodes (ExGe1) of the first part (P1), and vice versa. In an embodiment, the first and second hearing devices (HD.sub.1, HD.sub.2) implements a binaural hearing system, e.g. a binaural hearing aid system. The reference voltages (V.sub.REF1, V.sub.REF2) may be transmitted from one part to the other (P1<->P2) via electric interface EI (e.g. a wired or wireless connection, and optionally to or via auxiliary device PRO, cf. wired or wireless links WLCon). The two sets of EEG-signal voltage differences (V.sub.ExG1, V.sub.ExG2) can be used separately in each of the respective first and second hearing devices (HD.sub.1, HD.sub.2) (e.g. to control processing of an input audio signal, or to select one of a multitude of audio sources) or combined in one of the hearing devices and/or in the auxiliary device (PRO, e.g. for display and/or further processing).

(118) FIG. 8C shows a hearing system comprising a bimodal hearing system comprising an air conduction hearing device as well as a cochlear implant hearing device. FIG. 8C shows an embodiment of a hybrid (bimodal) hearing device (HD) according to the present disclosure comprising an external part (ITE) for acoustically stimulating an ear of a user and an implanted part (IMP) for electrically stimulating the auditory nerve of the (same) ear of the user. The hybrid hearing device comprises a part (ITE) adapted to be inserted into an ear canal of the user to allow acoustic stimulation of the ear drum (Ear drum) from a loudspeaker located in the ITE-part, or in another part of the hearing device in acoustic communication with the ITE-part (e.g. via a tube). The implanted part (IMP) is adapted for being implanted in the head of the user. The hearing device (HD) further comprises a BTE-part adapted for being located at the ear (Ear), e.g. behind the ear (BTE), and a further external part (ANT) electrically connected to the BTE-part via a connecting element CON1. The further external part (ANT) comprises antenna (e.g. an inductor coil) and transceiver circuitry for providing communication between the BTE-part (BTE) and the implanted part (IMP) through the skin (Skin) of the user. The BTE-part is further electrically connected to the ITE-part via a connecting element CON2. The implanted part (IMP) comprises antenna (e.g. an inductor coil) and transceiver circuitry allowing a wireless link to be established between the further external part (ANT) and the implanted part (IMP). The implanted part (IMP) is located under the skin of the user appropriately arranged relative to the further external part (ANT) to allow exchange of data between the two parts and to allow energy from the external part to the implanted part to be transferred. The ITE-part (ITE) and the implanted part (IMP) both comprise electrodes (E1, E2, E3, E4, E5, E6) for picking up (e.g. evoked) potentials (e.g. from the brain) or EOG potentials from the eyes of the user via electrical contact to skin and/or tissue (Skin/Tissue) at the ear or in the ear canal (electrodes E1, E2, E3, E4) of the user and to the skin and/or skull-tissue (Tissue) under the skin (electrodes E5, E6) of the user, respectively. The implanted part comprises an electrode array (EA) and a reference electrode (REF) electrically connected to the implanted part (IMP) via electrical connections CON3 and CON4, respectively (e.g. electrical conductors). The reference electrode (REF) is preferably separated from the electrode array (EA) by tissue (Tissue), e.g. in that the electrode array is inserted in cochlea (Cochlea) and the reference electrode is positioned outside cochlea.

(119) In the embodiment of FIG. 8C, the ITE-part comprises an electro-acoustic transducer (e.g. a loudspeaker) for acoustically stimulating a first (e.g. lower) frequency range via sound SED played in the residual volume in front of the ear drum and thereby conveyed to the ear drum and further to cochlea via the middle ear (Middle ear).

(120) In the embodiment of FIG. 8C, the implanted part comprises an electrode array (EA) for electrically stimulating a second (higher) frequency range via separate electric stimuli supplied to at least some of the electrodes of the electrode array when located at the auditory nerve, e.g. in cochlea (Cochlea).

(121) The hybrid hearing aid (HD) of FIG. 8C comprises a receiver in the ear (RITE) type hearing aid and a cochlear implant (CI) type hearing aid. The BTE-part (BTE) adapted for being located behind pinna services the ITE-part as well as the implanted part (IMP) via electrical connections CON2 and CON1, respectively. The BTE part (BTE) comprises two input transducers (here microphones) (M1, M2) each for providing an electric input audio signal representative of an input sound signal from a sound source (S) in the environment of the hearing device. The BTE-part further comprises two wireless receivers (WLR1, WLR2) for providing respective directly received auxiliary audio and/or information signals to/from other devices. The BTE-part (BTE) further comprises a substrate (SUB) whereon a number of electronic components are mounted, functionally partitioned according to the application in question (analogue, digital, passive components, etc.), but including a configurable signal processing unit (SPU), coupled to each other and to input and output units via electrical conductors Wx. The configurable signal processing unit (SPU) is adapted to provide separate first and second enhanced signals to the ITE-part and to the implanted part (IMP) allowing a separate (simultaneous) acoustic and electric stimulation, respectively, at an ear of the user. The ITE part (ITE) comprises an output unit in the form of a loudspeaker (receiver) for converting the first enhanced signal from the BTE-unit to an acoustic signal comprising frequencies in a first frequency range (providing, or contributing to, acoustic signal SED at the ear drum (Ear drum)). The implanted part (IMP) is configured to generate or convey electric stimuli representative of a second frequency range based on the second enhanced signal from the BTE-unit to the electrode array (EA).

(122) The BTE-part (BTE) further comprises a battery (BAT) for energizing electronic components of the hearing aid (e.g. including electronic components of the ITE-part, the ANT-unit and the implanted part (IMP)).

(123) D. Calculating the Auditory Display (Sound to be Presented to the Left and Right Ears) of the Enhanced Sound Scene and Transmitting this to a Pair of Hearing Aids.

(124) a. Calculate auditory display for acoustic output

(125) i. Calculate left and right ear head-related transfer functions (HRTFs) for each N acoustic objects. E.g. HRTF according to the CIPIC data base.

(126) ii. Calculate the augmented auditory display by (for each ear) summing the N acoustic object with the above weighting function, e.g. based on eye-gaze or face recognition, etc.

(127) b. Transmit the left and right output to the receiver for the corresponding ear

(128) i. Via electrical cables

(129) ii. Via wireless communication (e.g. from the sensor-integration device to the hearing instrument) to avoid cables [e.g. BLE, NFR, telecoil]

(130) FIG. 9A shows an embodiment of a hearing system comprising a hearing device according to the present disclosure,

(131) The hearing system comprises a hearing device (HD), e.g. a hearing aid, adapted to be worn by a user and configured to generate and update over time data representative of a map of a current environment of the user. The hearing device is configured to localize the user relative to one or more sound sources in the environment of the user in interaction with the map data. The hearing system comprises receiver circuitry for receiving signals from multitude of signal sources (AIU) and/or from one or more sensors (XS Rx/Tx, SU). The receiver circuitry provides corresponding electric input signals. The multitude of signal sources comprises a number N of localized sound sources (audio signals, sensor signals). The receiver circuitry may e.g. comprise a multitude M of microphones configured to be stationary relative to the user, each microphone being adapted to pick up sound from sound sources in the environment and for providing an electric input sound signal (IN.sub.1, . . . , IN.sub.M) comprising the sound. The audio input unit (AIU) comprises M input units (IU.sub.1, . . . , IU.sub.M), e.g. comprising M input transducers, e.g. microphones or a mixture of input transducers and wireless audio receivers, and a corresponding number of analogue to digital converters and analysis filter bank, as appropriate to provide the electric input sound signal (IN.sub.1, . . . , IN.sub.M) (Audio signals) in a time-frequency representation (as a number of time variant frequency sub-band signals).

(132) The hearing system (HD) further comprises a first processor (1.sup.st PRO (Map data) for, preferably continuously, estimating localization data relative to the user of the number N of, stationary or mobile, localized sound sources in the environment, such localization data for a given sound source e.g. comprising a direction of arrival of sound and/or a distance from said given sound source to the user, based on the multitude of electric input sound signals from the microphone array, and optionally, additionally, on sensor signals from the one or more sensors (and/or inputs UC from a user interface, UI).

(133) The first processor (1.sup.st PRO) is configured to (continuously, e.g. with a specific frequency/repetition rate) processing said electric input signals (audio signals and sensor signal) and providing data representative of an approximate, time dependent map of said confined environment comprising a present location of said number of, stationary or mobile, landmarks, including the N localized sound sources, and an estimate of a present location of the user relative to the number of landmarks in the confined environment. The first processor (1.sup.st PRO) may e.g. produce aid map data using a SLAM algorithm.

(134) The sensor unit (SU) may comprise a number of internal sensors and provide a number of corresponding internal sensor signals. Hearing device (HD) may further or alternatively comprise a number of receivers (XS, Rx/Tx, e.g. wireless receivers) for receiving signals from external sensors (e.g. sensors from a smartphone or signals from an FM-transmitter, etc.) and providing corresponding External sensor signals, to the sensor unit (SU). The sensor unit (SU) is configured to forwarding the internal and external sensor signals to the first processor (1.sup.st PRO), cf. bold arrow denoted Sensor signals (external/internal).

(135) The first processor (1.sup.st PRO) is configured to, preferably continuously, estimating one or more preferred sound sources that is of current interest to the user among the sound sources presently in the user's environment (e.g. fully or partially based on a control signal UC from a user interface (UI)). A resulting audio signal RES comprising said sound source (or weighted mixture of sound sources) of current interest to the user is provided by selector/mixer (SEL/MIX) based on the current p beamformed signals Y.sub.BF1, Y.sub.BF2, . . . , Y.sub.BFp (cf. e.g. beams FIG. 6 directed at sound sources of current interest S1-S5) from the beamformer filtering unit (BFU) and control signal SEL (e.g. based on or influenced by inputs (UC) from the user interface (UI)).

(136) The hearing system further (optionally) comprises a further processor (FP) for enhancing the preferred sound source(s) RES among said N sound sources and providing a processed electric output signal OUT. The processed electric output signal OUT is fed to an output unit (OU) (via synthesis filter bank (FBS) and digital to analogue conversion (DA) circuitry, as appropriate). The output unit provides stimuli perceivable to the user as sound (cf. Sound stimuli-out in FIG. 9B).

(137) FIG. 9B illustrates a second embodiment of a hearing device (HD) according to the present disclosure. The embodiment of FIG. 9B comprises the same functional elements as the embodiment of FIG. 9A. The hearing device, e.g. a hearing aid, (HD) comprises a forward path from a number M of input units (IU.sub.1, . . . , IU.sub.M) for picking up sound or receiving electric signals representing sound (Sound-in, Audio-in, respectively in FIG. 9B) to an output unit (OU) for providing stimuli (Sound stimuli-out) representing said sound and perceivable as sound by a user wearing the hearing device. The forward path further comprises a number M of analogue to digital converters (AD) and analysis filter banks (FBA) operationally coupled to each their input unit (IU.sub.1, . . . , IU.sub.M), where necessary, and providing respective digitized electric input signals IN.sub.1, . . . , IN.sub.M in time-frequency representation, each comprising a number K of frequency sub-band signals IN.sub.1(k,m), . . . , IN.sub.M(k,m), k and m being frequency and time indices, respectively, k=1, . . . , K. The forward path further comprises a beamformer filtering unit (BFU), or weighting unit, receiving the electric input signals IN.sub.1, . . . IN.sub.M as inputs and providing a resulting beamformed signals BF1, . . . , BFp (where p is number of beamformers provided by the hearing device), each providing a weighted combination of the M electric input signals. In other words, BFx=IN.sub.1(k,m)*w.sub.ix(k,m), . . . , IN.sub.M(k,m)*w.sub.Mk(k,m), where w.sub.ix. i=1, . . . , M, are real or complex (in general, time and frequency dependent) weights for a given beamformer BFx (x=1, p). The forward path further comprises a selector/mixer (SEL/MIX) for selecting the signal or combination of signals RES of current interest to the user. The selector (SEL) is controlled by the eye gaze control signal EOGCtr from the control unit (CONT). The forward path further comprises a signal processor (FP) for further processing the resulting signal RES, and providing a processed signal OUT. The signal processor (FP) is e.g. configured to apply a level and/or frequency dependent gain or attenuation according to a user's needs (e.g. hearing impairment), and/or to otherwise enhance the signal (RES). The forward path further comprises a synthesis filter bank (FBS) for converting frequency sub-band signals OUT to a single time-domain signal, and optionally a digital to analogue conversion unit (DA) to convert the digital processed time-domain signal to an analogue electric output signal to the output unit (OU).

(138) The hearing device (HD) further comprises a bio signal unit (BSU) for picking up bio signals from the user's body. The bio signal unit (BSU) comprises a sensor part (E.sub.1, E.sub.2, . . . , E.sub.N), adapted for being located at or in an ear and/or for fully or partially for being implanted in the head of a user. The sensor part comprises an electrical potential sensor for sensing an electrical potential from the body of the user, in particular from the head, e.g. due to brain activity or eye movement. In FIG. 9B, the sensor part is embodied as electrodes E.sub.1, E.sub.2, . . . , E.sub.N, which are electrodes of the hearing device configured to contact skin or tissue of the user's head (cf. e.g. FIG. 8A, 8B or 8C), when the hearing device is operationally mounted on the user (e.g. in an ear canal) or implanted in the head of the user. The bio signal unit (BSU) further comprises an amplifier (AMP), in the form of electronic circuitry coupled to the electrical potential sensor part to provide an amplified output. The amplifier, e.g. a differential amplifier, receives a number of potentials P.sub.1, P.sub.2, . . . , P.sub.N from the electrodes E.sub.1, E.sub.2, . . . , E.sub.N, and a reference potential P.sub.0 from a reference electrode (REF), and provides respective amplified voltages AV.sub.1, AV.sub.2, . . . , AV.sub.N. The amplified voltages are fed to respective analogue to digital converters (AD) providing digitized amplified voltages DAV.sub.i (i=1, 0.2 . . . . , N). In an embodiment, the amplifier (AMP) includes analogue to digital conversion or is constituted by analogue to digital converters.

(139) In an embodiment, at least one (such as all) of the input units comprises an input transducer, e.g. a microphone, for converting sound (Sound-in) to electric signals representing the sound. In an embodiment, at least one (such as all) of the input units comprises a wireless transceiver, e.g. a wireless receiver, e.g. configured to receive a signal (Audio-in) representative of sound picked up by a remote (wireless) microphone.

(140) The hearing device further comprises a detector unit (DET) comprising a number N.sub.DI of detectors for providing sensor data (signals DIS) representative of a current location of the user and/or of landmarks in the user's environment, e.g. representative of the user's head, in a fixed coordinate system (e.g. relative to a specific location, e.g. a room), or in movable coordinate system following the user. In an embodiment, the location sensor comprises a head tracker. In an embodiment, the location sensor comprises an accelerometer and a gyroscope. In an embodiment, the location sensor comprises a 9 degree of freedom sensor, comprising a 3D accelerometer, a 3D gyroscope, and a 3D magnetometer. The detector signals DIS are fed to the control unit (CONT) for comparison and processing, e.g. to provide or update the map data.

(141) The hearing device (HD) is configured to receive sensor signals from external sensors, e.g. regarding properties of the local environment, and/or from electromagnetic transmitters, e.g. FM-transmitters, Bluetooth transmitters, etc., e.g. via a wireless link (D-WL) and corresponding antenna and transceiver circuitry or radio receiver(s) (ANT Rx/Tx). In the hearing device (HD), these external sensor signals are denoted DXS (e.g. representing a number NDX of sensors) and fed to control unit (CONT) and used together with the detector signals DIS from the internal sensors to contribute to the generation of update of the map data.

(142) The hearing device further comprises a wireless transceiver (Rx/Tx) and appropriate antenna circuitry allowing reception of bio signals BioV from and transmission of bio signals BioV to another device, e.g. a contra lateral hearing device, e.g. amplified voltages AV.sub.1, AV.sub.2, . . . , AV.sub.N, e.g. representing eye movement, via a wireless link (X-WL), cf. waved, arrowed line denoted To/From other devices in FIG. 9B. The bio signals BioV from a contra-lateral hearing device are fed to calculation unit (CALC) and compared to the corresponding locally generated bio signal(s) DAV.sub.i (e.g. amplified voltages V.sub.1, V.sub.2, . . . , V.sub.N). In an embodiment, the EarEOG signal is a function (f) of a difference between the left and right amplified voltages AV.sub.left and AV.sub.right, EarEOG=f(AV.sub.leftAV.sub.right). In an embodiment, each pair of voltages, AV.sub.1,left and AV.sub.1,right, . . . , AV.sub.N,left and AV.sub.N,right, may provide corresponding ear EOG signals, e.g. EarEOG.sub.1=f(AV.sub.1,leftAV.sub.1,right), . . . , EarEOG.sub.1=f(AV.sub.N,leftAV.sub.N,right). In an embodiment, a resulting ear EOG signal at a given time may be found as an average (e.g. a weighted average; e.g. in dependence of the distance of the electrodes in question from the eyes) of the N ear EOG signals.

(143) The bio signal unit (BSU) and the calculation/filtering unit (CALC-FIL) form part of a user interface UIa. The hearing device or system may comprise a further user interface UIb in communication with the control unit (CONT) and e.g. allowing a user to influence the selection of one or more sound sources of current interest to the user (e.g. their mutual weight, if more than one sound source is of interest to the user at a given time), e.g. via an APP of a remote control device, e.g. a smartphone.

(144) The hearing device further comprises a processing unit (PU), equivalent to the 1.sup.st processor (1.sup.st PRO) of FIG. 9A, for providing a control signal for controlling a function of the hearing device based on the EarEOG signal(s), e.g., controlling the beamformer filtering unit (BFU), e.g. selecting one of a number of predefined beamformers in dependence of an eye gaze control signal EOGCtr (here in selector (SEL)). The predefined beamformers (BF1, BF2, . . . , BFp) may e.g. be stored in a memory of the hearing device (e.g. in the control unit CONT), cf. memory MEM), e.g. as sets of beamformer filtering coefficients, each corresponding to a given one of a number of predefined locations of a sound source of interest. The beamformers (and thus the sets of beamformer filtering coefficients or weights) are preferably updated using map data as proposed in the present disclosure. The processing unit (PU) comprises a calculation unit (CALC) configured to combine the sensor data DIS and DXS with the digitized amplified voltages DAV.sub.i (i=1, 0.2 . . . , N), representative of (ear) EEG and/or (ear) EOG signals, from the (local) bio signal unit (BSU) and received from a bio signal unit (BSU) of another device, e.g. a contra-lateral hearing device (cf. e.g. wireless link X-WL in FIG. 9B), to provide combined map data and resulting locations of the landmarks (current audio sources and the user). The processing unit (PU) may further comprise a Kalman filter (FIL) (or one or more Kalman filters) for filtering the location data and providing eye gaze angles in a fixed or relative coordinate system, cf. EOG data signal EOGD. The EOG data signal EOGD is forwarded to control unit CONT. Control unit (CONT) provides control signal EOGCtr to the selector (SEL) based on the EOG data signal EOGD. The control unit is further configured to determine updated beamformers (e.g. beamformer filtering coefficients) directed at the current location of sound sources relative to the user (in dependence of map data).

(145) In the embodiment of FIG. 9A or 9B, at least some of the input units (IU.sub.1, . . . , IU.sub.M) may be implemented as microphones (IT.sub.1, . . . , IT.sub.M). The output unit (OU) may e.g. be embodied in one or more of a loudspeaker, a vibrator for a bone-conducting hearing aid, and an electrode array of a cochlear implant hearing aid.

(146) FIG. 10 illustrates a user interface for a hearing system according to the present disclosure. FIG. 10 shows an embodiment of a hearing system implementing a binaural hearing system comprising left and right hearing devices (HD.sub.1, HD.sub.2) (e.g. hearing aids) in communication with an auxiliary device (AD), e.g. a cellphone, the auxiliary device comprising a user interface (UI) for the system, e.g. for viewing (and possibly influencing the selection of sound sources of current interest to the user among) the current sound sources (S.sub.S) in the environment of the binaural hearing system.

(147) The left and right hearing devices (HD.sub.1, HD.sub.2) are e.g. implemented as described in connection with FIG. 9A or 9B. The left and right hearing devices (HD.sub.1, HD.sub.2) and the auxiliary device (AD) each comprise relevant antenna and transceiver circuitry for establishing wireless communication links between the hearing devices (link IA-WL) as well as between at least one of or each of the hearing devices and the auxiliary device (link WL-RF). The antenna and transceiver circuitry in each of the left and right hearing devices necessary for establishing the two links is denoted RF-IA-RX/Tx-1, and RF-IA-RX/Tx-r, respectively, in FIG. 10. In an embodiment, the interaural link IA-WL is based on near-field communication (e.g. on inductive coupling), but may alternatively be based on radiated fields (e.g. according to the Bluetooth standard, and/or be based on audio transmission utilizing the Bluetooth Low Energy standard). In an embodiment, the link WL-RF between the auxiliary device and the hearing devices is based on radiated fields (e.g. according to the Bluetooth standard, and/or based on audio transmission utilizing the Bluetooth Low Energy standard), but may alternatively be based on near-field communication (e.g. on inductive coupling). The bandwidth of the links (IA-WL, WL-RF) is preferably adapted to allow sound source signals (or at least parts thereof, e.g. selected frequency bands and/or time segments) and/or localization parameters identifying a current location of a sound source (or other landmarks, and/or map data) to be transferred between the devices (e.g. to/from the auxiliary device). In an embodiment, processing of the system (e.g. sound source separation) and/or the function of a remote control is fully or partially implemented in the auxiliary device AD (e.g. a smartphone). In an embodiment, the user interface UI is implemented by the smartphone possibly running an APP (termed the SLAM app for Sound source localization and selection) allowing to control the functionality of the audio processing device via the smartphone, e.g. utilizing a display of the smartphone to implement a graphical interface (e.g. combined with text entry options).

(148) In an embodiment, the binaural hearing system is configured to allow a user to view a location of sound sources in the environment relative to the user (as e.g. shown in FIG. 2) via the user interface (UI) of the smartphone (which is convenient for viewing and interaction via a touch sensitive display, when the smartphone is held in a hand (Hand) of the user (U)). The sound sources LM1-LM5 displayed by the user interface (UI) are determined according to the present disclosure based on map data, e.g. using a SLAM algorithm. In the illustrated example in FIG. 10, the locality of 5 sound sources LM1, LM2, LM3, LM4 and LM5. In the example of FIG. 10, the user's movement over time relative to the sound sources are illustrated, and the user interface UI is configured to allow the user to indicate which one (or more) of the shown sound sources that are of a current interest (e.g. by clicking on the sound source symbol in question, and possibly indicating a weight of each source if more than one is selected). The binaural hearing system (including the auxiliary device) is configured to transmit information about the selected sound source(s) (and possibly their mutual relative weight) to the left and right hearing devices (HD.sub.1, HD.sub.2) to allow them to focus the processed output signals presented to the left and right ears of the user on the selected sound sources.

(149) It is intended that the structural features of the devices described above, either in the detailed description and/or in the claims, may be combined with steps of the method, when appropriately substituted by a corresponding process.

(150) As used, the singular forms a, an, and the are intended to include the plural forms as well (i.e. to have the meaning at least one), unless expressly stated otherwise. It will be further understood that the terms includes, comprises, including, and/or comprising, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being connected or coupled to another element, it can be directly connected or coupled to the other element but an intervening elements may also be present, unless expressly stated otherwise. Furthermore, connected or coupled as used herein may include wirelessly connected or coupled. As used herein, the term and/or includes any and all combinations of one or more of the associated listed items. The steps of any disclosed method is not limited to the exact order stated herein, unless expressly stated otherwise.

(151) It should be appreciated that reference throughout this specification to one embodiment or an embodiment or an aspect or features included as may means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the disclosure. The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects.

(152) The claims are not intended to be limited to the aspects shown herein, but is to be accorded the full scope consistent with the language of the claims, wherein reference to an element in the singular is not intended to mean one and only one unless specifically so stated, but rather one or more. Unless specifically stated otherwise, the term some refers to one or more.

(153) Accordingly, the scope should be judged in terms of the claims that follow.

REFERENCES

(154) EP3185590A1 (Oticon) 28 Jun. 2017 [Skoglund et al.; 2017] European patent application number 17179464.7 filed with the European Patent Office on 4 of Jul. 2017 and having the title Direction Of Arrival Estimation In Miniature Devices Using A Sound Sensor Array. [Farmani et al.; 2015] Mojtaba Farmani; Michael Syskind Pedersen; Zheng-Hua Tan; Jesper Jensen, Informed TDoA-based direction of arrival estimation for hearing aid applications, IEEE 2015 Global Conference on Signal and Information Processing (GLOBALSIP), p. 953-957 EP2813175A2 (Oticon) 17 Dec. 2014 US20140098981A1 (Oticon) 10 Apr. 2014 [Davison; 2003], A. J. Davison. Real-time simultaneous localisation and mapping with a single camera. In Proceedings of the 9th IEEE International Conference on computer vision, pages 1403-1410, Nice, France, 13-16 Oct. 2003. [Pax et al.; 2008] L. M. Paz, P. Pinies, J. D. Tards and J. Neira, Large-Scale 6-DOF SLAM With Stereo-in-Hand, in IEEE Transactions on Robotics, vol. 24, no. 5, pp. 946-957, October 2008. [Lupton & Sukkarieh; 2008] T. Lupton and S. Sukkarieh. Removing scale biases and ambiguity from 6DoF monocular SLAM using inertial. In Proceedings of the International Conference on Robotics and Automation (ICRA), pages 3698-3703, Pasadena, Calif., USA, 2008. IEEE. [Lorenz & Boyd; 2005] Robert G. Lorenz, and Stephen P. Boyd, Robust Minimum Variance Beamforming, IEEE Transactions on Signal Processing, Vol. 53, No. 5, May 2005. [Richardson & Spivey; 2004] Richardson D C, Spivey M J. Eye tracking: Research areas and applications. In: Wnek G E, Bowlin G L, editors. Encyclopedia of Biomaterials and Biomedical Engineering. London, UK: Taylor & Francis; 2004. pp. 573-582.

Hearing device or system adapted for navigation

Assignee

Inventors

Cpc classification

Classification Explorer

G02C11/06

PHYSICS

Classification Explorer

A61N1/36038

HUMAN NECESSITIES

Classification Explorer

G06F3/011

PHYSICS

Classification Explorer

H04R5/04

ELECTRICITY

Classification Explorer

H04R1/1041

ELECTRICITY

Classification Explorer

H04R2430/20

ELECTRICITY

Classification Explorer

H04R2225/39

ELECTRICITY

Classification Explorer

H04R25/606

ELECTRICITY

Classification Explorer

H04R5/027

ELECTRICITY

Classification Explorer

H04R25/407

ELECTRICITY

Classification Explorer

H04R25/554

ELECTRICITY

Classification Explorer

H04R2225/55

ELECTRICITY

Classification Explorer

H04R2460/07

ELECTRICITY

Classification Explorer

H04R2225/61

ELECTRICITY

Classification Explorer

G06F3/015

PHYSICS

Classification Explorer

G06F3/013

PHYSICS

Classification Explorer

G06F1/163

PHYSICS

Classification Explorer

H04R25/552

ELECTRICITY

Classification Explorer

H04R1/1083

ELECTRICITY

Classification Explorer

H04R25/43

ELECTRICITY

International classification

Classification Explorer

H04R25/00

ELECTRICITY

Classification Explorer

G02C11/06

PHYSICS

Classification Explorer

G06F3/01

PHYSICS

Classification Explorer

G06F1/16

PHYSICS

Classification Explorer

H04R1/10

ELECTRICITY