Hearing system configured to localize a target sound source
10945079 · 2021-03-09
Assignee
Inventors
Cpc classification
H04R1/1091
ELECTRICITY
H04R2430/20
ELECTRICITY
G10L21/02
PHYSICS
H04R25/407
ELECTRICITY
International classification
G10K11/34
PHYSICS
G10L21/02
PHYSICS
Abstract
A hearing system is adapted to be worn by a user and configured to capture sound in an environment of the user and comprises a) a sensor array comprising M transducers for providing M electric input signals representing said sound and having a known geometrical configuration relative to each other; b) a detector unit for detecting movements over time of the hearing system, and providing location data of said sensor array at different points in time t, t=1, . . . , N; c) a first processor for receiving said electric input signals andin case said sound comprises sound from a localized sound source Sfor extracting sensor array configuration specific data .sub.ij of said sensor array indicative of differences between a time of arrival of sound from said localized sound source S at said respective input transducers, at said different points in time t, t=1, . . . , N; and d) a second processor configured to estimate data indicative of a location of said localized sound source S relative to the user based on corresponding values of said location data and said sensor array configuration data at said different points in time t, t=1, . . . , N.
Claims
1. A hearing system adapted to be worn by a user and configured to capture sound in an environment of the user, the hearing system comprising a sensor array of M input transducers, where M2, each for providing an electric input signal representing said sound in said environment, said input transducers p.sub.i, i=1, . . . , M, of said array having a geometrical configuration relative to each other, when worn by the user, and a detector unit for detecting movements over time of the hearing system when worn by the user, and providing location data of said sensor array at different points in time t, t=1, . . . , N; a first processor for receiving said electric input signals and for extracting sensor array configuration specific data .sub.ij of said sensor array indicative of differences between a time of arrival of sound from said localized sound source S at said respective input transducers, at said different points in time t, t=1, . . . , N; a second processor configured to estimate data indicative of a location of said localized sound source S relative to the user based on corresponding values of said location data and said sensor array configuration specific data at said different points in time t, t=1, . . . , N.
2. A hearing system according to claim 1 wherein the detector unit is configured to detect rotational and/or translational movements of the hearing system.
3. A hearing system according to claim 1 wherein said data indicative of a location of said localized sound source S relative to the user at said different points in time t, t=1, . . . , N constitutes or comprises a direction of arrival of sound from said sound source S.
4. A hearing system according to claim 1 wherein said data indicative of a location of said localized sound source S relative to the user at said different points in time t, t=1, . . . , N comprises coordinates of said sound source relative said user, or direction of arrival of sound from and distance to said sound source relative said user.
5. A hearing system according to claim 1 wherein said detector unit comprises a number of IMU-sensors including at least one of an accelerometer, a gyroscope and a magnetometer.
6. A hearing system according to claim 5 wherein at least one of said IMU-sensors is located in a separate device.
7. A hearing system according to claim 1 wherein said second processor is configured to estimate data indicative of a location of said localized sound source S relative to the user based on the following expression for stacked residual vectors r(S.sup.e) originating from said time instances t=1, . . . , N
r(S.sup.e)=y.sub.t.sup.ijh.sub.ij(S.sup.e,R.sub.t,T.sub.t.sup.e) where S.sup.e represent the position of said sound source in an inertial frame of reference, R.sub.t and T.sub.t.sup.eare matrices describing a rotation and a translation, respectively, of the sensor array with respect to the inertial frame at time t, and y.sub.t.sup.ij=.sub.ij+e.sub.t represent said sensor array configuration specific data, where .sub.ij represent said differences between a time of arrival of sound from said localized sound source Sat said respective input transducers i, j, and e.sub.t represents measurement noise, where (i,j)=1, . . . , M, j>i, wherein h.sub.ij is a model of the time differences .sub.ij between each microphone pair P.sub.i and p.sub.j.
8. A hearing system according to claim 7 wherein the second processor is configured to solve the problem represented by the stacked residual vectors r(S.sup.e) in a maximum likelihood framework.
9. A hearing system according to claim 7 wherein the second processor is configured to solve the problem represented by the stacked residual vectors r(S.sup.e) using an Extended Kalman filter (EKF) algorithm.
10. A hearing system according to claim 1 comprising first and second hearing devices, adapted to be located at or in left and right ears of the user, or to be fully or partially implanted in the head at the left and right ears of the user, each of the first and second hearing devices comprising at least one input transducer for providing an electric input signal representing sound in said environment, at least one output transducer for providing stimuli perceivable to the user as representative of said sound in the environment, wherein said at least one input transducer of said first and second hearing devices constitutes or form part of said sensor array.
11. A hearing system according to claim 10 wherein each of the first and second hearing device comprises circuitry for wirelessly exchanging said electric input signals, or parts thereof, with the other hearing device, and/or with an auxiliary device.
12. A hearing system according to claim 10 wherein the first and second hearing devices are constituted by or comprises respective first and second hearing aids.
13. A hearing system according to claim 1 comprising a hearing aid, a headset, an earphone, an ear protection device or a combination thereof.
14. A hearing system according to claim 1 comprising an auxiliary device comprising said second processor.
15. A hearing system according to claim 1 comprising a carrier configured to carry at least some of the M input transducers of the sensor array, wherein the carrier has a dimension larger than 0.10 m.
16. A hearing system according to claim 15 wherein the carrier may be configured to carry at least some of the sensors of the detector unit.
17. A hearing system according to claim 1 the number M input transducers is larger than or equal to 8.
18. A hearing system according to claim 1 comprising one or more cameras.
19. A hearing system according to claim 1 comprising a number of EOG sensors or an eye tracking camera for eye-tracking, and a scene camera for Simultaneous Localization and Mapping (SLAM) combined with a number of Inertial Measurements Units (IMUs) for motion tracking/head rotation.
20. A method of operating a hearing system adapted to be worn by a user and configured to capture sound in an environment of the user, when said hearing system is operationally mounted on the user, the hearing system comprising sensor array of M input transducers, where M2, each for providing an electric input signal representing said sound in said environment, said input transducers p.sub.i, i=1, . . . , M, of said array having a geometrical configuration relative to each other, when worn by the user, the method comprising detecting movements over time of the hearing system when worn by the user, and providing location data of said sensor array at different points in time t, t=1, . . . , N; when said sound comprises sound from a localized sound source S extracting sensor array configuration specific data .sub.ij of said sensor array indicative of differences between a time of arrival of sound from said localized sound source S at said respective input transducers, at said different points in time t, t=1, . . . , from said electric input signals; and estimating data indicative of a location of said localized sound source S relative to the user based on corresponding values of said location data and said sensor array configuration specific data at said different points in time t, t=1, . . . , N.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1) The aspects of the disclosure may be best understood from the following detailed description taken in conjunction with the accompanying figures. The figures are schematic and simplified for clarity, and they just show details to improve the understanding of the claims, while other details are left out. Throughout, the same reference numerals are used for identical or corresponding parts. The individual features of each aspect may each be combined with any or all features of the other aspects. These and other aspects, features and/or technical effect will be apparent from and elucidated with reference to the illustrations described hereinafter in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12) The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out. Throughout, the same reference signs are used for identical or corresponding parts.
(13) Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.
DETAILED DESCRIPTION OF EMBODIMENTS
(14) The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. Several aspects of the apparatus and methods are described by various blocks, functional units, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as elements). Depending upon particular application, design constraints or other reasons, these elements may be implemented using electronic hardware, computer program, or any combination thereof.
(15) The electronic hardware may include microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure. Computer program shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
(16) The present application relates to the field of hearing devices, e.g. hearing aids, to hearing systems, e.g. to binaural hearing aid systems
(17) Direction Of Arrival (DOA) estimation and source-location estimation are becoming increasingly important. Some examples are power saving and user tracking in WiFi access points and Mobile cell towers, detection and tracking of acoustic sources. With modern array processing techniques applications such as Massive Multiple Input Output (M-MIMO) and Active Electronically Scanned Array (AESA) Radars can steer the output energy or the antenna sensitivity in the desired direction. Both AESA and M-MIMO are based on planar arrays yielding directionality in azimuth and elevation. However, some system may be limited to linear arrays for computing the DOA, e.g., Binural Hearing Aid Systems (HAS) which use one microphone per ear and towed arrays in deep-sea exploration can only estimate one angle.
(18) In this disclosure, linear arrays with two or more sensors receiving a signal from a source are considered. When the sensors are equidistantly spaced a so called uniform linear array (ULA) is obtained and it gives a uniform spatial sampling of the wavefield. This sampling eases non-parametric narrowband DOA methods, such as MUltiple SIgnal Classification (MUSIC) and Minimum Variance Distortionless Response (MVDR), as they seek the direction with strongest power.
(19) To overcome the limitations of linear arrays several methods has been proposed in order to estimate the 3D source direction or its full position. A chest-worn planar microphone array may be used to estimate the direction, while Head-Related Transfer Functions (HRTFs) are used to estimate the position.
(20) The proposed methods utilize the geometrical properties of the array when subject to motion. The aperture is the space occupied by the array and the simple idea utilized here is that the motion of the array synthesize a larger space. A nonlinear least-squares (NLS) formulation utilizing known motion is proposed and two sequential solutions are proposed. The formulation is extended to include uncertainty in the motion allowing estimation of source locations and the motion simultaneously.
(21)
(22)
(23) The setting illustrated in
(24) When the sources are not perpendicular to the array, the distance between the sensors and the source will be different resulting in a time difference in the received signals. With known speed of the medium (here e.g. air), the time difference can be converted to a distance and with known separation between the sensors, the angle to the source can be calculated.
(25)
(26) For simplicity, a free field assumption is made which result in unobstructed waves impinging the array. It is also assumed that wave-front is planar. When the sources are not perpendicular to the array the distance between the sensors and the source will be different resulting in a time difference in the received signals. With known speed of the medium the time difference can be converted to a distance and with known separation between the sensors the angle to the source can be calculated.
(27) When the sensors are not necessarily equidistantly spaced the DOA on a linear sensor array, as illustrated in
(28)
(29) where [90, 90] is the DOA, .sub.ij is the time difference of between the signal at each sensor p.sub.i and p.sub.j with distance p.sub.i-p.sub.j, and c is the transmission speed of the medium (e.g. air). Time difference measurements can be for instance obtained with time-domain methods based Generalized Cross Correlation (cf. e.g. [Knapp & Carter; 1976]).
(30) A common setting is to consider the array and DOA source all lying in the same plane (e.g. the xy-plane in .sup.3 and the source as a point in the same space, as illustrated in
(31) The source direction then has two degrees of freedom (DOF), namely, the azimuth () and polar (or elevation) () angles, see e.g.
(32) A body fixed coordinate (b) frame containing the array at which the sensor nodes are located with X.sup.b in .sup.3 is defined. The orientation of the b frame with respect to an inertial frame of reference (e) is described with a rotation matrix {R
.sup.33; det R=1; R.sup.T=R.sup.1}. Hence, for pure orientation changes, vectors between these frames are related by X.sup.b=RX.sup.e and trivially X.sup.e=R.sup.1 X.sup.b=R.sup.T X.sup.b. Denote the translation, i.e., the position, of the array vector with T.sup.e
.sup.3 and the position of point source by S.sup.e
.sup.3, then the source expressed in the b frame is
S.sup.b=R(S.sup.eT.sup.e).(2)
(33) This rigid body transformation of the array vector and the position of the source is illustrated in
(34)
(35) Let the pairwise difference between the Mnodes be denoted by X.sub.ij.sup.b=p.sub.ip.sub.j .sup.3, (i, j)=1, . . . , M, j>i. The DOA in the b-frame is the scalar product between the vectors X.sub.ij.sup.b and S.sup.b. Using eq. (1), the time difference measurement can be expressed as
(36)
(37) where h.sub.ij is a model of the time differences .sub.ij between each microphone pair p.sub.i and p.sub.j. Thus, the time difference between each node pair can be expressed as a nonlinear function of the source position, the array length, its position and orientation. Furthermore, with S.sup.e=[x,y,z], the azimuth and elevation angles can be defined as
(38)
respectively.
(39) The unknown variable S.sup.e only has two DOF since distance is not observed and it is therefore convenient to assume S.sup.e=1. In this case, the DOA measurements and the measurement function corresponds to a system of nonlinear equations.
(40) Rotation only: If there is no translation i.e., T.sub.t.sup.e=0, t=1, . . . , N, then the distance to the source cannot be found. Hence, S.sup.e has two DOF and can only be determined up to an unknown scale. In the case that there is only one measurement, N=1, the nonlinear system is underdetermined since max rank H=1. In the case N2, there exists a search direction, by the corresponding normal equations, only if rank H=2, since this is also the DOF of the unknown parameter S.sup.e. The rank of the Jacobian is a function of the rotation and the location of the source.
(41) As discussed earlier, the general DOA problem has geometrical ambiguities resulting in rotational invariance for certain configurations. This invariance means that DOA remains the same since the relative distance to the source is not changed by the rotation.
(42) A rotation around the DOA array itself corresponds to a change in pitch. This is because any vector is rotationally invariant to rotations around its own axis i.e., X.sup.b=R(X.sup.b)X.sup.b, where R(X.sup.b) denotes a rotation around the vector X.sup.b. Thus, for rotations around the DOA array the two angles to the source cannot be resolved.
(43) Rotation and translation: When there is translation of the array, then all three DOF of S.sup.e can be considered on the basis of triangulation. Assume that X.sup.b undergoes known rotation and translation {R.sub.t, T.sub.t.sup.e, t=1, . . . , N} and there is a set of DOA measurements, as before. The corresponding measurement function (3) is parametrized by h(S.sup.e, R.sub.t; T.sub.t.sup.e). The basic requirement is that the number of measurements are greater or equal than the DOF, i.e., N3. The motion resulting in rank H<3 from which a search direction cannot be found is translation along vectors parallel to S.sup.e-T.sup.e with any rotation. This result is immediate from (2) since the only information about S.sup.e that affects the measurements (3) are related to orientation changes. From the discussion, it was established that orientation could only contribute to finding two DOF of S.sup.e. The intuition is that such motion does not result in any parallax which is needed for triangulation.
(44) Estimation:
(45) Assume that all rotations and translations (the pose trajectory) {R.sub.t, T.sub.t.sup.e, t=1, . . . , N} of the array vector X.sup.b are available (e.g. from movement monitoring sensors, such as IMUs), and there is a corresponding set of time difference measurements (e.g. based on maximizing respective correlation estimates between the signals in question)
{y.sub.t.sup.ij=.sub.ij+e.sub.t,(i,j)=1,. . . M,j>i,t=1,. . . ,N}
(46) Here y.sub.t.sup.ij is the measurement at the i.sup.th node compared to node j at time t such that j>i and e.sub.t is noise. The collection of measurements at each time t is called a snap-shot. With a stationary source S.sup.e the stacked residual vector for one time instant t=1 can be written as
(47)
(48) And by stacking the N residual vectors (for t=1, . . . , N), we obtain
r(S.sup.e)=[r.sub.1(S.sup.e).sup.T, . . . , r.sub.N(s.sup.e).sup.T].sup.T(5)
(49) where r(S.sup.e) .sup.B1 and B=N.sub.i=1.sup.M1i. The squared from of (5) is
V(S.sup.e)=r(S.sup.e).sub.2.sup.2(6)
(50) which is nonlinear least-squares (NLS) formulation. NLS problems are readily solved using e.g., the Levenberg-Marquardt (LM) method, cf. e.g. [Levenberg; 1944], [Marquardt; 1963]. LM uses only gradient information to perform a quasi-Newton search. The gradient of (6) is
(51)
(52) where H is the Jacobian, i.e., the matrix of first order partial derivatives dr(Se)
(53)
(54) It is also preferable to use a weighting strategy for the NLS problem by taking into account that the measurement noise may vary over the time, and/or be different. The corresponding residuals in (6) are then weighted by the inverse of the measurement covariance r.sub.iR.sub.i.sup.1 or the whole batch as
V.sub.R(S.sup.e)=r(S.sup.e).sub.R.sub.
(55) where R=diag(R.sub.1, . . . , R.sub.B). When the measurement errors are Gaussian, e.sub.t (0, R), then cost function (7) corresponds to the Maximum Likelihood (ML) criterion.
(56) The array is said to be unambiguous if the spatial distribution of the nodes yields a well-defined estimation problem. It turns out that there are two motions for which the array is ambiguous and the S.sup.e cannot be estimated. The first is rotation only (RO) for which only the source direction can be found as long as the rotation is not around the array axis. The second is rotation and translation (RT) of the array. From such general motion the source location is implicitly triangulated by the NLS solution as long as the translation is non-parallel to S.sup.e-T.sup.e.
(57) Target tracking and SLAM: With the NLS problem defined for a stationary source and known motion of the array, it is straightforward to define more challenging cases. If the source is allowed to move, then the parameter S.sup.e is changed to be time-varying S.sub.t.sup.e, t=1, . . . , Nin eq. (6) and the problem is that of target tracking. This is not well-defined since there are more DOFs in the parameter than what can be obtained in the measurements. A remedy may be to include a dynamic model of the parameter into the residual.
(58)
(59) where
X.sub.t+1=vec S.sub.i.sup.e,i=2, . . . , N+1,F=I.sub.3N,X.sub.t=vec S.sub.i.sup.e,i=1, . . . , N
(60) And Q is a diagonal covariance matrix of appropriate dimension. In an embodiment, Q is large.
(61) When there is uncertainty in both the position of sources and the motion of the array a Simultaneous Localization and Mapping (SLAM) problem is obtained. The Maximum Likelihood (ML) version of SLAM does not consider any motion model and thus the following NLS problem is obtained
V.sub.R(S.sub.k.sup.e,T.sub.t.sup.e,R.sub.t)=r(S.sub.k.sup.E,T.sub.t.sup.e,R.sub.t).sub.R.sub.
and there are K stationary sources S.sub.k.sup.e, k=1, . . . , K. This kind of formulation is common in computer vision where it is called Bundle Adjustment.
(62) Sequential solutions: In many applications it is desired to process data in an on-line fashion. By construction, NLS is an off-line solution but sequential recursive methods are easily derived thereof. A well known algorithm is the Extended Kalman filter (EKF) [Jazwinski; 1970], which can be viewed as a special case of NLS without iterations. This naturally leads to iterated solutions which, in general, result in an increased performance. In order to compute a search direction for the RO case, at least two snapshots are needed at each update. Similarly, at least three snapshots are needed in the RT case.
(63) Sequential Nonlinear Least-Squares: A simple sequential NLS (S-NLS) solution can be done as follows. Given an initial guess (x).sup.0 of the unknown parameter x then, for an appropriate number of snapshots iterate
x.sub.i+1=x.sub.i+.sub.i(H.sup.TH).sup.1Hr(10)
(64) until convergence. Here H and r are parametrized by the current iterate x.sub.i, and .sub.i [0, 1] is a step-size, which can be computed with e.g., backtracking. In the RO case (x=S.sup.e), then x can only be estimated up to scale and therefore the estimate should be normalized at each iteration as
(65)
(66) Iterated Extended Kalman filter: State space models are an important tool as they admit dynamic assumptions on the otherwise stationary parameter through a process model. As usual, the state is assumed to evolve according to some process model
x.sub.t+1=f(x.sub.t,w.sub.t),(12)
(67) where w.sub.t is process noise. The iterated Extended Kalman filter (IEKF) can be seen as an NLS solver for state space models. IEKF generally obtains smaller residual errors and is to prefer over the standard EKF when the nonlinearities are severe and computational resources are available. The iterations are performed in the measurement update where the Minimum a posteriori (MAP) cost function is minimized with respect to the unknown state. The cost function can be used to ensure cost decrease and when the iterations should terminate. A basic version of the measurement update in IEKF is summarized in Algorithm 1. For a complete description and other options.
(68) Algorithm 1 Iterated Extended Kalman Measurement Update:
(69) Require an initial state, {circumflex over (x)}.sub.0|0=(x).sup.0T.sup.e, and an initial state covariance, {circumflex over (P)}.sub.0|0.
(70) 1. Measurement update iterations
(71)
(72) 2. Update the state and the covariance
{circumflex over (x)}.sub.t|t=x.sub.i+1,(14a)
{circumflex over (P)}.sub.t|t=(IK.sub.iH.sub.i){circumflex over (P)}.sub.t|t1(14b)
(73) Example Stationary Target
(74) With a stationary target initialized at S.sup.e=[10, 10, 10].sup.T+w, where w (0.sub.31, I.sub.3), the cases of rotation only (RO) and rotation and translation (RT) are evaluated in a Monte Carlo (MC) fashion. For each case, the measurements are from an array with M=2 with p.sub.1P.sub.2=0.3 giving y.sub.t=.sub.12+e.sub.t, t=1, . . . , 31, where e.sub.t
(0,0.01). The rotation sequence is given by a roll pitch and yaw motion as R.sub.t=[0, 0, 0].sup.T.fwdarw.[30, 30, 30].sup.T [] in increments of one degree. The translation sequence is T.sub.t.sup.e=[0, 0, 0].sup.T.fwdarw.[0, 0.3, 0.3].sup.T [m] in increments of 0.01 m for the yz coordinates. For both cases, twenty runs where made and all estimators where run until no significant progress could be made. The dynamic model used in IEKF is constant position x.sub.t+1=x.sub.t+w.sub.t, where w.sub.t
(0, Q=0.01I.sub.3). The measurement covariance R=0.01I, where I is either I.sub.2 for RO or I.sub.3 for RT. For all three methods, a fixed step size =0.5 where chosen, and the initial point in each MC iterate was (S.sup.e).sup.0=S.sup.3+w.sup.init, where w.sup.init
(0, 0.5.sup.2I.sub.3). In Table 1, the RMSE over the MC estimation results from the proposed methods on the two cases are shown. All three methods work fine and, as expected, the two sequential solutions perform slightly worse than NLS.
(75) TABLE-US-00001 TABLE 1 RMSE of estimates obtained with the proposed methods for the case of rotation only and the case of rotation and translation. Method/Case NLS S-NLS IEKF RO 0.0069 0.1526 0.2222 RT 0.5737 0.7298 0.6762
(76) Example (Fixed Microphone Distance):
(77) The direction of arrival (DOA) of a soundwave, assumed to be a free-field and planar wave front, impinging the array can be described by
(78)
(79) Where represents the DOA, R is the 3D orientation of the array, S.sup.e (=(x.sub.s, y.sub.s, z.sub.s) in
(80)
(81) where the y's are the DOA measurements found via e.g., delay-and-sum or beamforming. Then the two-norm of the residual vector r(S.sup.e) can be solved for in two scenarios: 1. Given two, or more, DOA measurements from distinct orientations, which are not a rotation around the array axis X.sup.b, then the corresponding equation system can be solved with respect to S.sup.e. In this scenario, only the direction, , to the source can be found, i.e., not the distance r. This method requires that the orientation of the array can be computed. This can be done using inertial measurement units (IMU), e.g. a 3D-gyroscope and/or a 3D-accelerometer. 2. Given three, or more, DOA measurements at distinct positions, and the translation is not along the DOA vector, then the corresponding equation system can be solved with respect to S.sup.e. In this scenario the full three degrees of freedom of the system can be found. This method requires that the position of the array can be computed. This can be done using the IMU over short time intervals.
(82) The minimization procedure can be any nonlinear least squares (NLS) method such as Levenberg-Marquardt or standard NLS with line-search.
(83)
(84) The hearing system further comprises a detector unit (DET) (or is configured for receiving corresponding signals from separate sensors) for detecting movements over time of the hearing system when worn by the user, and providing location data of said sensor array at different points in time t, t=1, . . . , N. The detector (DET) provides data indicative of a track of the user (hearing system) relative to the sound source (cf. signal(s) trac, e.g. from Q different sensors or comprising Q different signals)
(85) The hearing system further comprises a first processor (PRO1) for receiving said electric input signals andin case said sound comprises sound from a localized sound source Sfor extracting sensor array configuration specific data .sub.ij (cf. signal tau) of the sensor array indicative of differences between a time of arrival of sound from the localized sound source S at said respective input transducers (M1, M2), at different points in time t, t=1, . . . , N.
(86)
(87) The hearing system further comprises a second processor (PRO2) configured to estimate data indicative of a location of said localized sound source S relative to the user based on corresponding values of said location data and said sensor array configuration data at said different points in time t, t=1, . . . , N. The data indicative of a location of said localized sound source S relative to the user may e.g. be a direction of arrival (cf. signal doa from the processor (PRO2) to the beamformer filtering unit BF)
(88) The embodiment of a hearing system in
(89) The embodiment of a hearing system in
(90) The embodiment of a hearing system in
(91)
(92) The hearing aid (HD) exemplified in
(93) In an embodiment, the hearing device (HD) of
(94) The hearing aid (HD) may e.g. comprise a directional microphone system (including a beamformer filtering unit) adapted to spatially filter out a target acoustic source among a multitude of acoustic sources in the local environment of the user wearing the hearing aid, and to suppress noise from other sources in the environment. The beamformer filtering unit may receive as inputs the respective electric signals from input transducers IT.sub.11, IT.sub.12, IT.sub.2 (and possibly further input transducers) (or any combination thereof) and generate a beamformed signal based thereon. In an embodiment, the directional system is adapted to detect (such as adaptively detect) from which direction a particular part of the microphone signal (e.g. a target part and/or a noise part) originates. In an embodiment, the beam former filtering unit is adapted to receive inputs from a user interface (e.g. a remote control or a smartphone) regarding the present target direction. A memory unit (MEM) may e.g. comprise predefined (or adaptively determined) complex, frequency dependent constants (Wi.sub.j) defining predefined (or adaptively determined) or fixed beam patterns (e.g. omni-directional, target cancelling, pointing in a number of specific directions relative to the user), together defining a beamformed signal Y.sub.BF.
(95) The hearing aid of
(96) The hearing aid (HD) according to the present disclosure may comprise a user interface UI, e.g. as shown in
(97)
(98) The left and right hearing devices each comprise a forward path between M input units IU, i=1, . . . , M (each comprising e.g. an input transducer, such as a microphone or a microphone system and/or a direct electric input (e.g. a wireless receiver)) and an output unit (SP), e.g. an output transducer, here a loudspeaker. A beamformer or selector (BF) and a signal processor (SPU) is located in the forward path. In an embodiment, the signal processor is adapted to provide a frequency dependent gain according to a user's particular needs. In the embodiment of
(99)
(100)
(101) In the embodiment of a hearing device in
(102) The hearing system (here, the hearing device HD) further comprises a detector unit comprising one or more inertial measurement units (IMU), e.g. a 3D gyroscope, a 3D accelerometer and/or a 3D magnetometer, here denoted IMU1 and located in the BTE-part (BTE). Inertial measurement units (IMUs), e.g. accelerometers, gyroscopes, and magnetometers, and combinations thereof, are available in a multitude of forms (e.g. multi-axis, such as 3D-versions), e.g. constituted by or forming part of an integrated circuit, and thus suitable for integration, even in miniature devices, such as hearing devices, e.g. hearing aids. The sensor IMU1 may thus be located on the substrate (SUB) together with other electronic components (e.g. MEM, FE, DSP). One or more movement sensors (IMU) may alternatively or additionally be located in or on the ITE part (ITE) or in or on the connecting element (IC).
(103) The hearing device (HD) further comprises an output unit (e.g. an output transducer) providing stimuli perceivable by the user as sound based on a processed audio signal from the processor or a signal derived therefrom. In the embodiment of a hearing device in
(104) An auxiliary electric signal derived from visual information from video camera VC may be used in a mode of operation where it is combined with an electric sound signal from one of more of the input transducers (e.g. the microphones) to localize sound sources relative to the user. In another mode of operation, the a beamformed signal is provided by appropriately combining electric input signals from the input transducers (M.sub.BTE1, M.sub.BTE2, M.sub.BTE3, M.sub.ITE1, M.sub.ITE2), e.g. by applying appropriate complex weights to the respective electric input signals (beamformer). In a mode of operation, the auxiliary electric signal is used as input to a processing algorithm (e.g. a single channel noise reduction algorithm) to enhance a signal of the forward path, e.g. a beamformed (spatially filtered) signal.
(105) The electric input signals (from input transducers M.sub.BTE1, M.sub.BTE2, M.sub.BTE3, M.sub.ITE1, M.sub.ITE2) may be processed in the time domain or in the (time-) frequency domain (or partly in the time domain and partly in the frequency domain as considered advantageous for the application in question).
(106) The hearing device (HD) exemplified in
(107) The hearing device in
(108)
(109) The hearing system in
(110) It is intended that the structural features of the devices described above, either in the detailed description and/or in the claims, may be combined with steps of the method, when appropriately substituted by a corresponding process.
(111) As used, the singular forms a, an, and the are intended to include the plural forms as well (i.e. to have the meaning at least one), unless expressly stated otherwise. It will be further understood that the terms includes, comprises, including, and/or comprising, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being connected or coupled to another element, it can be directly connected or coupled to the other element but an intervening element may also be present, unless expressly stated otherwise. Furthermore, connected or coupled as used herein may include wirelessly connected or coupled. As used herein, the term and/or includes any and all combinations of one or more of the associated listed items. The steps of any disclosed method is not limited to the exact order stated herein, unless expressly stated otherwise.
(112) It should be appreciated that reference throughout this specification to one embodiment or an embodiment or an aspect or features included as may means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the disclosure. The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects.
(113) The claims are not intended to be limited to the aspects shown herein, but is to be accorded the full scope consistent with the language of the claims, wherein reference to an element in the singular is not intended to mean one and only one unless specifically so stated, but rather one or more. Unless specifically stated otherwise, the term some refers to one or more.
(114) Accordingly, the scope should be judged in terms of the claims that follow.
REFERENCES
(115) [Jazwinski; 1970] Andrew H. Jazwinski, Stochastic Processes and Filtering Theory, vol. 64 of Mathematics in Science and Engineering, Academic Press, Inc, 1970.
(116) [Knapp & Carter; 1976] C. Knapp and G. Carter, The generalized correlation method for estimation of time delay, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, no. 4, pp. 320-327, August 1976.
(117) [Levenberg; 1944] Kenneth Levenberg, A method for the solution of certain non-linear problems in least squares, Quarterly Journal of Applied Mathematics, vol. II, no. 2, pp. 164-168, 1944.
(118) [Marquardt; 1963] Donald W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, SIAM Journal on Applied Mathematics, vol. 11, no. 2, pp. 431-441, 1963.
(119) EP2701145A1 (Oticon, Retune) Feb. 26, 2014.
(120) EP3267697A1 (Oticon) Jan. 1, 2018.