Hearing device configured to utilize non-audio information to process audio signals

11689869 · 2023-06-27

Assignee

Inventors

Cpc classification

International classification

Abstract

A hearing device, e.g. a hearing aid, is configured to be worn by a user, e.g. fully or partially on the head of the user, comprises a) an input transducer for converting a sound comprising a target sound from a target talker and possible additional sound in an environment of the user, when the user wears the hearing device, to an electric sound signal representative of said sound, b) an auxiliary input unit configured to provide an auxiliary electric signal representative of said target signal or properties thereof, c) a processor connected to said input transducer and to said auxiliary input unit, and wherein said processor is configured to apply a processing algorithm to said electric sound signal, or a signal derived therefrom, to provide an enhanced signal by attenuating components of said additional sound relative to components of said target sound in said electric sound signal, or said signal derived therefrom. The auxiliary electric signal is derived from visual information, e.g. from a camera, containing information of current vibrations of a facial or throat region of said target talker, and the processing algorithm is configured to use the auxiliary electric signal or the signal derived therefrom to provide the enhanced signal.

Claims

1. A hearing device configured to be worn by a user or to be fully or partially implanted in the head of the user, the hearing device comprising: at least one input transducer for converting a sound comprising a target sound from a target talker and possible additional sound in an environment of the user, when the user wears the hearing device, to an electric sound signal representative of said sound, an auxiliary input unit configured to provide an auxiliary electric signal representative of said target sound or properties thereof, a processor connected to said input transducer and to said auxiliary input unit, said processor being configured to apply a processing algorithm to said electric sound signal, or a signal derived therefrom, to provide an enhanced signal by attenuating components of said additional sound relative to components of said target sound in said electric sound signal, or in said signal derived therefrom, wherein said auxiliary electric signal in the form of a vibration signal is derived from a video signal from a camera providing visual information containing information of current mechanical vibrations of a facial or throat region of said target talker originating from the vocal cord of the target talker, and wherein said processing algorithm is configured to use said auxiliary electric signal in the form of a vibration signal or said signal derived therefrom to provide said enhanced signal.

2. A hearing device according to claim 1 comprising a light sensitive sensor for providing said visual information.

3. A hearing device according to claim 1 comprising a camera for providing said visual information.

4. A hearing device according to claim 3 comprising a carrier whereon said camera is mounted.

5. A hearing device according to claim 4 wherein said carrier comprises a housing of the hearing device, a spectacle frame, or a boom of a headset, an article of clothing, or a clip.

6. A hearing device according to claim 1 comprising a user interface allowing a user to indicate a direction to or a location of a target talker of current interest to the user.

7. A hearing device according to claim 1 comprising a filter bank for decomposing the electric sound signal in frequency sub-bands, at least providing a low-frequency part and a high-frequency part of the electric sound signal.

8. A hearing device according to claim 7 comprising an adaptive filter and a combination unit for estimating a low-frequency part ŝ.sub.LF of the enhanced signal from said low frequency part x.sub.LF of the electric sound signal and said auxiliary electric signal e.sub.s.

9. A hearing device according to claim 8 comprising a synthesis filter bank or a sum unit for providing said enhanced signal ŝ from said low-frequency part ŝ.sub.LF and a high-frequency part ŝ.sub.LF of the enhanced signal.

10. A hearing device according to claim 1 comprising a voice activity detector for providing a voice activity indicator representing an estimate of whether or not, or with what probability, an input signal comprises a voice signal at a given point in time, and wherein said voice activity indicator is determined in dependence of said auxiliary electric signal or said signal derived therefrom.

11. A hearing device according to claim 1 comprising a face tracking algorithm to extract features of the face region of a person in a field of view of the camera.

12. A hearing device according to claim 1 comprising an output unit for providing stimuli perceivable as sound to a user based on said enhanced signal ŝ.

13. A hearing device according to claim 1 configured to provide that the use of the auxiliary electric signal in the providing of the enhanced signal is only enabled, when vibrations in the facial or throat region are above a certain threshold taken to be due to the person having activated the vocal cords, and hence is talking.

14. A hearing device according to claim 1 being constituted by or comprising a hearing aid, a headset, an earphone, an ear protection device or a combination thereof.

15. A method of operating a hearing device configured to be worn by a user or to be fully or partially implanted in the head of the user, the method comprising: converting a sound comprising a target sound from a target talker and possible additional sound in an environment of the user, when the user wears the hearing device, to an electric sound signal representative of said sound, providing an auxiliary electric signal representative of said target sound or properties thereof, applying a processing algorithm to said electric sound signal, or a signal derived therefrom, to provide an enhanced signal by attenuating components of said additional sound relative to components of said target sound in said electric sound signal, or in said signal derived therefrom, deriving said auxiliary electric signal in the form of a vibration signal from a video signal from a camera providing visual information containing information of current mechanical vibrations of a facial or throat region of said target talker, and using said auxiliary electric signal in the form of a vibration signal or said signal derived therefrom to provide said enhanced signal.

16. A hearing aid system, comprising a hearing device according to claim 1 and an auxiliary device, the hearing system being adapted to establish a communication link between the hearing device and the auxiliary device to provide that information can be exchanged or forwarded from one to the other.

17. A hearing aid system according to claim 16 wherein the auxiliary device is or comprises the light sensitive sensor.

18. A hearing aid system according to claim 16 wherein the auxiliary device is or comprises a remote control, a smartphone, or other portable or wearable electronic device.

19. A hearing aid system according to claim 16 wherein the auxiliary device is or comprises another hearing device and wherein the hearing system implements a binaural hearing system.

Description

BRIEF DESCRIPTION OF DRAWINGS

(1) The aspects of the disclosure may be best understood from the following detailed description taken in conjunction with the accompanying figures. The figures are schematic and simplified for clarity, and they just show details to improve the understanding of the claims, while other details are left out. Throughout, the same reference numerals are used for identical or corresponding parts. The individual features of each aspect may each be combined with any or all features of the other aspects. These and other aspects, features and/or technical effect will be apparent from and elucidated with reference to the illustrations described hereinafter in which:

(2) FIG. 1A shows a top view of an embodiment of a use case of a hearing aid system according to the present disclosure; and

(3) FIG. 1B shows a side view of an embodiment of a use case of a hearing aid system according to the present disclosure,

(4) FIG. 2A shows an embodiment of a hearing aid system comprising an adaptive-filter based system to produce an enhanced target signal, ŝ(n), based on microphone signal x(n) and on vibration signal e.sub.s(n) derived from throat video signal:

(5) FIG. 2B shows an exemplary more detailed embodiment of the hearing aid system of FIG. 2A;

(6) FIG. 2C shows an exemplary more detailed embodiment of the input stage comprising input transducer and analysis filter bank of the/hearing aid system of FIG. 2B; and

(7) FIG. 2D schematically illustrates the filter characteristics of the high pass and low pass filters of the embodiment of an analysis filter bank of FIG. 2C,

(8) FIG. 3A shows a top view of an embodiment of a hearing aid system comprising first and second hearing devices integrated with a spectacle frame,

(9) FIG. 3B shows a front view of the embodiment in FIG. 3A, and

(10) FIG. 3C shows a side view of the embodiment in FIG. 3A,

(11) FIG. 4 shows an embodiment of a hearing device according to the present disclosure, and

(12) FIG. 5A shows a first embodiment of a hearing aid systems comprising a multitude of input transducers and a video camera;

(13) FIG. 5B shows a second embodiment of a hearing aid systems comprising a multitude of input transducers, a beamformer and a video camera, wherein a signal extracted from the video camera is used to enhance the beamformed signal; and

(14) FIG. 5C shows a third embodiment of a hearing aid systems comprising a multitude of input transducers, a beamformer, a single channel noise reduction system, and a video camera, wherein a signal extracted from the video camera is used by the single channel noise reduction system to enhance the beamformed signal,

(15) FIG. 6 shows an embodiment of a binaural hearing system according to the present disclosure, and

(16) FIG. 7A schematically illustrates a time segment of an exemplary (clean) sound element (vocal a), e.g. 100 ms, at the mouth of the speaker,

(17) FIG. 7B schematically illustrates a time segment of an exemplary (clean) sound element (vocal a), e.g. 100 ms, at the vocal cords of the speaker, and

(18) FIG. 7C schematically shows spectra of the sound element /a/ corresponding to FIG. 7A (at the mouth of the speaker), to FIG. 7B (at the vocal cords of the speaker), and as recorded by a hearing aid microphone (including environment noise).

(19) The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out. Throughout, the same reference signs are used for identical or corresponding parts.

(20) Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.

DETAILED DESCRIPTION OF EMBODIMENTS

(21) The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. Several aspects of the apparatus and methods are described by various blocks, functional units, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as “elements”). Depending upon particular application, design constraints or other reasons, these elements may be implemented using electronic hardware, computer program, or any combination thereof.

(22) The electronic hardware may include microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure. Computer program shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.

(23) The present application relates to the field of hearing devices, e.g. hearing aids, in particular to the improvement of speech intelligibility in difficult listening situations, e.g. situations exhibiting a low signal to noise ratio (SNR). It is proposed to combine traditional noise reduction algorithms (aiming at modifying a noisy speech signal to improve a user's intelligibility thereof) with the use of additional information (not derived from the acoustic signal as such, and hence—ideally—not disturbed by a low SNR of the acoustic signal), e.g. based on optical signals, e.g. images/video data.

(24) The use of visual face/mouth information of a target talker in combination with microphone signals is e.g. dealt with in EP3028475A1. Furthermore, in [1] it was demonstrated that sound signals may be somewhat reconstructed based on the tiny vibrations—as picked up by a high-speed camera—of light surfaces, e.g. a bag of chips lying on a table, or the leaves of a plant. However, the idea proposed in the present disclosure is different in the sense that it reconstructs a clean speech signal based on a visual and an audio signal ([1] reconstructs audio entirely from video). The proposed idea is special in that it focuses exclusively on a vibrating surface of particular interest, e.g. the throat region of the target talker.

(25) Another class of methods exists, which tries to reconstruct speech from a video recording of a target talker, e.g. [2]. However, these methods are active, in that they require a signal, e.g. a laser, to be shined on the target talker. These methods then reconstruct speech from the reflected laser signal. The proposed methods is passive in the sense that it does not involve any interference with the target talker.

(26) According to the present disclosure, a user is equipped with a hearing device or system (e.g. a hearing aid or a hearing aid system), which comprises one or more microphones, and a (high-speed) video camera focused towards a target talker. It is proposed to use visual information picked up by the camera, to help enhance the noisy speech signals picked up by the microphones. However, in contrast to existing audio-visual approaches, which may use face/mouth information of the target talker to help guide the speech enhancement algorithm, the proposed idea uses visual information from the throat area of the target talker.

(27) More specifically, with a high-speed video camera focusing at the throat of the target talker, it may be possible to detect and record the vibrations of the skin, related to the vibration (or absence thereof) of the vocal cords of the target talker (see FIG. 1). This video signal provides instantaneous information about the produced speech sound, which cannot be extracted by a face/mouth visual signal. Hence, this throat signal may be used in combination with face/mouth information or in a stand-alone configuration to help enhance the noisy signals picked up by the microphones of the hearing aid system. The proposed idea requires a video camera focused towards the throat of the target talker—modern face tracking algorithms allow the detection/tracking of faces in the video stream. Furthermore, additional algorithms can be used to detect facial features (e.g., eyes, mouth, etc.)—it is a fairly simple matter to adapt such algorithms to localize the throat region.

(28) FIG. 1A shows a top view (cf. VERT-DIR indication perpendicular to the view) of a use case of an embodiment of a hearing aid system according to the present disclosure. FIG. 1A shows a hearing aid system (HS) comprising hearing device (HD.sub.L) worn by a user (U) (here at a left ear of the user) and a (high-speed) video camera (VC) pointing away from the body of the hearing aid user (U), focused towards a target talker (TT) providing a target (speech) signal s′(n), where n represents time (e.g. a time index). The hearing device (HD.sub.L) comprises at least one microphone (M) for picking up sound from the environment. The goal is to retrieve clean target signal components s.sub.L(n) from the noisy signal x.sub.L(n) as recorded by the hearing aid microphone (M). The signal x.sub.L(n) picked up by the microphone of the hearing devices comprises the voice (e.g. speech) of the target talker (TT) (propagated from the talker to the hearing device microphone). The (clean) speech of the target talker as at the microphone (M) is denoted s.sub.L(n). Additionally, the signal x.sub.L(n) comprises possible noise v.sub.L(n) from the environment. The video camera (VC) has in its field of view (FOV) (is e.g. focused at) the face, and e.g. in particular at the throat, or cheek or chin, of the target talker (TT). The field if view (FOV) is represented by dash dotted arrows from the camera (VC) pointing at the target talkers' head. A (preferably, high speed) video signal with frame rate f.sub.s,cam is picked up by the camera and represented by electrical signal e.sub.s(n), and made accessible to the hearing device(s) (or hearing aid system), e.g. for use in a noise reduction algorithm as described in the present disclosure. The microphone signal can thus be expressed as x.sub.L(n)=s.sub.L(n)+v.sub.L(n), corresponding to signals received at the microphone (M).

(29) The video camera (VC) may e.g. be mounted on the user (U), e.g. on the user's head, e.g. on a headband or on a spectacle frame (see e.g. FIG. 3A, 3B, 3C), or integrated with one of the hearing devices or connected to the hearing devices via a wireless or wired connection (cf. wireless link (LNK) in FIG. 1A). The mounting on the head of the user has the advantage that the focus of the camera may follow the head rotation of the user, so that it (for example) is focused on the object (e.g. a person, e.g. the target talker) that the user is currently looking at (and assumingly paying attention to), cf. look direction (LOOK-DIR) in FIG. 1A of the user (U). In an embodiment, each of the hearing devices (HD.sub.L, HD.sub.R) comprises or is connected to a camera (e.g. to different cameras). In an embodiment, each hearing device is configured to use the information from the camera to enhance one of the microphone signals or a beamformed signal (e.g. formed as a (possibly complex) weighted combination of two or more microphone signals of the hearing device or system in question, see e.g. FIG. 5B, 5C).

(30) The video camera may or may not form part of the hearing device or the hearing aid system. In an embodiment, the hearing aid system (e.g. a hearing aid) comprises an interface to a video camera and is configured to receive (video) data from a video camera. The video camera may be located in a fixed position (possibly mounted in a way allowing rotation around one or more axes) or moved by an operator.

(31) The hearing device of FIG. 1A, 1B is shown as a BTE-part adapted to be located at or behind an ear (pinna). It may further comprise an ITE-part, e.g. a customized mould connected to the BTE part via an acoustically guiding tube, or comprising a loudspeaker connected to the BTE-part via a cable comprising electrical conductors. The hearing device may be constituted by any other appropriate hearing aid style, be it of an air conduction type, a cochlear implant type or a bone conduction type.

(32) FIG. 1B shows a side view of an embodiment of a use case of a hearing aid system according to the present disclosure. The view of FIG. 1B is a side view (looking on the left side of the head of the user (U), i.e. on the right side of the head of the target talker (TT)) of the system and scenario illustrated from the top in FIG. 1A. The signals and components of FIG. 1B are the same as depicted in FIG. 1A and discussed above. The field of view (FOV) of the video camera (VC) is indicated in a vertical direction to include the head and throat region of the target talker (TT). Vibrations of the vocal cords are visible (extractable) in the facial region, e.g. at the throat (Throat (vocal cord), signal e′.sub.s(n)) or at a cheek (Cheek, signal e″.sub.s(n)), or chin (not shown) of the target talker (TT).

(33) FIG. 2A shows an embodiment of a hearing aid system (HS) configured to produce an enhanced target signal s(n) (estimate ŝ) based on a signal x(n), the electric sound signal from input transducer (IT) (e.g. from microphone (M) in FIG. 2B) and on an auxiliary electric signal in the form of vibration signal e.sub.s(n) derived from a throat video signal (and/or a video signal comprising other regions containing vibrations originating from the vocal cords). The hearing aid system (HS) comprises a hearing aid (HD) comprising an input transducer (IT) for converting a sound in the environment of the hearing system to an electric sound signal x(n), which is fed to a processor (PRO). The hearing aid system further comprises a camera (VC) for providing an auxiliary electric signal e.sub.s(n) derived from visual information provided by a camera (VC) (exhibiting a frame rate of f.sub.s,cam), containing information of the current vibration of the vocal cords of a target talker. The video camera (VC) is connected (by cable or wirelessly) to the hearing aid HD comprising an auxiliary input unit (AIN) comprising receiver circuitry for receiving signal E.sub.s(n) representing said target sound signal or characteristics thereof. The auxiliary input unit (AIN) provides the auxiliary electric signal e.sub.s(n), which is fed to the processor (PRO). The processor (PRO) is configured to enhance the (noisy) microphone signal x(n) using a processing algorithm, e.g. a noise reduction algorithm, configured to use the auxiliary electric signal e.sub.s(n) or a signal derived therefrom to provide the enhanced signal in the form of an estimate ŝ(n) of the target signal s(n). The enhanced signal ŝ(n) is fed to output transducer (OT) for providing stimuli based thereon and perceivable as sound to the user. The output transducer (OT) may e.g. comprise one or more of a loudspeaker, a vibrator, and a multi-electrode array.

(34) FIG. 2B shows an exemplary more detailed embodiment of the hearing aid system of FIG. 2A comprising a hearing aid (HD) coupled to a video camera (VC). In the embodiment of FIG. 2B, the hearing aid system (HS) comprises an adaptive-filter based system to produce the enhanced target signal ŝ(n). The hearing aid system comprises an analysis filter bank (FB-A) for converting the electric (time domain) sound signal x(n) provided by the input transducer (here a microphone (M)) as a number of frequency sub-band signals (here two) x.sub.HF(n) and x.sub.LF(n), respectively. The signal comprising frequencies above a cut-off frequency f.sub.cut is termed the high frequency part x.sub.HF(n) of the signal, whereas the signal comprising frequencies below the cut-off frequency f.sub.cut is termed the low frequency part x.sub.LF(n) of the signal.

(35) In other words, the analysis filter bank (FB-A) decomposes the noisy microphone signal x(n) in a low-frequency part,
x.sub.LF(n)=s.sub.LF(n)+v.sub.LF(n),
and a high-frequency part,
x.sub.HF(n)=s.sub.HF(n)+v.sub.HF(n).

(36) The analysis filter bank essentially implements a low-pass and a high-pass filter with identical cut-off frequency (f.sub.cut), see FIGS. 2C, 2D. The cut-off frequency of the filter bank may be related to the frame-rate f.sub.s,cam of the camera (VC). Specifically, e.sub.s(n) will contain no signal components at frequencies higher than half the camera frame rate. Hence, the cut-off frequency could, for example, be set to half of the camera frame rate (f.sub.cut=f.sub.s,cam/2). In some situations it is possible to increase the ‘effective frame rate f.sub.s,cam to a higher value. The physical frame rate is unchanged, but the effective frame rate is increased by temporal interpolation using spatial information. Hereby the cut-off frequency can be increased (so that higher frequencies can be included in the LF-signal).

(37) The focus of the embodiment of FIGS. 2A-2D is the retrieval of the low-frequency part of the clean signal, s.sub.LF(n), using e.sub.s(n). Retrieval of the high-frequency part of the clean signal, s.sub.HF(n), may be approached using other speech enhancement methods, e.g. single channel noise reduction, see e.g. [6,7].

(38) An estimate of the combined impulse response vector, h.sub.tot(n), may be found using the adaptive filter setup depicted in FIG. 2B (or equivalently the frequency response by appropriate transforms to and from the frequency domain). The auxiliary electric signal e.sub.s is used to estimate ŝ.sub.LF the low frequency part of the target signal. The adaptive filter is coupled to minimize a squared difference between the low frequency part x.sub.LF of the noisy microphone signal and the estimated low frequency part ŝ.sub.LF of the target signal ((x.sub.LF-ŝ.sub.LF).sup.2). Thereby the vibration signal recorded by the camera is adapted to resemble the low-frequency part of the (clean) target signal recorded at the (reference) microphone. The auxiliary electric signal e.sub.s(n) from the video camera (VC) is fed to adaptive filter (denoted ĥ.sub.tot(n)) providing as an output an estimate of the low-frequency part ŝ.sub.LF(n) of the target speech signal. An estimate of the filter coefficient vector ĥ.sub.tot(n) may be found adaptively, by minimizing the mean-squared error criterion (minimizing an expectation value of the squared difference between the correct value s.sub.LF of the LF-part of the target signal received at the microphone and the estimated value ŝ.sub.LF of same):
E[(s.sub.LF(n)−h.sub.tot(n).sub.Te.sub.s(n)).sup.2],
where E[ ] is the expectation operator, superscript .sup.T denotes vector transposition, and e.sub.s(n) should be read as vector of successive sample values of e.sub.s(n) up to and including the sample at time n. The dimension of vector e.sub.s(n) is obviously identical to that of vector h.sub.tot(n).

(39) An adaptive estimate of vector h.sub.tot(n) may for example be found using variants of the well-known least-mean-square (1 ms) algorithm (see e.g. [8]), leading to filter coefficient estimates of the form ĥ.sub.tot(n):
ĥ.sub.tot(n+1)=ĥ.sub.tot(n)+μ(n)e.sub.s(n)[x.sub.LF(n)−ŝ.sub.LF(n)],
where we assumed that target and noise signals observed at the microphone(s) are uncorrelated, where μ(n) is a step-length parameter, which may be fixed or time-varying (signal-dependent) (cf. e.g. [8]), and where
s.sub.LF(n+1)=h.sub.tot(n+1).sup.Te.sub.s(n+1).

(40) In other words, the estimate of the filter coefficient vector ĥ.sub.tot (n) may be found adaptively, by minimizing an expectation value of the squared difference between the noisy value x.sub.LF of the LF-part of the target signal received at the microphone and the estimated value ŝ.sub.LF of the LF-part of the target signal provided by the adaptive filter ĥ.sub.tot (as illustrated in FIG. 2B).

(41) Many other adaptive algorithms with better tracking/convergence properties are known [8] and can be used in this context.

(42) Finally, the estimate of the low-frequency part of the clean signal, ŝ.sub.LF(n), and the estimate of the high-frequency part of the clean signal, ŝ.sub.HF(n), are combined to form the estimate, ŝ(n), e.g. using a synthesis filter bank (FB-S), or simply by summing ŝ.sub.LF(n) and ŝ.sub.HF(n). Note that the estimate of the high-frequency clean signal content might simply be (approximated by) the unprocessed noisy high-frequency part of the signal, i.e., ŝ.sub.HF(n)=x.sub.HF(n). Optionally, the high-frequency part of the signal, ŝ.sub.HF(n), may be provided from the unprocessed noisy high-frequency part of the signal, x.sub.HF(n), e.g. by single channel noise reduction (‘post filtering’). The combined estimate ŝ(n) of the target signal may be presented to the user via output transducer (OT), e.g. a loudspeaker, or may be further processed (e.g. by applying one or more processing algorithms, e.g. compressive amplification to compensate for a user's hearing impairment) before presentation.

(43) FIG. 2C shows an exemplary more detailed embodiment of the input stage comprising input transducer (IT) and analysis filter bank (FB-A) of the hearing aid system of FIG. 2B. The analysis filter bank of FIG. 2B may be implemented by respective high pass and low pass filters (HPF, LPF) providing respective high frequency and low frequency parts (x.sub.HF(n), x.sub.LF(n)) of the noisy input signal x(n) from the input transducer (IT).

(44) FIG. 2D schematically illustrates exemplary filter characteristics of the high pass and low pass filters (HPF, LPF) of the embodiment of an analysis filter bank of FIG. 2C. The high pass (HPF) and low pass (LPF) filters are adapted to exhibit (substantially) identical (3 dB) cut-off frequencies f.sub.cut. The cut-off frequency divides the operational part of the frequency axis on a low frequency range (LF) from f.sub.min. (e.g. 0 Hz) to f.sub.cut (e.g. 1 kHz) and a high frequency range (HF) from f.sub.cut to f.sub.max (e.g. 12 kHz).

(45) The minimum frequency may e.g. be of the order of 20 Hz or 50 Hz. The maximum frequency may e.g. be of the order of 8 kHz or 10 kHz. The cut-off frequency may e.g. be of the order of 1 kHz or 2 kHz.

(46) The solution described above uses the entire waveform e.sub.s(n) of the throat signal in the enhancement process. Other solutions may be envisioned, where, first, the signal e.sub.s(n) is analysed and certain features (characteristics) of the signal are extracted. These features may include a) speech activity (is the target talker speaking in the first place), b) voicing state (i.e., to which extent are the vocal cords vibrating), and c) fundamental frequency (i.e., if the vocal cords are vibrating, at which frequency).

(47) Such features may be used as side information in speech enhancement systems to improve their performance (see e.g. [5.6]).

(48) FIG. 3A shows a top view of a first embodiment of a hearing aid system (HS) comprising first and second hearing devices (HD.sub.1, HD.sub.2) integrated with a spectacle frame. FIG. 3B shows a front view of the embodiment in FIG. 3A, and FIG. 3C shows a side view of the embodiment in FIG. 3A.

(49) The hearing aid system according to the present disclosure is configured to be worn on the head of a user and comprises a head worn carrier, here embodied in a spectacle frame.

(50) The hearing aid system (HS) comprises left and right hearing devices (HD.sub.1, HD.sub.2) and a number of sensors, wherein at least some of the sensors are mounted on the spectacle frame. The hearing aid system (HS) comprises a number of sensors S.sub.i, (i=1, . . . , N.sub.S) associated with (e.g. forming part of or connected to) left and right hearing devices (HD.sub.1, HD.sub.2), respectively. The number of sensors comprise at least one camera (e.g. a high-speed camera). Two or more (e.g. all) of the number of sensors N.sub.S (here four) may represent cameras, focused at different parts of the environment of the hearing aid system (i.e. of the user, when wearing the hearing aid system). In the example of FIG. 3A, 3B, 3C the distribution of sensors is symmetric, which need not necessarily be so, though). The first, second, third, and fourth sensors S.sub.1, S.sub.2, S.sub.3, S.sub.4 are mounted on a spectacle frame of the glasses (GL). In the embodiment of FIG. 3A, sensors S.sub.1 and S.sub.2 are mounted on the respective sidebars (SB.sub.1 and SB.sub.2), whereas sensors S.sub.3 and S.sub.4 are mounted on the cross bar (CB) having (e.g. hinged) connections to the right and left side bars (SB.sub.1 and SB.sub.2). Glasses or lenses (LE) of the spectacles may be mounted on the cross bar (CB) and nose sub-bars (NSB.sub.1, NSB.sub.2). The left and right hearing devices (HD.sub.1, HD.sub.2) comprises respective BTE-parts (BTE.sub.1, BTE.sub.2), and further comprise respective ITE-parts (ITE.sub.1, ITE.sub.2). Alternatively, at least one of the left and right hearing devices may comprise only a BTE part or only an ITE part, or be adapted to be fully or partially implanted in the head of the user. In an embodiment, the glasses comprises at least one camera mounted at the spectacle frame (e.g. on the cross bar) so that its focus follows the look direction of the user wearing the hearing aid system. In an embodiment, the hearing aid system is configured to select one of a multitude of cameras as the one to use in the enhancement of the electric sound signal received via the input transducer (e.g. IT in FIG. 2A), e.g. via a user interface, e.g. implemented as an APP of an auxiliary device, e.g. a smartphone, a smartwatch or the like.

(51) Some or all microphones of the hearing aid system (HS) may be located on the (frame of the) glasses and/or on the BTE part, and or on an ITE-part. The ITE-parts may further e.g. comprise electrodes or other sensors for picking up body signals from the user, e.g. for monitoring physiological functions of the user, e.g. brain activity or eye movement activity or temperature, etc. The body signals may e.g. comprise Electroocculography (EOG) potentials and/or brainwave potentials, e.g. Electroencephalography (EEG) potentials, cf. e.g. EP3185590A1. The sensors mounted on the spectacle frame may (in addition to one or more cameras for picking up images of facial regions (e.g. including the throat region) of a target talker) e.g. comprise one or more of an accelerometer, a gyroscope, a magnetometer, a radar sensor, an eye camera (e.g. for monitoring pupillometry), or other sensors for localizing or contributing to localization of a sound source (or other landmark) of interest to the user wearing the hearing system and/or for identifying a target talker or a user's own voice.

(52) The BTE- and ITE parts (BTE and ITE) of the hearing devices are electrically connected, either wirelessly or wired, as indicated by the dashed connection between them in FIG. 3C. The ITE part (ITE.sub.1) may comprise a microphone (cf. M.sub.ITE in FIG. 4) and/or a loudspeaker (cf. SPK in FIG. 4) located in the ear canal during use.

(53) While a camera pointed towards the target talker might allow both face, mouth, and throat information to be used, the proposed idea (using visually acquired information representing vibration of the vocal cords of the target talker) could be used in systems which i) rely only on such visual information, combined with the microphone signals picked up at the hearing aid user. A simplified system may be envisioned, which ii) rely exclusively on visual information representing vibrations of the vocal cords (e.g. throat information) (i.e., which does not use the microphone signals at all).

(54) FIG. 4 shows an embodiment of a hearing device according to the present disclosure. The hearing device (HD), e.g. a hearing aid, is of a particular style (sometimes termed receiver-in-the ear, or RITE, style) comprising a BTE-part (BTE) adapted for being located at or behind an ear of a user, and an ITE-part (ITE) adapted for being located in or at an ear canal of the user's ear and comprising a receiver (loudspeaker). The BTE-part and the ITE-part are connected (e.g. electrically connected) by a connecting element (IC) and internal wiring in the ITE- and BTE-parts (cf. e.g. wiring W.sub.X in the BTE-part). The connecting element may alternatively be fully or partially constituted by a wireless link between the BTE- and ITE-parts (see e.g. FIG. 3C).

(55) In the embodiment of a hearing device in FIG. 4, the BTE part comprises two input units comprising respective input transducers (e.g. microphones) (M.sub.BTE1, M.sub.BTE2), each for providing an electric input audio signal representative of an input sound signal (S.sub.BTE) (originating from a sound field S around the hearing device). The input unit further comprises two wireless receivers (WLR.sub.1, WLR.sub.2) (or transceivers) for providing respective directly received auxiliary audio and/or control input signals (and/or allowing transmission of audio and/or control signals to other devices, e.g. a remote control or processing device). The input unit further comprises a video camera (VC) located in the housing of the BTE-part so that its field of view (FOV) is directed in a look direction of the user wearing the hearing device (here next to the electric interface to the connecting element (IC). The hearing device (HD) comprises a substrate (SUB) whereon a number of electronic components are mounted, including a memory (MEM) e.g. storing different hearing aid programs (e.g. parameter settings defining such programs, or parameters of algorithms, e.g. optimized parameters of a neural network) and/or hearing aid configurations, e.g. input source combinations (M.sub.BTE1, M.sub.BTE2, WLR.sub.1, WLR.sub.2, VC), e.g. optimized for a number of different listening situations. The auxiliary electric signal derived from visual information (e.g. from video camera VC) may be used in a mode of operation where it is combined with an electric sound signal from one of the input transducers (e.g. a microphone, e.g. M.sub.BTE1). In another mode of operation, the auxiliary electric signal is used together with a beamformed signal provided by appropriately combining electric input signals from the first and second input transducers (M.sub.BTE1, M.sub.BTE2), e.g. by applying appropriate complex weights to the respective electric input signals (beamformer). In a mode of operation, the auxiliary electric signal is used as input to a processing algorithm (e.g. a single channel noise reduction algorithm) to enhance a signal of the forward path, e.g. a beamformed (spatially filtered) signal. In an embodiment, the auxiliary electric signal is used only when the hearing device is brought into a specific mode of operation (e.g. a ‘boost noise reduction’ mode representing a particularly difficult, e.g. multi talker or extraordinary noisy acoustic environment). An activation of the specific mode of operation may be performed by a program shift, e.g. initiated via a user interface, e.g. implemented as an APP on a remote control device, e.g. a smartphone or other wearable device. In an embodiment, the light sensitive sensor (e.g. a camera) is only activated, when the hearing device is brought into the specific mode of operation. In an embodiment, the light sensitive sensor (e.g. a camera) is activated in a low-power mode (e.g. a camera with reduced frame rate), when the hearing device is not in the specific mode of operation.

(56) The substrate further comprises a configurable signal processor (DSP, e.g. a digital signal processor, e.g. including a processor (e.g. PRO in FIG. 2A) for applying a frequency and level dependent gain, e.g. providing beamforming, noise reduction (including improvements using the camera), filter bank functionality, and other digital functionality of a hearing device according to the present disclosure). The configurable signal processor (DSP) is adapted to access the memory (MEM) and for selecting and processing one or more of the electric input audio signals and/or one or more of the directly received auxiliary audio input signals, and/or the camera signal based on a currently selected (activated) hearing aid program/parameter setting (e.g. either automatically selected, e.g. based on one or more sensors, or selected based on inputs from a user interface). The mentioned functional units (as well as other components) may be partitioned in circuits and components according to the application in question (e.g. with a view to size, power consumption, analogue vs. digital processing, etc.), e.g. integrated in one or more integrated circuits, or as a combination of one or more integrated circuits and one or more separate electronic components (e.g. inductor, capacitor, etc.). The configurable signal processor (DSP) provides a processed audio signal, which is intended to be presented to a user. The substrate further comprises a front-end IC (FE) for interfacing the configurable signal processor (DSP) to the input and output transducers, etc., and typically comprising interfaces between analogue and digital signals. The input and output transducers may be individual separate components, or integrated (e.g. MEMS-based) with other electronic circuitry.

(57) The hearing device (HD) further comprises an output unit (e.g. an output transducer) providing stimuli perceivable by the user as sound based on a processed audio signal from the processor or a signal derived therefrom. In the embodiment of a hearing device in FIG. 4, the ITE part comprises the output unit in the form of a loudspeaker (also termed a ‘receiver’) (SPK) for converting an electric signal to an acoustic (air borne) signal, which (when the hearing device is mounted at an ear of the user) is directed towards the ear drum (Ear drum), where sound signal (S.sub.ED) is provided. The ITE-part further comprises a guiding element, e.g. a dome, (DO) for guiding and positioning the ITE-part in the ear canal (Ear canal) of the user. The ITE-part further comprises a further input transducer, e.g. a microphone (M.sub.ITE), for providing an electric input audio signal representative of an input sound signal (S.sub.ITE) at the ear canal.

(58) The electric input signals (from input transducers M.sub.BTE1, M.sub.BTE2, M.sub.ITE) may be processed in the time domain or in the (time-) frequency domain (or partly in the time domain and partly in the frequency domain as considered advantageous for the application in question).

(59) The hearing device (HD) exemplified in FIG. 4 is a portable device and further comprises a battery (BAT), e.g. a rechargeable battery, e.g. based on Li-Ion battery technology, e.g. for energizing electronic components of the BTE- and possibly ITE-parts. In an embodiment, the hearing device, e.g. a hearing aid, is adapted to provide a frequency dependent gain and/or a level dependent compression and/or a transposition (with or without frequency compression) of one or more frequency ranges to one or more other frequency ranges, e.g. to compensate for a hearing impairment of a user.

(60) FIG. 5A shows a first embodiment of a hearing aid (HS) system comprising a hearing device (HD) and a video camera (VC). The hearing system comprises comprising a multitude of input transducers (IT.sub.1, . . . , IT.sub.M) (e.g.—as here—forming part of the hearing device; one or more of the input transducers may e.g. be external to the hearing device, e.g. located in an auxiliary device). The M input transducers (e.g. microphones) each provide respective electric (time-domain) input signals x.sub.1(n), . . . , x.sub.M(n) representing sound at the location of the input transducer in question (n representing time, e.g. a time index of a digital signal). The electric input signals x.sub.1(n), . . . , x.sub.M(n) from the input transducers (IT.sub.1, . . . , IT.sub.M) and the auxiliary electric signal e.sub.s(n) from the auxiliary input unit (AIN) (connected to the video camera) are fed to processor (PRO) for processing the electric input signals in dependence of the auxiliary electric signal by applying one or more processing algorithms and providing an estimate ŝ(n) of a target signal. The estimate of the target signal is feed to the output transducer (OT) for presentation to a user wearing the hearing aid system as stimuli perceivable as sound. Apart from comprising a multitude of input transducers, the embodiment of FIG. 5A comprise the same elements as the embodiment shown in FIG. 2A and described above. The auxiliary input signal e.sub.s(n) provided by the video camera (VC) may be combined in a multitude of ways to enhance one or more, such as all of, or a weighted combination of the multitude of, the electric input signals x.sub.1(n), . . . , x.sub.M(n). Two examples thereof are illustrated in FIGS. 5B and 5C and described in the following.

(61) FIG. 5B shows a second embodiment of a hearing aid system (HS) comprising a multitude of microphones, a beamformer and a video camera, wherein a signal extracted from the video camera is used to enhance the beamformed signal. The embodiment of FIG. 5B is similar to the embodiment of FIG. 5A apart from the processor being shown to comprise a beamformer and a signal processor, and appropriate analysis and synthesis filter banks for executing processing in frequency sub-bands. The processor (PRO) of the embodiment of a hearing device of FIG. 5B comprises respective analysis filter banks (FBA) for providing respective electric sound signals x.sub.1(n), . . . , x.sub.M(n) as frequency sub-band signals x.sub.1(k,m), . . . , x.sub.M(k,m), where k is a frequency index, k=1, . . . , K, and m is a time frame index, m=1, 2, . . . . The processor (PRO) further comprises a spatial filter (beamformer, BF) for providing a beamformed signal x.sub.BF(k,m) from the frequency sub-band signals x.sub.1(k,m), . . . , x.sub.M(k,m). The processor (PRO) further comprises a signal processor (SPU) for applying one or more processing algorithms (e.g. a noise reduction algorithm) to the spatially filtered (beamformed) signal x.sub.BF(k,m) and providing an estimate ŝ(k,m) of the target signal (the enhanced signal) based thereon. The one or more processing algorithms is/are configured to use the auxiliary electric signal e.sub.s(n) (possibly converted to a time frequency representation e.sub.s(k,m)) to provide the enhanced signal ŝ(k,m). The processor (PRO) further comprises a synthesis filter bank (FBS, here indicated together with the signal processor (SPU) as SPU-FBS) for converting the time-frequency (frequency sub-band) representation of the estimate of the target signal ŝ(k,m) to a time domain signal ŝ(n), which is fed to the output transducer (OT) for presentation to the user as an audibly perceivable signal. The signal processor (SPU) may e.g. implement a time-frequency based version of the adaptive filter arrangement or an equivalent solution as shown in FIG. 2B and discussed above. The signal processor may further be configured to apply other processing to the beamformed signal, or a processed version thereof, e.g. to the estimate of the target signal ŝ(k,m), e.g. to compensate for a user's hearing impairment.

(62) FIG. 5C shows an embodiment of a hearing aid system (HS) comprising a multitude of microphones, a beamformer, a single channel noise reduction system, a hearing aid processor, and a video camera, and a feature extractor, wherein a signal extracted from the video camera is used to enhance the beamformed signal by using extracted parameters from the auxiliary signal, e.g. a voice activity indicator, as input to the noise reduction system. The embodiment of FIG. 5B is similar to the embodiment of FIG. 5A, apart from the following features. The single channel noise reduction system (SC-NR) is configured to apply a noise reduction algorithm to the beamformed signal x.sub.BF(k,m). The feature extractor (FEX) may e.g. be configured to extract from the video signal e.sub.s(n) (or a sub-band version thereof) a voice detection signal indicative of whether or not, or with what probability, the target talker is actively speaking or otherwise uses his voice at a given point in time. The further processing unit (HAP) is e.g. configured to apply other processing algorithms to the noise reduced signal, e.g. a level and frequency dependent gain (or attenuation) to compensate for a user's hearing impairment. The feature extractor (FEX) may be configured to extract other characteristics of the target sound signal from the auxiliary signal e.sub.s(n), e.g. a fundamental frequency, which may be used as input to one or more of the processing algorithms (as e.g. indicated by the dashed arrow to the hearing aid processor (HAP) in FIG. 5C). The fundamental frequency may e.g. be used as an indicator of a particular voice class of the target talker (e.g. male, female or a child). The indicator may e.g. be used to select a set of processing parameters in dependence of the determined fundamental frequency (such processing parameters being e.g. related to gain, compression, directionality, noise reduction, etc., the frequency dependency of different processing algorithms being e.g different in dependence of the voice class (fundamental frequency), cf. e.g. EP2081405A1.

(63) In the example of FIG. 2A-2D above, an adaptive filter solution to the problem of enhancing a microphone signal picked up at the hearing aid user, using a throat (or cheek or chin) video as side information has been disclosed. Obviously, (deep) neural network solutions may be envisioned, which are trained to produce as output an enhanced microphone signal based on an input consisting of the noise microphone signal(s) and the video signal.

(64) FIG. 6 shows an embodiment of a binaural hearing system according to the present disclosure. The scenario illustrated in FIG. 6 is similar to the one illustrated in FIG. 1A. A difference is that the user wears hearing devices at left as well as at right ears. Both hearing devices may be in communication with the video camera (VC) via respective wired or wireless links (LNK). Further (or alternatively), the left and right hearing devices (HD.sub.L, HD.sub.R) may be equipped with appropriate transceiver circuitry to allow communication (e.g. via an inter-aural link) to be established between them to thereby allow data (e.g. from the video camera) to be transferred from one hearing device to the other (possibly via an intermediate device). A further difference is that both hearing devices comprise two microphones, termed a front (FM.sub.L, FM.sub.R) and a rear (RM.sub.L, RM.sub.R) microphone, respectively, referring to the front and rear directions relative to the user's face (nose). The two (or more) microphones of a given hearing device may be used to create beamformed signals, e.g. focusing on a target direction, e.g. the look direction of the user (as indicated by the dashed arrow through the user's nose (NOSE) (by definition) pointing in a ‘front’ direction). The distance between the two hearing devices and thus the respective microphones is indicated by parameter a (e.g. of the order of 0.15-0.30 m). The hearing aid microphones of the left and right hearing devices (HD.sub.L, HD.sub.R) may be used to provide ‘separate’, local beamforming in each hearing device (only based on its ‘own’ microphones) and/or to provide binaural beamforming based on microphone(s) from both bearing devices. The

(65) FIG. 7A schematically illustrates a time segment of an exemplary (clean) sound element (vocal a), e.g. of length of the order of 100 ms, at the mouth of the speaker.

(66) FIG. 7B schematically illustrates a time segment of an exemplary (clean) sound element (vocal a), e.g. of length of the order of 100 ms, at the vocal cords of the speaker.

(67) The middle part of FIG. 7C in dashed, bold line schematically shows a spectrum S(k) of the (clean) sound element /a/ (at the mouth of the speaker) corresponding to FIG. 7A. The top part of FIG. 7C in solid, bold line further schematically shows a spectrum X(k) of (noisy) sound element /a/ as recorded by a hearing aid microphone (including environment noise). The bottom part of FIG. 7C further schematically shows a (line) spectrum E.sub.s(k) of the (clean) sound element /a/ (at the vocal cords of the speaker) corresponding to FIG. 7B. The line spectrum represents a fundamental frequency F.sub.0 (and harmonics thereof qF.sub.0), q=2, 3, . . . ). The vertical dotted line in FIG. 7C, denoted f.sub.s,cam, indicates the frame rate of the camera. The frame rate of the camera f.sub.s,cam is related to the cut-off frequency, f.sub.cut, of the low-pass and the high-pass filters of FIG. 2A-2D, i.e. a limit between a low frequency region (LF) and a high frequency range (HF). The cut-off frequency, f.sub.cut, may for example, be smaller than or equal to the frame rate of the camera, f.sub.s,cam. The cut-off frequency, f.sub.cut, may e.g. be set to half of the camera frame rate (f.sub.cut=f.sub.s,cam/2). The (video) camera may have a frame rate (f.sub.s,cam) in the range between 250 Hz and 1 kHz. The camera may be a high-speed video camera, e.g. having a frame rate larger than 1 kHz, such as larger than 2 kHz. Fundamental frequencies (F.sub.0) of the vocal tract of human beings during vocal utterances (e.g. speech) are typically in the range between 50 Hz and 550 Hz.

(68) Average fundamental frequencies are different for male, female and child species. Male fundamental frequencies are e.g. typically in the range from 85 Hz to 165 Hz, see e.g. EP2081405A1. During speech, the vocal cords (and its immediate surroundings, e.g. skin, tissue and bone) will at least vibrate with the fundamental frequency F.sub.0, but higher harmonics (F.sub.q=qF.sub.0) will also be excited and be present in the user's speech signal together with a number of formant frequencies determined by the resonance properties (e.g. its form and dimensions) of the vocal tract of the target talker. For the purposes of signal processing in hearing aids, speech frequencies are generally taken to lie in the range below 8-10 kHz. A majority of speech frequencies of importance to a user's intelligibility of speech are below 5 kHz, and mainly below 3 kHz, such as below 2 kHz. At least some of these frequencies (a low-frequency part<f.sub.cut (here=f.sub.s,cam)) will create corresponding vibrations in the facial region of the talker and be extractable by a video camera (including or) focused on the facial region (e.g. the throat and/or cheek or chin regions).

(69) It is intended that the structural features of the devices described above, either in the detailed description and/or in the claims, may be combined with steps of the method, when appropriately substituted by a corresponding process.

(70) As used, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well (i.e. to have the meaning “at least one”), unless expressly stated otherwise. It will be further understood that the terms “includes,” “comprises,” “including,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element but an intervening element may also be present, unless expressly stated otherwise. Furthermore, “connected” or “coupled” as used herein may include wirelessly connected or coupled. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. The steps of any disclosed method is not limited to the exact order stated herein, unless expressly stated otherwise.

(71) It should be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” or “an aspect” or features included as “may” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the disclosure. The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects.

(72) The claims are not intended to be limited to the aspects shown herein, but is to be accorded the full scope consistent with the language of the claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” Unless specifically stated otherwise, the term “some” refers to one or more.

(73) Accordingly, the scope should be judged in terms of the claims that follow.

REFERENCES

(74) [1] A. Davis et al., “The Visual Microphone: Passive Recovery of Sound from Video,” ACM

(75) Transactions on Graphics (Proc. SIGGRAPH), Vol. 33, No. 4, pp. 79:1-79:10, 2014. [2] Z. Zalevsky et al., “Simultaneous remote extraction of multiple speech sources and heart beats from secondary speckles pattern”, Optics Express, Vol. 17, No. 24, pp. 21566-21580, 2009. [3] M. A. Shabani et al., “Local Visual Microphones: Improved Sound Extraction from Silent Video,” 2017. [4] P. Jax and P. Vary, “Artificial bandwidth extension of speech signals using mmse estimation based on a hidden markov model,” Proc. Icassp 2003. [5] J. R. Deller, J. H. L. Hansen, and J. G. Proakis, “Discrete-Time Processing of Speech Signals,” IEEE Press, 2000. [6] P. C. Loizou, “Speech Enhancement—Theory and Practice,” CRC Press, 2007. [7] R. C. Hendriks, T. Gerkmann, J. Jensen, “DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement,” Morgan and Claypool, 2013.Xxxxx EP3028475A1 (STARKEY) Feb. 5, 2015 EP3267697A1 (OTICON) Jan. 10, 2018 EP2081405A1 (BERNAFON) Jan. 21, 2008 EP3185590A1 (OTICON) Jun. 28, 2017