System and method of wind and noise reduction for a headphone
10341759 ยท 2019-07-02
Assignee
Inventors
- Sorin V. Dusan (San Jose, CA)
- Tom-Davy W. Saux (Santa Clara, CA, US)
- Vladan Bajic (San Francisco, CA, US)
Cpc classification
G10K11/17881
PHYSICS
G10K11/178
PHYSICS
G10K2200/10
PHYSICS
H04R2410/07
ELECTRICITY
H04R2201/107
ELECTRICITY
G10K2210/1081
PHYSICS
International classification
H04R1/10
ELECTRICITY
Abstract
Method of wind and noise reduction for headphones starts by receiving acoustic signals from first external microphone included on the outside of earcup's housing. Acoustic signals are received from internal microphone included inside earcup's housing. ANC downlink corrector processes downlink signal to generate echo estimate of speaker signal. First summator removes echo estimate of speaker signal from acoustic signals from internal microphone to generate corrected internal microphone signal. Spectral combiner performs spectral mixing of corrected internal microphone signal with acoustic signals from first external microphone to generate mixed signal. Lower frequency portion of mixed signal includes corresponding lower frequency portion of corrected internal microphone signal, and higher frequency portion of mixed signal includes corresponding higher frequency portion of acoustic signals from first external microphone. Other embodiments are also described.
Claims
1. A method of noise reduction for a headphone comprising: receiving an acoustic signal from an external microphone positioned outside a housing of an earcup of the headphone; receiving an acoustic signal from an internal microphone positioned inside the housing of the earcup; processing a downlink signal to generate an estimate of a speaker signal that is to be output by a speaker of the headphone; removing the estimate of the speaker signal from the acoustic signal from the internal microphone to generate a corrected internal microphone signal; spectrally mixing the corrected internal microphone signal with the acoustic signal from the external microphone to generate a mixed signal, wherein a lower frequency portion of the mixed signal includes a corresponding lower frequency portion of the corrected internal microphone signal, and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the acoustic signal from the external microphone, processing the corrected internal microphone signal to generate an anti-noise signal; and adding the anti-noise signal to the downlink signal to generate the speaker signal to be output by the speaker.
2. The method of claim 1, further comprising: transforming the acoustic signal from the external microphone, the acoustic signal from the internal microphone, and the downlink signal from a time domain to a frequency domain; and transforming an enhanced mixed signal from the frequency domain to the time domain.
3. The method of claim 1, further comprising: removing a linear acoustic echo from the acoustic signal from the external microphone based on the downlink signal to generate an enhanced external microphone signal; and removing a linear acoustic echo from the corrected internal microphone signal based on the downlink signal to generate an enhanced internal microphone signal.
4. The method of claim 3, further comprising: scaling the enhanced internal microphone signal to match a level of the enhanced external microphone signal.
5. The method of claim 4, further comprising: amplifying the enhanced external microphone signal to generate an amplified enhanced external microphone signal, wherein the spectrally mixing comprises: spectrally mixing of the scaled enhanced internal microphone signal with the amplified enhanced external microphone signal to generate the mixed signal.
6. The method of claim 1, further comprising: transmitting the mixed signal as an uplink signal.
7. The method of claim 6, further comprising: detecting a presence of noise, wherein noise includes at least one of: wind noise or ambient noise.
8. The method of claim 7, wherein spectrally mixing to generate the mixed signal is based on detecting the presence of noise.
9. The method of claim 8, further comprising: removing at least one of a residual noise or a non-linear acoustic echo in the mixed signal based on detecting the presence of noise to generate an enhanced mixed signal.
10. A method of noise reduction for a headphone comprising: receiving an acoustic signal from a first external microphone and an acoustic signal from a second external microphone, wherein the first and second external microphones are included on an outside of a housing of an earcup of the headphone, generating a voicebeam signal based on the first external microphone signal and the second external microphone signal; receiving an acoustic signal from an internal microphone included inside the housing of the earcup; processing a downlink signal to generate an estimate of a speaker signal that is to be output by a speaker of the headphone; removing the estimate of the speaker signal from the acoustic signal from the internal microphone to generate a corrected internal microphone signal; spectrally mixing of the corrected internal microphone signal with the voicebeam signal to generate a mixed signal, wherein a lower frequency portion of the mixed signal includes a corresponding lower frequency portion of the corrected internal microphone signal, and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the voicebeam signal, processing the corrected internal microphone signal to generate an anti-noise signal; and adding the anti-noise signal to the downlink signal to generate the speaker signal to be output by the speaker.
11. The method of claim 10, further comprising: transforming the acoustic signal from the first external microphone, the acoustic signal from the second external microphone, the acoustic signal from the internal microphone, and the downlink signal from a time domain to a frequency domain; and transforming an enhanced mixed signal from the frequency domain to the time domain.
12. The method of claim 10, further comprising: transmitting the mixed signal as an uplink signal.
13. The method of claim 12, further comprising: scaling the enhanced internal microphone signal to match a level of the enhanced first external microphone signal, wherein spectrally mixing comprises: spectrally mixing of a scaled enhanced internal microphone signal with the voicebeam signal to generate the mixed signal.
14. The method of claim 13, further comprising: detecting a presence of noise, wherein noise includes at least one of: wind noise or ambient noise, wherein spectrally mixing to generate the mixed signal is based on detecting the presence of noise.
15. The method of claim 14, further comprising: removing at least one of a residual noise or a non-linear acoustic echo in the mixed signal based on detecting the presence of noise to generate an enhanced mixed signal.
16. A system of noise reduction for a headphone comprising: a speaker to output a speaker signal based on a downlink signal; an earcup of the headphone includes a first external microphone included on an outside of a housing of the earcup, and an internal microphone included inside the housing of the earcup; an active-noise cancellation (ANC) downlink corrector to process the downlink signal to generate an estimate of the speaker signal; a first summator to remove the estimate of the speaker signal from an acoustic signal from the internal microphone to generate a corrected internal microphone signal; a first acoustic echo canceller to remove a linear acoustic echo from an acoustic signal from the first external microphone based on the downlink signal to generate an enhanced first external microphone signal; and a second acoustic echo canceller to remove a linear acoustic echo from the corrected internal microphone signal based on the downlink signal to generate an enhanced internal microphone signal; an equalizer to scale the enhanced internal microphone signal to match a level of the enhanced first external microphone signal; a spectral combiner to spectrally mix the enhanced internal microphone signal with the enhanced first external microphone signal to generate a mixed signal, wherein a lower frequency portion of the mixed signal includes a corresponding lower frequency portion of the enhanced internal microphone signal, and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the enhanced first external microphone signal.
17. The system of claim 16, further comprising: a communications interface to transmit the mixed signal as an uplink signal.
18. The system of claim 17, further comprising: a feedback ANC corrector to process the corrected internal microphone signal to reduce amplification of a user's speech signal and of the ambient noise signal in the internal microphone and to generate an anti-noise signal; and a second summator to add the anti-noise signal to the downlink signal to generate the speaker signal.
19. The system of claim 18, further comprising: an amplifier to amplify the enhanced first external microphone signal to generate an amplified enhanced first external microphone signal, wherein the spectral combiner spectrally mixing comprises: spectrally mixing of an output of the equalizer with the amplified enhanced first external microphone signal to generate the mixed signal.
20. The system of claim 19, further comprising: a wind and noise detector to detect a presence of noise, wherein noise includes at least one of: wind noise or ambient noise.
21. The system of claim 20, wherein the spectral combiner spectrally mixes to generate the mixed signal based on detecting the presence of noise.
22. The system of claim 21, further comprising: a noise suppressor to remove at least one of a residual noise or a residual non-linear acoustic echo in the mixed signal based on detecting the presence of noise to generate an enhanced mixed signal.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The embodiments of the invention are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to an or one embodiment of the invention in this disclosure are not necessarily to the same embodiment, and they mean at least one. In the drawings:
(2)
(3)
(4)
(5)
(6)
(7)
DETAILED DESCRIPTION
(8) In the following description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures, and techniques have not been shown to avoid obscuring the understanding of this description.
(9) Moreover, the following embodiments of the invention may be described as a process, which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a procedure, etc.
(10)
(11) The headphone on
(12)
(13) As shown in
(14) The third microphone 11.sub.3 is located inside each earcup facing the user's ear cavity (e.g., inside microphone, internal microphone, or error microphone). Since the third microphone 11.sub.3 is located against or in the ear and the third microphone 11.sub.3 is placed inside the earcup 10.sub.1, the third microphone 11.sub.3 is protected from external noises such as ambient noise, environmental noise, and wind noise. In some embodiments, the location of the third microphone 11.sub.3 captures acoustic signals having ambient noises attenuated by 10 db-20 db and wind noises attenuated by 15-20 db. In one embodiment, the earcup is an earbud such that the third microphone 11.sub.3 is located on the portion of the earbud (e.g., tube) that is placed in the user's ear such that the third microphone 11.sub.3 is as close as possible to the user's eardrum. In some embodiments, at least one of the external microphones 11.sub.1, 11.sub.2, and the internal microphone 11.sub.3 can be used to perform active noise cancellation (ANC).
(15) While
(16) While not shown in the
(17) In another embodiment, the earcups 10.sub.1, 10.sub.2 are wireless and may also include a battery device, a processor, and a communication interface (not shown). In this embodiment, the processor may be a digital signal processing chip that processes the acoustic signal from the microphones 11.sub.1-11.sub.3. In one embodiment, the processor may control or include at least one of the elements illustrated in the system 3 in
(18) The communication interface may include a Bluetooth receiver and transmitter which may communicate speaker audio signals or microphone signals from the microphones 11.sub.1-11.sub.3 wirelessly in both directions (uplink and downlink) with the electronic device. In some embodiments, the communication interface communicates encoded signal from a speech codec (not shown) to the electronic device.
(19)
(20) As shown in
(21) In the embodiment in
(22) Embodiments of the invention may be applied in time domain or in frequency domain. In one embodiment, the sample rate converters (SRC) 301.sub.1-301.sub.3 in
(23) The time-frequency transformers (FBa) 302.sub.1-302.sub.3 transform the acoustic signals from the first microphone 11.sub.1, the acoustic signals from the second microphone 11.sub.2, and the acoustic signal from the third microphone 11.sub.3, from a time domain to a frequency domain. Similarly, the time-frequency transformer (FBa) 302.sub.4 transforms the downlink signal from a time domain to a frequency domain.
(24) An active-noise cancellation (ANC) downlink corrector 318 processes the downlink signal from the downlink DSP chain 37 to generate an echo estimate of the speaker signal. A first summator 304.sub.1 receives the acoustic signals from the third microphone (e.g., internal microphone) 11.sub.3 and the echo estimate of the speaker signal from the ANC downlink corrector 318. The first summator 304.sub.1 removes the echo estimate of the speaker signal from acoustic signals from the internal microphone to generate a corrected internal microphone signal. Accordingly, the first summator 304.sub.1 extracts from the internal microphone signal the echo generated by the downlink signal that is produced by the speaker 316 which may be included in the earcup 10.sub.1 or the electronic device. This extraction further preserves the level of the speaker signal being played by the speaker 316.
(25) Given the earcup 10.sub.1's occlusion on the user's ear, the user's speech that is captured by the third microphone 11.sub.3 is amplified at low frequencies comparing with the external microphones 11.sub.1 and 11.sub.2. To reduce this amplification to a level close to what the user would hear normally without the earcup occlusion, a feedback ANC corrector 313 processes the corrected internal microphone signal from the first combiner 304.sub.1 and generates an anti-noise signal. A second summator 304.sub.2 receives the anti-noise signal and the downlink signal from the downlink DSP chain 317. The second summator 304.sub.2 adds the anti-noise signal to the downlink signal to generate the speaker signal. The speaker signal is then played or output by the loudspeaker 316.
(26) As further shown in
(27) Referring back to the uplink signal processing, in
(28) The time-frequency transformers (FBa) 302.sub.1-302.sub.4 may transform the signals from a time domain to a frequency domain by filter bank analysis. In one embodiment, the time-frequency transformers (FBa) 302.sub.1-302.sub.4 may transform the signals from a time domain to a frequency domain using the Fast Fourier Transforms (FFT).
(29) Acoustic echo cancellers (AEC) 303.sub.1-303.sub.3 provide additional echo suppression. For example, the first AEC 303.sub.1 removes a linear acoustic echo from acoustic signals from the first external microphone 11.sub.1 in the frequency domain based on a downlink signal in the frequency domain to generate an enhanced first external microphone signal in the frequency domain. The second AEC 303.sub.2 removes a linear acoustic echo from acoustic signals from the second external microphone 11.sub.2 in the frequency domain based on a downlink signal in the frequency domain to generate an enhanced second external microphone signal in the frequency domain. The third AEC 303.sub.3 removes a linear acoustic echo from the corrected internal microphone signal in the frequency domain based on the downlink signal in the frequency domain to generate an enhanced internal microphone signal in the frequency domain.
(30) A beamformer 306 is generating a voicebeam signal based on the enhanced first external microphone signal in the frequency domain and the enhanced second external microphone signal in the frequency domain.
(31) In one embodiment, when only one external microphone is included in the system 3 (e.g., first microphone 110, instead of a beamformer 306, the system 3 includes an amplifier 306 that is a single-microphone amplifier to amplify the enhanced first external microphone signal to generate an amplified enhanced first external microphone signal which is transmitted to the spectral combiner 307 in lieu of the voicebeam signal.
(32) While the beamformer 306 is able to help capture the sounds from the user's mouth and attenuate some of the environmental noise, when the power of the environmental noise (or ambient noise) is above a given threshold or when wind noise is detected in at least two microphones, the acoustic signals captured by the beamformer 306 may not be adequate. Accordingly, in one embodiment of the invention, rather than only using the acoustic signals captured by the beamformer 306, the system 3 performs spectral mixing of the acoustic signals from the internal microphone 11.sub.3 and the voicebeam signal from the beamformer 306 to generate a mixed signal. In another embodiment, the system 3 performs spectral mixing of the acoustic signals from internal microphone 11.sub.3 with the acoustic signals captured by at least one of the external microphones 11.sub.1, 11.sub.2 or a combination of them to generate a mixed signal.
(33) As shown in
(34) In one embodiment, when only one external microphone is included in the system 3 (e.g., first microphone 11.sub.1), the wind and noise detector 305 only receives the enhanced first external microphone signal in the frequency domain from the first AEC 303.sub.1 and determines whether noise such as ambient noise and wind noise is detected in the enhanced first external microphone signal. In this embodiment, the noise detector detects ambient and wind noise when the acoustic noise power signal is greater than the pre-determined threshold. The wind and noise detector 305 generates the detector output to indicate whether the ambient or wind noise is detected in the enhanced first external microphone signal.
(35) In one embodiment, an equalizer 310 scales the enhanced internal microphone signal in the frequency domain to match a level of the enhanced first external microphone signal. The equalizer 310 corrects the frequency response of the third microphone 11.sub.3 (e.g., internal microphone) to match the frequency response of the first or second external microphones 11.sub.1, 11.sub.2. In one embodiment, the equalizer 310 may scale the enhanced internal microphone signal by a fixed scaling quantity. In another embodiment, the equalizer 310 may adaptively scale the enhanced internal microphone signal based on a comparison of the magnitudes of the signals from the first AEC 303.sub.1 and the third AEC 303.sub.3 at run time.
(36) In
(37)
(38) As shown in
(39) Since acoustic signals from the internal microphone 11.sub.3 are more robust to the wind and ambient noise than the external microphones 11.sub.1, 11.sub.2 (or voicebeam signal from the beamformer 306), a lower frequency portion of the mixed signal generated by the spectral combiner 307 includes a corresponding lower frequency portion of the corrected internal microphone signal and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the voicebeam signal. The mixed signal generated by the spectral combiner 307 includes the lower frequency portion and the higher frequency portion.
(40) In the embodiment where only one external microphone is included in the system 3 (e.g., first microphone 11.sub.1), the spectral combiner spectrally mixes the enhanced internal microphone signal with the enhanced first external microphone signal to generate a mixed signal. In one embodiment, prior to the spectral mixing, a single microphone amplifier may amplify the enhanced first external microphone signal as discussed above. In this embodiment, a lower frequency portion of the mixed signal includes a corresponding lower frequency portion of the enhanced internal microphone signal, and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the enhanced first external microphone signal.
(41) As shown in
(42) In one embodiment, the wind and noise detector 305 may generate a detector output that indicates that noise is detected and further indicates the type of noise that is detected. For example, the detector output may indicate that the type of noise detected is either ambient noise or wind noise. As shown in
(43) In one embodiment, the spectral combiner 307 may include a low-pass filter and a high-pass filter. The low-pass filter applies the cutoff frequency (e.g., F1 or F2) to the acoustic signals from the internal microphone 11.sub.3 (or scaled enhanced internal microphone signal) and the high-pass filter applies the cutoff frequency (e.g., F1 or F2) to the acoustic signals from the first external microphone 11.sub.1 or to the voicebeam signal from the beamformer 306 to generate the mixed signal.
(44) Referring to
(45) In one embodiment, the enhanced mixed signal may be in the frequency domain. In
(46) The following embodiments of the invention may be described as a process, which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a procedure, etc.
(47)
(48) At Block 503, the ANC downlink corrector 318 processes a downlink signal to generate an echo estimate of a speaker signal to be output by a speaker 316. At Block 504, a first summator 304.sub.1 removes the echo estimate of the speaker signal from the acoustic signals from the internal microphone 11.sub.3 to generate a corrected internal microphone signal.
(49) At Block 505, a first AEC 303.sub.1 removes a linear acoustic echo from the acoustic signals from the first external microphone 11.sub.3 based on the downlink signal to generate an enhanced first external microphone signal. At Block 506, a second AEC (e.g., AEC 303.sub.3) removes a linear acoustic echo from the corrected internal microphone signal based on the downlink signal to generate an enhanced internal microphone signal.
(50) At Block 507, an equalizer 310 scales the enhanced internal microphone signal to match a level of the enhanced first external microphone signal. At Block 508, the spectral combiner 307 spectrally mixes of the output of the equalizer (e.g., equalized corrected internal microphone signal) with the enhanced first external microphone signal to generate a mixed signal. In one embodiment, a lower frequency portion of the mixed signal includes a corresponding lower frequency portion of the output of the equalizer and a higher frequency portion of the mixed signal includes a corresponding higher frequency portion of the enhanced first external microphone signal. At Block 509, a feedback ANC corrector 313 processes the corrected internal microphone signal to reduce amplification of the user's speech signal by the internal microphone and to generate an anti-noise signal. At Block 510, a second summator 304.sub.2 adds the anti-noise signal to the downlink signal to generate the speaker signal to be output by the speaker.
(51)
(52) Keeping the above points in mind,
(53) An embodiment of the invention may be a machine-readable medium having stored thereon instructions which program a processor to perform some or all of the operations described above. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), such as Compact Disc Read-Only Memory (CD-ROMs), Read-Only Memory (ROMs), Random Access Memory (RAM), and Erasable Programmable Read-Only Memory (EPROM). In other embodiments, some of these operations might be performed by specific hardware components that contain hardwired logic. Those operations might alternatively be performed by any combination of programmable computer components and fixed hardware circuit components.
(54) While the invention has been described in terms of several embodiments, those of ordinary skill in the art will recognize that the invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting. There are numerous other variations to different aspects of the invention described above, which in the interest of conciseness have not been provided in detail. Accordingly, other embodiments are within the scope of the claims.