INPUT SELECTION FOR WIND NOISE REDUCTION ON WEARABLE DEVICES
20240071404 ยท 2024-02-29
Assignee
Inventors
- Douglas George Morton (Southborough, MA, US)
- Olivia Montana Canavan (Natick, MA, US)
- Yang Liu (Boston, MA, US)
Cpc classification
H04R2420/07
ELECTRICITY
H04R2420/01
ELECTRICITY
G10K11/17873
PHYSICS
H04R2430/03
ELECTRICITY
G10K11/17837
PHYSICS
H04R2410/07
ELECTRICITY
H04R2201/107
ELECTRICITY
H04R2410/01
ELECTRICITY
G10K2210/1081
PHYSICS
International classification
G10K11/178
PHYSICS
H04R1/10
ELECTRICITY
Abstract
A wind noise reduction system including a beamformer, a comparator, and a voice mixer is provided. The beamformer may be an MVDR beamformer, and generates a beamformed signal based on a first microphone signal and a second microphone signal. The comparator generates a comparison signal based on the beamformed signal and a wind microphone signal. The comparison signal may be further based on a beamformed energy level of the beamformed signal and a wind energy level of the wind microphone signal. The voice mixer generates an output voice signal based on the beamformed signal, the wind microphone signal, and the comparison signal. The wind noise reduction system may further include a wind microphone corresponding to the wind microphone signal. The wind microphone may be arranged on a portion of a wearable audio device configured to be seated in a concha of a wearer.
Claims
1. A wind noise reduction system comprising: a beamformer configured to generate a beamformed signal based on a first microphone signal and a second microphone signal; a comparator configured to generate a comparison signal based on the beamformed signal and a wind microphone signal; and a dynamic voice mixer configured to generate an output voice signal based on the beamformed signal, the wind microphone signal, and the comparison signal.
2. The wind noise reduction system of claim 1, wherein the beamformer is a minimum variance distortionless response (MVDR) beamformer.
3. The wind noise reduction system of claim 1, wherein the comparison signal is further based on a beamformed energy level of the beamformed signal and a wind energy level of the wind microphone signal.
4. The wind noise reduction system of claim 1, wherein the output voice signal is a blend of the beamformed signal and the wind microphone signal.
5. The wind noise reduction system of claim 4, wherein a ratio of the wind microphone signal to the beamformed signal in the output voice signal corresponds to the comparison signal.
6. The wind noise reduction system of claim 5, wherein the ratio of the wind microphone signal to the beamformed signal in the output voice signal is frequency dependent.
7. The wind noise reduction system of claim 1, wherein the output voice signal corresponds to the wind microphone signal at a frequency range of 200 Hz to 2 kHz.
8. The wind noise reduction system of claim 1, further comprising: a first microphone corresponding to the first microphone signal; a second microphone corresponding to the second microphone signal; and a wind microphone corresponding to the wind microphone signal.
9. The wind noise reduction system of claim 8, wherein the wind microphone is arranged on a portion of a wearable audio device configured to be seated in a concha of a wearer.
10. The wind noise reduction system of claim 8, wherein the wind microphone faces a floor of a concha of a wearer during use.
11. The wind noise reduction system of claim 1, wherein the first microphone signal, the second microphone signal, and the wind microphone signal are frequency domain signals.
12. The wind noise reduction system of claim 1, wherein the first microphone signal, the second microphone signal, and the wind microphone signal are time domain signals.
13. The wind noise reduction system of claim 1, further comprising an equalizer configured to filter the beamformed signal prior to the beamformed signal being received by the comparator and the dynamic voice mixer.
14. The wind noise reduction system of claim 1, further comprising a high pass filter configured to filter the beamformed signal prior to the beamformed signal being received by the dynamic voice mixer.
15. The wind noise reduction system of claim 1, further comprising a feedforward noise cancellation controller for performing feedforward noise cancellation, wherein the feedforward noise cancellation controller receives an input corresponding to the wind microphone signal.
16. A wearable audio device, comprising: a first microphone configured to generate a first microphone signal; a second microphone configured to generate a second microphone signal; a wind microphone corresponding to a wind microphone signal; a beamformer configured to generate a beamformed signal based on the first microphone signal and the second microphone signal; a comparator configured to generate a comparison signal based on the beamformed signal and the wind microphone signal; and a dynamic voice mixer configured to generate an output voice signal based on the beamformed signal, the wind microphone signal, and the comparison signal.
17. The wearable audio device of claim 16, wherein the wind microphone is arranged on a portion of the wearable audio device configured to be inserted in a concha of a wearer.
18. The wearable audio device of claim 16, wherein the wearable audio device is an earbud.
19. A method for reducing wind noise, comprising: generating, via a first beamformer, a beamformed signal based on a first microphone signal and a second microphone signal; generating, via a comparator, a comparison signal based on the beamformed signal and a wind microphone signal; and generating, via a dynamic voice mixer, an output voice signal based on the beamformed signal, the wind microphone signal, and the comparison signal.
20. The method of claim 19, further comprising: generating, via a first microphone, the first microphone signal; generating, via a second microphone, the second microphone signal; and generating, via a wind microphone arranged on a portion of a wearable audio device configured to be disposed in a concha of a wearer and facing a floor of the concha, the wind microphone signal.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0025] In the drawings, like reference characters generally refer to the same parts throughout the different views. Also, the drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the various examples.
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
DETAILED DESCRIPTION
[0034] This disclosure generally relates to input selection for wind noise reduction on wearable audio devices. A wearable audio device captures spoken voice audio from a wearer via two microphones coupled to a beamformer, such as a minimum variance distortionless response (MVDR) beamformer. The wearable audio device also captures spoken voice audio via a wind microphone. The wind microphone is arranged on a portion of the wearable device configured to be seated in a concha of the wearer, such that the wind microphone faces a floor of the concha. Accordingly, the wind microphone will be shielded from wind noise by the concha and the structure of the ear of the wearer, and thus perform better in windy conditions than the beamformer. In particular, the wind microphone will generally outperform (in terms of characteristics such as signal to noise ratio (SNR) or noise floor level) the beamformer in the 200 Hz to 2 kHz frequency range.
[0035] The energy level of a beamformed signal generated by the beamformer is compared to the energy level of a wind microphone signal captured by the wind microphone. A dynamic voice mixer generates an output voice signal by switching between (or blending) the beamformed signal and the wind microphone signal based on the energy level comparison. If the energy level of the beamformed signal is higher than the energy level of the wind microphone signal, windy conditions are present, and at least a portion of the output voice signal will correspond to the wind microphone signal. Alternatively, if the energy level of the beamformed signal is lower than the energy level of the wind microphone signal, non-windy conditions are present, and at least a portion of the output voice signal will correspond to the beamformed signal.
[0036]
[0037]
[0038]
[0039] As with the first microphone 102, the second microphone 104 may be any microphone generally configured to capture spoken voice audio from the wearer W, such as an omnidirectional microphone. Further, as will be described in more detail below, the first and second microphones 102, 104 may be used with a beamformer 114 (see
[0040] As used herein, the term beamformer generally refers to a filter or filter array used to achieve directional signal transmission or reception. In the examples described in the present application, the beamformers combine audio signals received by multiple audio sensors (such as microphones and accelerometers) to focus on a desired spatial region, such as the region around the wearer's mouth. While different types of beamformers utilize different types of filtering, beamformers generally achieve directional reception by filtering the received signals such that, when combined, the signals received from the desired spatial region constructively interfere, while the signals received from the undesired spatial region destructively interfere. This interference results in an amplification of the signals from the desired spatial region, and rejection of the signals from the undesired spatial region. The desired constructive and destructive interference is generally achieved by controlling the phase and/or relative amplitude of the received signals before combining. The filtering may be implemented via one or more integrated circuit (IC) chips, such as a field-programmable gate array (FPGA). The filtering may also be implemented using software.
[0041] The wearable audio device 10 also includes a wind microphone 106. The wind microphone 106 is positioned on a portion of the wearable audio device 10 configured to be seated in the concha C of the wearer W such that the wind microphone 106 faces the floor F of the concha C. By positioning the wind microphone 106 within the concha C, the wind microphone 106 is effectively shielded from wind noise. Thus, in windy conditions, it may be preferable to use spoken voice audio captured by the wind microphone 106 rather than the first or second microphone 102, 104. In some examples, the wind microphone 106 may also be used as an input to a feedforward noise cancellation system. In a feedforward noise cancellation system, audio captured by the wind microphone may be used to remove unwanted noise from audio played for the wearer W via the acoustic transducer 185. A further (left side) view of the wearable audio device 10, showing the wind microphone 106, ear tip 14, and acoustic transducer 185 is shown in
[0042]
[0043]
[0044] As illustrated in
[0045] Each of the microphones 102, 104, 106 generates time domain electrical signals corresponding to the captured voice audio. The first microphone 102 generates a first microphone signal 108, the second microphone 104 generates a second microphone signal 110, and the wind microphone 106 generates a wind microphone signal 112. The first microphone signal 108, the second microphone signal 110, and the third microphone signal 112 are then converted to the frequency domain by a Weighted, Overlap, and Add (WOLA) analysis filter bank.
[0046] The first and second frequency domain second microphone signals 208, 210 are then provided to a beamformer 114. As previously described, the beamformer 114 is used to achieve directional audio capture using the first and second microphones 102, 104. The beamformer 114 uses the first and second frequency domain microphone signals 208, 210 to generate a beamformed signal 216. In the example of
[0047] The frequency domain wind microphone signal 212 is provided to equalizer 130. The equalizer 130 is configured to attenuate portions of the frequency domain wind microphone signal 212 such that in a quiet, non-windy environment, the energy level of an equalized wind microphone signal 254 equals the energy level of the beamformed signal 216 for more accurate wind detection. One or more filter weights of the equalizer 130 may be programmable and/or dynamic.
[0048] The wind noise reduction system 100 then determines the energy levels of the beamformed signal 216 and the equalized wind microphone signal 254. A first energy detector 138 receives the beamformed signal 216. The first energy detector 138 analyzes the beamformed signal 216 using smooth energy envelope analysis to generate a beamformed energy level signal 242 corresponding to the energy level of the beamformed signal 216. Similarly, a second energy detector 140 receives the equalized wind microphone signal 254. The second energy detector 140 analyzes the equalized wind microphone signal 254 using smooth energy envelope analysis to generate a wind microphone energy level signal 244 corresponding to the energy level of the equalized wind microphone signal 254.
[0049] The beamformed energy level signal 242 and the wind microphone energy level signal 244 are then provided to a comparator 118. The comparator 118 generates a comparison signal 220 indicating whether the beamformed energy level signal 242 or the wind microphone energy level signal 244 is greater. In some examples, the comparison signal 220 may also indicate the degree of difference between the beamformed energy level signal 242 and the wind microphone energy level signal 244. In further examples, the comparison signal 220 may be frequency dependent, indicative of fluctuating energy levels over frequency. In even further examples, the comparator 118 may be focused on comparing the energy levels in a defined frequency range, such as 200 Hz and 2 kHz. The frequency range of 200 Hz and 2 kHz is an example of a frequency range where the wind microphone signal 112 may outperform (in terms of characteristics such as SNR or noise floor level) the beamformed signal 216 in windy conditions.
[0050] The equalized wind microphone signal 254 is then provided to a high pass filter 132. The high pass filter 132 is configured to remove or attenuate low frequency noise in windy conditions. One or more filter weights of the high pass filter 132 may be programmable and/or dynamic. Notably, for accurate energy level comparisons, a high pass filter is not applied to the equalized wind microphone signal 254 received by the second energy detector 140. Further, a high pass filter is not applied to the beamformed signal 216 to preserve low frequency aspects in non-windy conditions.
[0051] The comparison signal 220, the beamformed signal 216, and the filtered wind microphone signal 256 are provided to a voice mixer 122. The voice mixer 122 may function as a crossfader to generate a frequency domain output voice signal 224 by switching between or blending the beamformed signal 216 and the filtered wind microphone signal 256 based on the comparison signal 220. As the comparison signal 220 will change based on the beamformed energy level signal 242 and the wind microphone energy level signal 244, the switching or blending settings of the voice mixer 122 will change accordingly. Thus, the voice mixer 122 can be considered to be a dynamic voice mixer.
[0052] In one example, the voice mixer 122 is configured to switch back-and-forth between the beamformed signal 216 and the filtered wind microphone signal 256 to generate the output voice signal 224. If the comparison signal 220 indicates that the energy level of the beamformed signal 216 is significantly higher than the energy level of the equalized wind microphone signal 254 (corresponding to windy conditions), the frequency domain output voice signal 224 may correspond to the filtered wind microphone signal 256. If the comparison signal 220 indicates that the energy level of the beamformed signal 216 decreases to be significantly less than the energy level of the equalized wind microphone signal 254 (corresponding to non-windy conditions), the frequency domain output voice signal 224 may switch to correspond to the beamformed signal 216. In some examples, this switching may be limited to frequency ranges where the wind microphone 106 (positioned in the concha C of the wearer W) performs significantly better than the beamformer 114 in windy conditions. By dynamically switching back-and-forth, the voice mixer 122 adapts the frequency domain output voice signal 224 to use the beamformed signal 216 in non-windy conditions and the frequency domain wind microphone signal 212 in windy conditions for improved performance over just the beamformed signal 216 or the frequency domain wind microphone signal 212 over the entire applicable frequency range.
[0053] In one example, this switching may be confined to a defined frequency range. For example, if the energy level of the beamformed signal 216 is significantly higher than the energy level of the equalized wind microphone signal 254 (indicating windy conditions), the frequency domain output signal 224 may be configured to correspond to the filtered wind microphone signal 256 below 2 kHz, while corresponding to the beamformed signal 216 above 2 kHz, as the impact of wind noise on the beamformed signal is most severe below 2 kHz.
[0054] In other examples, rather than switching between the beamformed signal 216 and the filtered wind microphone signal 256, the frequency domain output voice signal 224 may be a blend of the two, analogous to a blended crossfade. If the comparison signal 220 indicates the energy level of the beamformed signal 216 is higher than the energy level of the equalized wind microphone signal 254 (indicating windy conditions) by a ratio of 2 to 1, the frequency domain output voice signal 224 may be a blend of the beamformed signal 216 and the equalized wind microphone signal 254 at a ratio of 1 to 2. As with the previous examples, this blend of the beamformed signal 216 and the filtered wind microphone signal 256 may be confined to a defined frequency range within the frequency domain output voice signal 224, such as below 2 kHz. In some examples, the ratio of the beamformed signal 216 to the filtered wind microphone signal 256 may vary over frequency.
[0055] Once generated, the frequency domain output voice signal 224 may be provided to additional circuitry for further processing in the frequency domain. Alternatively, the frequency domain output voice signal 224 may be converted into the time domain by a WOLA synthesis filter bank 150. The time domain output voice signal 124 may then be further processed before being transmitted, via a transceiver 195, to a peripheral device, such as a smartphone, for use in a telephone call or related application.
[0056] Aside from wind noise reduction, the (time domain) wind microphone signal 112 may also be used for feedforward noise cancellation to reduce noise played back to the wearer W via the acoustic transducer 185. The wind microphone signal 112 may be provided to a feedforward noise cancellation controller 134. The feedforward noise cancellation controller 134 then generates, based on the wind microphone signal 112, an anti-noise signal 146 which is provided to the acoustic transducer 185 to cancel out noise captured by the wind noise microphone 106, such as audible noise at the concha C of the wearer W.
[0057]
[0058] In
[0059]
[0060] According to another example, the method 900 may further include the optional steps of: (1) generating 908, via a first microphone, the first microphone signal; (2) generating 910, via a second microphone, the second microphone signal; and (3) generating 912, via a wind microphone arranged on a portion of a wearable audio device configured to be disposed in a concha of a wearer and facing a floor of the concha, the wind microphone signal.
[0061] All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
[0062] The indefinite articles a and an, as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean at least one.
[0063] The phrase and/or, as used herein in the specification and in the claims, should be understood to mean either or both of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with and/or should be construed in the same fashion, i.e., one or more of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the and/or clause, whether related or unrelated to those elements specifically identified.
[0064] As used herein in the specification and in the claims, or should be understood to have the same meaning as and/or as defined above. For example, when separating items in a list, or or and/or shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as only one of or exactly one of, or, when used in the claims, consisting of, will refer to the inclusion of exactly one element of a number or list of elements. In general, the term or as used herein shall only be interpreted as indicating exclusive alternatives (i.e. one or the other but not both) when preceded by terms of exclusivity, such as either, one of, only one of, or exactly one of.
[0065] As used herein in the specification and in the claims, the phrase at least one, in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase at least one refers, whether related or unrelated to those elements specifically identified.
[0066] It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited.
[0067] In the claims, as well as in the specification above, all transitional phrases such as comprising, including, carrying, having, containing, involving, holding, composed of, and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases consisting of and consisting essentially of shall be closed or semi-closed transitional phrases, respectively.
[0068] The above-described examples of the described subject matter can be implemented in any of numerous ways. For example, some aspects may be implemented using hardware, software or a combination thereof. When any aspect is implemented at least in part in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single device or computer or distributed among multiple devices/computers.
[0069] The present disclosure may be implemented as a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
[0070] The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
[0071] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
[0072] Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++, or the like, and procedural programming languages, such as the C programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some examples, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
[0073] Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to examples of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
[0074] The computer readable program instructions may be provided to a processor of a, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram or blocks.
[0075] The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
[0076] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various examples of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
[0077] Other implementations are within the scope of the following claims and other claims to which the applicant may be entitled.
[0078] While various examples have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the examples described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific examples described herein. It is, therefore, to be understood that the foregoing examples are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, examples may be practiced otherwise than as specifically described and claimed. Examples of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure.