Controlling wind noise in a bilateral microphone array
09843861 · 2017-12-12
Assignee
Inventors
Cpc classification
H04R2201/107
ELECTRICITY
H04R2410/07
ELECTRICITY
International classification
Abstract
A pair of earphones have microphone arrays each providing a plurality of microphone signals. A processor receives the microphone signals and applies a first set of filters to a subset of the plurality of microphone signals from each of the arrays, the first set of filters inverting the signals below a cutoff frequency, and provides the first-filtered signals and the remainder of the microphone signals from each of the arrays to a second set of filters. The processor uses the second set of filters to combine the signals to generate a far-field signal that is more sensitive to sounds originating a short distance away from the earphones than to sounds close to the earphones above the cutoff frequency, and omnidirectional below the cutoff frequency, determines a level of wind noise present in the microphone signals, and adjusts the cutoff frequency as a function of the determined level of wind noise.
Claims
1. An apparatus comprising: a first earphone having a first microphone array providing a first plurality of microphone signals, and a first speaker; a second earphone having a second microphone array providing a second plurality of microphone signals, and a second speaker; and a processor receiving the first plurality of microphone signals and second plurality of microphone signals, and configured to: apply a first set of filters to a subset of the plurality of microphone signals from each of the first microphone array and the second microphone array, the first set of filters inverting the signals below a cutoff frequency; provide the first-filtered signals and the remainder of the microphone signals from each of the first microphone array and the second microphone array to a second set of filters; use the second set of filters to combine the microphone signals to generate a far-field signal that is more sensitive to sounds originating a short distance away from the apparatus than to sounds close to the apparatus above the cutoff frequency, and omnidirectional below the cutoff frequency; determine a level of wind noise present in the microphone signals; adjust the cutoff frequency as a function of the determined level of wind noise; and provide the far-field signal to the speakers for output.
2. The apparatus of claim 1, wherein the processor is further configured to: after generating the far-field signal in the second set of filters, apply gain to the output of the filters below a second cutoff frequency which is a function of the first cutoff frequency.
3. The apparatus of claim 1, wherein the processor is further configured to: after generating the far-field signal in the first set of filters, apply a high-pass filter to the output of the filters.
4. The apparatus of claim 1, wherein the processor is further configured to: determine a total low-frequency energy present in the microphone signals; and upon determining that the total sound level is below a first threshold, and the level of wind noise is below a second threshold, increase the cutoff frequency of the first set of filters.
5. The apparatus of claim 1, wherein generating the far-field signal comprises, in the processor: determining a total low-frequency energy present in the microphone signals; computing a sum of the microphone signals; computing a difference of the microphone signals; comparing the sum of the microphone signals to the difference of the microphone signals and to the total low-frequency energy; and determining the cutoff frequency based on the results of the comparison.
6. The apparatus of claim 5, wherein computing the difference of the microphone signals comprises: computing a first difference of microphone signals in the first plurality of microphone signals, computing a second difference of microphone signals in the second plurality of microphone signals, and computing a difference of the first difference and the second difference as the difference of the microphone signals.
7. An apparatus comprising: a first earphone having a first microphone array providing a first plurality of microphone signals, and a first speaker; a second earphone having a second microphone array providing a second plurality of microphone signals, and a second speaker; and a processor receiving the first plurality of microphone signals and second plurality of microphone signals, and configured to: use a first set of filters to combine the microphone signals to generate a far-field signal that is more sensitive to sounds originating a short distance away from the apparatus than to sounds close to the apparatus above a cutoff frequency, and omnidirectional below the cutoff frequency; determine a level of wind noise present in the microphone signals; adjust the cutoff frequency as a function of the determined level of wind noise; provide the far-field signal to the speakers for output; use a second set of filters to combine the microphone signals to generate a near-field signal that is more sensitive to voice signals from a person wearing the earphones than to sounds originating away from the apparatus; combine the microphone signals to generate an omnidirectional signal; combine the near-field signal and the omnidirectional signal using a weighted sum, the weight being a function of the determined level of wind noise to generate a communication signal; and provide the communication signal to a communication system.
8. The apparatus of claim 7, wherein the processor is configured to: determine the level of wind noise for adjusting the cutoff frequency based on a comparison of a sum of the microphone signals to a difference of the microphone signals; and determine the level of wind noise for adjusting the weight applied to the near field signal in the communication signal based on a comparison of the near field signal to the omnidirectional signal.
9. The apparatus of claim 7, wherein generating the far-field signal comprises, in the processor: applying an all-pass filter to a subset of the plurality of microphone signals from each of the first microphone array and the second microphone array, the all-pass filter inverting the signals below the cutoff frequency; and providing the all-pass-filtered signals and the remainder of the microphone signals from each of the first microphone array and the second microphone array to the first set of filters.
10. The apparatus of claim 7, wherein generating the near-field signal and omnidirectional signal comprises, in the processor: applying a third set of filters to a first subset of the plurality of microphone signals from each of the first microphone array and the second microphone array; applying a fourth set of filters to a second subset of the plurality of microphone signals from each of the first microphone array and the second microphone array; combining the filtered first subset with the filtered second subset to generate the near-field signal; and summing the first subset and the second subset to generate the omnidirectional signal.
11. The apparatus of claim 10, wherein generating the near-field signal and omnidirectional signal further comprises: summing the first subset and providing the summed first subset to the third set of filters; summing the second subset and providing the summed second subset to the fourth set of filters; summing the summed first subset and the second summed subset to generate the omnidirectional signal.
12. The apparatus of claim 10, wherein the processor comprises a plurality of sub-processors, and the summing of the first and second subsets is performed by a separate sub-processor from the applying of the third and fourth filters and combining of the filtered subsets.
13. A method comprising, in a processor: receiving, from a first earphone having a first microphone array, a first plurality of microphone signals; receiving, from a second earphone having a second microphone array, a second plurality of microphone signals; and applying a first set of filters to a subset of the plurality of microphone signals from each of the first microphone array and the second microphone array, the first set of filters inverting the signals below a cutoff frequency; providing the first-filtered signals and the remainder of the microphone signals from each of the first microphone array and the second microphone array to a second set of filters; using the second set of filters to combine the microphone signals to generate a far-field signal that is more sensitive to sounds originating a short distance away from the earphones than to sounds close to the apparatus above the cutoff frequency, and omnidirectional below the cutoff frequency; determining a level of wind noise present in the microphone signals; adjusting the cutoff frequency as a function of the determined level of wind noise; and providing the far-field signal to first and second speakers in the respective first and second earphones for output.
14. The method of claim 13, further comprising, in the processor: after generating the far-field signal in the second set of filters, applying gain to the output of the filters below a second cutoff frequency.
15. The method of claim 13, further comprising, in the processor: after generating the far-field signal in the first set of filters, applying a high-pass filter to the output of the filters.
16. The method of claim 13, further comprising, in the processor: determining a total sound level present in the microphone signals; and upon determining that the total sound level is below a first threshold, and the level of wind noise is below a second threshold, increasing the cutoff frequency of the first set of filters.
17. A method comprising, in a processor: receiving, from a first earphone having a first microphone array, a first plurality of microphone signals; receiving, from a second earphone having a second microphone array, a second plurality of microphone signals; using a first set of filters to combine the microphone signals to generate a far-field signal that is more sensitive to sounds originating a short distance away from the apparatus than to sounds close to the apparatus above a cutoff frequency, and omnidirectional below the cutoff frequency; determining a level of wind noise present in the microphone signals; adjusting the cutoff frequency as a function of the determined level of wind noise; providing the far-field signal to first and second speakers in the respective first and second earphones for output; using a second set of filters to combine the microphone signals to generate a near-field signal that is more sensitive to voice signals from a person wearing the earphones than to sounds originating away from the earphones; combining the microphone signals to generate an omnidirectional signal; combining the near-field signal and the omnidirectional signal using a weighted sum, the weight being a function of the determined level of wind noise to generate a communication signal; and providing the communication signal to a communication system.
18. The method of claim 17, further comprising, in the processor: determining the level of wind noise for adjusting the cutoff frequency based on a comparison of a sum of the microphone signals to a difference of the microphone signals; and determining the level of wind noise for adjusting the weight applied to the near field signal in the communication signal based on a comparison of the near field signal to the omnidirectional signal.
19. The method of claim 17, wherein generating the far-field signal comprises, in the processor: applying an all-pass filter to a subset of the plurality of microphone signals from each of the first microphone array and the second microphone array, the all-pass filter inverting the signals below the cutoff frequency; and providing the all-pass-filtered signals and the remainder of the microphone signals from each of the first microphone array and the second microphone array to the first set of filters.
20. The method of claim 17, wherein generating the near-field signal and omnidirectional signal comprises: applying a third set of filters to a first subset of the plurality of microphone signals from each of the first microphone array and the second microphone array; applying a fourth set of filters to a second subset of the plurality of microphone signals from each of the first microphone array and the second microphone array; combining the filtered first subset with the filtered second subset to generate the near-field signal; summing the first subset and the second subset to generate the omnidirectional signal.
21. The method of claim 20, wherein generating the near-field signal and omnidirectional signal further comprises: summing the first subset and providing the summed first subset to the third set of filters; summing the second subset and providing the summed second subset to the fourth set of filters; summing the summed first subset and the second summed subset to generate the omnidirectional signal.
22. The method of claim 20, wherein the processor comprises a plurality of sub-processors, and the summing of the first and second subsets is performed by a separate sub-processor from the applying of the third and fourth filters and combining of the filtered subsets.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
DESCRIPTION
(3) In a new headphone architecture shown in
(4) The processor 112 applies a number of configurable filters to the signals from the various microphones. The provision of a high-bandwidth communication channel from all four microphones 126, 128, 130, 132, two located at each ear, to a shared processing system provides new opportunities in both local conversation assistance and communication with a remote person or system. Specifically, as shown in
(5) A third set of filters 206 is used to combine the four microphone signals to form a near-field array optimized for detecting the user's own voice. When we say the array is optimized for detecting the user's own voice, we mean that the sensitivity of the array to signals originating from the user's mouth is greater than the sensitivity to sounds originating farther from the headphones. Even with the microphones 126, 128, 130, 132 physically arranged to optimize far-field pickup in front of the user, the combination of all four microphones has been found to provide near-field voice performance at least as good as, and in some cases better than, a two-microphone array in the same earbud location but physically aimed at the user's mouth.
(6) In some examples, yet another set of filters 208 is used for providing the user's voice back to the user himself, commonly called side-tone. The side-tone voice signal may be filtered differently from the outbound voice signal to account for the effect of the earphone's acoustics on the user's perception of his or her own voice. Finally, active noise reduction (ANR) filters 210, 212 for each ear use at least one of the local microphones to produce noise-cancelling signals. The ANR filters may use one or both external microphones and the feedback microphone for each ear to cancel ambient noise. In some examples, the external microphones from the opposite ear may also be used for ANR in each ear.
(7) The ANR signals, far-field array signals, side-tone signals, and any incoming communication or entertainment signals (not shown) are summed for each ear. As shown in
(8) In some examples, as shown in
(9) Far-Field Filtering
(10) An example topology for far-field microphone processing is shown in
(11) The particular filters and related signal processing for generating the far-field signals for output to the left and right ear are described in application U.S. 2015/0230026, incorporated by reference above. All of the filtering, summing, equalizing, and processing shown in
(12) Near-Field Communication Filters
(13) As noted above, even with the four microphones physically arranged to optimize far-field voice pickup, when all four are combined, they also produce good near-field voice signals for communication purposes. Previous communication headsets have combined two microphones to improve detection of the user's voice, for example, in a beam-forming array aimed at the user's mouth. To a high level, the same type of processing shown in
(14) Side-Tone Filters
(15) In headsets that block the user's ear, hearing their own voice played back can help the user control the level at which they speak, and feel more comfortable talking into the headset. As anyone who has listened to a recording of themselves can relate, however, simply providing the outbound communication signal to the user's ear may not sound natural. This is even more pronounced due to the way the earphones 102, 104 change how the user perceives their own voice. U.S. Pat. No. 9,020,160, incorporated here by reference, discusses ways of filtering feedback and feed-forward microphone signals to produce a self-voice signal that sounds more natural. These techniques can be used in the present architecture either using all four microphones, as shown by filter 208 in
(16) In a simplified example, such as in the example of
(17) Wind-Noise Mitigation
(18) As noted above, two microphones have previously been used as beam-forming arrays to detect the user's voice. In other examples, as described in U.S. Pat. No. 8,620,650, incorporated here by reference, two microphone signals can be combined to optimize rejection of ambient and wind noise. This can be adapted to the example of
(19) The far-field array signal is also susceptible to wind noise, but different processing is used to manage it. In some examples, as shown in
(20) A second set of wind filters 624 is applied after the far-field array processing 204. This second set of wind filters does two things: it decreases low-frequency gain, and it applies a high-pass filter. In the normal far-field array processing, high gain is applied at lower frequencies to account for the loss of energy due to the directionality of the array. As the sensitivity at lower frequencies is shifted to being omnidirectional, this energy is restored and the gain can be reduced. The cutoff frequency of this low-frequency gain is based on the cutoff frequency of the all-pass filters 622, but may not be exactly the same frequency. At the same time, the high-pass filter removes whatever residual wind noise is still picked up—at particularly high wind levels, this may be more effective than the other techniques. As the wind level increases, both the low-frequency gain cutoff frequency and the high-pass filter cutoff frequency are raised, following the raising inversion frequency of the wind pre-filters.
(21) Mitigation of White Noise Gain at Low Frequencies
(22) In some examples, also shown in
(23) Bilateral Wind Mitigation
(24) Rather than combining the left and right microphone signals, as mentioned above in the discussion of near-field voice pickup, the wind-vs-ambient noise mixing algorithm used for the near-field signal can also be adapted to use separate left and right microphone signals to optimize rejection of noise that is asymmetric in the far-field microphone signal, e.g., if wind is striking the user from one side more than the other. In this example, as shown in
(25) The summing and comparison can be done in each of the array processors (assuming there are two, as in some of the examples), or done in one of them and a control signal provided to the other. If the communication processer were provided with all four microphone signals, rather than with the pre-summed front and rear signal pairs, then a similar left/right wind noise control could be applied to the near-end voice signal in combination with the omnidirectional/directional wind noise control shown in
(26) Simultaneous Operation
(27) With sufficient processing power, the different sets of filters can be used in parallel to simultaneously produce the near-field and far-field signals. This allows the user to his own voice and a conversation partner's voice simultaneously (i.e., if they are talking over each other), or to talk on the wireless connection at the same time as listening to another person. Aside from simply multitasking, that latter can be useful if more than one person in a conversation is using a device such as the one described herein. See, for example, U.S. Pat. No. 9,190,043, the entire contents of which are incorporated here by reference. Each of the multiple headsets can transmit its user's locally-detected voice, from the near-field filters, to the other headsets, where it can be combined with the results of that headset's far-field filters to provide the user with a complete set of their conversation partner(s) voices.
(28) The simultaneous detection of near-field and far-field voice can also be useful where the near-field is not being used for conversation. For example, if the headset implements or is connected to a voice personal assistant (VPA, the near-field signal can be directed to that system, or to a wake-up word detection process. The near-field signal should provide a higher signal-to-noise ratio for this than simply using ambient microphones.
(29) The near-field and far-field signals can also be compared to each other. One result of this comparison could be to estimate the proximity of the dominant signal—if the correlation of the two is high, it is the user speaking. This can be used for a voice activity detector, or to change other noise reduction algorithms, to name two examples.
(30) In the particular example of
(31) Embodiments of the systems and methods described above comprise computer components and computer-implemented steps that will be apparent to those skilled in the art. For example, it should be understood by one of skill in the art that the computer-implemented steps may be stored as computer-executable instructions on a computer-readable medium such as, for example, Flash ROMS, nonvolatile ROM, and RAM. Furthermore, it should be understood by one of skill in the art that the computer-executable instructions may be executed on a variety of processors such as, for example, microprocessors, digital signal processors, gate arrays, etc. For ease of exposition, not every step or element of the systems and methods described above is described herein as part of a computer system, but those skilled in the art will recognize that each step or element may have a corresponding computer system or software component. Such computer system and/or software components are therefore enabled by describing their corresponding steps or elements (that is, their functionality), and are within the scope of the disclosure.
(32) A number of implementations have been described. Nevertheless, it will be understood that additional modifications may be made without departing from the scope of the inventive concepts described herein, and, accordingly, other embodiments are within the scope of the following claims.