SYSTEM AND METHOD FOR MANAGING A HEADPHONES USERS SOUND EXPOSURE
20220225048 · 2022-07-14
Inventors
Cpc classification
G06F17/156
PHYSICS
H04R5/04
ELECTRICITY
H04R1/1041
ELECTRICITY
H04S7/301
ELECTRICITY
H04R2430/01
ELECTRICITY
International classification
Abstract
A method and system for managing a user sound exposure are described herein. The method includes collecting raw audio data, calculating spectral data and a sound pressure level (SPL) from the raw audio data, comparing the calculated SPL to a predetermined threshold level of sound exposure, and applying one or more modifications to the SPL in response to a determination that the SPL is approaching the predetermined threshold level of sound exposure to ensure the SPL never reaches the predetermined threshold level of sound exposure.
Claims
1. A method executed by an audio playback device for managing a user sound exposure, the method comprising: collecting raw audio data; calculating SPL data from the raw audio data; comparing the calculated SPL to a predetermined threshold level of sound exposure; and in response to a determination that the SPL is approaching the predetermined threshold level of sound exposure, applying one or more modifications to the SPL to reduce the SPL to ensure the SPL never reaches the predetermined threshold level of sound exposure.
2. The method of claim 1, wherein the raw audio data comprises a first set of raw audio data from a first data source, a second set of raw audio data from a second data source, and a third set of raw audio data from another data source, and wherein the first source differs from the second source.
3. The method of claim 2, wherein the first data source is selected from the group consisting of: one or more internally faced microphones on the audio playback device and a Bluetooth chip memory of the audio playback device, and wherein the second data source comprises one or more externally facing microphones mounted on or nearby the audio playback device.
4. The method of claim 2, wherein the first set of raw audio data and the second set of raw audio data are obtained simultaneously.
5. The method of claim 2, wherein the first set of raw audio data and the second set of raw audio data are obtained at varying sample rates and sampling intervals.
6. The method of claim 1, wherein the calculation of the SPL occurs via a fast Fourier transform (FFT) process.
7. The method of claim 2, wherein the calculation of the SPL from the raw audio data comprises: applying an algorithm or process to the first set of raw audio data to form a first set of calculated data; and applying the algorithm or the process to the second set of raw audio data to form a second set of calculated data.
8. The method of claim 7, wherein the calculation of the SPL from the raw audio data calculation further comprises: storing the first set of calculated data in a first level data array; storing the second set of calculated data in a second level data array; and producing a cumulative array comprising an integral of the first level data array and the second level data array.
9. The method of claim 8, wherein the calculation of the SPL from the raw audio data calculation further comprises: arranging the first level data array, the second level data array, and the cumulative array as a circular buffer such that once the first level data array or the second level data array is filled, newest data overwrites oldest data; and storing the first level data array, the second level data array, and the cumulative array in a non-volatile memory of the audio playback device.
10. The method of claim 8, wherein each of the first level data array and the second level data array are stored with a timestamp.
11. The method of claim 10, wherein the timestamp is an absolute time derived from a real-time clock or a relative time derived as an incremental value from a defined starting point.
12. The method of claim 1, further comprising: transmitting the raw audio data, the spectral data, and/or the SPL to another device.
13. The method of claim 1, wherein each modification of the one or more modifications are selected from the group consisting of: active noise control (ANC), equalization (EQ), and a sound gain filter.
14. The method of claim 1, wherein the audio playback device comprises headphones.
15. The method of claim 1, further comprising: predicting a trend of the SPL.
16. A method executed by an audio playback device for managing a user sound exposure, the method comprising: collecting raw audio data; calculating SPL data from the raw audio data; comparing the calculated SPL to a predetermined threshold level of sound exposure; in response to a determination that the SPL is approaching the predetermined threshold level of sound exposure, applying one or more modifications to the SPL to reduce the SPL to ensure the SPL never reaches the predetermined threshold level of sound exposure, wherein each modification of the one or more modifications are selected from the group consisting of: active noise control (ANC), equalization (EQ), and a sound gain filter; and transmitting the raw audio data, the spectral data, and/or the SPL to another device.
17. The method of claim 16, wherein the calculation of the SPL from the raw audio data comprises: applying an algorithm or process to the first set of raw audio data to form a first set of calculated data; and applying the algorithm or the process to the second set of raw audio data to form a second set of calculated data.
18. The method of claim 17, wherein the calculation of the SPL from the raw audio data calculation further comprises: storing the first set of calculated data in a first level data array; storing the second set of calculated data in a second level data array, and producing a cumulative array comprising an integral of the first level data array and the second level data array.
19. The method of claim 18, wherein each of the first level data array and the second level data array comprise information regarding momentary SPLs, and wherein the cumulative array provides information regarding a total sound energy over a time period.
20. The method of claim 18, wherein the calculation of the SPL from the raw audio data calculation further comprises: arranging the first level data array, the second level data array, and the cumulative array as a circular buffer such that once the first level data array or the second level data array is filled, newest data overwrites oldest data; and storing the first level data array, the second level data array, and the cumulative array in a non-volatile memory of the audio playback device.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0044] The preferred embodiments of the present invention will now be described with reference to the drawings. Identical elements in the various figures are identified with the same reference numerals.
[0045] Reference will now be made in detail to each embodiment of the present invention. Such embodiments are provided by way of explanation of the present invention, which is not intended to be limited thereto. In fact, those of ordinary skill in the art may appreciate upon reading the present specification and viewing the present drawings that various modifications and variations can be made thereto.
[0046] As described herein, “active noise control,” “ANC,” “noise cancellation,” “active noise reduction,” or “ANR” is a method for reducing unwanted sound by the addition of a second sound specifically designed to cancel the first.
[0047] As described herein “sound pressure level” or “SPL” is a logarithmic measure of the effective pressure of a sound relative to a reference value.
[0048] As described herein, a “root mean square” or “RMS” is the square root of the mean square (the arithmetic mean of the squares of a set of numbers).
[0049] As described herein, “Bluetooth” is a wireless technology standard used for exchanging data between fixed and mobile devices over short distances using UHF radio waves in the industrial, scientific and medical radio bands, from 2.402 GHz to 2.480 GHz, and building personal area networks.
[0050] As described herein, “Bluetooth Low Energy” is a wireless personal area network technology aimed at applications in the healthcare, fitness, beacons, security, and home entertainment industries. When compared to classic Bluetooth, Bluetooth Low Energy is intended to provide considerably reduced power consumption and cost while maintaining a similar communication range. Mobile operating systems including iOS, Android, Windows Phone and BlackBerry, as well as macOS, Linux, Windows 8 and Windows 10, natively support Bluetooth Low Energy.
[0051] It should be appreciated that “headphones” may be used interchangeable with “headsets” or “ear buds” herein.
[0052] As described herein, “equalization” or “EQ” is the process of adjusting the balance between frequency components within an electronic signal. The most well-known use of equalization is in sound recording and reproduction but there are many other applications in electronics and telecommunications.
[0053] A system described and depicted at least in
[0054] Furthermore, as described herein, a “sound exposure” is framed in terms of an equivalent continuous sound pressure level and is defined by the following expression:
In the above expression, P(t) is the sound pressure in pascals at time t, P.sub.o is the reference sound pressure level 2e-5 pascals, the integral is performed over the interval Q to T.sub.m, Q represents a zero point in time, T.sub.m is the period of time the integration extends, and L.sub.eq is a sum of the squares of sound pressure level over time. As such, the integral term describes a total exposure and the maximum allowable exposure is approximately equal to a constant, as shown:
[0055] A method executed by Bluetooth-enabled headphones for managing a user sound exposure is also described herein. The method comprises numerous process steps, such as collecting raw audio data. The raw audio data may include a first set of raw audio data from a first data source and a second set of raw audio data from a second data source. It should be appreciated that the first source differs from the second source. Moreover, in examples, the first data source may include one or more internally faced microphones on the headphones, which may sense sound pressure inside the front cavity, such as a feedback ANC microphone, via an analog-to-digital converter (ADC).
[0056] In another example, the first data source may include a Bluetooth audio data stream of the headphones and the first set of raw audio data may include Bluetooth audio or voice data. In a further example, the second data source may include one or more externally facing microphones mounted on or nearby the headphones. Examples of such include an inline voice microphone, a feed forward ANC microphone or boom microphone, via the ADC.
[0057] In some examples, the first set of raw audio data and the second set of raw audio data are obtained simultaneously in systems that have sufficient processing power. The system's ability to provide for continuous sampling of the incoming audio data stream, before it is reproduced as an acoustic signal, allows for immediate truncation and compression of the signal to avoid excessively SPL levels. In some applications, this situation would be prevented by design, and therefore, the need for the continuous sampling would be avoided. However, this solution is computationally expensive and would consume a relatively high amount of power. In other examples, the first set of raw audio data and the second set of raw audio data are obtained at varying sample rates and sampling intervals, where these intervals may be periodic (e.g., at a few times per second up to periods of several seconds).
[0058] Using multiple sources of data allows the system to differentiate between noise originating from program audio content and noise originating from ambient noise, as well as allowing for the ability to self-calibrate. Self-calibration may occur where a feedback microphone produces a direct measurement of sound pressure level that can then be used to calibrate the scaling factor applied to the Bluetooth digital data, where the scaling of the Bluetooth digital data gives an equivalent sound pressure level number.
[0059] It should be appreciated that a sample data block size may be chosen so that a fast Fourier transform (“FFT”) of the data produces frequencies that adequately cover the relevant frequency spectrum, as shown in
[0060] As described herein, “a fast Fourier transform” or “FFT” is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in the frequency domain and vice versa. The DFT is obtained by decomposing a sequence of values into components of different frequencies.
[0061] Moreover, it should be appreciated that the data in each buffer is scaled to represent a common unit of measurement, such as the Pascal unit, referring to sound pressure at the ear of the user. As an example, in the case of a microphone signal sampled by the ADC, a scale factor may have the units, Pascals per ADC Code. The buffer is a block of memory that holds the block of sampled data, with one buffer being used for each data source. The data in the buffer is not retained once the calculation of SPL and spectrum data is completed. Instead, the buffer is then used for the next block of sampled data.
[0062] A scale factor is needed to convert the raw sample data into a common unit of measure. For example, the Bluetooth audio data comprises digital numbers that represent a waveform, The acoustic pressure that will result from playing this waveform through a DAC amplifier and speaker is proportional to the digital number. The scale factor one uses accounts for this proportion so that when one scales the Bluetooth audio data numbers accordingly, one gets the equivalent pascals of sound pressure that the user will experience.
[0063] In the case of digital audio data obtained from a Bluetooth data stream, the scale factor would have the form of Pascals per digital code, but may also have an additional factor (e.g., the net system gain). This gain would change with user adjustment of equalization (EQ) and volume controls.
[0064] It is preferred that the system makes use of the data source already available in a given headphone design so that no additional component cost is incurred when implementing the instant system. Systems can therefore function with any combination of available sources. For example, the most basic implementation would include access only to Bluetooth digital data. An advanced implementation would include access to feedforward ANC microphones, feedback ANC microphones, and Bluetooth digital data.
[0065] Next, the method may include calculating spectral data and the SPL RMS values from the raw audio data The spectral data is calculated using an FFT calculation. The SPL however can be computed with a relatively simple formula as below. The FFT takes a block of sampled data that represent a waveform in time and transforms it to a set of frequency components. The FFT tells us the magnitude (level) of each frequency component that makes up the waveform. From this information, one can determine which frequency components have the greatest influence on the total sound pressure level and thereby allows one to selectively reduce those frequency components without having to attenuate the entire signal. The SPL calculation may occur by the following equation:
In the above-referenced equation, X.sub.n is the nth data sample, A is the scaling factor that transforms X.sub.n into the equivalent pascals of sound pressure, the summation is performed over the range of 1 to N samples, N is the total number of samples in our sample data buffer, and SPL(t) is the SPL at the time point t (when the calculation is made).
[0066] It should be appreciated that in some examples, data may be computed only as the means of the squares, where the square root is not performed. This is done so that computing an accumulated acoustic energy number does not need to ‘undo’ the square root function (given the accumulated acoustic energy is based on the square of sound pressure). The use of RMS is interchangeable with the means of the squares, albeit in practice at the computation cost of a square or square root function. The complete block of sampled data is transformed with the FFT or SPL calculations.
[0067] The spectral data resulting from the FFT process may be reduced in size so that wide frequency ranges would be characterized by a single number. In one case, the ranges could be reduced to represent just bass, mid, and treble frequency ranges, thereby only requiring three numbers to be stored. However, in host chips with sufficient memory capacity, the complete set of FFT frequency bins could be stored.
[0068] The calculation may form/generate a first set of calculated data and a second set of calculated data. Next, the first set of calculated data may be stored in a first level data array and the second set of calculated data may be stored in a second level data array. Each element or row of the first level data array and the second level data array encapsulates the information from a block of sampled data. There is one level data array for each data source, such that the information from each data source is maintained independently. As such, the level data provides a momentary snapshot of the acoustic sound pressure at a particular point in time.
[0069] Moreover, in the general case, the level data stored in the first level data array and the second level data array is not required to be regularly spaced in time. In some cases, it may be preferably to dynamically modify the period of sampling and the data calculation to reduce power consumption, when, for example, a very consistent level of acoustic signal is detected, a longer period of between samplings can provide an adequate characterization of the acoustic signal within that period. Conversely, when the acoustic signal has relatively long periods of silence with short burst of high level signal, a shorter interval of samplings would be required to adequately capture the high level bursts.
[0070] The length of the first level data array and the second level data array, and therefore the time span of the measurements, may be user-defined/user-customizable and may constrained by available memory in the underlying device. However, this would typically be in the range of several seconds up to twenty-four hours. In some case is would be possible (given sufficient data storage) to have an unlimited length.
[0071] A first cumulative array may be produced that includes an integral of the first level data array and a second cumulative array may be produced that includes an integral of the second level data array, as shown below:
[0072] Given the general case, each of the first level data array and the second level data array may have arbitrary time spacing, where each element of the first level data array and the second level data is multiplied by its associated time period. In the case that all elements occur with an identical period, the cumulative array becomes the sum of the first level data array and the second level data multiplied by the total time. Further, the cumulative array data element is given an absolute of relative timestamp. The cumulative array would most often be updated at a slower rate than the first level data array and the second level data, but could be equal to the levels array period.
[0073] The first level data array, the second level data array, the first cumulative array, and the second cumulative array may be arranged as a circular buffers such that once the first level data array or the second level data array is filled, newest data overwrites oldest data. Each of the first level data array and the second level data array comprise information regarding momentary sound exposure levels. Further, the first and second cumulative arrays provides information regarding a total sound energy over a time period.
[0074] In examples, each of the first level data array and the second level data array are also stored with a timestamp. In some examples, the timestamp is an absolute time derived from a real-time clock. In other examples, the timestamp is a relative time derived as an incremental value from a defined starting point. Further, the first level data array, the second level data array, first cumulative array and the second cumulative array may be stored in a non-volatile memory of the headphones. In some examples, the first level data array, the second level data array, first cumulative array and the second cumulative array may be stored in a Bluetooth chip of the headphones. The Bluetooth chip may further provide processing hardware used for at least a subset of the process steps described herein. As such, the technology described herein can be implemented without a need for additional equipment, and as such, is cost-effective.
[0075] The process/chain of collecting the raw audio data through to calculating the cumulative array may be depicted in
[0076] It should be appreciated that in some examples, the system provides the user with the ability to clear the existing memory of levels and cumulative arrays, which are preferably stored in non-volatile memory so that when the device is powered-off and subsequently powered-on, the data is retained and the system can continue to operate, while including the information about prior levels and cumulative data.
[0077] While the headphones host chipset provides only limited non-volatile data storage, the levels and cumulative data can be summarized into a reduced sets of values that represent key aspect of the information normally stored in the levels and cumulative arrays. For example, an RTC may often provide a very low power persistent data storage space. The cumulative data arrays could be reduced to single value and the time period it represents to capture the key information from the entire cumulative array. When the device is powered off, this summary data is stored to the RTC memory. When the device is powered on, this information is recovered and used to seed the cumulative array with a starting point that represents the prior total cumulative data.
[0078]
[0079] Next,
[0080] Moreover, the system includes the concept of a “limit cumulative value” and a “time to limit” value. The limit cumulative value parameter represents a maximum allowable acoustic energy exposure. This value may be user-defined or prescribed to align with international standards for sound exposure, such as by the OSHA and/or WHO, as depicted in
[0081]
[0082] The method may also include comparing the sound exposure to a predetermined threshold level of sound exposure. In response to a determination that the sound exposure is approaching the predetermined threshold level of sound exposure, the method may further include applying one or more parameters to the sound exposure to reduce the sound exposure to ensure the sound exposure never reaches the predetermined threshold level of sound exposure, as depicted in
[0083] In the case of an EQ change, the system includes a DSP equalizer aspect, usually in the form of a digital biquad filter structure with variable coefficients. When a particular band requires gain adjustment, the appropriate coefficient value in modified in the DSP system and the biquad filter changes the effective gain of that frequency band. This approach is particularly useful in cases where, for example, heavy bass levels are present in the audio content source that drive up the sound exposure level, while the rest of the frequency spectrum is contributing very little to the total exposure. In this situation, the bass gain could progressively be reduced such that the user may continue to listen to the content at a similar overall level, with only the bass frequencies being attenuated. In practice, while a gradual adjustment takes place, a user's psychoacoustic experience compensates for the reduction in low frequency output and the change goes unnoticed and does not detract for the user enjoyment.
[0084] In another example, the ANC system may have a range of available gain. When the ambient noise is contributing significantly to the users sound exposure, an increase in the noise reduction effect would limit the user cumulative exposure gradient without detracting for the listening experience, activation of this part of the system would require either or both of a feedforward and feedback microphone to enable discrimination of ambient noise compared to program audio material.
[0085] It should be appreciated that the simplest system that does not have variable EQ parameters or ANC would allow only a net system gain change. In this case, the principal applies identically, however, only the net system gain can be adjusted to taper the user cumulative sound level exposure.
[0086] In another example, if the time to limit exceeded a specified amount (e.g., 3 hours), the EQ and gain maximum may be increased by an increment value. In another example, if the time to limit exceeded a specified amount (e.g., 2 hours), but is less than 3 hours, no change is made. In a further example, if the time to limit is less than a specified amount (e.g., 2 hours), the system may reduce in the overall gain or gain of a specific frequency band by an increment amount. It should be appreciated that the EQ or gain change would lead to a lower value in the levels array and consequently a smaller increase in the cumulative array. This smaller increase would then cause the regression line to flatten out and thereby increase the time to limit. The system would be tuned such the cumulative level would be forced to asymptote to the limit level, but never exceed it.
[0087] The amount of EQ band or overall gain change may be defined as a fixed increment or may be computed from the cumulative array data curve fit. In this method, one can calculate a line gradient that gives an intercept at the minimum required time point (e.g., 2 hours). Then, calculation of the cumulative level increment, in the next element, would give the required gradient. Then, the differences are found between the desired increment and the mean increment across the preceding points. This difference in increment is then the dB gain level change required. The system response may further be controlled by adding additional criteria to the change, for example, that the increment may not be greater than a given value (e.g., less than 1 dB) so that the user would not readily notice a step change in listening level, rather very gradual taper is applied such that the users listening experience is not interrupted.
[0088] It should be appreciated that the computational functions here in the preferred embodiment would be completely executed on the Bluetooth chip. Given the periodic nature of the sampling, levels array calculations, cumulative array calculations and line intercept calculations, the demand on processing could be within reasonably limits. Example tables for the SPL time and the cumulative time are depicted in
[0089]
[0090] Next, the method may include transmitting the raw audio data and/or the sound exposure to another device. This other device may be a smartphone, a PC, a facility, or a cloud-based server, among others. This means that the local device may retain data that spans a relatively limited time frame, e.g. 24 hours, while the connected device (e.g., the other device) may retain an unlimited span of data. This data logging ability allows users to access a historical record of acoustic energy exposure over long time frames and additionally allows users to observe patterns of sound exposure that may facilitate behavioral changes to help to protect hearing health. For example, a user may discover that their exposure to high levels of acoustic energy always occur in a specific circumstance for which they may be able to avoid.
[0091] It should be appreciated that the system (e.g., the headphones) described herein is designed such that it is incapable of producing an SPL level high enough to cause damage over a time period less than the period required to develop extrapolated regression line that would allow for taper of the acoustic energy level delivered to the ear of the wearer.
[0092] The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others or ordinary skill in the art to understand the embodiments disclosed herein.
[0093] When introducing elements of the present disclosure or the embodiments thereof, the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements. Similarly, the adjective “another,” when used to introduce an element, is intended to mean one or more elements. The terms “including” and “having” are intended to be inclusive such that there may be additional elements other than the listed elements.
[0094] Although this invention has been described with a certain degree of particularity, it is to be understood that the present disclosure has been made only by way of illustration and that numerous changes in the details of construction and arrangement of parts may be resorted to without departing from the spirit and the scope of the invention.