Method and apparatus for measuring distortion and muffling of speech by a face mask
11295759 · 2022-04-05
Assignee
Inventors
Cpc classification
G01H17/00
PHYSICS
G10L25/18
PHYSICS
International classification
G01N29/44
PHYSICS
G01H17/00
PHYSICS
G10L25/18
PHYSICS
Abstract
Systems and methods are provided for measuring the distortion and muffling caused by a face mask. For example, in one embodiment a simulated voice source produces a sound. The sound is then acoustically coupled to a simulated vocal tract and a face mask. A microphone receives sound and produces a signal and an analyzer receives the signal from the microphone. A manikin head or other facial structure may also simulate fitting of the face mask onto a face. The analyzer may further produce a quantitative assessment of the distortion and muffling of the face mask, for example, by comparing at least one spectrum obtained with the face mask and at least one spectrum obtained without the face mask.
Claims
1. A system, comprising a simulated voice source, configured to produce a sound; a simulated vocal tract, acoustically coupled to the simulated voice source; a face mask, acoustically coupled to the simulated vocal tract; a microphone, configured to receive the sound and produce a signal; and an analyzer, configured to receive the signal from the microphone.
2. The system of claim 1, further comprising a manikin head or other facial structure configured to simulate fitting of the face mask onto a face.
3. The system of claim 1, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask.
4. The system of claim 1, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask by comparing at least one spectrum obtained with the face mask and at least one spectrum obtained without the face mask.
5. The system of claim 1, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask by comparing at least one spectrum obtained with the face mask and a control.
6. The system of claim 1, wherein the analyzer uses an inverse filter.
7. The system of claim 1, wherein the analyzer produces a metric of the distortion and muffling of the face mask.
8. The system of claim 1, wherein the analyzer measures at least one of a frequency, amplitude, or bandwidth of a formant.
9. The system of claim 1, wherein the analyzer assesses the distortion and muffling of the face mask by measuring at least one of a shift in frequency, change in amplitude, or change in bandwidth damping of a formant.
10. The system of claim 1, further comprising a link between the analyzer and the simulated voice source.
11. The system of claim 1, wherein the analyzer comprises a display configured to visualize a comparison of formant spectra in the time or frequency domain.
12. A method comprising the steps of: producing a sound with a simulated voice source; providing a simulated vocal tract, acoustically coupled to the simulated voice source; providing a face mask, acoustically coupled to the simulated vocal tract; receiving the sound and producing a signal with a microphone; and receiving the signal from the microphone with an analyzer.
13. The method of claim 12, further comprising providing a manikin head or other facial structure configured to simulate fitting of the face mask onto a face.
14. The method of claim 12, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask.
15. The method of claim 12, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask by comparing at least one spectrum obtained with the face mask and at least one spectrum obtained without the face mask.
16. The method of claim 12, wherein the analyzer produces a quantitative assessment of the distortion and muffling of the face mask by comparing at least one spectrum obtained with the face mask and a control.
17. The method of claim 12, wherein the analyzer uses an inverse filter.
18. The method of claim 12, wherein the analyzer produces a metric of the distortion and muffling of the face mask.
19. The method of claim 12, wherein the analyzer measures at least one of a frequency, amplitude, or bandwidth of a formant.
20. The method of claim 12, wherein the analyzer assesses the distortion and muffling of the face mask by measuring at least one of a shift in frequency, change in amplitude, or change in bandwidth damping of a formant.
21. The method of claim 12, further providing a link between the analyzer and the simulated voice source.
22. The method of claim 12, wherein the analyzer comprises a display configured to visualize a comparison of formant spectra in the time or frequency domain.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
(18) Before the present invention is described in further detail, it is to be understood that the invention is not limited to the particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
(19) Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
(20) Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, a limited number of the exemplary methods and materials are described herein.
(21) It must be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
(22) All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates, which may need to be independently confirmed.
(23) The acoustic characteristics of speech can be modelled as a sound source, vocal tract filter, and radiation characteristics. The term “vocal tract,” or “supraglottal vocal tract” refers to the chambers of the mouth and pharynx above the laryngeal voice source.
(24) In voiced sounds, the sound source is due to the vibrating vocal folds. The energy of the sound source usually comes from air expelled from the lungs, which is converted to acoustic energy at the larynx (or “voice box”), as this flow of air passes between the vocal folds.
(25) The shape of the vocal tract can be modelled as the vocal tract filter, and is usually modelled separately from the vocal source. The vocal tract is usually measured from the glottis to the mouth, but can also include the nasal cavity, depending upon whether the velum is open or closed. For example, the nasal sounds such as /m/, /n/, and /ng/ require added resonance in the nasal cavity.
(26) When speech is voiced, the vocal folds vibrate, effectively producing sound waves. Articulators, such as the tongue, teeth, pharynx, jaw and lips, modify the spectrum of those sound waves. Radiation characteristics refer to the way in which sound as a speech pressure waveform radiates from the mouth. Sound production that involves moving the vocal folds close together is called glottal. Voiced (e.g., quasiperiodic) source sounds are glottal, in addition to whisper (e.g., aperiodic). On the other hand, there can be supra-glottal sound sources in speech that are aperiodic (i.e., random noise or impulses).
(27) An acoustic filter selectively strengthens or attenuates certain frequencies and allows other frequencies to pass through unstrengthened or unattenuated. During unnasalized voiced speech sounds, that is sounds for which the velar passageway is closed or almost closed, the vocal tract acoustic filter can be effectively characterized by a small number of acoustic resonances. These acoustic resonances in the vocal tract produce peaks in the spectral envelope of the output sound. Thus, the vocal tract is an acoustic filter, and the resonances of the vocal tract produce spectral peaks or formants in the output sound. The term “formant,” as used in the art, is used to describe either a spectral peak or a resonance that gives rise to it.
(28) A uniform tube closed at one end and open at the other, is a what is referred to in radio engineering as a quarter-wave resonator, and would have resonance frequencies in a 1, 3, 5, 7 multiplicative sequence. This is illustrated in the standing wave patterns shown in
(29) The resonances of the vocal tract can be estimated by modeling it as an acoustic waveguide, typically having a length of about 10-20 cm. The cross section along the length of the waveguide is varied by the geometry of the articulators. The frequencies of the resonances depend upon the shape. The frequencies of the first, second, third and ith resonances are called R.sub.1, R.sub.2, R.sub.3 . . . , R.sub.i . . . . As shown in
(30) During the voicing of vowels and voiced sonorant consonants, the area of the glottis is negligible compared to the opening at the lips, especially during the most closed portion of the vibratory cycle, when most of the acoustic energy is generated, so it can be effectively treated as closed in an acoustic analysis. The articulators (such as the tongue, teeth, pharynx, jaw and lips) are able to provide differences in vowel sounds, and produce significant changes in the formant frequencies. In other words, the different vowel sounds can be thought of as modifications to the vocal tract resonance. For example, the opening or closing of the mouth affects the resonance of the vocal tract cavity, as well as the length of the opening formed by the articulators, as shown in
(31) Thus, by specifying peaks in the spectrum, formants provide the information that people require to distinguish between speech sounds. The formant with the lowest frequency is called F.sub.1, the second F.sub.2, and the third F.sub.3. Most often the two first formants, F.sub.1 and F.sub.2, are sufficient to identify the vowel. Formants may be defined by their frequency and by their spectral width, or bandwidth.
(32) For a typical adult person, F.sub.1 will usually be between 200-800 Hz. The low end of the range would be realized for vowel pronunciation that requires a small opening of the mouth, whereas the high end of the range typically would be the case with a larger opening of the mouth. The second resonance of an adult vocal tract is typically in the range of 800-2000 Hz. Again, these values vary depending on the vowel pronounced. For example, the vowel /u/ requires a small opening of the mouth, so for a given speaker, R.sub.2 may be lower than 800 Hz (e.g., 500 Hz would not be uncommon). As discussed, the articulators (such as the tongue, teeth, pharynx, jaw and lips) are able to provide differences in vowel sounds by producing significant changes in the formant frequencies.
(33) The distortion and muffling of the speech of a face mask user can come from two primary sources: (1) blocking of the speech sounds from the mouth and/or nose, and (2) distortion and muffling of the speech sounds from the mouth and/or nose caused by the face mask.
(34) The second aspect of speech distortion, the modifying or distortion of the speech as it is emitted from the mouth and nose, is caused by the acoustic coupling of the face mask to the vocal tract as well as by resonances (and antiresonances) generated in the mask itself. While most people think that the reduced speech intelligibility caused by wearing a mask is due to the first source (blocking of the speech sounds), the second source (distortion and muffling) is actually the predominant cause.
(35) There are a number of methods used for measuring the frequency and damping of the speech formants. In mathematical terms, a formant is a resonance, defined by a frequency and a damping factor or alternatively, in some descriptions of vocal tract acoustics, a formant is described as a peak in the spectrum of the speech and a center frequency and a bandwidth of that peak. The bandwidth, nominally, the distance in Hz between the −3 dB points preceding and following the peak, can be mathematically derived from the damping factor, and vice versa.
(36) Also, in some applications, a formant is identified by only its frequency. For example, it is only the frequency of a formant that is identified by a spectrographic analysis.
(37) As further explained by the experiments shown below, when a face mask is worn on a face, there is a shifting in the frequency of the formants and/or the damping of the formants of the speech emitted from the mouth and nose caused by the acoustic coupling of the mask chambers to the chambers of the mouth and nose, so as to cause a reduction in the intelligibility of the speech. In other words, the natural chamber of the vocal tract produces formants of the voice, and when a face mask is worn, the chamber created over the mouth couples to the vocal chamber, and alters the formants. This effect is depicted in
(38) When worn, face masks can result in a shifting in the frequency, and/or the damping of the formants of speech emitted from the mouth and nose caused by the acoustic coupling of the mask chambers to the chambers of the mouth and nose (i.e., vocal tract and nasal cavity), and/or an increase or decrease in the spectral peaks generated by one or more formants. In other words, the interior of the mask becomes acoustically part of the vocal tract. This lengthening of the effective vocal tract will tend to lower the formants, with the effect varying with the vowel being spoken. In the tract/mask acoustic system, the departure from the closed-to-open tube model can also add additional resonances and antiresonances to the transfer function, to further muffle the speech.
(39) Because most of the information in speech is conveyed by the frequency and damping of the lowest 2 or 3 formants in the speech, it is possible to evaluate the degree of distortion or muffling of the speech caused by the mask by comparing the formant structure of the speech with and without the mask, as in
(40) A broadening of one or more of the formant peaks is generally known as a “dampening” effect, which may also be accompanied be a decrease in base-to-peak amplitude of one or more of the formant peaks. The terms “distortion” and “muffling” are essentially synonymous in the art. In some applications “muffling” may be more associated with damping effects, while “distortion” may be more associated with shifting effects. As used here, “distortion” and “muffling” are synonymous and may refer to any changes in formant structure caused by the face mask.
(41) While speech intelligibility is primarily determined by the first three formants, distortion or muffing may cause changes in only a single formant, multiple formants, or all formants. Additionally, different formants may be affected in different ways. For example, a particular mask may cause the first formant to see a shift, while the second formant is dampened, and the third formant is unaffected.
(42)
(43) The speaker attempted identical vowel /a/ sounds in each case, and the first three formants can be seen in both spectra, as labeled.
(44) The clear spectra in
(45)
(46) When estimating the distortion of the speech produced with a face mask, a comparison of the spectrum of the speech with and without the mask that includes an estimation of change in formant structure caused by the mask has an advantage over subjective testing of speech intelligibility in that it can yield repeatable objective measures of the muffling of the speech in a short amount of time.
(47) There are a number of methods used for measuring the frequency and damping of the speech formants. In mathematical terms, a formant is a resonance that may, in some cases, be defined by a frequency and a damping factor. In other cases, in some descriptions of vocal tract acoustics, a formant is described as a peak in the spectrum of the speech and a center frequency and a bandwidth of that peak.
(48) In the mathematical specification of a damped resonance, the damping factor is the coefficient of the exponential decay of the sinusoidal oscillations that result from the resonance. The bandwidth, nominally, the distance in Hz between the −3 dB points preceding and following the peak, can be mathematically derived from the damping factor, and vice versa.
(49) The damping of a resonance can also be described mathematically by the % decay per cycle of oscillation. Rothenberg M. (1973). A new inverse-filtering technique for deriving the glottal air flow waveform during voicing. Journal of the Acoustic Society of America, 53(6), 1632-45.
(50) The damped sinusoids in
(51) However, in comparing the spectrum of speech with and without a mask, or with different masks, it must be kept in mind that the there is a natural variability in human speech, that can be reduced by using a trained speaker, but cannot be eliminated. For this reason, it is proposed in this application that such comparisons be preferably made using a synthesized voice generated using a mechanically stimulated physical vocal tract model, such as proposed in this application. Using a physically simulated voice source and vocal tract instead of natural speech thus allows the user to detect and measure the small changes in the spectrum caused by masks that are perceived to muffle the speech of the user but do not cause high levels of distortion.
(52) Among the plurality of tools available for the analysis and comparison of the spectra of the speech with and without a mask is the method of inverse filtering, in which a filter having zeros, or antiresonances, at the frequencies and damping of the resonances underlying a given spectrum is used to cancel such resonances. Inverse filtering could also introduce resonances to cancel antiresonances underlying a given vowel spectrum, as in nasalized speech. Inverse filtering has been widely used to analyze natural speech to study the voice source.
(53) According to an embodiment of the invention, the measurement of the formants of speech is accomplished by generating simulated vowels using a simulated vocal tract that is affixed to a physical model of a human head upon which the mask to be tested can be mounted, as shown diagrammatically in
(54)
(55) As shown in
(56) The voice source 203 should preferably have a high acoustic impedance so as to emulate the glottis during its closed phase, during which there is little or no formant energy absorbed by an open glottis. Formant energy absorbed by a simulated glottis with an impedance not high compared to the impedance of the simulated vocal tract will increase the formant damping and change the resonant frequency of the formant, thus creating errors in the measurements of the effects of the mask being tested.
(57) The simulated voice source 203 is acoustically coupled to the simulated vocal tract 202, such that sound output from simulated voice source 203 is coupled into the simulated vocal tract 202. The sound output from the simulated vocal tract 202 may then acoustically coupled to the mask 204, such that the acoustic effect of the mask 204 can be detected.
(58) In some embodiments, a link 207 may be provided between the analyzer 206 and the simulated voice source 203 to synchronize the analyzer with the voice source, in order to aid in the analysis. The link 207 may provide the analyzer 206 information regarding the sound produced by voice source 203. For example, if the simulated voice source is an acoustic impulse, the link 207 can send the impulse data, and may signal to the analyzer 206 the time that the impulse is generated, so that the analysis can be set to occur over a time interval that begins an advantageous preset time after the impulse.
(59) Mask 204 is the mask to be tested. Microphone 205 is positioned so as to pick up the sound emitted from the simulated vocal tract 202. Microphone 205 may be any device that converts a received sound into a signal for the analyzer 206. If a face mask is in place, the microphone is preferably placed outside of the face mask such that all the effects of the distortion and muffling caused by the mask can be effectively captured.
(60) Analyzer 206 is a system for receiving an output signal from the microphone 205 and performing an analysis that yields a measure or measures of the muffling and distortion of the simulated voice caused by the mask. The analyzer may be a signal processor with circuitry or processors optimized for the operational needs of signal processing. Examples of the type of analysis that can be performed by the analyzer 206 are shown below in
(61) In one embodiment, the analyzer 206 compares the spectra with and without the face mask 204. The analyzer 206 may also compare the spectra with the mask to any other type of control. For example, the analyzer 206 may be provided the original or control signal generated by the simulated voice source 203 by link 207.
(62) With reference to
(63)
(64) A miniature loudspeaker having a high acoustic impedance at its output was inserted in one end of the tube to emulate the glottal voice source, to function as the simulated voice source 203. However, other sound sources could be used, as a spark-generated acoustic impulse source.
(65) A microphone 205 was mounted a fixed distance, approximately 2 inches, from the manikin face to record the radiated acoustic signal. The signal from the microphone was processed by analyzer 206 in order to determine the distortion of the radiated acoustic waveform caused by the presence of a mask.
(66) The spectrum of the synthesized vowel, from the microphone a few inches from the face mouth opening, with no mask in place, is shown in
(67) The frequency peaks in
(68)
(69)
(70)
(71)
(72) To ascertain the source of the increased energy near 700 Hz in
(73)
(74) Changes in the spectrum caused by resonances or anti resonances in the mask may also be differentiated from changes in the radiated spectrum caused by a mask interacting with vocal tract acoustics by shifting the location or damping of the vocal tract formants by shifting the location of the simulated voice source to a location closer to the mouth.
(75) A voice source at the location of the simulated glottis, as in
(76) Moving the simulated voice source in this way may be desirable if the goal of the user is to optimize mask design and not to only measure the muffling and distortion of a given design.
(77) The signal recorded from the microphone may also be played back through a loudspeaker or earphones for a subjective evaluation.
(78) The system is able to measure and report to the user at least the changes in the frequency and/or damping caused by wearing a mask of one or more vocal tract formants, as well as provide information about any additional resonances or antiresonances introduced by the mask. We illustrate here methods that could be used in the analyzer to provide such information to the user.
(79) The frequency of a formant can be measured in the time domain as the inverse of the period of the oscillations in the acoustic pressure waveform caused by the formant. In the frequency domain, the formant frequency can be estimated by the location of a spectral peak caused by the formant. There are a number of other methods discussed in the literature for estimating the frequency of a formant, as from the cepstrum or an autocorrelation analysis.
(80) The damping of a formant can also be estimated in the time domain or the frequency domain. In the frequency domain the damping can be estimated by the width of the related spectral peak, for example the bandwidth, as defined by the distance in Hertz between the frequencies at which the energy is 3 dB lower than at the peak.
(81) In the time domain the damping can be quantified by the rate of decay of the energy at the formant frequency after the vocal tract is stimulated by an impulsive signal.
(82) In the data shown in
(83) This estimate of the formant bandwidth was verified by measuring the decay in the time waveform, as shown in
(84) In
(85) A formant resonance at a frequency f.sub.r generates a waveform approximating the function e.sup.−Kt Cos[(2π)(f.sub.r)(t)] in response to an impulsive stimulus. The constant K in this expression determines the damping or rate of decay. K can be determined by the percent decay per oscillatory cycle, which is constant throughout an exponential decay.
(86) An exponentially decaying sinusoid is generated by a resonance only during periods in which no stimulus is applied. The first 5 or 6 oscillations in the response to an impulsive stimulus shown in
(87) In
(88) To show the effect of an increase of formant bandwidth on the rate of exponential decay, the waveforms in
(89) It is estimated in the art that for formants in the range found in speech, the bandwidth of a formant resonance can be estimated to an accuracy of approximately 5 Hz by superimposing a graph of an exponential decay over the measured decay in formant energy. Stevens K. N., House A. S. (1958). Estimation of Formant Band Widths from Measurements of Transient Response of the Vocal Tract. Journal of Speech and Hearing Research, 1(4), 309-315. This estimate agrees with our measurements.
(90) The frequency and damping of a formant can also be measured by using an inverse filter, such as the Waveview™ program marketed by Glottal Enterprises. In a manual procedure, a formant-based decaying oscillation can be displayed on a computer screen, and the frequency and damping parameters of the filter adjusted to minimize the oscillations on the screen. The settings required to accomplish this can be used as estimates of the frequency and damping of the formant.
(91) For bandwidths much less than the formant frequency, as is usually the case in speech, the formant bandwidth that is equivalent to a % decay per cycle of 7.0 can by computed by the expression: BW=2 f.sub.r (% decay per cycle/100). For a formant frequency f.sub.r of 467 Hz, this expression yields a BW of approximately 65 Hz, which roughly agrees with the bandwidth measured in
(92) To estimate the change in bandwidth required to be detectable by superimposing a graph of an exponential decay over the measured decay in formant energy, the waveforms in
(93) This example indicates that if a decay lasting at least 5 or 6 oscillatory cycles can be used for the analysis, for formants near 500 Hz. formant bandwidth changes of as little as 5 Hz should be clearly measurable using a decay rate analysis
(94) A quantitative assessment of the distortion and muffling of the face mask can be made by a comparison between the spectra with and without the face mask (e.g., between 10A and 10B, or 10A and 10C). In one embodiment, the comparison involves at least one of a comparison between the center frequency of one or more formants, the bandwidth of one of more formants, or the amplitude of one of more formants. In general, the greater the shift in the center frequency of a given formant, the greater the distortion and muffling. Likewise, the greater the change in bandwidth or amplitude (or both) of a formant, the greater the greater the distortion and muffling.
(95) For example, the Weini K320T N95 mask (see
(96) As a non-limiting example, these shifts in frequency and changes in bandwidth and amplitude are factors that may be used as inputs into a distortion value. As discussed, a greater the shift in frequency and changes in bandwidth and/or amplitude likely mean a greater distortion value. In some embodiments, the first three formants are considered. However, in some embodiments, only the first or only the first and second formants are considered. Furthermore, when multiple formants are considered, each formant can contribute equally to the distortion value, or, the formants could be weighted. For example, even a small shift in frequency of the first formant can produce a large amount of distortion and muffling.
(97) With respect to the example of
(98) The following are non-limiting examples of calculations of distortion values using the example of
(99) In one example, the change in frequency or bandwidth of a formant caused by a particular mask can be summarized in a numerical distortion index to allow the comparison of various masks. One such definition of a distortion index might be formed by first considering values of normalized frequency shift, dF.sub.1, and bandwidth change, dB.sub.1, defined as follows (note that subscript “m”=mask, subscript “nm”=no mask, and bandwidth change is assumed to be an increase since the no mask condition results in a minimum bandwidth):
ΔF.sub.1=dF.sub.1=|F.sub.1m−F.sub.1nm|/F.sub.1nm
ΔBWF.sub.1=dB.sub.1=(B.sub.WF1m−BWF.sub.1nm)/BWF.sub.1nm
(100) Thus, if a first formant with a frequency value of 500 Hz with no mask and a bandwidth of 80 Hz with no mask, has a frequency value of 470 Hz and a bandwidth of 120 Hz with a particular mask, the value of dF.sub.1 would be 30/500=0.06, and the value of dB.sub.1 would be equal to (120−80)/80=0.50.
(101) Each of these values could be normalized by dividing it by the minimum perceptible value, as determined experimentally, which might be 0.01 for dF.sub.1 and 0.1 for dB.sub.1. This would yield a value of 6.0 for frequency shift and a value of 5.0 for bandwidth increase.
(102) Assuming that frequency shift and bandwidth increase contribute equally to distortion, these two measures may be added together to give them equal weighting, resulting a single numerical measure of the speech distortion and muffling. In this case the combined measure would be 11.0.
(103) As another example, a normalization routine may find that the maximum shift in frequency is 50 Hz and the maximum change in bandwidth is 500 Hz, and may use a range of 0-100 for distortion value. In this case, because the frequency shift and bandwidth change are considered equally, each would contribute 0-50 to the distortion value. Using a simple linear normalization, the shift of 5 Hz for the face mask of 10B would add 5 to the distortion value. If the definition of bandwidth showed an increase of 50 Hz, the face mask of 10B would add 5 to the distortion value (500/50=50/5). Thus, the distortion value for the face mask of
(104) As discussed, the analyzer 206 may compare the spectra with and without the face mask 204. In one embodiment, the analyzer may further comprise a graphical user interface or other display for visualizing such a comparison of formant spectra. For example,
(105)
(106) The invention is described in detail with respect to preferred embodiments, and it will now be apparent from the foregoing to those skilled in the art that changes and modifications may be made without departing from the invention in its broader aspects, and the invention, therefore, as defined in the claims, is intended to cover all such changes and modifications that fall within the true spirit of the invention.
(107) Thus, specific apparatus for and methods for objectively measuring the effect of wearing a mask on the acoustical properties of speech have been disclosed. It should be apparent, however, to those skilled in the art that many more modifications besides those already described are possible without departing from the inventive concepts herein. The inventive subject matter, therefore, is not to be restricted except in the spirit of the disclosure. Moreover, in interpreting the disclosure, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced.