Method and system for audio critical listening and evaluation

Abstract

Disclosed herein is a method of constructing and utilizing a sound engineering evaluation and comparison process to allow for improved finished results. Such a method entails the utilization of a high-pass filter for listening evaluation of recorded music or sounds including consistency with low-frequency mixing to allow for a tool to implement changes in relation to the filtered results in order to accommodate sensitivities of the human ear (with the optional inclusion of a comparison method to provide possible further enhanced results and the avoidance of biases). In such a manner, a facilitating method for sound engineering mixing adjustments that provide such accommodations are provided for improved sound recordings for distribution within on-line or recording product frameworks.

Claims

1. An audio processing auditory evaluation method for applying fine aesthetic adjustments in a sound recording studio to musical audio signals utilizing a separate bass track and non-bass track component, said method comprising: i) separation of an audio recording into at least one recorded non-bass track by filter exclusion of at least one bass track component of a musical audio recording; ii) providing a high-pass filter for utilization with said at least one recorded non-bass track, wherein said high-pass filter includes a distortion minimization component and optionally a maximum ear sensitivity frequency level treatment component; iii) providing a multi-level bass track filter for utilization with said at least one recorded bass track, wherein said bass track filter separates low-frequency bass track and high frequency portions; iv) applying said high-pass filter of step “ii” to said at least one recorded non-bass track to generate at least one filtered non-bass track component exhibiting results from filtered, distortion minimized, evaluation and optionally maximum ear sensitivity frequency level treatment results; v) applying said multi-level bass track filter of step “iii” to said at least one recorded bass track to generate at least one filtered bass track component exhibiting separated low-frequency and high-frequency resultant bass track portions; and vi) combining said at least one filtered non-bass track component of step “iv” with said at least one filtered bass track component of step “v” to generate a resultant mixed audio recording.

2. The audio processing auditory evaluation method of claim 1 wherein said optional maximum ear sensitivity frequency level treatment component is present.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a prior art graphical representation of a typical Auratone Frequency Response.

(2) FIG. 2 is a prior art graphical representation of a typical Yamaha NS10 Frequency response.

(3) FIG. 3 is a prior art graphical representation of an ISO 226:2003 Curve set at different frequencies.

(4) FIG. 4 is an inventive Primary Non-Bass Track Filter curve graphical representation.

(5) FIG. 5 is a graphical representation of a curve for a device evaluation result after filtering.

(6) FIG. 6 is a graphical representation of a curve showing FM radio typical results for a filtered non-bass track.

(7) FIG. 7 is a graphical representation of a curve showing a Mastering Harshness evaluation of a filtered non-bass track.

(8) FIG. 8 is a passive evaluation graphical representation of a curve showing a filtered non-bass track.

(9) FIG. 9 is a graphical representation of a curve showing total quality review for a filtered non-bass track.

(10) FIG. 10 is a graphical representation of a curve showing Auratones-like curve without nonlinearity for a filtered non-bass track.

DETAILED DESCRIPTION OF THE DISCLOSED SYSTEM AND DRAWINGS

(11) When judging sound quality, subjective and objective factors are at play according to at least one source. Generally, audio has objective and subjective perceptions and in this source we explore the objective side. This disclosure goes into detail about what objective factors exist, decomposes them, covering how they are measurable and how to advance them. PEMO-Q is one of the more popular methods for this evaluation, while others exist and are in development. When examining the experience of Geoff Emerick, Bruce Swedien and Bob Clearmountain, listening without bass is a connection linking their experiences. Each of these key engineers describe distinct listening techniques, and express them to be central to their work and accomplishments. The first of the techniques pre-dates the work of Geoff Emerick, as he attributes the technique to Norman Smith, an earlier engineer for The Beatles also known for working with the highly regarded band, Pink Floyd. We can't be certain how the technique originated, but we do know that it played a role in the success of one of the most popular musical groups of all-time which was noted for recordings that had a unique sound.

(12) Today, listening systems with an extended bass range are common and relatively inexpensive in audio engineering. Bass levels of a mix can be set with precision using these systems, however, mixing with them in full-range mode may detract from the potential mix quality and the ability of a mix to translate well to a variety of systems. In the past, full range systems were less common. Also, there was a distinct elimination of the bass frequencies that helped focus in on the frequencies to which we are most sensitive with the techniques employed by Smith, Emerick, Swedien and Clearmountain. Also, with the distinct techniques used by each of these engineers, we may find that different options for evaluation could yield unique styling to the material at hand.

(13) The sensitivity of the human ear and the nature of the bass frequency range are significant factors when considering the use of a high pass filter for evaluation. While the Fletcher-Munson curves may be the most widely known description of the sensitivity of the ear, they were later followed by the Robinson-Dadson curves and most recently by (ISO 226:2003) (FIG. 3 Prior Art). A general understanding from this research is that the ear is most sensitive around 3 kHz with a peaks there and diminished results above and below. Humans have little sensitivity to bass frequencies, as well as frequencies as they approach the Nyquist limit. It is also a matter of comfort, for example, if a pure 50 Hz tone were played in a room, it would be far more comfortable to the ear than a tone at 3 kHz, or even 10 kHz at the same loudness level. Using a high pass filter to evaluate audio causes a distortion that would not be concluded from other sources. When high pass filtering is applied, it affects frequency ranges significantly above the cutoff frequency. We also understand from this source that speakers can act as high pass filters, just as an analog or digital filter could. When considering this understanding, together with the techniques employed by Smith, Emerick, Swedien and Clearmountain, it is apparent that the classic evaluation techniques accounted for subtle distortions caused by filtering, while simultaneously focusing the audio engineer on the frequencies which are most sensitive to the human ear.

(14) Bass frequencies wavelengths are longer and thus less character is possible in the bass range. While no source imported into this research that makes this claim, it is common knowledge to virtually all audio engineers with a basic understanding of the nature of the audible frequency range. Frequencies are measured in Hertz (Hz), a measure per unit of time in seconds. The human range of hearing spans from 20 Hz to 20,000 Hz. Because of this wide range, Hertz relating to sound is typically represented on a logarithmic scale, as a linear scale would be too wide and would not practically depict audible frequency energy. Because of the nature of the depictions and the nature of the frequency sensitivity of the ear, we understand that exponentially more energy is contained in the lower frequencies. It logically follows that exponentially less character is possible in the lower frequencies than in those higher. From this it is easy to see that bass frequencies do not carry much character. With bass, the greatest factor is loudness and transient characteristics. Combining this with the notion that the ear is less sensitive to bass frequencies, we see that this range is unique from those upper ranges to which we are less sensitive but that can carry exponentially higher character.

(15) Thus, it has been understood herein within this disclosure that bass frequencies obscure one's decision making during mixing. They are more pleasing to the ear, use far more energy, have exponentially less resolution than the ranges above, and are reproduced with the highest variability among different listening systems. From this, it has been determined that to craft a mix with higher translation and with a focus on the frequencies that contain character, we may mix without the bass frequencies present. This is tantamount to using a zoom feature when working in the visual modality. From there it was then realized that there are discrete ways to best implement a bass filter to yield discrete stylistic results for the user.

(16) An optimal high-pass filter for listening evaluation has thus been crafted herein. Such a high-pass filter is used to eliminate bass frequencies, thus the name high-pass, as higher frequencies are allowed to pass, while lower frequencies are not.

(17) All pass filters have a cutoff frequency and a slope. All professional audio engineers are familiar with these parameters. However, there are other details that can affect the filter performance.

(18) There are several filter types, including the Bessel type, which apparently has the widest, most gradual crossover region with a gentle dip when summed in usage as a crossover. There are other filter types that have their own benefits, which should be considered when selecting the optimal high pass filter for filtering these frequencies. It is likely that a Bessel filter would be the most useful for the purpose of evaluating audio.

(19) High-pass filters have an attribute called the filter order. A higher filter order will allow for increased accuracy. Particularly a 4th or 6th order filter would provide the highest possibly audible quality. However, when using a higher filter order, the slope will be affected and it may be very best to find the slope and order combination that best fits to the curve shown in (ISO 226:2003).

(20) The inventive high pass filter design and utilization disclosed herein for evaluating audio centers around the sensitivity of the human ear (or possibly, if desired, around other listening needs). Using the sensitivity of the ear allows the user to focus on those frequencies to which we are most sensitive. However, this does still include de-emphasized bass frequencies that may not be reproduced by smaller listening systems. With this in mind, slopes that cut more bass provide the highest zooming effect and produce the highest translatability.

(21) Thus, one separates non-bass tracks from bass tracks (such as those that are well understood by the ordinarily skilled artisan). Once separated, the non-bass track component(s) is then, in its recorded state, filtered by the same or a different type of filter (Bessel and the like) in order for the sound engineer to maximum ear sensitivity measurements at 3 Hz (and the like); the filter likewise acts on the recorded track component(s) to remove distortions therefrom (see FIGS. 4-10, for instance, as the resultant curves are smooth without any appreciable or noticeable distortion levels). The resultant filtered non-bass track component(s) is then retained for combination and mixing with the bass track. To that end, the bass track is then filtered again to separate the high frequency portion from the low frequency portion, allowing for “zooming” in on each individual track portion for more effective treatment. Such a filter step allows for the sound engineer the ability to work his or her magic, as it were, and effectively blend the two separate track portions together. In the past, as noted above, such a bass track would only be treated as a single component, without any means to rework and/or mix such a heavy frequency recorded component beyond the single separation from the non-bass track. Thus, with the ability to further refine the bass track portions in this manner, the sound engineer has a greater palette for coloring the resultant musical product and integrate the same to a more robust level with the non-bass track. The resultant effect is that the finished product is provided with a uniformity in sound quality such that any sound producing device will ostensibly provide the same basic listening results for the user (dependent more on individual auditory qualities, rather than the sound producing device itself). Such has heretofore been unattained since the bass lines of standard recorded music has been treated as a single track or component and the non-bass track has either exhibited high levels of distortions or is targeted to maximum ear sensitivity levels, not both.

(22) Additionally, as noted above, the overall mixing program accords the sound engineering an alert component that indicates when a certain amount of time has passed during high intensity (for instance) audio exposure during such a mixing operation. The ability to limit such continuous exposure helps the sound engineer to maximize his or her capabilities without suffering ear fatigue.

(23) Evaluation of audio also includes comparison, and the remaining discussion is in regard to this topic. FIGS. 4-10 provide different evaluation curves for the sound engineer to determine if the filtered tracks have been undertaken to the desired levels. Comparison may include comparing one recording to another, or one processed version of a recording to another processed version.

(24) Because differences in loudness of approximately 0.41 dbSPL are detectable by the ear per the popular research, all difference thresholds for comparison should be lower than this level. This can be made even more exact by implementing the loudness curves in (ISO 226:2003), so that all thresholds use this filtering to ensure that differences are according to the sensitivity of the ear.

(25) These disclosed comparison devices thus involve an evolution beyond the earlier ideas of sound recording improvements noted above.

(26) Of the current comparison methods, ABX testing is perhaps the most noted. While ABX testing is practical for research and controlled experiments, it would not be very practical in the course of an audio engineer's daily work and evaluation. In these prior situations, even more stringent statistical controls and evaluations are in place (or at least suggested) than in the standard ABX fare; however, these advancements, including double blind testing is generally not practical for audio engineering purposes and for speed of evaluation. Usage of a third random element may not be practical for some audio engineering tasks but may be for others, so the inclusion of this idea is something that can be considered in the crafting of a comparison device. Recently, cross-modal interplay has been considered, which is interplay between vision and audition. In consideration of this research, an advanced comparison device would include a feature that would blank or otherwise diminish the visual modality during audition. In this way, the device interface would not affect audition, while minimizing any sacrifice in the graphical appeal of the comparison device.

(27) Proposed then have been three divisions for evaluation including stimulus response, pleasantness of sound, and identifiability of sounds or sound sources. For audio engineers, the mode of evaluation may instead be along the lines of frequency range evaluation, and then temporal evaluation where the listening focuses primarily on timing and dynamics. This thinking may be combined now with prior considerations which describe that there are both objective and subjective modes of listening, with the objective modes being possible upon which to construct meters. To that end, prior work suggests that PEMO-Q is one of the most popular evaluation methods, while there will be others that follow. The best comparison tools will help to prompt users to evaluate audio in different ways. Also, as an evolution of the intellectual property claims involving voting, a rating scale may be more useful to gauge how program material at hand measure within prompted categories of evaluation.

(28) This two-component (possibly three- and/or four-component) system thus overcomes such prior limitations with highly effective results. FIGS. 4 through 10 provide graphical representations of the overall effectiveness of such multi-track treatments, particularly in terms of the smooth (low unintended sound qualities and/or distortion) results (as compared with, for instance, those in Prior Art FIGS. 1 and 2). The maximum ear sensitivity levels are obtained through the high-pass filter application to the recorded non-bass track component(s) as shown in FIG. 4, as well, a result heretofore unattained within the recording industry, precisely because of the lack of utilizing anything that can capture such low distortion and maximum ear sensitivity results. Combined with the results generated as to high frequency and low frequency separated bass track recordings and applying the resultant treatments (mixing to optimum levels separately, in view of the integration with the non-bass track results) again provide effects that have heretofore been unexplored (and hence unattained) within the recording industry.

(29) While the technique of listening without bass dates back to the earliest popular recordings, the technique itself may not have been duly acknowledged. Herein disclosed are the best ways of constructing a listening tool that would help in evaluating bass, particularly in terms of two-level separated mixing methods.

(30) Sound engineering is an area where the utmost quality is demanded. This research, and the tools that can be constructed according to it, will help increase the quality of work for virtually any engineer who is not currently employing similar techniques, and will help increase the quality of those who are accomplishing these tasks in a more manual or less accurate way. Although specific embodiments of the invention have been disclosed, those having ordinary skill in the art will understand that changes can be made to the specific embodiments without departing from the spirit and scope of the invention. The scope of the invention is not to be restricted, therefore, to the specific embodiments, and it is intended that the description herein cover any and all such applications, modifications, and embodiments within the scope of the present invention.

Method and system for audio critical listening and evaluation

Inventors

Cpc classification

Classification Explorer

H04S2400/15

ELECTRICITY

Classification Explorer

H04H60/04

ELECTRICITY

Classification Explorer

A61B5/12

HUMAN NECESSITIES

Classification Explorer

G08B21/24

PHYSICS

Classification Explorer

H04S7/307

ELECTRICITY

Classification Explorer

G10L21/0232

PHYSICS

Classification Explorer

H04R3/04

ELECTRICITY

Classification Explorer

G10L21/038

PHYSICS

Classification Explorer

A61B5/746

HUMAN NECESSITIES

Classification Explorer

A61B5/725

HUMAN NECESSITIES

Classification Explorer

H04S2400/13

ELECTRICITY

International classification

Classification Explorer

H04R3/04

ELECTRICITY

Classification Explorer

A61B5/00

HUMAN NECESSITIES

Classification Explorer

A61B5/12

HUMAN NECESSITIES

Classification Explorer

G08B21/24

PHYSICS

Classification Explorer

G10L21/0232

PHYSICS

Classification Explorer

G10L21/038

PHYSICS

Classification Explorer

H04H60/04

ELECTRICITY

Abstract

Claims

Description