NON LINEAR FILTER WITH GROUP DELAY AT PRE-RESPONSE FREQUENCY FOR HIGH RES RADIO
20170346465 · 2017-11-30
Inventors
Cpc classification
International classification
Abstract
Methods and devices are described for reducing the audible effect of pre-responses in an audio signal. The pre-responses are effectively delayed by employing a digital non-minimum-phase filter, which includes a zero lying outside the unit circle in its z-transform response. This zero is not paired with another zero at a reciprocal position inside the unit circle, as this would linearise the phase modification. The filtering can introduce a greater group delay at the pre-response frequency than at a low frequency, such as 500 Hz or even 0 Hz. The technique can be used to reduce pre-responses in an existing audio signal and also to pre-empt pre-responses that would be introduced to the audio signal by subsequent processing.
Claims
1. A method for reducing the audible effect of a pre-response having energy at a pre-response frequency, the method comprising: introducing group delay at the pre-response frequency by filtering a digital audio signal using a digital non-minimum-phase filter having a z-transform response that includes a zero lying outside the unit circle whose phase response is not linearised by a zero at a reciprocal position inside the unit circle.
2. A method for reducing the audible effect of a pre-response having energy at a pre-response frequency, the method comprising: introducing group delay at the pre-response frequency by filtering a digital audio signal using a digital non-minimum-phase filter having a z-transform response that includes a zero lying outside the unit circle, wherein the zero is selected to create a greater group delay at the pre-response frequency than at a frequency of 0 Hz.
3. A method according to claim 1, wherein the digital audio signal contains the pre-response prior to the filtering.
4. A method according to claim 1, wherein the filtering preconditions the digital audio signal to reduce a pre-response that will be generated in a subsequent upsampling process.
5. A method according to claim 1, wherein the pre-response frequency lies within 20% of a reference Nyquist frequency equal to one half of a reference sampling frequency that is less than or equal to the sampling frequency of the digital audio signal.
6. A method according to claim 5, wherein the z-transform response of the filter has at least three zeroes lying outside the unit circle, each selected such that it has a z-plane reciprocal whose real part is more negative than minus 0.5, wherein z represents a time advance of one sample at the reference sampling frequency.
7. A method according to claim 5, wherein the reference sampling frequency is a sampling frequency of a process that produced the pre-response.
8. A method according to claim 5, wherein the reference sampling frequency is 44.1 kHz or 48 kHz.
9. A method according to claim 5, wherein the reference sampling frequency is the sampling frequency of the digital audio signal.
10. A method according to claim 5, wherein the reference sampling frequency is one half of the sampling frequency of the digital audio signal.
11. A method according to claim 5, wherein the pre-response frequency is not greater than 60% of a signal Nyquist frequency equal to one half of the sampling frequency of the digital audio signal, and wherein the z-transform response of the filter includes a further zero lying outside the unit circle and contributing a group delay greater at the signal Nyquist frequency than at the pre-response frequency.
12. A method according to claim 1, wherein the z-transform response of the filter also includes a pole lying inside the unit circle at a reciprocal position to the zero, the pole and zero together selected to create an all-pass factor in the transfer function of the filter.
13. A method according to claim 1, wherein the z-transform response of the filter comprises one or more zeroes and one or more poles configured such that the combination of zeroes and poles provides an amplitude response flat within 1 dB over the frequency range 0 to 16 kHz.
14. A method according to claim 1, wherein the zero is selected to create a greater group delay at the pre-response frequency than at a comparison frequency lower than the pre-response frequency.
15. A method according to claim 14, wherein the group delay at the pre-response frequency exceeds the group delay at the comparison frequency by at least ten cycles at the pre-response frequency.
16. A method according to claim 14, wherein the comparison frequency is less than or equal to 500 Hz.
17. A method according to claim 16, wherein the comparison frequency is 0 Hz.
18. A method according to claim 1, wherein the group delay introduced at the pre-response frequency exceeds by at least ten cycles at the pre-response frequency the time interval from the start of an impulse response of the filter to a sample thereof having the largest absolute magnitude.
19. A mastering processor adapted to receive a first digital audio signal and to furnish a second digital audio signal for distribution, wherein the mastering processor is configured to reduce the audible effect of a pre-response on a signal rendered from the second signal for auditioning by a listener by introducing group delay at the pre-response frequency by filtering a digital audio signal using a digital non-minimum phase filter having a z-transform response that includes a zero lying outside the unit circle whose phase response is not linearized by a zero at a reciprocal position inside the unit circle, or whose zero is selected to create a greater group delay at the pre-response frequency than at a frequency of 0 Hz.
20. (canceled)
21. (canceled)
22. A non-transitory computer readable medium having stored therein instructions that when executed cause a computer to perform a method for reducing the audible effect of a pre-response having energy at a pre-response frequency, comprising: Introducing a group delay at the pre-response frequency by filtering a digital audio signal using a digital non-minimum phase filter having a z-transform response that includes a zero lying outside the unit-circle whose phase response is not linearized by a zero at a reciprocal position inside the unit circle.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0032] Examples of the present invention will be described in detail with reference to the accompanying drawings, in which:
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
DETAILED DESCRIPTION
[0051]
[0052] The listener's equipment 7, 8, 9 includes a digital to analogue converter (DAC) 8 and a transducer 9 such as a headphone or loudspeaker, and optionally further processing (P2) 7.
[0053] As will be described later, processing according to the invention may be provided either as P1 in the mastering equipment 5 or as P2 in the listener's receiving equipment 7. In both cases, pre-rings generated by the ADC2 or the SRC 4 or by the listener's DAC 8 will be treated. In some implementations, processing according to the invention may be provided at both locations. Furthermore, in some embodiments, processing according to the invention may be provided before the SRC, if present, or even before the Archive.
[0054] The CD uses a sample rate of 44.1 kHz and throughout the 1980s and 1990s many companies operated the whole recording chain at 44.1 kHz, also archiving at 44.1 kHz so that the SRC 4 was not used. More recently there has been a tendency to run the ADC and the archive at a higher rate such as 44.1 kHz, 88.2 kHz, 176.4 kHz, 192 kHz, or even 2.8224 MHz for 1-bit ‘DSD’ recording, thus necessitating the sample rate converter 4, which can be either a separate piece of hardware or part of a software Digital Audio Workstation (DAW).
[0055] Sample rate conversion has a strong potential to generate pre-responses because of the necessary filtering. This problem is not evaded by running the whole chain at 44.1 kHz, for most commercial ADCs that furnish a 44.1 kHz output will operate internally at a higher frequency and then use a sample rate conversion process to provide the desired output sample rate.
[0056] Diverse architectures are known for sample rate conversion, the choice depending on factors such as whether the frequencies involved are in a simple integer ratio such as 2:1 or a more ‘difficult’ ratio such as 48:44.1. Alias-free downsampling to 44.1 kHz however always requires a low-pass filter that cuts quite sharply above 20 kHz. The requirements on the shape of the filter are not critically dependent on the sampling frequency of the source signal. This is true also for upsampling to an arbitrary new sample rate. Thus both downsampling and upsampling/reconstruction generate a requirement for a digital low-pass filter known as an ‘antialias’ filter when downsampling or as a ‘reconstruction’ filter when upsampling. The technical requirements for the two filters are not necessarily very different.
[0057] Opinion is divided on whether, when downsampling audio to 44.1 kHz or upsampling from 44.1 kHz, the low-pass filter should provide a substantial ‘stop-band’ attenuation such as 90 dB at 22.05 kHz or whether it acceptable to use a filter such as a ‘half-band’ operating at 88.2 kHz and configured to provide 6 dB attenuation at 22.05 kHz and full attenuation by 24.1 kHz. Historically, it was usual to make the filter's transition band as wide as was considered acceptable in order to minimise the number of taps in a hardware transversal (‘FIR’) implementation. The transition band was thus about 2 kHz wide, for example from 20 kHz to 22.05 kHz, or alternatively about 4 kHz wide, for example from 20 kHz to 24.1 kHz. More recent software implementations have provided much narrower transition bands, for example the recent ‘Adobe Audition CS 5.5’ DAW offers SRC facilities having a transition band about 100 Hz wide, starting about 75 Hz below the Nyquist frequency.
[0058] Perhaps more typical is the earlier ‘Adobe Audition 1.5’ DAW which offers a filter having a transition band about 500 Hz wide, starting at 21.5 kHz. Many commercially issued recordings exhibit a near-Nyquist noise spectrum that suggests that a filter such as this may have been used at some stage in the processing.
[0059] The impulse responses of the Adobe and Arcam filters are shown in
[0060] The Adobe plot is in fact the output of ‘Adobe Audition 1.5’ when upsampling a single impulse in a 44.1 kHz stream to 88.2 kHz, with the “Pre/Post Filter” and “Quality=999” options selected. Investigation reveals that the same filter is used internally when Audition is used to downsample from 88.2 kHz to 44.1 kHz. In the far ‘tail’ of the pre-response,
[0061] To remove the Audition filter's pre-ring a double-notch filter might therefore be indicated but this would be specific to the Audition 1.5 SRC. We desire a more general method since a music archive may contain 44.1 kHz recordings made and/or downsampled using diverse and possibly unknown equipment.
[0062] Pre-response Suppression by Filtering
[0063] Assuming pre-responses may have energy in the range 20 kHz-22.05 kHz, one approach is to attenuate this frequency range. A third order IIR filter having the following z-transform response:
attenuates the region 20 kHz-22.05 kHz by 20 dB when operated at a 44.1 kHz sample rate. This IIR filter has poles (crosses) and zeroes (circles) as shown in
[0064] According to the invention, the pre-responses may be further reduced by replacing the minimum-phase filter shown above by the corresponding maximum-phase filter, as follows:
[0065] This filter has the same poles but with zeroes outside the unit circle, as shown in
[0066] With zeroes outside the unit circle, it is now possible to adjust the poles inside the unit circle so as to create an all-pass filter:
whose poles and zeroes are shown in
[0067] More powerful suppression of pre-responses is provided by a 12th order all-pass filter, as follows:
whose poles and zeroes are shown in
[0068] Referring to
[0069] The bottom trace of
[0070]
[0071] To measure pre-response delay a reference is needed, since a modest delay of the total signal does not affect the audio quality. One may conjecture that the ear may use as reference the highest peak in a filtered impulse response or a filtered envelope response. In practice it is found that non-mimimum-phase zeroes each having a larger group delay in the vicinity of 20 kHz than at low audio frequencies are helpful. We note that group delay at a frequency of 0 Hz is well-defined mathematically: the group delay versus frequency of non-mimimum-phase zeroes having various frequencies over the range 11.025 kHz-22.1 kHz are plotted in
[0072] Referring again to
1/(−0.12±0.06 i)=−6.46±3.43 i
and
1/(−0.4±0.16 i)=−2.15±0.87 i
are contributing little to the group delay near 20 kHz relative to group delay at low audio frequencies. Calculation confirms that indeed these four zeroes and four poles can be deleted while affecting the said relative group delay by only 5% but saving 33% in filter complexity.
[0073] Thus in the case of all-pass filters, it is the poles whose real part is more negative than −0.5 together with their corresponding zeroes that are most helpful in delaying pre-responses close to the Nyquist frequency. In the case of filters that are not all-pass, it is the zeroes that are important since a zero can provide helpful attenuation even if there is no corresponding pole. Thus in general, it is the zeroes whose reciprocals lie inside the unit circle and whose real parts are more negative than −0.5 that are most helpful in reducing pre-responses.
[0074] In some cases it is possible to deduce the presence of a non-minimum-phase zero in a filtering apparatus by feeding in a sine-wave with an exponentially rising envelope. For example, in the case of the filter represented in
[0075] Of course, such a test signal must have a restricted duration in order not to provoke overload and care must be taken that processing delay is not mistaken for attenuation. A suitable test signal might start at a very low amplitude and contain an impulse as a time reference at the end of the increasing sine-wave. The test could include a comparison of the response to that signal with the response to a sine-wave at the same frequency but with constant amplitude. However, it is not practical to test for zeroes that are far outside the unit circle in this way and there may also be signal-to-noise difficulties in the case of zeroes that are extremely close to other zeroes. In difficult cases one may alternatively capture the impulse response of the apparatus to high precision using a technique such as chirp excitation, and then apply a root-finding algorithm to the impulse response.
[0076] In the situation depicted in
[0077] The treatment has also been found useful for ‘hi-res’ recordings at a sample frequency such as 96 kHz which may contain pre-rings having frequencies closer to 48 kHz. The same filter architecture and coefficients have been used, but clocked at 96 kHz so that the large group delay is achieved at frequencies in the range 44 kHz to 48 kHz.
[0078] Separately from the above, it is sometimes required to treat a signal that has already been upsampled: for example there is evidence that some nominally 88.2 kHz or 96 kHz commercially available recordings have been upsampled from 44.1 kHz or 48 kHz respectively, thereby containing pre-responses just above 20 kHz. In these cases we must distinguish between the sampling frequency of the signal presented for treatment and a ‘reference’ sampling frequency which relates to the process that created, or will subsequently create, the pre-rings it is desired to treat. Similar care is needed over the ‘z-transform’: for implementation purposes ‘z’ must represent a time advance of one sample of the signal presented for processing, but the criterion previously discussed relating to the positions of zeroes assumes a ‘z’ that represents one sample period of the process that produced or will produce a pre-response.
[0079] For the case where the reference sampling frequency is one-half of the signal's sampling frequency, an appropriate modification to the improvement filters already presented is to replace z by z.sup.2 throughout, and hence z.sup.2 is replaced by z.sup.4. The poles and zeroes shown in
[0080] The filters thus modified could alternatively be implemented by separate processing of substreams consisting of odd samples and even samples respectively, and this may be more economical.
[0081] These possibilities are not exhaustive, and although the processing will be performed digitally, it is not excluded that analogue media may intervene. For example, the archive 3 in