Frequency response method and apparatus
11158341 · 2021-10-26
Assignee
Inventors
Cpc classification
H04S2400/15
ELECTRICITY
H03G5/165
ELECTRICITY
G10L21/00
PHYSICS
H03G5/025
ELECTRICITY
G11B20/10046
PHYSICS
International classification
G06F17/14
PHYSICS
Abstract
The invention provides a method and apparatus for filtering a temporal signal. A target magnitude frequency response H.sub.T(f) is specified (101,201) of frequency f in terms of a column vector l of K weights l.sub.k where log H.sub.T(f)=l.sup.TW(f) and W(f) is a column vector of K magnitude basis functions W.sub.k(f). A constrained frequency response H.sub.c(f) is computed (102,214) defined by log H.sub.c(f)=g.sup.TV(f) , where V(f) is a column vector of N constrained basis functions V.sub.n(f) for which each exp g.sub.nV.sub.n(f) satisfies a constraint preserved by concatenation, and g is a column vector of N coefficients satisfying a matching criterion between l.sup.TW(f) and g.sup.TV(f). An input temporal signal is received (103,212) and filtered (104,210) with the constrained frequency response H.sub.c(f) to form a filtered temporal signal; and the filtered temporal signal is output (105,211).
Claims
1. A method of filtering a temporal signal, the method comprising implementing on a digital signal processing device the steps of: specifying a target magnitude frequency response H.sub.T(f) of frequency f in terms of a column vector l of a number K of weights l.sub.k where log H.sub.T(f)=l.sup.TW(f) where l.sup.T is the vector transpose of l and W(f) is a column vector of K magnitude basis functions W.sub.k(f); computing a constrained frequency response H.sub.c(f) defined by log H.sub.c(f) =g.sup.T V(f), where V(f) is a column vector of a number N of constrained basis functions V.sub.n(f) for which each exp g.sub.nV.sub.n(f) satisfies a constraint preserved by concatenation, and g.sup.T is vector transpose of a column vector g of N coefficients g.sub.n satisfying a matching criterion between l.sup.TW(f) and g.sup.T V(f); receiving an input temporal signal; filtering the input temporal signal with the constrained frequency response H.sub.c(f) to form a filtered temporal signal; and outputting the filtered temporal signal.
2. The method of claim 1, wherein the matching criterion is a condition on one or more error measures between l.sup.TW(f) and g.sup.T V(f).
3. The method of claim 2, wherein the one or more error measures comprises an error measure an error function e(f) defined by l.sup.TW(f)−g.sup.T Re V(f).
4. The method of claim 3, wherein the condition of the matching criterion is a minimization with respect to variation of g, and the error measure of the error function e(f) is an inner product <e,e> where for any functions a(f) and b(f), the inner product denoted <a,b> is an integral or sum over f between a(f) and b(f), whereby g satisfies the matching criterion as Ml, where M=η.sup.+C, η is an N by N matrix η.sub.np defined by <Re V.sub.n, Re V.sub.p>, where Re denotes real part, η.sup.+ denotes an inverse or pseudo-inverse of η, and C is an N by K matrix C.sub.nk defined by <Re V.sub.n, W.sub.k>.
5. The method of claim 4, wherein the inverse or pseudo-inverse is a Moore-Penrose pseudo-inverse.
6. The method of any one of claims 1 to 3, wherein g is specified as g=Ml where M is an N by K matrix and the matching criterion comprises computing M to satisfy a condition on one or more error measures of an error function vector E(f) of K error functions defined by E(f)=W(f)−M.sup.TRe V(f).
7. The method of claim 6, wherein for any functions a(f) and b(f), the inner product an inner product denoted <a,b> is an integral or sum over f between a(f) and b(f), and the one or more error measures incorporate E(f) as at least one of the functions in the inner product.
8. The method of claim 7, wherein the one or more error measures incorporate <E.sub.k, E.sub.j> for at least one pair of values k,j between 1 and K.
9. The method of claim 8, wherein the one or more error measures comprise <E.sub.k, E.sub.k> for each value k between 1 and K, and the condition is minimisation minimization respect to variation of M.
10. The method of claim 7, wherein the one or more measures is <W.sub.k, E.sub.k> for each value k between 1 and K and for each value n between 1 and N, and the condition is <W.sub.k, E.sub.k>=0 whereby M satisfies the matching criterion as M=C.sup.+ϑ, where C is a K by N matrix having elements C.sub.k,n=<W.sub.k, ReV.sub.n> for each value k between 1 and K and for each value n between 1 and N, C.sup.+is an inverse or pseudo-inverse of C and ϑ is a K by K matrix having elements ϑ.sub.k,j=<W.sub.k, W.sub.j> for each pair of values k,j between 1 and K.
11. The method of claim 10, wherein the inverse or pseudo-inverse is a Moore-Penrose pseudo-inverse.
12. The method of claim 1, wherein: the step of specifying the target magnitude frequency response further comprises dynamically altering the target magnitude frequency response according to user preferences or operation of a computational auditory perception model, and the step of computing the constrained frequency response further comprises dynamically updating computation of the constrained frequency response following the dynamic alteration of the target magnitude frequency response.
13. The method of claim 1, wherein the constraint preserved by concatenation is a minimum phase constraint.
14. The method of claim 4, wherein an integrand of the inner product integral or sum is multiplied by a weight function λ(f) adapted to a desired frequency emphasis.
15. The method of claim 1, wherein at least one of the V.sub.n (f) comprises a derivative with respect to gain of a logarithm of a recursively implemented band filter frequency response.
16. The method of claim 1, wherein the steps of specifying the target magnitude frequency response and computing the constrained frequency response are repeatedly performed on demand.
17. The method of claim 1, wherein the temporal signal is an audio signal or a digital communication signal.
18. The method of claim 17, wherein the temporal signal is an audio signal.
19. An apparatus for filtering a temporal signal, the apparatus comprising a digital signal processing device comprising: a target magnitude frequency response specifier configured to specify a target magnitude frequency response H.sub.T(f) of frequency f in terms of a column vector l of a number K of weights l.sub.k where log H.sub.T(f)=l.sup.TW(f) where l.sup.T is a vector transpose of l and W(f) is a column vector of K magnitude basis functions W.sub.k(f); a constrained frequency response determiner programmed to compute a constrained frequency response H.sub.c(f) defined by log H.sub.c(f)=g.sup.T V(f), where V(f) is a column vector of a number N of constrained basis functions V.sub.n(f) for which each exp g.sub.nV.sub.n(f) satisfies a constraint preserved by concatenation, and g is a column vector of N coefficients satisfying a matching criterion between l.sup.TW(f) and g.sup.T V(f), where g.sup.T is a vector transpose of g; an input temporal signal receiver interface to receive an input temporal signal; a filter to filter the input temporal signal with the constrained frequency response H.sub.c(f) to form a filtered temporal signal; and a filtered temporal signal output interface to output the filtered temporal signal.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
DETAILED DESCRIPTION OF EMBODIMENTS
(4) Embodiments of the invention will now be described. It will be appreciated by person skilled in the art that practical embodiments of the invention are implemented in computer hardware, which may be realised in a number of ways. Fundamentally, the invention produces an improved minimum phase equalisation method and system, and the system typically involves programmed off-the-shelf components which are easily obtainable and selectable by person skilled in the art. Consequently most of the description herein concerns definition of the algorithms which define embodiments of the method, programming for which is easily transferred to computer implementation.
(5) Functions including basis functions in this description and the claims are defined using the terminology of continuous functions of frequency, but the invention extends to discrete approximations. Functions are expressed in terms of frequency f (cycles per second), but may equivalently be expressed in terms of angular frequency ω=2πfT, where T is the time sampling interval of the digitally sampled audio signal, or any other form of frequency parameterisation.
(6) The method of the invention is enabled by several independent realisations on how the problem may be simplified into a linear system. First, the problem of concatenating minimum phase filters to match an arbitrary target magnitude frequency response can be recast as a linear equation in terms of the logarithm of the frequency response. Concatenation of filter stages by multiplication to match a target frequency response transforms into sums of logarithms of filter stages. Second, because the Hilbert transform is a linear operator, the minimum phase constraint equation (1) on the real and imaginary parts of the logarithm of the frequency response is satisfied for any linear combination of basis functions with real coefficients, as long as the exponential of each basis function also satisfies (1). Third, suitable basis functions in the logarithmic domain can be derived from recursive filters due to the linearity property of the Hilbert transform in the logarithmic. Hereinafter we will refer to theses basis functions as constrained basis functions, to distinguish them from the basis functions used in relation to the target magnitude frequency response, discussed below and referred to as magnitude basis functions.
(7) Constrained Basis Functions
(8) Constrained basis functions in this embodiment are functions the exponential of which is minimum phase, and the gain parameter is a real multiplicative coefficient. The set of linear coefficients multiplying each constrained basis function defines the resultant constrained frequency response.
(9) This may be expressed as
log H.sub.c(f)=g.sup.TV(f) (2)
(10) where H.sub.c(f) is the constrained frequency response, g.sup.Tis a transposed column vector g of N gain weights g.sub.n and V(f) is a column vector of N constrained basis functions V.sub.n(f).
(11) Because the gain parameter g.sub.n is a linear coefficient and therefore the basis functions do not change as the gain parameters change, the overlap between basis functions is constant and the optimisation problem becomes a simple solution of a linear equation, the details of which depends on the matching criterion used to compare the constrained frequency response with the target magnitude frequency response.
(12) Conveniently, suitable constrained basis functions may be chosen from known recursive minimum phase band filter stages and shelf filters, by making use of the linear nature of the minimum phase constraint equation (1). Specifically, consider a known minimum phase band filter stage function H.sub.n(f,γ.sub.n, f.sub.n,Δf.sub.n) of frequency f, defined by parameters gain γ.sub.n, centre peak frequency f.sub.n and bandwidth Δf.sub.n. Since the minimum phase constraint is a linear constraint operating on the imaginary and real components of log H.sub.n, the minimum phase constraint is also satisfied by its derivative with respect to log γ.sub.n, being a limit of subtraction of two functions satisfying the minimum phase constraint, so long as log γ.sub.n is real (γ.sub.n≥1). Therefore we may define
(13)
(14) and choose the constrained basis function V.sub.n(f) in the form
(15)
(16) An example choice of a class of recursive minimum phase functions H.sub.n(f, γ.sub.n, f.sub.n, Δ.sub.n) as the basis for the above calculation are the Zoelzer low shelf, high shelf and peak filters (Zoelzer, Digitale Audiosignalverarbeitung”, ISBN 9783519261803 page 137). In order to preserve the minimum phase character of exp g.sub.nV.sub.n(f), the coefficient g.sub.n is a real number if V.sub.n(f) is defined as above so that exp V.sub.n(f) is minimum phase. Clearly, opposite cancelling phase rotations in g.sub.n and V.sub.n(f) have no effect on the outcome.
(17) A typical choice of the number N of the constrained basis functions V.sub.n(f) is slightly less than 100.
(18) Notwithstanding the above convenient example, in general the constrained basis functions can be any functions the exponential of which satisfies the constraint, not necessarily functions representing single peaks. Any set of functions satisfying the constraint can be chosen which enable representation of a convenient range of constrained frequency responses to match the desired range of target magnitude frequency responses.
(19) Target Magnitude Basis Functions
(20) The logarithm of the target magnitude frequency response H.sub.T(f) which is a real function of frequency f, is specified in terms of a linear combination of K magnitude basis functions W.sub.k(f) multiplied by K positive or negative real weights l.sub.k using vector notation as follows:
log H.sub.T(f)=l.sup.TW(f) (5)
(21) The number K of magnitude basis functions and the form of each basis function is chosen to enable the desired freedom and smoothness in the functional form of H.sub.T(f) for the application. Conveniently, each W.sub.k(f) may be chosen as a triangle function with a peak of value 1 at a control point frequency f.sub.k linearly falling to zero at control point frequencies of neighbouring basis functions, except for the highest and lowest frequencies which may be better represented by shelf functions as depicted in
(22) There is no restriction on the number of magnitude basis functions, the only practical limitation of which may be computational speed. A typical choice of the number of magnitude basis functions and corresponding control points is 100 or greater, which does not need to be the same as the number of constrained basis functions. The larger values of K, the higher the computational load but with modern computing power values of K up to 500 or more are feasible with the method of the invention, when applied to real-time adjustment of the frequency response.
(23) Matching Criterion—Example 1
(24) The method uses a matching criterion to define the degree of match between the constrained frequency response and the target magnitude frequency response and does not require any particular matching criterion for practical application, but variants are discussed herein by way of example.
(25) First, it is useful to define an inner product of two functions
<a,b>∫df a*(f)b(f)λ(f) (6)
(26) where the function λ(f) is an optional weighting function for providing different weighting of different frequencies in accordance with requirements of the application. Typically, λ(f)˜1/f.
(27) Using the inner product, the method may then define an error function being a measure of difference between the constrained frequency response and the target magnitude frequency response in terms of the basis functions, such as the real function
e(f)l.sup.TW(f)−g.sup.TRe V(f) (7)
(28) where exp V.sub.n(f) is normalized as minimum phase so that g.sub.n is real (see par 0038 above). Taking the real part advantageously also halves the computational load of solving the resulting equations.
(29) A matching criterion can then be applied to the error function, in this example being minimisation of <e,e> with respect to variation of g.sub.k, for which the stationary point is straightforwardly derived as g=Ml where M is a N by K matrix defined by inner products between the basis functions M=η.sup.+C, η is an N by N matrix η.sub.np defined by <Re V.sub.n, Re V.sub.p>, the superscript + denotes an inverse or pseudo-inverse operation, and C is an N by K matrix C.sub.nk defined by <Re V.sub.n, W.sub.k>.
(30) While for this matching criterion η is regular and its inverse η.sup.−1 exists, for reasons of numerical stability the Moore-Penrose pseudo-inverse is preferred, because its use guarantees stability and produces a well-behaved solution even if η is close to ill-conditioned.
(31) The matrix M may be calculated once, for a given choice of constrained basis functions V.sub.n(f) and magnitude basis functions W.sub.k(f), and stored in computer memory. In a typical implementation, there is no need for adjustment of the basis functions, although for general-purpose systems different choices of basis functions may be part of a system. The matrix M involves inner product integrals which are defined as integrals over a continuous frequency variable f, and are conveniently calculated using numerical integration techniques. Alternatively, in some instances analytical solutions to the integrals may be known or algebraically derived and may be directly programmed into the system. Further, as discussed above, the inner products may be implemented as discrete sums rather than integrals in their definition.
(32) Choice of the real coefficients l.sub.k specifies the target magnitude frequency response, and calculation of the coefficients g.sub.n is therefore straightforward and deterministic knowing M. The actual constrained frequency response follows from (2) as
H.sub.c(f)=exp(V.sup.T(f)g)=exp(V.sup.T(f)Ml) (8)
Computation of Filtered Audio Signal
(33) Techniques of the prior art are adapted to apply the constrained frequency response H.sub.c (f) to a digitally sampled audio signal x(n), briefly described here. H.sub.c(f) is approximately represented by a vector H.sub.ci evaluated at a discrete set of frequencies. The digital audio signal is temporarily buffered in memory and processed in chunks of length C samples and the Fast Fourier Transform (FFT) performed on each chunk to produce a FFT transformed signal y.sub.j. In the inventors' current embodiment, the chunk length C (or C+R as discussed below) is typically 512 digital samples which at a sampling rate of about 50,000 samples per second comprises about 0.01 seconds of audio signal per chunk.
(34) For the FFT, simple pointwise multiplication of the FFT transformed signal y.sub.j with a discretised frequency response filter H.sub.ci transforms via the inverse Fast Fourier Transform (iFFT) into a circular convolution in the time domain, introducing temporal aliasing artefacts. When a recursive filter is used in the prior art, temporal aliasing is avoided in the time domain by padding both the time series chunks x(n) of length C and the filter defined in the time domain by the impulse response h(n) of length R with trailing zeros to form equal length series each of length C+R. In this case, the computed filter H.sub.ci is the frequency response in the frequency domain and the operation in the frequency domain corresponding to padding of the impulse response in the time domain to size C+R is equivalently provided to choose C+R rather than R equally spaced frequencies for the discretization of H.sub.ci. The C+R pointwize multiplication of H.sub.ci with the FFT transformed padded audio signal X.sub.j, is then transformed back into the time domain with the iFFT to produce the filtered audio signal y.sub.i of length C+R in the time domain.
(35) The filtered audio signal chunks of length C are constructed and joined using the overlap-add or overlap-save algorithms on the filtered chunks of length C+R. Further, in order to avoid artefacts generated by numerical discontinuities at the chunk boundaries, the input audio signal chunks of length C may also be constructed to overlap with a fading window applied to them so that all chunks add up to the original signal while fading in and out individually.
(36) Because H.sub.c(f) is minimum phase and therefore has minimum impulse response tail, the appropriate value for R is well constrained and for the impulse response length is given by approximately at least 2π/F where F the smallest width of any of the band basis functions V.sub.n, usually located at the lowest frequency band. The current implementation utilises R=10/F.
(37) A highly efficient implementation for calculation of the filtered audio signal can be provided discretising the V.sub.n(f) to V.sub.inthereby expressing equation (8) directly in the discrete form
H.sub.ci=exp(Σ.sub.n,kV.sub.inM.sub.nkl.sub.k)=exp(Σ.sub.kA.sub.ikl.sub.k) (9)
(38) so the (C+R) by K matrix A.sub.ik=Σ.sub.nV.sub.inM.sub.nk may be precalculated. In a system where l.sub.k may be dynamically changed, recalculation of H.sub.ci for each frequency i requires simply the multiplication Σ.sub.kA.sub.ikl.sub.k and a complex exponentiation, which may be performed easily in real-time, as defined by the elapsed time of a single chunk C.
(39) Typically, one dedicated processor and architecture is used to continually process and produce the filtered audio signal based on application of a current value of the discretised frequency response H.sub.ci, and a separate processor may be used to receive and calculate the updated H.sub.ci when the target magnitude frequency response represented by l.sub.k is changed either by direct user selection or the computed input of a psychoacoustic model.
(40) Referring now to
(41) Two examples of prior art computational auditory perception models are (1) a model as required by MPEG for MP3 compression encoding in order to calculate inter-band masking; and (2) standardised loudness models which provide a target magnitude magnitude frequency response based on a target loudness profile. An example of a machine learning model could be an automatic mechanism which learns how the time-varying spectral density of music behaves in music which is perceived by human listeners to be well-balanced, and then applies adjustments via the target magnitude frequency response to alter the spectral density of input music to improve the perceived balance.
(42) Step 101 may thus be performed repeatedly. Upon update of the vector l, in step 102 the constrained frequency response is calculated in accordance with the methods described. Steps 103, 104 and 105 are performed continually in a cycle on chunks of audio signal, usually in real-time, but in the broadest aspect of the invention the steps may be performed faster or slower than real-time in an off-line mode whereby the filtered audio signal may be output to a memory for later playing and reproduction on a sound system. In step 103, the input audio signal is received from a source, preferably in digital form. It will be understood that the source in the broadest aspect is unrestricted, and can be for example a microphone pick up, a synthesised audio signal, a locally stored audio signal or an audio signal streamed over a network from a remote location. The source may also be provided by a digital audio workstation host software application which also acts as the routing destination for the output and potential source of a sidechain control signal.
(43) Typically, the input audio signal is stored in a buffer for processing chunk by chunk. In step 104, each chunk of the input audio signal is filtered by applying the constrained frequency response to produce and join chunks of filtered audio signal, exemplified by the methods described above. In step 105, the filtered audio signal is output chunk by chunk to audio hardware for listening or stored for later reproduction. In one embodiment, optional step 106 comprises user adjustment of one or more of the basis functions W.sub.k(f) or V.sub.n(f), or the associated weighting function λ(f). This may be desired if a new application requires increased detail of the target magnitude frequency response in a particular frequency range, or sensitivity of the matching criterion to a particular frequency range.
(44) Referring now to
(45) An input audio signal receiver interface 212 which may be under selection by user 10 selects an input audio signal from a source, which may be a stored audio signal on digital memory 13 which may be sent from a digital audio workstation (which may be implemented as a software program running on a general purpose on a computer) or from a remote source via a remote network 11 such as the Internet, or a live audio pickup via microphone 12 and an analog to digital converter 213. Chunks of digital data series x.sub.i of the input audio signal are received by filter 210 and filtered with the discretised minimum phase frequency response H.sub.ci stored in constrained frequency response memory 214, using the fast linear convolution FFT algorithms of the prior art as described above to produce chunks of digital data series y.sub.i of the filtered audio signal. Filtered audio signal output interface 211 then outputs the filtered audio signal to one or more output destinations under selection by user 10 such as to a storage on digital memory 13 which may be in a digital audio workstation or from a remote source via a remote network 11 such as the Internet or to an audio system 15 to produce sound.
(46) Target magnitude frequency response specifier 201 allows adjustment of the target magnitude frequency response H.sub.T(f) via specification of the K weights l.sub.k stored in magnitude weight memory 207, by direct user selection of the weights which as described above may be control points capable of selection on a graphical interface depicting the target magnitude frequency response curve, by automatic operation of a computational auditory perception model 202 or a machine learning model which may process the input audio signal x.sub.i to determine characteristics of the audio signal for input into the model 202, and which may also be under user selection in respect of user-specifiable parameters of the model.
(47) As described above in relation to
(48) Upon modification of the magnitude weights l.sub.k or the match coefficient matrix A.sub.ik, constrained frequency response determiner 209 computes the discretised constrained frequency response H.sub.ci using formula (9) and stores the result in constrained frequency response memory 214.
(49) In the current implementation of the inventors, the software is written in generic C++ code and Intel 64-bit architecture is used, without the need for dedicated digital signal processing hardware. Other implementations could incorporate ARM, dedicated DSP hardware or even low-level logic devices such as FPGAs or ASICs.
(50) Other Matching Criteria Examples
(51) There are many other matching criteria which generally produce a small error value and therefore can in practice produce adequate matching results. For example, we may require a solution of the form g=Ml where M is an N by K matrix and the error function e(f) from equation (7) becomes l.sup.TE(f) with the error vector E(f) defined as E(f)=W(f)−M.sup.TRe V(f) and factoring out l.sup.k we can define a matching criterion involving each E.sub.k(f).
(52) With this approach, using a matching criterion varying M to minimise <E.sub.k, E.sub.k> for each k, results in the same outcome as example 1 above.
(53) Alternatively, using a non-variational matching criterion <W.sub.k, E.sub.k>=0 for each k results in the equation M=C.sup.+ϑ, where the K by N matrix C is C.sub.k,n=<W.sub.k, Re V.sub.n>, superscript + denotes a generalised inverse and the K by K matrix ϑ is ϑ.sub.k,j=<W.sub.k, W.sub.j >. In this case, C is in general not invertible and the Moore-Penrose pseudo-inverse may be necessary rather than desirable, giving a least-squares approximation fit. While this example may not provide as close a match in some circumstances as example 1, the result is mostly acceptable and it will be appreciated that many other variant matching criteria along similar lines are possible.
(54) Concluding Remarks
(55) Persons skilled in the art will appreciate that many variations may be made to the invention without departing from the scope of the invention, which is determined from the broadest scopes and claims.
(56) For example, while the invention is conceived primarily with respect to implementation of minimum phase audio signal filters, other constraint criteria are within the broadest scope of the invention, which can be applied to any constraint which survives concatenation of the exponential of the constrained basis functions, i.e. which survives linear combination of the constrained basis functions. Constraint criteria which are approximately minimum phase are also within the scope of the invention.
(57) Further, while the examples above recite particular error functions and error vectors, other means of matching the linear combination of target magnitude and constrained basis functions is within the broadest scope of the invention.
(58) Further still, while the examples above relate to audio signals, as discussed on page 1 of this description the applications extend in the broadest aspect of the invention to temporal signals which are not audio signals, such as digital communications signals where persons skilled in the art will appreciate that the methods of the invention can be equally applied to adaptive channel equalisation, where the aim is to improve the quality of a digital signal.
(59) Further also, in the broadest aspect of the invention the temporal signal may be a physical signal or a synthesised signal.
(60) In the claims which follow and in the preceding description of the invention, except where the context requires otherwise due to express language or necessary implication, the word “comprise” or variations such as “comprises” or “comprising” is used in an inclusive sense, i.e. to specify the presence of the stated features but not to preclude the presence or addition of further features in various embodiments of the invention. Further, any method steps recited in the claims are not necessarily intended to be performed temporally in the sequence written, or to be performed without pause once started, unless the context requires it.
(61) It is to be understood that, if any prior art publication is referred to herein, such reference does not constitute an admission that the publication forms a part of the common general knowledge in the art, in Australia or any other country.