Oversampling in a combined transposer filter bank
11591657 · 2023-02-28
Assignee
Inventors
Cpc classification
G10L19/265
PHYSICS
C12Q1/6883
CHEMISTRY; METALLURGY
International classification
C12Q1/6883
CHEMISTRY; METALLURGY
G10L19/022
PHYSICS
G10L19/02
PHYSICS
Abstract
The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank (501) comprising an analysis transformation unit (601) having a frequency resolution of Δf; and an analysis window (611) having a duration of D.sub.A; the analysis filter bank (501) being configured to provide a set of analysis subband signals from the low frequency component of the signal; a nonlinear processing unit (502, 650) configured to determine a set of synthesis subband signals based on a portion of the set of analysis subband signals, wherein the portion of the set of analysis subband signals is phase shifted by a transposition order T; and a synthesis filter bank (504) comprising a synthesis transformation unit (602) having a frequency resolution of QΔf; and a synthesis window (612) having a duration of D.sub.S; the synthesis filter bank (504) being configured to generate the high frequency component of the signal from the set of synthesis subband signals; wherein Q is a frequency resolution factor with Q≥1 and smaller than the transposition order T; and wherein the value of the product of the frequency resolution Δf and the duration D.sub.A of the analysis filter bank is selected based on the frequency resolution factor Q.
Claims
1. A system for generating an output audio signal comprising a high frequency component from an input audio signal comprising a low frequency component using a transposition order T, comprising: an analysis window unit configured to apply an analysis window of a length of L.sub.A samples, thereby extracting a frame of the input signal; an analysis transformation unit of order M and having a frequency resolution Δf configured to transform the L.sub.A samples into M complex coefficients; a nonlinear processing unit, configured to modify phases of the complex coefficients based on the transposition order T, and to modify magnitudes of the complex coefficients based on the transposition order T; a synthesis transformation unit of order M and having a frequency resolution QΔf, configured to transform the altered coefficients into M altered samples; wherein Q is a frequency resolution factor independent of the transposition order T; and a synthesis window unit configured to apply a synthesis window of a length of L.sub.s samples to the M altered samples, thereby generating a frame of the output signal; wherein M is equal to (QL.sub.A+L.sub.s)/2.
2. A method for generating an output audio signal comprising a high frequency component from an input audio signal comprising a low frequency component using a transposition order T, the method comprising: applying an analysis window of a length of L.sub.A samples, thereby extracting a frame of the input signal; transforming the frame of L.sub.A samples of the input signal into M complex coefficients using an analysis transformation of order M and frequency resolution Δf; modifying phases of the complex coefficients based on the transposition order T, and modifying magnitudes of the complex coefficients based on the transposition order T; transforming the altered coefficients into M altered samples using a synthesis transformation of order M and frequency resolution QΔf; wherein Q is a frequency resolution factor independent of the transposition order T; and applying a synthesis window of a length of L.sub.s samples to the M altered samples, thereby generating a frame of the output signal; wherein M is equal to (QL.sub.A+L.sub.s)/2.
3. A non-transitory computer-readable storage medium comprising a sequence of instructions, wherein, when executed by one or more processors, the sequence of instructions causes the one or more processors to perform a method for generating an output audio signal comprising a high frequency component from an input audio signal comprising a low frequency component using a transposition order T, the method comprising: applying an analysis window of a length of L.sub.A samples, thereby extracting a frame of the input signal; transforming the frame of L.sub.A samples of the input signal into M complex coefficients using an analysis transformation of order M and frequency resolution Δf; modifying phases of the complex coefficients by using the transposition order T, and modifying magnitudes of the complex coefficients based on the transposition order T; transforming the altered coefficients into M altered samples using a synthesis transformation of order M and frequency resolution QΔf; wherein Q is a frequency resolution factor independent of the transposition order T; and applying a synthesis window of a length of L.sub.s samples to the M altered samples, thereby generating a frame of the output signal; wherein M is equal to (QL.sub.A+L.sub.s)/2.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The present invention will now be described by way of illustrative examples, not limiting the scope or spirit of the invention, with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
DESCRIPTION OF PREFERRED EMBODIMENTS
(14) The below-described embodiments are merely illustrative for the principles of the present invention for oversampling in a combined transposer filter bank. It is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
(15)
(16) Typically, each filter bank has a physical frequency resolution Δf measured in Hertz and a physical time stride parameter Δt measured in seconds, wherein the physical frequency resolution Δf is usually associated with the frequency resolution of the transform function and the physical time stride parameter Δt is usually associated with the time interval between succeeding window functions. These two parameters, i.e. the frequency resolution and the time stride, define the discrete-time parameters of the filter bank given the chosen sampling rate. By choosing the physical time stride parameters, i.e. the time stride parameter measured in time units e.g. seconds, of the analysis and synthesis filter banks to be identical, an output signal of the transposer 100 may be obtained which has the same sampling rate as the input signal.
(17) Furthermore, by omitting the nonlinear processing 102 a perfect reconstruction of the input signal at the output may be achieved. This requires a careful design of the analysis and synthesis filter banks. On the other hand, if the output sampling rate is chosen to be different from the input sampling rate, a sampling rate conversion may be obtained. This mode of operation may be necessary in the case where the desired bandwidth of the output signal y is larger than half of sampling rate of the input signal x, i.e. when the desired output bandwidth exceeds the Nyqvist frequency of the input signal.
(18)
(19) It should be noted that each transposer 201-1, 201-2, . . . , 201-P requires an analysis and a synthesis filter bank as depicted in
(20)
(21) It should be noted that if the synthesis filter banks 303-1, 303-2, . . . , 303-P corresponding to the different transposition orders operate at different sampling rates, e.g. by using different degrees of bandwidth expansion, the time domain output signals of the different synthesis filter banks 303-1, 303-2, . . . , 303-P need to be differently resampled in order to align the P output signals to a common time grid, prior to their summation in combiner 304.
(22)
(23)
(24) As already indicated above, the nonlinear processing 102 typically provides a number of subbands at its output which corresponds to the number of subbands at the input. The non-linear processing 102 typically modifies the phase and/or the amplitude of the subband or the subband signal according to the underlying transposition order T. By way of example a subband at the input is converted to a subband at the output with T times higher frequency, i.e. a subband at the input to the nonlinear processing 102, i.e. the analysis subband,
(25)
may be transposed to a subband at the output of the nonlinear processing 102, i.e. the synthesis subband,
(26)
wherein k is a subband index number and Δf if the frequency resolution of the analysis filter bank. In order to allow for the use of common analysis filter banks 501 and common synthesis filter banks 504, one or more of the advanced processing units 502-1, 502-2, . . . , 502-P may be configured to provide a number of output subbands which may be different from the number of input subbands.
(27) In the following, the principles of advanced nonlinear processing in the nonlinear processing units 502-1, 502-2, . . . , 502-P will be outlined. For this purpose, it is assumed that the analysis filter bank and the synthesis filter bank share the same physical time stride parameter Δt. the analysis filter bank has a physical frequency resolution Δf. the synthesis filter bank has a physical frequency resolution QΔf where the resolution factor Q≥1 is an integer.
(28) Furthermore, it is assumed that the filter banks are evenly stacked, i.e. the subband with index zero is centered around the zero frequency, such that the analysis filter bank center frequencies are given by kΔf where the analysis subband index k=1, . . . , K.sub.A−1 and K.sub.A is the number of subbands of the analysis filter bank. The synthesis filter bank center frequencies are given by kQΔf where the synthesis subband index n=1, . . . , N.sub.S−1 and N.sub.S is the number of subbands of the synthesis filter bank.
(29) When performing a conventional transposition of integer order T≥1 as shown in
θ.sub.S(k)=Tθ.sub.A(k), (1)
where θ.sub.A(k) is the phase of a (complex) sample of the analysis subband k and θ.sub.S(k) is the phase of a (complex) sample of the synthesis subband k. The magnitude or amplitude of a sample of the subband may be kept unmodified or may be increased or decreased by a constant gain factor. Due to the fact that T is an integer, the operation of equation (1) is independent of the definition of the phase angle.
(30) In conventional multiple transposers, the resolution factor Q of an analysis/synthesis filter bank is selected to be equal to the transposition order T of the respective transposer, i.e. Q=T. In this case, the frequency resolution of the synthesis filter bank is TΔf and therefore depends on the transposition order T. Consequently, it is necessary to use different filter banks for different transposition orders T either in the analysis or synthesis stage. This is due to the fact that the transposition order T defines the quotient of physical frequency resolutions, i.e. the quotient of the frequency resolution Δf of the analysis filter bank and the frequency resolution TΔf of the synthesis filter bank.
(31) In order to be able to use a common analysis filter bank 501 and a common synthesis filter bank 504 for a plurality of different transposition orders T, it is proposed to set the frequency resolution of the synthesis filter bank 504 to QΔf, i.e. it is proposed to make the frequency resolution of the synthesis filter bank 504 independent of the transposition order T. Then the question arises of how to implement a transposition of order T when the resolution factor Q, i.e. the quotient Q of the physical frequency resolution of the analysis and synthesis filter bank, does not necessarily obey the relation Q=T.
(32) As outlined above, a principle of harmonic transposition is that the input to the synthesis filter bank subband n with center frequency nQΔf is determined from an analysis subband at a T times lower center frequency, i.e. at the center frequency nQΔf/T. The center frequencies of the analysis subbands are identified through the analysis subband index k as kΔf. Both expressions for the center frequency of the analysis subband index, i.e. nQΔf/T and kΔf, may be set equal. Taking into account that the index n is an integer value, the expression
(33)
is a rational number which can be expressed as the sum of an integer analysis subband index k and a remainder r∈{0, 1/T, 2/T, . . . , (T−1)/T} such that
(34)
(35) As such, it may be stipulated that the input to a synthesis subband with synthesis subband index n may be derived, using a transposition of order T, from the analysis subband with the index k given by equation (2). In view of the fact that
(36)
is a rational number, the remainder r may be unequal to 0 and the value k+r may be greater than the analysis subband index k and smaller than the analysis subband index k+1, i.e. k≤k+r≤k+1. Consequently, the input to a synthesis subband with synthesis subband index n should be derived, using a transposition of order T, from the analysis subbands with the analysis subband index k and k+1, wherein k is given by equation (2). In other words, the input of a synthesis subband may be derived from two consecutive analysis subbands.
(37) As an outcome of the above, the advanced nonlinear processing performed in a nonlinear processing unit 502-1, 502-2, . . . , 502-P may comprise the step of considering two neighboring analysis subbands with index k and k+1 in order to provide the output for synthesis subband n. For a transposition order T, the phase modification performed by the nonlinear processing unit 502-1, 502-2, . . . , 502-P may for example be defined by the linear interpolation rule,
θ.sub.s(n)=T(1−r)θ.sub.A(k)+Trθ.sub.A(k+1), (3)
where θ.sub.A(k) is the phase of a sample of the analysis subband k, θ.sub.A(k+1) is the phase of a sample of the analysis subband k+1, and θ.sub.S(n) is the phase of a sample of the synthesis subband n. If the remainder r is close to zero, i.e. if the value k+r is close to k, then the main contribution of the phase of the synthesis subband sample is derived from the phase of the analysis subband sample of subband k. On the other hand, if the remainder r is close to one, i.e. if the value k+r is close to k+1, then the main contribution of the phase of the synthesis subband sample is derived from the phase of the analysis subband sample of subband k+1. It should be noted that the phase multipliers T(1−r) and Tr are both integers such that the phase modifications of equation (3) are well defined and independent of the definition of the phase angle.
(38) Concerning the magnitudes of the subband samples, the following geometrical mean value may be selected for the determination of the magnitude of the synthesis subband samples,
a.sub.s(n)=a.sub.A(k).sup.(1-r)a.sub.A(k+1).sup.r, (4)
where as (n) denotes the magnitude of a sample of the synthesis subband n, a.sub.A(k) denotes the magnitude of a sample of the analysis subband k and a.sub.A(k+1) denotes the magnitude of a sample of the analysis subband k+1. It should be noted that other interpolation rules for the phase and/or the magnitude may be contemplated.
(39) For the case of an oddly stacked filter bank where the analysis filter bank center frequencies are given by
(40)
with k=1, . . . , K.sub.A−1 and the synthesis filter bank center frequencies are given by
(41)
with n=1, . . . , N.sub.S−1, an corresponding equation to equation (2) may be derived by equating the transposed synthesis filter bank center frequency
(42)
and the analysis filter bank center frequency
(43)
Assuming an integer index k and a remainder r∈[0,1[ the following equation for oddly stacked filter banks can be derived:
(44)
(45) The skilled person will appreciate that if T−Q, i.e. the difference between the transposition order and the resolution factor, is even, T(1−r) and Tr are both integers and the interpolation rules of equations (3) and (4) can be used.
(46) The mapping of analysis subbands into synthesis subbands is illustrated in
(47) In the illustrated case, equation (2) may be written as
(48)
Consequently, for a transposition order T=1, an analysis subband with an index k is mapped to a corresponding synthesis subband n and the remainder r is always zero. This can be seen in
(49) In case of transposition order T=2, the remainder r takes on the values 0 and ½ and a source bin is mapped to a plurality of target bins. When reversing the perspective, it may be stated that each target bin 532, 535 receives a contribution from up to two source bins. This can be seen in
(50) A further interpretation of the above advanced nonlinear processing may be as follows. The advanced nonlinear processing may be understood as a combination of a transposition of a given order T into intermediate subband signals on an intermediate frequency grid TΔf, and a subsequent mapping of the intermediate subband signals to a frequency grid defined by a common synthesis filter bank, i.e. by a frequency grid QΔf. In order to illustrate this interpretation, reference is made again to
(51) In summary, a nonlinear processing method has been described which allows the determination of contributions to a synthesis subband by means of transposition of several analysis subbands. The nonlinear processing method enables the use of single common analysis and synthesis subband filter banks for different transposition orders, thereby significantly reducing the computational complexity of multiple harmonic transposers.
(52)
(53)
(54) It can be seen that the analysis window 611 is moved by a hop size 621 of 128 samples. The synthesis window 612 corresponding to a transposition of order T=2 is moved by a hop size 622 of 256 samples, i.e. a hop size 622 which is twice the hop size 621 of the analysis window 611. As outlined above, this leads to a time stretch of the signal by the factor T=2. Alternatively, if a T=2 times higher sampling rate is assumed, the difference between the analysis hop size 621 and the synthesis hop size 622 leads to a harmonic transposition of order T=2. I.e. a time stretch by an order T may be converted into a harmonic transposition by performing a sampling rate conversion of order T.
(55) In a similar manner, it can be seen that the synthesis hop size 623 associated with the harmonic transposer of order T=3 is T=3 times higher than the analysis hop size 621, and the synthesis hop size 624 associated with the harmonic transposer of order T=4 is T=4 times higher than the analysis hop size 621. In order to align the sampling rates of the 3.sup.rd order transposer and the 4.sup.th order transposer with the output sampling rate of the 2.sup.nd order transposer, the 3.sup.rd order transposer and the 4.sup.th order transposer comprise a factor 3/2—downsampler 633 and a factor 2—downsampler 634, respectively. In general terms, the T order transposer would comprise a factor T/2—downsampler, if an output sampling rate is requested, which is 2 times higher than the input sampling rate. I.e. no downsampling is required for the harmonic transposer of order T=2.
(56) Finally,
(57) An efficient combined filter bank structure for the transposer can be obtained by limiting the multiple transposer of
(58) The analysis/synthesis filter bank of
(59) As has been outlined in the context of
(60) In a similar manner to the case of Q=1 illustrated in
(61) The above mentioned non-linear processing is performed in the multiple transposer unit 650 which determines target bins 730 for the different orders of transposition T=2, 3, 4 using advanced non-linear processing units 502-2, 502-3, 502-4. Subsequently, corresponding target bins 730 are combined in a combiner unit 503 to yield a single set of synthesis subband signals which are fed to the synthesis filter bank. As outlined above, the combiner unit 503 is configured to combine a plurality of contributions in overlapping frequency ranges from the output of the different non-linear processing units 502-2, 502-3, 502-4.
(62) In the following, the harmonic transposition of transient signals using harmonic transposers is outlined. In this context, it should be noted that harmonic transposition of order T using analysis/synthesis filter banks may be interpreted as time stretching of an underlying signal by an integer transposition factor T followed by a downsampling and/or sample rate conversion. The time stretching is performed such that frequencies of sinusoids which compose the input signal are maintained. Such time stretching may be performed using the analysis/synthesis filter bank in combination with intermediate modification of the phases of the subband signals based on the transposition order T. As outlined above, the analysis filter bank may be a windowed DFT filter bank with analysis window v.sub.A and the synthesis filter bank may be a windowed inverse DFT filter bank with synthesis window vs. Such analysis/synthesis transform is also referred to as short-time Fourier Transform (STFT).
(63) A short-time Fourier transform is performed on a time-domain input signal x to obtain a succession of overlapped spectral frames. In order to minimize possible side-band effects, appropriate analysis/synthesis windows, e.g. Gaussian windows, cosine windows, Hamming windows, Hann windows, rectangular windows, Bartlett windows, Blackman windows, and others, should be selected. The time delay at which every spectral frame is picked up from the input signal x is referred to as the hop size Δs or physical time stride Δt. The STFT of the input signal x is referred to as the analysis stage and leads to a frequency domain representation of the input signal x. The frequency domain representation comprises a plurality of subband signals, wherein each subband signal represents a certain frequency component of the input signal.
(64) For the purpose of time-stretching of the input signal, each subband signal may be time-stretched, e.g. by delaying the subband signal samples. This may be achieved by using a synthesis hop-size which is greater than the analysis hop-size. The time domain signal may be rebuilt by performing an inverse (Fast) Fourier transform on all frames followed by a successive accumulation of the frames. This operation of the synthesis stage is referred to as overlap-add operation. The resulting output signal is a time-stretched version of the input signal comprising the same frequency components as the input signal. In other words, the resulting output signal has the same spectral composition as the input signal, but it is slower than the input signal i.e. its progression is stretched in time.
(65) The transposition to higher frequencies may then be obtained subsequently, or in an integrated manner, through downsampling of the stretched signals or by performing a sample-rate conversion of the time stretched output signal. As a result the transposed signal has the length in time of the initial signal, but comprises frequency components which are shifted upwards by a pre-defined transposition factor.
(66) In view of the above, the harmonic transposition of transient signals using harmonic transposers is described by considering as a starting point the time stretching of a prototype transient signal, i.e. a discrete time Dirac pulse at time instant t=t.sub.0,
(67)
(68) The Fourier transform of such a Dirac pulse has unit magnitude and a linear phase with a slope proportional to t.sub.0:
(69)
wherein
(70)
is the center frequency of the m.sup.th subband signal of the STFT analysis and M is the size of the discrete Fourier transform (DFT). Such Fourier transform can be considered as the analysis stage of the analysis filter bank described above, wherein a flat analysis window v.sub.A of infinite duration is used. In order to generate an output signal y which is time-stretched by a factor T, i.e. a Dirac pulse δ(t−Tt.sub.0) at the time instant t=Tt.sub.0, the phase of the analysis subband signals should be multiplied by the factor T in order to obtain the synthesis subband signal Y(Ω.sub.m)=exp(−jΩ.sub.mTt.sub.0) which yields the desired Dirac pulse δ(t−Tt.sub.0) as an output of an inverse Fourier Transform.
(71) However, it should be noted that the above considerations refer to an analysis/synthesis stage using analysis and synthesis windows of infinite lengths. Indeed, a theoretical transposer with a window of infinite duration would give the correct stretch of a Dirac pulse δ(t- to). For a finite duration windowed analysis, the situation is scrambled by the fact that each analysis block is to be interpreted as one period interval of a periodic signal with a period equal to the size of the DFT.
(72) This is illustrated in
(73) In a real-world system, the pulse train actually contains a few pulses only (depending on the transposition factor), one main pulse, i.e. the wanted term, a few pre-pulses and a few post-pulses, i.e. the unwanted terms. The pre- and post-pulses emerge because the DFT is periodic (with L). When a pulse is located within an analysis window, so that the complex phase gets wrapped when multiplied by T (i.e. the pulse is shifted outside the end of the window and wraps back to the beginning), an unwanted pulse emerges within the synthesis window. The unwanted pulses may have, or may not have, the same polarity as the input pulse, depending on the location in the analysis window and the transposition factor.
(74) In the example of
(75) As the analysis and synthesis stage move along the time axis according to the hop factor Δs or the time stride Δt, the pulse δ(t−t.sub.0) 812 will have another position relative to the center of the respective analysis window 811. As outlined above, the operation to achieve time-stretching consists in moving the pulse δ12 to T times its position relative to the center of the window. As long as this position is within the window 821, this time-stretch operation guarantees that all contributions add up to a single time stretched synthesized pulse δ(t−Tt.sub.0) at t=Tt.sub.0.
(76) However, a problem occurs for the situation of
(77) The principle of the solution to this problem is described in reference to
(78) It should be noted that in a preferred embodiment the synthesis window and the analysis window have equal “nominal” lengths (measured in the number of samples). However, when using implicit resampling of the output signal by discarding or inserting samples in the frequency bands of the transform or filter bank, the synthesis window size (measured in the number of samples) will typically be different from the analysis size, depending on the resampling and/or transposition factor.
(79) The minimum value of F, i.e. the minimum frequency domain oversampling factor, can be deduced from
(80)
i.e. for any input pulse comprised within the analysis window 1011, the undesired image δ(t−Tt.sub.0+FL) at time instant t=Tt.sub.0−FL must be located to the left of the left edge of the synthesis window at
(81)
In an equivalent manner, the condition
(82)
must be met, which leads to the rule
(83)
(84) As can be seen from formula (6), the minimum frequency domain oversampling factor F is a function of the transposition order T. More specifically, the minimum frequency domain oversampling factor F is proportional to the transposition order T.
(85) By repeating the line of thinking above for the case where the analysis and synthesis windows have different lengths one obtains a more general formula. Let L.sub.A and L.sub.s be the lengths of the analysis and synthesis windows (measured in the number of samples), respectively, and let M be the DFT size employed. The general rule extending formula (6) is then
(86)
(87) That this rule indeed is an extension of (6) can be verified by inserting M=FL, and L.sub.A=L.sub.s=L in (7) and dividing by L on both side of the resulting equation.
(88) The above analysis is performed for a rather special model of a transient, i.e. a Dirac pulse. However, the reasoning can be extended to show that when using the above described time-stretching and/or harmonic transposition scheme, input signals which have a near flat spectral envelope and which vanish outside a time interval [a, b] will be stretched to output signals which are small outside the interval [Ta,Tb]. It can also be verified, by studying spectrograms of real audio and/or speech signals, that pre-echoes disappear in the stretched or transposed signals when the above described rule for selecting an appropriate frequency domain oversampling factor is respected. A more quantitative analysis also reveals that pre-echoes are still reduced when using frequency domain oversampling factors which are slightly inferior to the value imposed by the condition of formula (6) or (7). This is due to the fact that typical window functions vs are small near their edges, thereby attenuating undesired pre-echoes which are positioned near the edges of the window functions.
(89) In summary, a way to improve the transient response of frequency domain harmonic transposers, or time-stretchers, has been described by introducing an oversampled transform, where the amount of oversampling is a function of the transposition factor chosen. The improved transient response of the transposer is obtained by means of frequency domain oversampling.
(90) In the multiple transposer of
(91) In the following, the use of frequency domain oversampling in the context of combined analysis/synthesis filter banks, such as described in the context of
(92) In general, for a combined transposition filter bank where the physical spacing QΔf of the synthesis filter bank subbands is Q times the physical spacing Δf of the analysis filter bank and where the physical analysis window duration D.sub.A (measured in units of time, e.g. seconds) is also Q times that of the synthesis filter bank, D.sub.A=QD.sub.S, the analysis for a Dirac pulse as above will apply for all transposition factors T=Q, Q+1, Q+2, . . . as if T=Q. In other words, the rule for the degree of frequency domain oversampling required in a combined transposition filter bank is given by
(93)
(94) In particular, it should be noted that for T>Q, the frequency domain oversampling factor
(95)
is sufficient, while still ensuring the suppression of artifacts on transient signals caused by harmonic transposition of order T. I.e. using the above oversampling rules for the combined filter bank, it can be seen that even when using higher transposition orders T>Q, it is not required to further increase the oversampling factor F. As indicated by equation (6b), it is sufficient in the combined filter bank implementation of
(96) In a more general scenario, the physical time durations of the analysis and synthesis windows D.sub.A and D.sub.S, respectively, may be arbitrarily selected. Then the physical spacing Δf of the analysis filter bank subbands should satisfy
(97)
in order to avoid the described artifacts caused by harmonic transposition. It should be noted that the duration of a window D typically differs from the length of a window L. Whereas the length of a window L corresponds to the number of signal samples covered by the window, the duration of the window D corresponds to the time interval of the signal covered by the window. As illustrated in
(98)
In a similar manner, the frequency resolution of a transform Δf is related to the number of points or length M of the transform via the sampling frequency f.sub.s, i.e. notably
(99)
Furthermore, the physical time stride Δt of a filter bank is related to the hop size Δs of the filter bank via the sampling frequency f.sub.s, i.e. notably
(100)
(101) Using the above relations, equation (6b) may be written as
(102)
i.e. the product of the frequency resolution and the window length of the analysis filter bank and/or the frequency resolution and the window length of the synthesis filter bank should be selected to be smaller or equal to
(103)
For T>Q, the product ΔfD.sub.A and/or QΔfD.sub.s may be selected to be greater than
(104)
thereby reducing the computational complexity of the filter banks.
(105) In the present document, various methods for performing harmonic transposition of signals, preferably audio and/or speech signals, have been described. Particular emphasis has been put on the computational complexity of multiple harmonic transposers. In this context, a multiple transposer has been described, which is configured to perform multiple orders of transposition using a combined analysis/synthesis filter bank, i.e. a filter bank comprising a single analysis filter bank and a single synthesis filter bank. A multiple tranposer using a combined analysis/synthesis filter bank has reduced computational complexity compared to a conventional multiple transposer. Furthermore, frequency domain oversampling has been described in the context of combined analysis/synthesis filter banks. Frequency domain oversampling may be used to reduce or remove artifacts caused on transient signals by harmonic transposition. It has been shown that frequency domain oversampling can be implemented at reduced computational complexity within combined analysis/synthesis filter banks, compared to conventional multiple transposer implementations.
(106) While specific embodiments of the present invention and applications of the invention have been described herein, it will be apparent to those of ordinary skill in the art that many variations on the embodiments and applications described herein are possible without departing from the scope of the invention described and claimed herein. It should be understood that while certain forms of the invention have been shown and described, the invention is not to be limited to the specific embodiments described and shown or the specific methods described.
(107) The methods and systems described in the present document may be implemented as software, firmware and/or hardware. Certain components may e.g. be implemented as software running on a digital signal processor or microprocessor. Other components may e.g. be implemented as hardware and or as application specific integrated circuits. The signals encountered in the described methods and systems may be stored on media such as random access memory or optical storage media. They may be transferred via networks, such as radio networks, satellite networks, wireless networks or wireline networks, e.g. the internet. Typical devices making use of the methods described in the present document are for example media players or setup boxes which decode audio signals. On the encoding side, the systems and methods may be used e.g. in broadcasting stations and at multimedia production sites.