CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION
20220293113 · 2022-09-15
Assignee
Inventors
Cpc classification
G10L19/025
PHYSICS
International classification
G10L19/02
PHYSICS
G10L19/025
PHYSICS
Abstract
The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ+rΩ.sub.0 is generated on the basis of existing components at Ω and Ω+Ω.sub.0. The invention provides a block-based harmonic transposition, wherein a time block of complex subband samples is processed with a common phase modification. Superposition of several modified samples has the net effect of limiting undesirable intermodulation products, thereby enabling a coarser frequency resolution and/or lower degree of oversampling to be used. In one embodiment, the invention further includes a window function suitable for use with block-based cross-product enhanced HFR. A hardware embodiment of the invention may include an analysis filter bank, a subband processing unit configurable by control data and a synthesis filter bank.
Claims
1. A system configured to generate a time stretched and/or frequency transposed signal from an input signal, the system comprising one or more processing elements that: derive a number Y≥1 of analysis subband signals from the input signal, wherein each analysis subband signal comprises a plurality of complex-valued analysis samples, each having a phase and a magnitude; generate a synthesis subband signal from the Y analysis subband signals using a subband transposition factor Q and a subband stretch factor S, at least one of Q and S being greater than one by: forming Y frames of L input samples, each frame being extracted from said plurality of complex-valued analysis samples in an analysis subband signal, wherein L is a frame length greater than 1, and wherein at least one of the L input samples is derived by interpolating two or more of the plurality of complex-valued analysis samples; applying a block hop size of h samples to said plurality of complex-valued analysis samples, prior to forming a subsequent frame of L input samples, thereby generating a sequence of frames of input samples; generating, on the basis of Y corresponding frames of input samples, a frame of processed samples by determining a phase and magnitude for each processed sample of the frame, wherein, for at least one processed sample: i) the phase of the processed sample is based on the respective phases of corresponding input samples in each of the Y frames of input samples; and ii) the magnitude of the processed sample is determined as a mean value of the magnitude of the corresponding input sample in a first frame of the Y frames of input samples and the magnitude of the corresponding input sample in a second frame of the Y frames of input samples; applying a window function to the frame of processed samples, wherein the window function is a rectangular window with a length corresponding to the frame length L; and determining the synthesis subband signal by overlapping and adding the samples of a sequence of windowed frames of processed samples; and generating the time stretched and/or frequency transposed signal from the synthesis subband signal, wherein the system is operable at least for Y=2.
2. A method for generating a time stretched and/or frequency transposed signal from an input signal, the method comprising: deriving a number Y≥2 of analysis subband signals from the input signal, wherein each analysis subband signal comprises a plurality of complex-valued analysis samples, each having a phase and a magnitude; forming Y frames of L input samples, each frame being extracted from said plurality of complex-valued analysis samples in an analysis subband signal, wherein L is a frame length greater than 1, and wherein at least one of the L input samples is derived by interpolating two or more of the plurality of complex-valued analysis samples; applying a block hop size of h samples to said plurality of complex-valued analysis samples, prior to deriving a subsequent frame of L input samples, thereby generating a sequence of frames of input samples; generating, on the basis of Y corresponding frames of input samples, a frame of processed samples by determining a phase and a magnitude for each processed sample of the frame, wherein, for at least one processed sample: i) the phase of the processed sample is based on the respective phases of corresponding input samples in each of the Y frames of input samples; and ii) the magnitude of the processed sample is determined as a mean value of the magnitude of the corresponding input sample in a first frame of the Y frames of input samples and the magnitude of the corresponding input sample in a second frame of the Y frames of input samples; determining the synthesis subband signal by applying a window function to the frame of processed samples, and overlapping and adding the samples of a sequence of windowed frames of processed samples, wherein the window function is a rectangular window with a length corresponding to the frame length L; and generating the time stretched and/or frequency transposed signal from the synthesis subband signal.
3. A non-transitory data carrier storing computer-readable instructions for performing the method set forth in claim 2.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0071] The present invention will now be described by way of illustrative examples, not limiting the scope or spirit of the invention, with reference to the accompanying drawings.
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
DESCRIPTION OF PREFERRED EMBODIMENTS
[0080] The embodiments described below are merely illustrative for the principles of the present invention CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION. It is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, that the invention be limited only by the scope of the appended patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
[0081]
[0082]
[0083] A block extractor 201 samples a finite frame of samples from the complex valued input signal. The frame is defined by an input pointer position and the subband transposition factor. This frame undergoes nonlinear processing in processing section 202 and is subsequently windowed by windows of finite and possibly variable length in windowing section 203. The resulting samples are added to previously output samples in an overlap and add unit 204 where the output frame position is defined by an output pointer position. The input pointer is incremented by a fixed amount and the output pointer is incremented by the subband stretch factor times the same amount. An iteration of this chain of operations will produce an output signal with duration being the subband stretch factor times the input subband signal duration, up to the length of the synthesis window, and with complex frequencies transposed by the subband transposition factor. The control signal 104 may influence each of the three sections 201, 202, 203.
[0084]
[0085]
[0089] A cross processing control unit 404 furnishes this cross processing control data 403 given a portion of the control data 104 describing a fundamental frequency and the multitude of complex valued subband signals output from the analysis filter bank 101. The control data 104 may also carry other signal dependent configuration parameters which influence the cross product processing.
[0090] In the following text, a description of principles of cross product enhanced subband block based time stretch and transposition will be outlined with reference to
[0091] The two main configuration parameters of the overall harmonic transposer and/or time stretcher are [0092] S.sub.φ: the desired physical time stretch factor; and [0093] Q.sub.φ: the desired physical transposition factor.
[0094] The filter banks 101 and 103 can be of any complex exponential modulated type such as QMF or a windowed DFT or a wavelet transform. The analysis filter bank 101 and the synthesis filter bank 103 can be evenly or oddly stacked in the modulation and can be defined from a wide range of prototype filters and/or windows. While all these second order choices affect the details in the subsequent design such as phase corrections and subband mapping management, the main system design parameters for the subband processing can typically be derived from the two quotients Δt.sub.S/Δt.sub.A and Δf.sub.S/Δf.sub.A of the following four filter bank parameters, all measured in physical units. In the above quotients, [0095] Δt.sub.A is the subband sample time step or time stride of the analysis filter bank 101 (e.g. measured in seconds [s]); [0096] Δf.sub.A is the subband frequency spacing of the analysis filter bank 101 (e.g. measured in Hertz [1/s]); [0097] Δt.sub.S is the subband sample time step or time stride of the synthesis filter bank 103 (e.g. measured in seconds [s]); and [0098] Δf.sub.S is the subband frequency spacing of the synthesis filter bank 103 (e.g. measured in Hertz [1/s]).
[0099] For the configuration of the subband processing unit 102, the following parameters should be computed: [0100] S: the subband stretch factor, i.e. the stretch factor which is applied within the subband processing unit 102 as a ratio of input and output samples in order to achieve an overall physical time stretch of the time domain signal by S.sub.φ; [0101] Q: the subband transposition factor, i.e. the transposition factor which is applied within the subband processing unit 102 in order to achieve an overall physical frequency transposition of the time domain signal by the factor Q.sub.φ; and [0102] the correspondence between source and target subband indices, wherein n denotes an index of an analysis subband entering the subband processing unit 102, and m denotes an index of a corresponding synthesis subband at the output of the subband processing unit 102.
[0103] In order to determine the subband stretch factor S, it is observed that an input signal to the analysis filter bank 101 of physical duration D corresponds to a number D/Δt.sub.A of analysis subband samples at the input to the subband processing unit 102. These D/Δt.sub.A samples will be stretched to S.Math.D/Δt.sub.A samples by the subband processing unit 102 which applies the subband stretch factor S. At the output of the synthesis filter bank 103 these S.Math.D/Δt.sub.A samples result in an output signal having a physical duration of Δt.sub.S.Math.S.Math.D/Δt.sub.A. Since this latter duration should meet the specified value S.sub.φ.Math.D, i.e. since the duration of the time domain output signal should be time stretched compared to the time domain input signal by the physical time stretch factor S.sub.φ, the following design rule is obtained:
[0104] In order to determine the subband transposition factor Q which is applied within the subband processing unit 102 in order to achieve a physical transposition Q.sub.φ, it is observed that an input sinusoid to the analysis filter bank 101 of physical frequency Ω will result in a complex analysis subband signal with discrete time angular frequency ω=2πΩ.Math.Δt.sub.A and the main contribution occurs within the analysis subband with index n≈Ω/Δf.sub.A. An output sinusoid at the output of the synthesis filter bank 103 of the desired transposed physical frequency Q.sub.φ.Math.Ω will result from feeding the synthesis subband with index m≈Q.sub.φ.Math.Ω/Δf.sub.S with a complex subband signal of discrete angular frequency 2πQ.sub.φ.Math.Ω.Math.Δt.sub.S. In this context, care should be taken in order to avoid the synthesis of aliased output frequencies different from Q.sub.φ.Math.n. Typically this can be avoided by making appropriate second order choices as discussed, e.g. by selecting appropriate analysis and/or synthesis filter banks. The discrete frequency 2πQ.sub.φ.Math.Ω.Math.Δt.sub.S at the output of the subband processing unit 102 should correspond to the discrete time frequency ω=2πΩ.Math.Δt.sub.A at the input of the subband processing unit 102 multiplied by the subband transposition factor Q. I.e., by setting equal 2πQΩΔt.sub.A and 2πQ.sub.φ.Math.Ω.Math.Δt.sub.S, the following relation between the physical transposition factor Q.sub.φ and the subband transposition factor Q may be determined:
[0105] Likewise, the appropriate source or analysis subband index n of the subband processing unit 102 for a given target or synthesis subband index m should obey
[0106] In one embodiment, it holds that Δf.sub.S/Δf.sub.A=Q.sub.φ, i.e. the frequency spacing of the synthesis filter bank 103 corresponds to the frequency spacing of the analysis filter bank 101 multiplied by the physical transposition factor, and the one-to-one mapping of analysis to synthesis subband index n=m can be applied. In other embodiments, the subband index mapping may depend on the details of the filter bank parameters. In particular, if the fraction of the frequency spacing of the synthesis filter bank 103 and the analysis filter bank 101 is different from the physical transposition factor Q.sub.φ, one or two source subbands may be assigned to a given target subband. In the case of two source subbands, it may be preferable to use two adjacent source subbands with index n, n+1, respectively. That is, the first and second source subbands are given by either (n(m), n(m)+1) or (n(m)+1, n(m)).
[0107] The subband processing of
x.sub.1(k)=x(Qk+hl),k=−R.sub.1, . . . R.sub.2−1, (4)
[0108] wherein the integer 1 is a block counting index, L is the block length and R.sub.1,R.sub.2 are nonnegative integers. Note that for Q=1, the block is extracted from consecutive samples but for Q>1, a downsampling is performed in such a manner that the input addresses are stretched out by the factor Q. If Q is an integer this operation is typically straightforward to perform, whereas an interpolation method may be required for non-integer values of Q. This statement is relevant also for non-integer values of the increment h, i.e. of the input block stride. In an embodiment, short interpolation filters, e.g. filters having two filter taps, can be applied to the complex valued subband signal. For instance, if a sample at the fractional time index k+0.5 is required, a two tap interpolation of the form x(k+0.5)≈ax(k)+bx(k+1), where the coefficients a, b may be constants or may depend on a subband index (see, e.g., WO2004/097794 and WO2007/085275), may ensure a sufficient quality.
[0109] An interesting special case of formula (4) is R.sub.1=0, R.sub.2=1 where the extracted block consists of a single sample, i.e. the block length is L=1.
[0110] With the polar representation of a complex number z=|z|exp(i∠z), wherein |z| is the magnitude of the complex number and ∠z is the phase of the complex number, the nonlinear processing unit 202 producing the output frame y.sub.t from the input frame x.sub.t is advantageously defined by the phase modification factor T=SQ through
[0111] where ρϵ[0,1] is a geometrical magnitude weighting parameter. The case ρ=0 corresponds to a pure phase modification of the extracted block. A particularly attractive value of the magnitude weighting is ρ=1−1/T for which a certain computational complexity relief is obtained irrespectively of the block length L, and the resulting transient response is somewhat improved over the case ρ=0. The phase correction parameter θ depends on the filter bank details and the source and target subband indices. In an embodiment, the phase correction parameter θ may be determined experimentally by sweeping a set of input sinusoids. Furthermore, the phase correction parameter θ may be derived by studying the phase difference of adjacent target subband complex sinusoids or by optimizing the performance for a Dirac pulse type of input signal. Finally, with a suitable design of the analysis and synthesis filter banks 101 and 103, the phase correction parameter θ may be set to zero, or omitted. The phase modification factor T should be an integer such that the coefficients T−1 and 1 are integers in the linear combination of phases in the first line of formula (5). With this assumption, i.e. with the assumption that the phase modification factor T is an integer, the result of the nonlinear modification is well defined even though phases are ambiguous by identification modulo 2π.
[0112] In words, formula (5) specifies that the phase of an output frame sample is determined by offsetting the phase of a corresponding input frame sample by a constant offset value. This constant offset value may depend on the modification factor T, which itself depends on the subband stretch factor and/or the subband transposition factor. Furthermore, the constant offset value may depend on the phase of a particular input frame sample from the input frame. This particular input frame sample is kept fixed for the determination of the phase of all the output frame samples of a given block. In the case of formula (5), the phase of the center sample of the input frame is used as the phase of the particular input frame sample.
[0113] The second line of formula (5) specifies that the magnitude of a sample of the output frame may depend on the magnitude of the corresponding sample of the input frame. Furthermore, the magnitude of a sample of the output frame may depend on the magnitude of a particular input frame sample. This particular input frame sample may be used for the determination of the magnitude of all the output frame samples. In the case of formula (5), the center sample of the input frame is used as the particular input frame sample. In an embodiment, the magnitude of a sample of the output frame may correspond to the geometrical mean of the magnitude of the corresponding sample of the input frame and the particular input frame sample.
[0114] In the windowing unit 203, a window w of length L is applied on the output frame, resulting in the windowed output frame
z.sub.1(k)=w(k)y.sub.1(k),k=−R.sub.1, . . . R.sub.2−1. (6)
[0115] Finally, it is assumed that all frames are extended by zeros, and the overlap and add operation 204 is defined by
[0116] wherein it should be noted that the overlap and add unit 204 applies a block stride of Sh, i.e., a time stride which is S times higher than the input block stride h. Due to this difference in time strides of formula (4) and (7) the duration of the output signal z(k) is S times the duration of the input signal x(k), i.e., the synthesis subband signal has been stretched by the subband stretch factor S compared to the analysis subband signal. It should be noted that this observation typically applies if the length L of the window is negligible in comparison to the signal duration.
[0117] For the case where a complex sinusoid is used as input to the subband processing 102, i.e., an analysis subband signal corresponding to a complex sinusoid
x(k)=C exp(iωk), (8)
[0118] it may be determined by applying the formulas (4)-(7) that the output of the subband processing 102, i.e. the corresponding synthesis subband signal, is given by
[0119] independently of ρ. Hence, a complex sinusoid of discrete time frequency ω will be transformed into a complex sinusoid with discrete time frequency Qω provided the synthesis window shifts with a stride of Sh sum up to the same constant value K for all k,
[0120] It is illustrative to consider the special case of pure transposition where S=1 and T=Q. If the input block stride is h=1 and R.sub.1=0, R.sub.2=1, all the above, i.e. notably formula (5), reduces to the point-wise or sample based phase modification rule
[0121] The subband processing unit 102 may use the control data 104 to set certain processing parameters, e.g. the block length of the block extractors.
[0122] In the following, the description of the subband processing will be extended to cover the case of
[0123] The nonlinear processing 302 produces the output frame y, and may be defined by
[0124] the processing in 303 is again described by (6) and (7) and 204 is identical to the overlap and add processing described in the context of the single input case.
[0125] The definition of the nonnegative real parameters D.sub.1, D.sub.2, ρ and the nonnegative integer parameters T.sub.1, T.sub.2 and the synthesis window w now depends on the desired operation mode. Note that if the same subband is fed to both inputs, x.sup.(1)(k)=x.sup.(2)(k) and D.sub.1=Q, D.sub.2=0, T.sub.1=1, T.sub.2=T−1, the operations in (12) and (13) reduce to those of (4) and (5) in the single input case.
[0126] In one embodiment, wherein the ratio of the frequency spacing Δf.sub.S of the synthesis filter bank 103 and the frequency spacing Δf.sub.A of the analysis filter bank 101 is different from the desired physical transposition factor Q.sub.φ, it may be beneficial to determine the samples of a synthesis subband with index m from two analysis subbands with index n, n+1, respectively. For a given index m, the corresponding index n may be given by the integer value obtained by truncating the analysis index value n given by formula (3). One of the analysis subband signals, e.g., the analysis subband signal corresponding to index n, is fed into the first block extractor 301-1 and the other analysis subband signal, e.g. the one corresponding to index n+1, is fed into the second block extractor 301-2. Based on these two analysis subband signals a synthesis subband signal corresponding to index m is determined in accordance with the processing outlined above. The assignment of the adjacent analysis subband signals to the two block extractors 301-1 and 302-1 may be based on the remainder that is obtained when truncating the index value of formula (3), i.e. the difference of the exact index value given by formula (3) and the truncated integer value n obtained from formula (3). If the remainder is greater than 0.5, then the analysis subband signal corresponding to index n may be assigned to the second block extractor 301-2, otherwise this analysis subband signal may be assigned to the first block extractor 301-1. In this operation mode, the parameters may be designed such that input subband signals sharing the same complex frequency ω,
lead to an output subband signal being a complex sinusoid with discrete time frequency Qω. It turns out that this happens if the following relations hold:
[0127] For the operation mode of generating missing partials by means of cross products, the design criteria are different. Returning to the physical transposition parameter Q.sub.φ, the aim of a cross product addition is to produce output at the frequencies Q.sub.φΩ+rΩ.sub.0 for r=1, . . . , Q.sub.φ−1 given inputs at frequencies Ω and Ω+Ω.sub.0, where Ω.sub.0 is a fundamental frequency belonging to a dominant pitched component of the input signal. As described in WO2010/081892, the selective addition of those terms will result in a completion of the harmonic series and a significant reduction of the ghost pitch artifact.
[0128] A constructive algorithm for operating the cross processing control 404 will now be outlined. Given a target output subband index m, the parameter r=1, . . . , Q.sub.φ−1 and the fundamental frequency Ω.sub.0, one can deduce appropriate source subband indices n.sub.1 and n.sub.2 by solving the following system of equations in an approximate sense,
[0130] With the definitions [0131] p=Ω.sub.0/Δf.sub.A: the fundamental frequency measured in units of the analysis filter bank frequency spacing; [0132] F=Δf.sub.5s/Δf.sub.A: the quotient of synthesis to analysis subband frequency spacing; and
[0133] an example of advantageous approximate solution to (16) is given by selecting n, as the integer closest to n.sup.f, and n.sub.2 as the integer closest to n.sup.f+p.
[0134] If the fundamental frequency is smaller than the analysis filter bank spacing, that is if p<1, it may be advantageous to cancel the addition of a cross product.
[0135] As it is taught in WO2010/081892, a cross product should not be added to an output subband which already has a significant main contribution from the transposition without cross products. Moreover, at most one of cases r=1, . . . , Q.sub.φ−1 should contribute to the cross product output. Here, these rules may be carried out by performing the following three steps for each target output subband index m: [0136] 1. Compute the maximum M.sub.C over all choices of r=1, . . . Q.sub.φ−1 of the minimum of the candidate source subband magnitudes |x.sup.(1)| and |x.sup.(2)| evaluated in (or from a neighborhood of) the central time slot k=hl, wherein the source subbands x.sup.(1) and x.sup.(2) may be given by indices n.sub.1 and n.sub.2 as in equation (16); [0137] 2. Compute the corresponding magnitude M.sub.S for the direct source term |x| obtained from a source subband with index
[0139] Variations to this procedure may be desirable depending on the particular system configuration parameters. One such variation is to replace the hard thresholding of point 3 with softer rules depending on the quotient M.sub.C/M.sub.S. Another variation is to expand the maximization in point 1 to more than Q.sub.φ−1 choices, for example defined by a finite list of candidate values for fundamental frequency measured in analysis frequency spacing units p. Yet another variation is to apply different measures of the subband magnitudes, such as the magnitude of a fixed sample, a maximal magnitude, an average magnitude, a magnitude in l.sup.p-norm sense, etc.
[0140] The list of target source bands m selected for addition of a cross product together with the values of n.sub.1 and n.sub.2 constitutes a main part of the cross processing control data 403. What remains to be described is the configuration parameters D.sub.1, D.sub.2, ρ, the nonnegative integer parameters T.sub.1, T.sub.2 appearing in the phase rotation (13) and the synthesis window w to be used in the cross subband processing 402. Inserting the sinusoidal model for the cross product situation leads to the following source subband signals:
where ω=2πΩΔt.sub.A and ω.sub.0=2πΩΔt.sub.A. Likewise, the desired output subband is of the form
z(k)=C.sub.3exp[iQ(ω+rω.sub.0/Q.sub.φ)k]. (18)
Computations reveal that this target output can be achieved if (15) is fulfilled jointly with
The conditions (15) and (19) are equivalent to
which defines the integer factors T.sub.1, T.sub.2 for the phase modification in (13) and provides some design freedom in setting the values of downsampling factors D.sub.1, D.sub.2. The magnitude weighting parameter may be advantageously chosen to ρ=r/Q.sub.φ. As can be seen, these configuration parameters only depend on the fundamental frequency Ω.sub.0 through the selection of r. However, for (18) to hold, a new condition on the synthesis window w emerges, namely
[0141] A synthesis window w which satisfies (21) either exactly or approximately is to be provided as the last piece of cross processing control data 403.
[0142] It is noted that the above algorithm for computing cross processing control data 403 on the basis of input parameters, such as a target output subband index m and a fundamental frequency Ω.sub.0, is of a purely exemplifying nature and as such does not limit the scope of the invention. Variations of this disclosure within the skilled person's knowledge and routine experimentation—e.g., a further subband block based processing method providing a signal (18) as output in response to input signals (17)—fall entirely within the scope of the invention.
[0143]
[0144]
[0145] Consider first the case Q.sub.φ=2. Then 602-2 has to perform a subband stretch of S=2, a subband transposition of Q=1 (i.e. none) and the correspondence between source n and target subbands m is given by n=m for the direct subband processing. In the inventive scenario of cross product addition, there is only one type of cross product to consider, namely r=1 (see above, after equation (15)), and the equations (20) reduce to T.sub.1=T.sub.2=1 and D.sub.1+D.sub.2=1. An exemplary solution consists of choosing D.sub.1=0 and D.sub.2=1. For the direct processing synthesis window, a rectangular window of even length L=10 with R.sub.1=R.sub.2=5 may be used as it satisfies the condition (10). For the cross processing synthesis window, a short L=2 tap window can be used, with R.sub.1=R.sub.2=1, in order to keep the additional complexity of the cross products addition to a minimum. After all, the beneficial effect of using a long block for the subband processing is most notable in the case of complex audio signals, where unwanted intermodulation terms are suppressed; for the case of a dominant pitch, such artifacts are less probable to occur. The L=2 tap window is the shortest one that can satisfy (10) since h=1 and S=2. By the present invention, however, the window advantageously satisfies (21). For the parameters at hand, this amounts to
[0146] which is fulfilled by choosing w(0)=1 and w(−1)=exp(iα)=exp(iπp/2).
[0147] For the case Q.sub.φ=3 the specifications for 602-3 given by (1)-(3) are that it has to perform a subband stretch of S=2, a subband transposition of Q=3/2 and that the correspondence between source n and target m subbands for the direct term processing is given by n≈2m/3. There are two types of cross product terms r=1,2, and the equations (20) reduce to
[0148] An exemplary solution consists of choosing the downsampling parameters as [0149] D.sub.1=0 and D.sub.2=3/2 for r=1; [0150] D.sub.1=3/2 and D.sub.2=0 for r=2.
[0151] For the direct processing synthesis window, a rectangular window of even length L=8 with R.sub.1=R.sub.2=4 may be used. For the cross processing synthesis window, a short L=2 tap window can be used, with R.sub.1=R.sub.2=1, and satisfying
which is fulfilled by choosing w(0)=1 and w(−1)=exp(iα).
[0152] For the case Q.sub.φ=4, the specifications for 602-4 given by (1)-(3) are that it has to perform a subband stretch of S=2, a subband transposition of Q=2 and that the correspondence between source n and target subbands m for the direct term processing is given is by n≈2m. There are three types of cross product terms r=1,2,3, and the equations (20) reduce to
[0153] An exemplary solution consists of choosing [0154] D.sub.1=0 and D.sub.2=2 for r=1; [0155] D.sub.1=0 and D.sub.2=1 for r=2; [0156] D.sub.1=2 and D.sub.2=0 for r=3;
[0157] For the direct processing synthesis window, a rectangular window of even length L=6 with R.sub.1=R.sub.2=3 may be used. For the cross processing synthesis window, a short L=2 tap window can be used, with R.sub.1=R.sub.2=1, and satisfying
which is fulfilled by choosing w(0)=1 and w(−1)=exp(iα).
[0158] In each of the above cases where more than one r value is applicable, a selection will take place, e.g., similarly to the three-step procedure described before equation (17).
[0159]
[0160]
[0161] The top panel 801 depicts the output spectrum obtained if all cross product processing is canceled and only the direct subband processing 401 is active. This will be the case if the cross processing control 404 receives no pitch or p=0. Transposition by Q.sub.φ=2 generates the output in the range from 4 to 8 kHz and transposition by Q.sub.φ=3 generates the output in the range from 8 to 12 kHz. As it can be seen, the created partials are increasingly far apart and the output deviates significantly from the target high frequency signal 702. Audible double and triple “ghost” pitch artifacts will be present in the resulting audio output.
[0162] The middle panel 802 depicts the output spectrum obtained if cross product processing is active, the pitch parameter p=5 is used (which is an approximation to 128Ω.sub.0/fs=5.0196), but a simple two tap synthesis window with w(0)=w(−1)=1, satisfying condition (10), is used for the cross subband processing. This amounts to a straightforward combination of subband block based processing and cross-product enhanced harmonic transposition. As it can be seen, the additional output signal components compared to 801 do not align well with the desired harmonic series. This shows that it leads to insufficient audio quality to use the procedure inherited from the design of direct subband processing for the cross product processing.
[0163] The bottom panel 803 depicts the output spectrum obtained from the same scenario as for the middle panel 802, but now with the cross subband processing synthesis windows given by the formulas described in the cases Q.sub.φ=2,3 of
[0164]
[0165] It is possible to obtain the processed sample w according to this specification by pre-normalizing each of the input samples u.sub.1, u.sub.2 at a respective pre-normalizer 901, 902 and multiplying the pre-normalized input samples v.sub.1=u.sub.1/|u.sub.1|.sup.a, v.sub.2=u.sub.2/|u.sub.2|.sup.b at a weighted multiplier 910, which outputs w=v.sub.1.sup.αv.sub.1.sup.β. Clearly, the operation of the pre-normalizers 901, 902 and the weighted multiplier 910 is determined by input parameters a, b, a and R. It is easy to verify that equations (22) will be fulfilled if α=T.sub.1, β=T.sub.2, a=1−ρ/T, b=1−(1−φ/T.sub.2. The skilled person will readily be able to generalize this layout to an arbitrary number No of input samples, wherein a multiplier is supplied with No input samples, of which some or all have undergone pre-normalization. One observes, then, that a common pre-normalization (a=b, implying that the pre-normalizers 901, 902 produce identical results) is possible if the parameter ρ is set to ρ=T.sub.1/(T.sub.1+T.sub.2). This results in a computational advantage when many subbands are considered, since a common pre-normalization step can be effected on all candidate subbands prior to the multiplication. In an advantageous hardware implementation, a plurality of identically functioning pre-normalizers is replaced by a single unit which alternates between samples from different subbands in a time-division fashion.
[0166] Further embodiments of the present invention will become apparent to a person skilled in the art after reading the description above. Even though the present description and drawings disclose embodiments and examples, the invention is not restricted to these specific examples. Numerous modifications and variations can be made without departing from the scope of the present invention, which is defined by the accompanying claims.
[0167] The systems and methods disclosed hereinabove may be implemented as software, firmware, hardware or a combination thereof. Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit. Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to a person skilled in the art, computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Further, it is well known to the skilled person that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.