Audio signal of an FM stereo radio receiver by using parametric stereo

Abstract

The invention relates to a method for improving a stereo audio signal of an FM stereo radio receiver. The method comprises determining one or more parametric stereo parameters based on the stereo audio signal in a frequency-variant or frequency-invariant manner. Preferably, these PS parameters are time- and frequency-variant. Moreover, the method comprises generating the improved stereo signal based on a first audio signal and the one or more parametric stereo parameters. The first audio signal is obtained from the stereo audio signal, e.g. by a downmix operation.

Claims

1. A method for improving a left/right or mid/side audio signal output by a frequency modulation (FM) stereo radio receiver, the method comprising: receiving the left/right or mid/side audio signal from the FM stereo radio receiver; generating a first audio signal based on the left/right or mid/side audio signal by a downmix operation; determining one or more parametric stereo parameters based on the left/right or mid/side audio signal in a frequency-variant; receiving the first audio signal and outputting a decorrelated signal; and generating a stereo signal based on the first audio signal, the one or more parametric stereo parameters, and selectively on: a second audio signal or at least a frequency band thereof, the second audio signal being a received side signal or a residual signal, the residual signal indicating an error associated with representing the left/right or mid/side audio signal by the first audio signal and the one or more parametric stereo parameters, or the decorrelated signal, wherein: the generating the stereo signal selectively based on the second audio signal or the decorrelated signal is frequency-variant and uses: the second audio signal for a first frequency range and the decorrelated signal for a second frequency range, the frequencies of the first frequency range being lower than the frequencies of the second frequency range.

2. The method of claim 1, wherein the method further comprises generating a decorrelated signal based on the first audio signal, and the generating the stereo signal is based on the first audio signal, the one or more parametric stereo parameters, and the decorrelated signal or at least a frequency band thereof.

3. The method of claim 1, wherein the generating the first audio signal is according to the following formula:
(L+R)/a wherein L and R denote the left and right channels of a left/right audio signal and a is a real number.

4. The method of claim 1, wherein the first audio signal corresponds to a received mid signal.

5. The method of claim 1, further comprising deriving the second audio signal based on the left/right audio or mid/side audio signal.

6. The method of claim 1, wherein the generating the stereo signal selectively depends: on a radio reception indicator indicative of the radio reception condition, and/or on a quality indicator indicative of the quality of the received side signal.

7. The method of claim 1, wherein the one or more parametric stereo parameters include a parameter indicating a channel level difference and/or a parameter indicating an inter-channel cross-correlation.

8. The method of claim 1, further comprising: performing noise reduction of the first audio signal, and generating the stereo signal based on a noise reduced first audio signal and the one or more parametric stereo parameters.

9. The method of claim 1, further comprising: performing noise reduction on the left/right or mid/side audio signal, and generating the one or more parametric stereo parameters based on the reduced left/right or mid/side audio signal.

10. The method of claim 9, further comprising: obtaining the first audio signal from the noise reduced left/right or mid/side audio signal.

11. The method of claim 1, further comprising: a noise parameter characteristic for the noise power of the received side signal; and determining the one or more parametric stereo parameters based on the left/right or mid/side audio signal and the noise parameter in a frequency-variant or frequency-invariant manner.

12. The method of claim 1, further comprising: noticing that the FM stereo receiver selects mono output of the stereo radio signal or noticing poor radio reception; and using one or more upmix parameters for blind upmix in case that the FM stereo receiver selecting mono output of the stereo radio signal is noticed or poor reception is noticed.

13. The method of claim 12, wherein the one or more upmix parameters for blind upmix are one or more preset upmix parameters.

14. The method of claim 12, further comprising: detecting whether the left/right or mid/side audio signal is predominantly speech, the one or more upmix parameters for blind upmix being dependent on said detection.

15. The method of claim 1, further comprising: noticing that the FM stereo receiver selects mono output of the stereo radio signal or noticing poor radio reception; and when the FM stereo receiver switches to mono output or poor radio reception is noticed, the generating the stereo signal uses one or more upmix parameters which are based on one or more previously estimated parametric stereo parameters from the determining.

16. The method of claim 15, wherein the generating the stereo signal continues to use the one or more previously estimated parametric stereo parameters as upmix parameters when the FM stereo receiver switches to mono output or poor radio reception occurs.

17. The method of claim 1, further comprising selecting the normal stereo mode in a frequency-variant manner.

18. The method of claim 1, wherein the determining one or more parametric stereo parameters is carried out with error compensation.

Description

DESCRIPTION OF DRAWINGS

(1) The invention is explained below by way of illustrative examples with reference to the accompanying drawings, wherein

(2) FIG. 1 illustrates a schematic embodiment for improving the stereo output of an FM stereo radio receiver;

(3) FIG. 2 illustrates an embodiment of the audio processing apparatus based on the concept of parametric stereo;

(4) FIG. 3 illustrates another embodiment of the PS based audio processing apparatus having a PS encoder and a PS decoder;

(5) FIG. 4 illustrates an extended version of the audio processing apparatus of FIG. 3;

(6) FIG. 5 illustrates an embodiment of the PS encoder and the PS decoder of FIG. 4;

(7) FIG. 6 illustrates an exemplary structure of the signal S used for upmix;

(8) FIG. 7 illustrates an extended version of the audio processing apparatus of FIG. 3, where a noise reduction algorithm is added;

(9) FIG. 8 illustrates a further embodiment of the audio processing apparatus with noise reduction for PS parameter estimation;

(10) FIG. 9 illustrates another embodiment of the audio processing apparatus for pseudo-stereo generation in case of mono only output of the FM receiver;

(11) FIG. 10 illustrates the occurrence of short drop-outs in stereo playback at the output of the FM receiver;

(12) FIG. 11 illustrates an advanced PS parameter estimation stage with error compensation; and

(13) FIG. 12 illustrates a further embodiment of the audio processing apparatus based on an HE-AAC v2 encoder.

DETAILED DESCRIPTION

(14) FIG. 1 shows a simplified schematic embodiment for improving the stereo output of an FM stereo radio receiver 1. As discussed in the background section, in FM radio the stereo signal is transmitted by design as a mid signal and side signal. In the FM receiver 1, the side signal is used to create the stereo difference between the left channel L and the right channel R at the output of the FM receiver 1 (at least when reception is good enough and the side signal information is not muted). The left and right channels L, R may be digital or analog signals. For improving the audio signals L, R of the FM receiver, an audio processing apparatus 2 is used, which generates a stereo audio signal L and R at its output. The audio processing apparatus 2 corresponds to a system which is enabled to perform noise reduction of a received FM radio signal using parametric stereo. The audio processing in the apparatus 2 is preferably performed in the digital domain; thus, in case of an analog interface between the FM receiver 1 and the audio processing apparatus 2, an analog-to-digital converter is used before digital audio processing in the apparatus 2. The FM receiver 1 and the audio processing apparatus 2 may be integrated on the same semiconductor chip or may be part of two semiconductor chips. The FM receiver 1 and the audio processing apparatus 2 can be part of a wireless communication device such as a cellular telephone, a personal digital assistant (PDA) or a smart phone. In this case, the FM receiver 1 may be part of the baseband chip having additional FM radio receiver functionality.

(15) Instead of using a left/right representation at the output of the FM receiver 1 and the input of the apparatus 2, a mid/side representation may be used at the interface between the FM receiver 1 and the apparatus 2 (see M, S in FIG. 1 for the mid/side representation and L, R for the left/right representation). Such a mid/side representation at the interface between the FM receiver 1 and the apparatus 2 may result in less effort since the FM receiver 1 already receives a mid/side signal and the audio processing apparatus 2 may directly process the mid/side signal without downmixing. The mid/side representation may be advantageous if the FM receiver 1 is tightly integrated with the audio processing apparatus 2, in particular if the FM receiver 1 and the audio processing apparatus 2 are integrated on the same semiconductor chip.

(16) Optionally, a signal strength signal 6 indicating the radio reception condition may be used for adapting the audio processing in the audio processing apparatus 2. This will be explained later in this specification.

(17) The combination of the FM radio receiver 1 and the audio processing apparatus 2 corresponds to an FM radio receiver having an integrated noise reduction system.

(18) FIG. 2 shows an embodiment of the audio processing apparatus 2 which is based on the concept of parametric stereo. The apparatus 2 comprises a PS parameter estimation stage 3. The parameter estimation stage 3 is configured to determine PS parameters 5 based on the input audio signal to be improved (which may be either in left/right or mid/side representation). The PS parameters 5 may include, amongst others, a parameter indicating inter-channel intensity differences (IID or also called CLDchannel level differences) and/or a parameter indicating an inter-channel cross-correlation (ICC). Preferably, the PS parameters 5 are time- and frequency-variant. In case of an M/S representation at the input of the parameter estimation stage 3, the parameter estimation stage 3 may nevertheless determine PS parameters 5 which relate to the L/R channels.

(19) An audio signal DM is obtained from the input signal. In case the input audio signal uses already a mid/side representation, the audio signal DM may directly correspond to the mid signal. In case the input audio signal has a left/right representation, the audio signal is generated by downmixing the audio signal. Preferably, the resulting signal DM after downmix corresponds to the mid signal M and may be generated by the following equation:
DM=(L+R)/a, e.g. with a=2,
i.e. the downmix signal DM may correspond to the average of the L and R signals. For different values of a, the average of the L and R signals is amplified or attenuated.

(20) The apparatus further comprises an upmix stage 4 also called stereo mixing module or stereo upmixer. The upmix stage 4 is configured to generate a stereo signal L, R based on the audio signal DM and the PS parameters 5. Preferably, the upmix stage 4 does not only use the DM signal but also uses a side signal or some kind of pseudo side signal (not shown). This will be explained later in the specification in connection with more extended embodiments in FIGS. 4 and 5.

(21) The apparatus 2 is based on the idea that due to its noise the received side signal may too noisy for reconstructing the stereo signal by simply combining the received mid and side signals; nevertheless, in this case the side signal or side signal's component in the L/R signal may be still good enough for stereo parameter analysis in the PS parameter estimation stage 3. The resulting PS parameters 5 can be then used for generating a stereo signal L, R having a reduced level of noise in comparison to the audio signal directly at the output of the FM receiver 1.

(22) Thus, a bad FM radio signal can be cleaned-up by using the parametric stereo concept. The major part of the distortion and noise in an FM radio signal is located in the side channel which may be not used in the PS downmix. Nevertheless, the side channel is even in case of bad reception often of sufficient quality for PS parameter extraction.

(23) In all the following drawings, the input signal to the audio processing apparatus 2 is a left/right stereo signal. With minor modifications to some modules within the audio processing apparatus 2, the audio processing apparatus 2 can also process an input signal in mid/side representation. Therefore, the concepts discussed herein can be used in connection with an input signal in mid/side representation.

(24) FIG. 3 shows an embodiment of the PS based audio processing apparatus 2, which makes use of a PS encoder 7 and a PS decoder 8. The parameter estimation stage 3, in this example, is part of the PS encoder 7 and the upmix stage 4 is part of the PS decoder 8. The terms PS encoder and PS decoder are used as names for describing the function of the audio processing blocks within the apparatus 2. It should be noted that the audio processing is all Napping at the same FM receiver device. These PS encoding and PS decoding processes may be tightly coupled and the terms PS encoding and PS decoding are only used to describe the heritage of the audio processing functions.

(25) The PS encoder 7 generatesbased on the stereo audio input signal L, Rthe audio signal DM and the PS parameters 5. Optionally, the PS encoder 7 further uses a signal strength signal 6. The audio signal DM is a mono downmix and preferably corresponds to the received mid signal. When summing the L/R channels to form the DM signal, the information of the received side channel may be completely excluded in the DM signal. Thus, in this case only the mid information is contained in the mono downmix DM. Hence, any noise from the side channel may be excluded in the DM signal. However, the side channel is part of the stereo parameter analysis in the encoder 7 as the encoder 7 typically takes L=M+S and R=MS as input (consequently, DM=(L+R)/2=M).

(26) Experimental results indicate that a received side signal that contains intermediate levels of noise may not be good enough for reconstructing stereo itself but can be good enough for stereo parameter analysis in a PS encoder 7.

(27) The mono signal DM and the PS parameters 5 are used subsequently in the PS decoder 8 to reconstruct the stereo signal L, R.

(28) FIG. 4 shows an extended version of the audio processing apparatus 2 of FIG. 3. Here, in addition to the mono downmix signal DM and the PS parameters also the originally received side signal S.sub.0 is passed on to the PS decoder 8. This approach is similar to residual coding techniques from PS coding, and allows to make use of at least parts (e.g. certain frequency bands) of the received side signal S.sub.0 in case of good but not perfect reception conditions. The received side signal S.sub.0 is preferably used in case the mono downmix signal corresponds to the mid signal. However, in case the mono downmix signal does not correspond to the mid signal, a more generic residual signal can be used instead of the received side signal S.sub.0. Such a residual signal indicates the error associated with representing original channels by their downmix and PS parameters and is often used in PS encoding schemes. In the following, the remarks to the use of the received side signal S.sub.0 apply also to a residual signal.

(29) The use of a residual signal in an PS encoder/decoder is e.g. described in the MPEG Surround standard (see document ISO/IEC 23003-1:2007, MPEG Surround) and in the paper MPEG SurroundThe ISO/MPEG Standard for Efficient and Compatible Multi-Channel Audio Coding, J. Herre et al., Audio Engineering Convention Paper 7084, 122.sup.nd Convention, May 5-8, 2007.

(30) FIG. 5 shows an embodiment of the PS encoder 7 and the PS decoder 8 of FIG. 4. The PS encoder module 7 comprises a downmix generator 9 and a PS parameter estimation stage 3. E.g. the downmix generator 9 may create a mono downmix DM which preferably corresponds to a mid signal M (e.g. DM=M=(L+R)/a) and may optionally also generate a second signal which corresponds to the received side signal S.sub.0=(LR)/a.

(31) The PS parameter estimation stage 3 may estimate as PS parameters 5 the correlation and the level difference between the L and R inputs. Optionally, the parameter estimation stage receives the signal strength 6 which may be the signal power at the FM receiver. This information can be used to decide about the reliability, e.g. in case of a low signal strength 6, of the PS parameters 5. In case of a low reliability the PS parameters 5 may be set such that the output signal L, R is a mono output signal or a pseudo stereo output signal. In case of a mono output signal, the output signal L is equal to the output signal R. In case of a pseudo stereo output signal, default PS parameters may be used to generate a pseudo or default stereo output signal L, R.

(32) The PS decoder module 8 comprises a stereo mixing matrix 4a and a decorrelator 10. The decorrelator receives the mono downmix DM and generates a decorrelated signal S which is used as a pseudo side signal. The decorrelator 10 may be realized by an appropriate all-pass filter as discussed in section 4 of the cited document Low Complexity Parametric Stereo Coding in MPEG-4. The stereo mixing matrix 4a is a 22 upmix matrix in this embodiment.

(33) Dependent upon the estimated parameters 5, the matrix 4a mixes the DM signal with the received side signal S.sub.0 or the decorrelated signal S to create the stereo output signals L and R. The selection between the signal S.sub.0 and the signal S may depend on a radio reception indicator indicative of the reception conditions, such as the signal strength 6. One may instead or in addition use a quality indicator indicative of the quality of the received side signal. One example of such a quality indicator may be an estimated noise (power) of the received side signal. In case of a side signal comprising a high degree of noise, the decorrelated signal S may be used to create the stereo output signal L and R, whereas in low noise situations, the side signal S.sub.0 may be used. Various embodiments for estimating the noise of the received side signal are discussed later in this specification.

(34) As an example, in case of good reception conditions (i.e. the signal strength is high), the signal S.sub.0 is used for upmixing, whereas in case of bad conditions the upmixing is based on the decorrelated signal S. Preferably, the decision whether the stereo mixing module 4 uses the received side signal S.sub.0 or S is frequency dependent, e.g. for lower frequencies the received side signal S.sub.0 is used and for higher frequencies the decorrelated signal S is used. This will be discussed more in detail in connection with FIG. 6.

(35) The frequency-variant or frequency-invariant selection between the signal S.sub.0 and the signal S may be done in the upmix stage 4 (e.g. by selector means in the upmix stage 6 which are controlled e.g. in dependency of the signal strength 6). Alternatively, the frequency-variant or frequency-invariant selection between the signal S.sub.0 and the signal S may be performed in the parameter estimation stage 3 (e.g. in dependency of the signal strength 6), and the parameter estimation stage 3 then sends upmix parameters to the upmix stage 6 that cause that the respectively selected signal (either S.sub.0 or S) is used for the upmix, e.g. the upmix parameters relating to the signal S.sub.0 are set to zero and the parameters relating to S are not set to zero in case of selecting S. Alternatively, a selection signal (not shown) may be send to the upmix stage 6.

(36) The upmix operation is preferably carried out according to the following matrix equation:

(37) $(\begin{matrix} L^{} \\ R^{} \end{matrix}) = (\begin{matrix} \end{matrix}) (\begin{matrix} DM \\ S \end{matrix})$

(38) Here, the weighting factors , , , determine the weighting of the signals DM and S. The mono downmix DM preferably corresponds to the received mid signal. The signal S in the formula corresponds either to the decorrelated signal S or to the received side signal S.sub.0. The upmix matrix elements, i.e. the weighting factors , , , , may be derived e.g. as shown the cited paper Low Complexity Parametric Stereo Coding in MPEG-4 (see section 2.2), as shown in the cited MPEG-4 standardization document ISO/IEC 14496-3:2005 (see section 8.6.4.6.2) or as shown in MPEG Surround specification document ISO/IEC 23003-1 (see section 6.5.3.2). These sections of the documents (and also sections referred to in these sections) are hereby incorporated by reference for all purposes.

(39) Preferably, the selection between S and S.sub.0 is frequency dependent. This is shown in FIG. 6 indicating an exemplary structure of the signal S used for upmix. As indicated in FIG. 6, for lower frequencies the received side signal S.sub.0 is used for upmix and for higher frequencies the decorrelated signal S is used for upmix.

(40) If the received side signal S.sub.0 corresponds to S.sub.0=(LR)/2 and L=M+S.sub.0 and R=MS.sub.0, the mono downmix DM should preferably correspond to (L+R)/2; this allows perfect reconstruction, i.e. L=L and R=R.

(41) Instead of using a PS upmixer using the received side signal S.sub.0, a generalized PS upmixer using a residual signal may be used. The resulting signals L, R are function of the PS parameters, the residual signal and the mono downmix.

(42) FIG. 7 shows an exemplary embodiment using noise reduction. As in FIG. 5, in FIG. 7 the signal S.sub.0 is optional. In case of having a signal S.sub.0, a common noise reduction algorithm may be used, which performs noise reduction of the DM and S.sub.0 signals. Alternatively, two differently configured noise reduction modules may be used, one for noise reduction of the signal DM and one for noise reduction of the signal S.sub.0. It is also possible that only one signal may be subject to noise reduction (e.g. the signal DM or the signal S.sub.0). In FIG. 7, the noise reduction stage 11 performs noise reduction of the signal DM and the noise reduced signal DM after noise reduction is fed to the PS decoder 8 and its internal upmix stage 4. The noise reduction stage 11 performs noise reduction of the signal S.sub.0 and the noise reduced signal S.sub.0 after noise reduction is fed to the PS decoder 8.

(43) FIG. 8 shows a further embodiment of the apparatus 2. Here, a noise reduction method 12 is applied on the stereo input signal, the resulting noise reduced signal R, L is thereafter analyzed by the PS parameter estimation stage 3 of the PS encoder 8. The noise reduction may be very aggressive and optimized for the PS parameter extraction as the downmix signal DM takes another path not including the noise reduction stage 12.

(44) The mono downmix signal DM may be generated by adding the L, R channels with same weighting factors (e.g. using weighting factors of 1 or using weighting factors of ). The signal DM then corresponds to the received mid signal. When using weighting factors of , the amplitude of the signal DM is half of the amplitude of the signal DM in case when using weighting factors of 1.

(45) Optionally, some form of noise reduction may be also applied to the signal L/R or the signal DM (and/or the S.sub.0 signal if used). E.g. some noise reduction may be applied to the signal DM (see the optional noise reduction stage 11 in FIG. 8). Preferably, this noise reduction stage is gentler than the aggressive noise reduction stage 12. The noise reduction stage 11 may be alternatively placed upstream of the downmix stage 9 (e.g. at the input of the apparatus 2 or directly before the downmix stage 9).

(46) In certain reception conditions, the FM receiver 1 only provides a mono signal, with the conveyed side signal being muted. This will typically happen when the reception conditions are very bad and the side signal is very noisy. In case the FM stereo receiver 1 has switched to mono playback of the stereo radio signal, the upmix stage preferably uses upmix parameters for blind upmix, such as preset upmix parameters, and generates a pseudo stereo signal, i.e. the upmix stage generates a stereo signal using the upmix parameters for blind upmix.

(47) There are also embodiments of the FM stereo receiver 1 which switch at too poor reception conditions to mono playback. If the reception conditions are too poor for estimation of reliable PS parameters 5, the upmix stage preferably uses upmix parameters for blind upmix and generates a pseudo stereo signal based thereon.

(48) FIG. 9 shows an embodiment for the pseudo-stereo generation in case of mono only output of the FM receiver 1. Here, a mono/stereo detector 13 is used to detect whether the input signal to the apparatus 2 is mono, i.e. whether the signals of the L and R channels are the same. In case of mono playback of the FM receiver 1, the mono/stereo detector 13 indicates to upmix to stereo using e.g. a PS decoder with fixed upmix parameters. In other words: in this case, the upmix stage 4 does not use PS parameters from the PS parameter estimation stage 3 (not shown in FIG. 9), but uses fixed upmix parameters (not shown in FIG. 9).

(49) Optionally, a speech detector 14 may be added to indicate if the received signal is predominantly speech or music. Such speech detector 14 allows for signal dependent blind upmix. E.g. such a speech detector 14 may allow for signal dependent upmix parameters. Preferably, one or more upmix parameters may be used for speech and different one or more upmix parameters may be used for music. Such a speech detector 14 may be realized by a Voice Activity Detector (VAD). Strictly speaking, the upmix stage 4 in FIG. 9 comprises a decorrelator 10, a 22 upmix matrix 4a, and means to convert the output of the mono/stereo detector 13 and the speech detector 14 into some form of PS parameters used as input to the actual stereo upmix.

(50) FIG. 10 illustrates a common problem when the audio signal provided by the FM receiver 1 toggles between stereo and mono due to time-variant bad reception conditions (e.g. fading). To maintain a stereo sound image during mono/stereo toggling, error concealment techniques may be used. Time intervals where concealment shall be applied are indicated by C in FIG. 10. An approach to concealment in PS coding is to use upmix parameters which are based on the previously estimated PS parameters in case that new PS parameters cannot be computed because the audio output of the FM receiver 1 dropped down to mono. E.g. the upmix stage 4 may continue to use the previously estimated PS parameters in case that new PS parameters cannot be computed because the audio output of the FM receiver 1 dropped down to mono. Thus, when the FM stereo receiver 1 switches to mono audio output, the stereo upmix stage 4 continues to use the previously estimated PS parameters from the PS parameter estimation stage 3. If the dropout periods in the stereo output are short enough so that the stereo sound image of the FM radio signal remains similar during a dropout period, the dropout is not audible or only scarcely audible in the audio output of the apparatus 2. Another approach may be to interpolate and/or extrapolate upmix parameters from previously estimated parameters. With respect to determination of upmix parameters based on the previously estimated PS parameters, one may, in light of the teachings herein also use other techniques known e.g. from error concealment mechanisms that can be used in audio decoders to mitigate the effect of transmission errors (e.g. corrupt or missing data).

(51) The same approach of using upmix parameters based on the previously estimated PS parameters can be also applied if the FM receiver 1 provides a noisy stereo signal during a short period of time, with the noisy stereo signal being too bad to estimate reliable PS parameters based thereon.

(52) In the following, an advanced PS parameter estimation stage 3 providing error compensation is discussed with reference to FIG. 11. In case of estimating PS parameters based on a stereo signal containing a noisy side component, there will be an error in the calculation of the PS parameters if conventional formulas for determining the PS parameters are used, such as for determining the CLD parameter (Channel Level Differences) and the ICC parameter (Inter-channel Cross-Correlation).

(53) When assuming that the noise in the side signal is independent of the mid signal: the ICC values get closer to 0 in comparison to the ICC values estimated based on a noiseless stereo signal, and the CLD values in decibel get closer to 0 dB in comparison to the CLD values estimated based on a noiseless stereo signal.

(54) For compensation of the error in the PS parameters the apparatus 2 preferably has a noise estimate stage which is configured to determine a noise parameter characteristic for the power of the noise of the received side signal that was caused by the (bad) radio transmission. The noise parameter is considered when estimating the PS parameters. This may be implemented as shown in FIG. 11.

(55) According to FIG. 11, the signal strength data 6 may be used for at least partly compensating the error. The signal strength 6 is often available in FM radio receivers. The signal strength 6 is input to the parameter analyzing stage 3 in the PS encoder 7. In a side signal noise power estimation stage 15, the signal strength value 6 may be converted to a side signal noise power estimate N.sup.2, with N.sup.2=E(n.sup.2), where E( ) is the expectation operator. As an alternative to the signal strength 6 or in addition to the signal strength 6, the audio signal L, R may be used for estimating the signal noise power as will be discussed later on.

(56) The actual noisy stereo input signal values l.sub.w/noise and r.sub.w/noise, which are input to the inner PS parameter estimation stage 3 shown in FIG. 11, can be expressed in dependency of the respective values l.sub.w/o noise and r.sub.w/o noise without noise and the noise values n of the received side signal values:
l.sub.w/noise=m+(s+n)=l.sub.w/o noise+n
r.sub.w/noise=m(s+n)=r.sub.w/o noisen

(57) It should be noted that here the received side signal is modeled as s+n, where s is the original (undistorted) side signal, and n is the noise (distortion signal) caused by the radio transmission channel. Furthermore, it is assumed here that the signal m is not distorted by noise from the radio transmission channel.

(58) Thus, the corresponding input powers L.sub.w/noise.sup.2, R.sub.w/noise.sup.2 and the cross correlation L.sub.w/noiseR.sub.w/noise can be written as:
L.sub.w/noise.sup.2=E(l.sub.w/noise.sup.2)E((m+s).sup.2)+E(n.sup.2)=L.sub.w/o noise.sup.2+N.sup.2
R.sub.w/noise.sup.2=E(r.sub.w/noise.sup.2)=E((ms).sup.2)+E(n.sup.2)=R.sub.w/o noise.sup.2+N.sup.2
L.sub.w/noiseR.sub.w/noise=E(l.sub.w/noise.Math.r.sub.w/noise)=E((l.sub.w/o noise+n).Math.(r.sub.w/o noisen))=L.sub.w/o noiseR.sub.w/o noiseN.sup.2
with the side signal noise power estimate N.sup.2, with N.sup.2=E(n.sup.2), where E( ) is the expectation operator.

(59) By rearranging the above equations, the corresponding compensated powers and cross-correlation without noise can be determined to be:
L.sub.w/o noise.sup.2=L.sub.w/noise.sup.2N.sup.2
R.sub.w/o noise.sup.2=R.sub.w/noise.sup.2N.sup.2
L.sub.w/o noiseR.sub.w/o noise=L.sub.w/noiseR.sub.w/noise+N.sup.2

(60) An error-compensated PS parameter extraction based on the compensated powers and cross correlation may be carried out as given by the formulas below:
CLD=10.Math.log.sub.10(L.sub.w/o noise.sup.2/R.sub.w/o noise.sup.2)
ICC=(L.sub.w/o noiseR.sub.w/o noise)/(L.sub.w/o noise.sup.2+R.sub.w/o noise.sup.2)

(61) Such a parameter extraction compensates for the estimated N.sup.2 term in the calculation of the PS parameters.

(62) In FIG. 11, the side signal noise power estimation stage 15 is configured to derive the noise power estimate N.sup.2 based on the signal strength 6 and/or the audio input signals (L and R). The noise power estimate N.sup.2 can be both frequency-variant and time-variant.

(63) A variety of methods can be used for determining the side signal noise power N.sup.2, e.g.: When detecting power minima of the mid signal (e.g. pauses in speech), it can be assumed that the power of the side signal is noise only (i.e. the power of the side signal corresponds to N.sup.2 in these situations). The N.sup.2 estimate can be defined by a function of the signal strength data 6. The function (or lookup table) can be designed by experimental (physical) measurements. The N.sup.2 estimate can be defined by a function of the signal strength data 6 and/or the audio input signals (L and R). The function can be designed by heuristic rules. The N.sup.2 estimate can be based on studying the signal type coherence of the mid and side signals. The original mid and side signals can e.g. be assumed to have similar tonality-to-noise ratio or crest factor or other power envelope characteristics. Deviations of those properties can be used to indicate a high level of N.sup.2.

(64) In the following further preferred embodiments of the audio processing apparatus 2 are discussed.

(65) Preferably, the apparatus 2 is configured in such a way that for received side signals with practically only noise, the apparatus 2 smoothly switches to pseudo stereo (blind upmix) operation, as illustrated in FIGS. 9 and 10. This allows to output a pseudo stereo signal at the output of the apparatus 2 in case the FM receiver 1 has switched to mono operation (due to the high level of noise caused by bad reception conditions) or in case the side signal portion in the stereo signal at the input of the apparatus 2 is so noisy that reliable PS parameters cannot be estimated.

(66) For side signals with almost no noise, the apparatus 2 preferably switches smoothly to normal stereo operation instead of parametric stereo operation. In normal stereo operation, the signal improvement functionality of the apparatus 2 is essentially deactivated. For deactivation, the audio signal at the input of apparatus may be essentially fedthrough to the output of the apparatus 2.

(67) Alternatively, the normal stereo operation may be accomplished by using the received side signal S.sub.0, as illustrated in FIG. 4 and FIG. 6: For normal stereo operation, the received side signal S.sub.0 is used for mixing in the upmix stage 4. When appropriately selecting the upmix parameters in the upmix stage 4, the output signal L, R of the upmix stage 4 corresponds to the output signal L, R of the FM transmitter 1: e.g. when mixing the mono downmix DM and the received signal S.sub.0 according to:
L=DM+S.sub.0, R=DMS.sub.0,
in case DM=M=(L+R)/2 and S.sub.0=(LR)/2.

(68) More preferably, the normal stereo mode or the parametric stereo mode may be selected in a frequency-variant manner, i.e. the selection may be different for the different frequency bands. This is useful since the signal-to-noise ratio for the received side signal gets worse for higher frequencies.

(69) The smooth switching between different operation modes may be adapted dynamically to the current reception conditions, in order to provide always the best possible stereo signal at the output of the apparatus 2. In case of a high signal-to-noise ratio normal FM stereo operation (without noise reduction based on PS processing) is preferred, whereas in case of a low signal-to-noise ratio PS processing greatly improves the stereo signal.

(70) Preferably, the generation of the mono downmix DM in the PS encoder 7 should be done such that as little as possible noise from the side signal leaks into the mono downmix DM. This can require different downmix techniques than those typically used in a PS encoder (such as an MPEG-4 PS encoder for MPEG-4) which is normally employed in the context of a very low bitrate coding system. This can be as simple as a fixed (non-adaptive) downmix DM=M=(L+R)/2, where the downmix simply correspond to the mid signal. Furthermore, the upmix in the PS decoder 8 is typically adapted to the actual downmix technique used in the PS encoder 7.

(71) It should be noted that although in several drawings the PS encoder 7 and the PS decoder 8 are shown as separate modules, it is of course advantageous in the context of an efficient implementation to merge PS encoder 7 and the PS decoder 8 as much as possible.

(72) The concepts discussed herein can be implemented in connection with any encoder using PS techniques, e.g. an HE-AAC v2 (High-Efficiency Advanced Audio Coding version 2) encoder as defined in the standard ISO/IEC 14496-3 (MPEG-4 Audio), an encoder based on MPEG Surround or an encoder based on MPEG USAC (Unified Speech and Audio coder) as well as encoders which are not covered by MPEG standards.

(73) In the following, by way of example, a HE-AAC v2 encoder is assumed; nevertheless, the concepts may be used in connection with any audio encoder using PS techniques.

(74) HE-AAC is a lossy audio compression scheme. HE-AAC v1 (HE-AAC version 1) makes use of spectral band replication (SBR) to increase the compression efficiency. HE-AAC v2 further includes parametric stereo to enhance the compression efficiency of stereo signals at very low bitrates. An HE-AAC v2 encoder inherently includes a PS encoder to allow operation at very low bitrates. The PS encoder of such an HE-AAC v2 encoder can be used as the PS encoder 7 of the audio processing apparatus 2. In particular, the PS parameter estimating stage within a PS encoder of an HE-AAC v2 encoder can be used as the PS parameter estimating stage 3 of the audio processing apparatus 2. Also the downmix stage within a PS encoder of an HE-AAC v2 encoder can be used as the downmix stage 9 of the apparatus 2.

(75) Hence, the concept discussed in this specification can be efficiently combined with an HE-AAC v2 encoder to realize an improved FM stereo radio receiver. Such an improved FM stereo radio receiver may have an HE-AAC v2 recording feature since the HE-AAC v2 encoder outputs an HE-AAC v2 bitstream which can stored for recording purposes. This is shown in FIG. 12. In this embodiment, the apparatus 2 comprises an HE-AAC v2 encoder 16 and the PS decoder 8. The HE-AAC v2 encoder provides the PS encoder 7 used for generating the mono downmix DM and the PS parameters 5 as discussed in connection with the previous drawings.

(76) Optionally, the PS encoder 7 may be modified for the purpose of FM radio noise reduction to support a fixed downmix scheme, such as a downmix scheme according to DM=(L+R)/a.

(77) The mono downmix DM and the PS parameters 8 may be fed to the PS decoder 8 to generate the stereo signal L, R as discussed above. The mono downmix DM is fed to an HE-AAC v1 encoder for perceptual encoding of the mono downmix DM. The resulting perceptual encoded audio signal and the PS information are multiplexed into an HE-AAC v2 bitstream 18. For recording purposes, the HE-AAC v2 bitstream 18 can be stored in a memory such as a flash-memory or a hard-disk.

(78) The HE-AAC v1 encoder 17 comprises an SBR encoder and an MC encoder (not shown). The SBR encoder typically performs signal processing in the QMF (quadrature mirror filterbank) domain and thus needs QMF samples. In contrast, the MC encoder typically needs time domain samples (typically downsampled by a factor 2).

(79) The PS encoder 7 within the HE-MC v2 encoder 16 typically provides the downmix signal DM already in the QMF domain.

(80) Since the PS encoder 7 may already send the QMF domain signal DM to the HE-AAC v1 encoder, the QMF analysis transform in the HE-AAC v1 encoder for the SBR analysis can be made obsolete. Thus, the QMF analysis that is normally part of the HE-MC v1 encoder can be avoided by providing the downmix signal DM as QMF samples. This reduces the computing effort and allows for complexity saving.

(81) The time domain samples for the AAC encoder may be derived from the input of the apparatus 2, e.g. by performing the simple operation DM=(L+R)/2 in the time domain and by downsampling the time domain signal DM. This approach is probably the cheapest approach. Alternatively, the apparatus 2 may perform a half-rate QMF synthesis of the QMF domain DM samples.

(82) It should be noted that the PS encoder and PS decoder can be partly merged if both are implemented in the same module.

Audio signal of an FM stereo radio receiver by using parametric stereo

Assignee

Inventors

Cpc classification

Classification Explorer

H04S1/007

ELECTRICITY

Classification Explorer

H04S3/00

ELECTRICITY

Classification Explorer

H04H40/45

ELECTRICITY

Classification Explorer

H04S2420/03

ELECTRICITY

Classification Explorer

H04S1/00

ELECTRICITY

Classification Explorer

G10L19/008

PHYSICS

Classification Explorer

H04S5/00

ELECTRICITY

Classification Explorer

H04H40/81

ELECTRICITY

Classification Explorer

H04H40/72

ELECTRICITY

International classification

Classification Explorer

H04S3/00

ELECTRICITY

Classification Explorer

H04S1/00

ELECTRICITY

Classification Explorer

H04S5/00

ELECTRICITY

Classification Explorer

G10L19/008

PHYSICS

Classification Explorer

H04H40/45

ELECTRICITY

Classification Explorer

H04H40/72

ELECTRICITY

Classification Explorer

H04H40/81

ELECTRICITY

Abstract

Claims

Description