POST FILTER FOR AUDIO SIGNALS
20190214035 ยท 2019-07-11
Assignee
Inventors
Cpc classification
G10L19/09
PHYSICS
G10L19/12
PHYSICS
G10L19/22
PHYSICS
G10L19/265
PHYSICS
G10L19/125
PHYSICS
G10L19/20
PHYSICS
G10L19/107
PHYSICS
G10L19/02
PHYSICS
International classification
G10L19/22
PHYSICS
G10L19/125
PHYSICS
G10L19/02
PHYSICS
G10L19/09
PHYSICS
G10L19/20
PHYSICS
G10L19/12
PHYSICS
Abstract
In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.
Claims
1. A decoder system for decoding a bit stream signal as an audio time signal, the decoder system including: a decoding section for decoding the bit stream signal as a preliminary audio time signal, wherein the decoding section comprises a code-excited linear prediction, CELP, decoding module and a transform-coded excitation, TCX, decoding module; and an interharmonic noise attenuation post filter adapted to receive the preliminary audio time signal, and to supply the audio time signal, wherein the post filter comprises a control section for selectively operating the post filter in one of the following modes: i) a filtering mode, wherein the post filter filters the preliminary audio time signal to obtain a filtered signal and supplies the filtered signal as the audio time signal; and ii) a pass-through mode, wherein the post filter supplies the preliminary audio time signal as the audio time signal, wherein the interharmonic noise attenuation depends on a value of a variable gain and on pitch information included in the bit stream signal.
2. The decoder system of claim 1, wherein the decoding section selectively operates in one of the following modes: a) the TCX module is enabled and the post filter is operated in the pass-through mode; b) the CELP module is enabled and, in response to a post-filtering signal, the post filter is operated in the filtering mode; and c) the CELP module is enabled and, in response to the post-filtering signal, the post filter is operated in the pass-through mode.
3. The decoder system of claim 2, the decoding section further comprising an Advanced Audio Coding, AAC, decoding module for decoding a bit stream signal as an audio time signal, the control section being adapted to operate the decoder also in the following mode: d) the AAC module is enabled and the post filter is disabled.
4. The decoder system of claim 1, wherein the post filter is adapted to attenuate only such spectral components which are located below a predetermined cut-off frequency.
5. The decoder system of claim 1, wherein the bit stream signal is a Moving Pictures Experts Group, MPEG, bit stream and is segmented into time frames and the control section is adapted to disable an entire time frame or a sequence of entire time frames; and the control section is further adapted to receive, for each time frame, a data field associated with this time frame and is operable, responsive to the value of the data field, to disable the post filter, whereby the preliminary audio time signal is output as the audio time signal.
6. The decoder system of claim 1, wherein the control section is operable to enable the pass-through mode by setting the value of the variable gain to zero.
7. A method of decoding a bit stream signal as an audio time signal, comprising: decoding the bit stream signal as a preliminary audio time signal in one of a plurality of decoding modes, the plurality of decoding modes comprising code-excited linear prediction, CELP, and transform-coded excitation, TCX, decoding modes; and filtering the preliminary audio time signal with an interharmonic noise attenuation post-filter to obtain the audio time signal, wherein the post-filter comprises a control section for selectively operating the post-filter in one of the following modes: i) a filtering mode, wherein the post filter filters the preliminary audio time signal to obtain a filtered signal and supplies the filtered signal as the audio time signal; and ii) a pass-through mode, wherein the post-filter supplies the preliminary audio time signal as the audio time signal, wherein the interharmonic noise attenuation depends on a value of a variable gain and on pitch information included in the bit stream signal.
8. The method of claim 7, wherein decoding the bit stream signal as an audio time signal comprises selectively operating in one of the following modes: a) enabling the TCX decoding mode and operating the post-filter in the pass-through mode; b) enabling the CELP decoding mode and, in response to a post-filtering signal, operating the post-filter in the filtering mode; and c) enabling the CELP decoding mode and, in response to the post-filtering signal, operating the post-filter in the pass-through mode.
9. The method of claim 8, the decoding modes further comprising an Advanced Audio Coding, AAC, decoding mode for decoding a bit stream signal as an audio time signal, the control section being adapted to operate the decoder also in the following mode: d) the AAC decoding mode is enabled and the post filter is disabled.
10. The method of claim 7, wherein the post filter is adapted to attenuate only such spectral components which are located below a predetermined cut-off frequency.
11. The method of claim 7, wherein the bit stream signal is a Moving Pictures Experts Group, MPEG, bit stream and is segmented into time frames and the control section is adapted to disable an entire time frame or a sequence of entire time frames; and the control section is further adapted to receive, for each time frame, a data field associated with this time frame and is operable, responsive to the value of the data field, to disable the post filter, whereby the preliminary audio time signal is output as the audio time signal.
12. The method of claim 7, wherein the control section is operable to enable the pass-through mode by setting the value of the variable gain to zero.
13. A non-transitory computer readable storage medium containing a program of instructions, which when executed by one or more processors, cause one or more devices to perform a method of decoding a bit stream signal as an audio time signal, the method comprising: decoding the bit stream signal as a preliminary audio time signal in one of a plurality of decoding modes, the plurality of decoding modes comprising code-excited linear prediction, CELP, and transform-coded excitation, TCX, decoding modes; and filtering the preliminary audio time signal with an interharmonic noise attenuation post-filter to obtain the audio time signal, wherein the post-filter comprises a control section for selectively operating the post-filter in one of the following modes: i) a filtering mode, wherein the post filter filters the preliminary audio time signal to obtain a filtered signal and supplies the filtered signal as the audio time signal; and ii) a pass-through mode, wherein the post-filter supplies the preliminary audio time signal as the audio time signal, wherein the interharmonic noise attenuation depends on a value of a variable gain and on pitch information included in the bit stream signal.
14. The medium of claim 13, wherein decoding the bit stream signal as an audio time signal comprises selectively operating in one of the following modes: a) enabling the TCX decoding mode and operating the post-filter in the pass-through mode; b) enabling the CELP decoding mode and, in response to a post-filtering signal, operating the post-filter in the filtering mode; and c) enabling the CELP decoding mode and, in response to the post-filtering signal, operating the post-filter in the pass-through mode.
15. The medium of claim 14, the decoding modes further comprising an Advanced Audio Coding, AAC, decoding mode for decoding a bit stream signal as an audio time signal, the control section being adapted to operate the decoder also in the following mode: d) the AAC decoding mode is enabled and the post filter is disabled.
16. The medium of any claim 13, wherein the post filter is adapted to attenuate only such spectral components which are located below a predetermined cut-off frequency.
17. The medium of claim 13, wherein the bit stream signal is a Moving Pictures Experts Group, MPEG, bit stream and is segmented into time frames and the control section is adapted to disable an entire time frame or a sequence of entire time frames; and the control section is further adapted to receive, for each time frame, a data field associated with this time frame and is operable, responsive to the value of the data field, to disable the post filter, whereby the preliminary audio time signal is output as the audio time signal.
18. The medium of claim 13, wherein the control section is operable to enable the pass-through mode by setting the value of the variable gain to zero.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] Embodiments of the present invention will now be described with reference to the accompanying drawings, on which:
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
DETAILED DESCRIPTION OF EMBODIMENTS
[0050]
[0051]
[0052]
[0053] Information contained in the bit stream controls what decoding module is to be active. By the invention however, the pitch enhancement module 740 performs an analogous self actuation, which responsive to post filtering information in the bit stream may act as a post filter or simply as a pass-through. This may for instance be realized through the provision of a control section (not shown) in the pitch enhancement module 740, by means of which the post filtering action can be turned on or off. The pitch enhancement module 740 is always in its pass-through mode when the decoder system operates in the frequency-domain or TCX decoding mode, wherein strictly speaking no post filtering information is necessary. It is understood that modules not forming part of the inventive contribution and whose presence is obvious to the skilled person, e.g., a demultiplexer, have been omitted from
[0054] As a variation, the decoder system of
s.sub.ORIG(n)?s.sub.E(n)=s.sub.ORIG(n)?(s.sub.DEC(n)?[s.sub.DEC*p.sub.LT*h.sub.LP](n)),
where ? is the post filter gain. By studying the total energy, low-band energy, tonality, actual magnitude spectrum or past magnitude spectra of this signal, as disclosed in the Summary section and the claims, the control section may find a basis for the decision whether to activate or deactivate the pitch enhancement module 740.
[0055]
[0056] Preferably, the decision module 820 bases its decision on an approximate difference signal computed from an intermediate decoded signal s.sub.i.sub._.sub.DEC, which can be subtracted from the encoding module 810. The intermediate decoded signal represents an intermediate stage in the decoding process, as discussed in preceding paragraphs, but may be extracted from a corresponding stage of the encoding process. However, in the encoder system 800 the original audio signal s.sub.ORIG is available so that, advantageously, the approximate difference signal is formed as:
s.sub.ORIG(n)?(s.sub.i.sub._.sub.DEC(n)??[(s.sub.i.sub._.sub.DEC*p.sub.LT)*h.sub.LP](n)).
The approximation resides in the fact that the intermediate decoded signal is used in lieu of the final decoded signal. This enables an appraisal of the nature of the component that a post filter would remove at decoding, and by applying one of the criteria discussed in the Summary section, the decision module 820 will be able to take a decision whether to disable post filtering.
[0057] As a variation to this, the decision module 820 may use the original signal in place of an intermediate decoded signal, so that the approximate difference signal will be [(s.sub.i.sub._.sub.DEC*p.sub.LT)*h.sub.LP](n). This is likely to be a less faithful approximation but on the other hand makes the presence of a connection line 816 between the decision module 820 and the encoding module 810 optional.
[0058] In such other variations of this embodiment where the decision module 820 studies the audio signal directly, one or more of the following criteria may be applied: [0059] Does the audio signal contain both a component with dominant fundamental frequency and a component located below the fundamental frequency? (The fundamental frequency may be supplied as a by-product of the encoding module 810.) [0060] Does the audio signal contain both a component with dominant fundamental frequency and a component located between the harmonics of the fundamental frequency? [0061] Does the audio signal contain significant signal energy below the fundamental frequency? [0062] Is post-filtered decoding (likely to be) preferable to unfiltered decoding with respect to rate-distortion optimality?
[0063] In all the described variations of the encoder structure shown in
[0064]
[0065]
[0066] A 6-person listening test has been carried out, during which music samples encoded and decoded according to the invention were compared with reference samples containing the same music coded while applying post filtering in the conventional fashion but maintaining all other parameters unchanged. The results confirm a perceived quality improvement.
[0067] Further embodiments of the present invention will become apparent to a person skilled in the art after reading the description above. Even though the present description and drawings disclose embodiments and examples, the invention is not restricted to these specific examples. Numerous modifications and variations can be made without departing from the scope of the present invention, which is defined by the accompanying claims.
[0068] The systems and methods disclosed hereinabove may be implemented as software, firmware, hardware or a combination thereof. Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit. Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to a person skilled in the art, computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Further, it is well known to the skilled person that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.