Patent classifications
G10L19/012
Method and Device for Voice Activity Detection
In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus for voice activity detection (VAD). The VAD comprises creating a signal indicative of a primary VAD decision and determining hangover addition. The determination on hangover addition is made in dependence of a short term activity measure and/or a long term activity measure. A signal indicative of a final VAD decision is then created.
System, apparatus and method for transmitting continuous audio data
A system, apparatus and a method for transmitting continuous audio data configured to mitigate data discontinuities in a receiving device. The method may mitigate data discontinuities by transmitting a continuous stream of audio data that has reduced changes to the audio data characteristics. The method may transmit filler audio data when no application audio data is available. The application audio data and the filler audio data are processed to reduce changes to the audio data characteristics in each stream.
Generating spectrally shaped sound signal based on sensitivity of human hearing and background noise level
A communication device includes a loudspeaker to transmit sound into a room. A signal having a white noise-like frequency spectrum spanning a frequency range of human hearing is generated. Auditory thresholds of human hearing for frequencies spanning the frequency range are stored. Respective levels of background noise in the room at the frequencies are determined. The white noise-like frequency spectrum is spectrally shaped to produce a shaped frequency spectrum having, for each frequency, a respective level that follows either the auditory threshold or the level of background noise at that frequency, whichever is greater. The shaped frequency spectrum is transmitted from the loudspeaker into the room.
Method and apparatus for high frequency decoding for bandwidth extension
Disclosed are a method and an apparatus for high frequency decoding for bandwidth extension. The method for high frequency decoding for bandwidth extension comprises the steps of: decoding an excitation class; transforming a decoded low frequency spectrum on the basis of the excitation class; and generating a high frequency excitation spectrum on the basis of the transformed low frequency spectrum. The method and apparatus for high frequency decoding for bandwidth extension according to an embodiment can transform a restored low frequency spectrum and generate a high frequency excitation spectrum, thereby improving the restored sound quality without an excessive increase in complexity.
Method and apparatus for high frequency decoding for bandwidth extension
Disclosed are a method and an apparatus for high frequency decoding for bandwidth extension. The method for high frequency decoding for bandwidth extension comprises the steps of: decoding an excitation class; transforming a decoded low frequency spectrum on the basis of the excitation class; and generating a high frequency excitation spectrum on the basis of the transformed low frequency spectrum. The method and apparatus for high frequency decoding for bandwidth extension according to an embodiment can transform a restored low frequency spectrum and generate a high frequency excitation spectrum, thereby improving the restored sound quality without an excessive increase in complexity.
ENCODED OUTPUT DATA STREAM TRANSMISSION
In some examples, an audio sending device receives a stream of application audio data, encodes the stream of application audio data, and in response to detecting an end of the stream of application audio data, provides pre-encoded filler audio data from a buffer in the audio sending device as an encoded stream of filler audio data. The audio sending device transmits the encoded stream of application audio data and the encoded stream of filler audio data in an encoded output data stream over a transport to an audio receiving device.
Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application
An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.
Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application
An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.
Support for generation of comfort noise, and generation of comfort noise
A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor a. The method comprises signaling information about the weight factor a to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.
Support for generation of comfort noise, and generation of comfort noise
A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor a. The method comprises signaling information about the weight factor a to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.