G10L19/012

Jitter buffer control, audio decoder, method and computer program

A jitter buffer control for controlling a provision of a decoded audio content on the basis of an input audio content is configured to select a frame-based time scaling or a sample-based time scaling in a signal-adaptive manner. An audio decoder uses such a jitter buffer control.

Jitter buffer control, audio decoder, method and computer program

A jitter buffer control for controlling a provision of a decoded audio content on the basis of an input audio content is configured to select a frame-based time scaling or a sample-based time scaling in a signal-adaptive manner. An audio decoder uses such a jitter buffer control.

Methods, Apparatus and Systems for Determining Reconstructed Audio Signal

According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.

SPEECH SYNTHESIZER, AUDIO WATERMARKING INFORMATION DETECTION APPARATUS, SPEECH SYNTHESIZING METHOD, AUDIO WATERMARKING INFORMATION DETECTION METHOD, AND COMPUTER PROGRAM PRODUCT

According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

ESTIMATION OF BACKGROUND NOISE IN AUDIO SIGNALS
20230215447 · 2023-07-06 ·

Background noise estimators and methods are disclosed for estimating background noise in an audio signal. Some methods include obtaining at least one parameter associated with an audio signal segment, such as a frame or part of a frame, based on a first linear prediction gain, calculated as a quotient between a residual signal from a 0th-order linear prediction and a residual signal from a 2nd-order linear prediction for the audio signal segment. A second linear prediction gain is calculated as a quotient between a residual signal from a 2nd-order linear prediction and a residual signal from a 16th-order linear prediction for the audio signal segment. Whether the audio signal segment comprises a pause is determined based at least on the obtained at least one parameter; and a background noise estimate is updated based on the audio signal segment when the audio signal segment comprises a pause.

METHODS AND DEVICES FOR ENCODING AND/OR DECODING SPATIAL BACKGROUND NOISE WITHIN A MULTI-CHANNEL INPUT SIGNAL

The present document describes a method (600) for encoding a multi-channel input signal (101) which comprises N different channels. The method (600) comprises, for a current frame of a sequence of frames, determining (601) whether the current frame is an active frame or an inactive frame using a signal and/or a voice activity detector, and determining (602) a downmix signal (103) based on the multi-channel input signal (101), wherein the downmix signal (103) comprises N channels or less. In addition, the method (600) comprises determining (603) upmixing metadata (105) comprising a set of parameters for generating, based on the downmix signal (103), a reconstructed multi-channel signal (111) comprising N channels, wherein the upmixing metadata (105) is determined in dependence of whether the current frame is an active frame or an inactive frame. The method (600) further comprises encoding (604) the upmixing metadata (105) into a bitstream.

Wireless communication device, and method and apparatus for processing voice data

The disclosure provides a wireless communication device and a method and apparatus for processing voice data. The wireless communication device includes: a radio frequency chip and a computing power chip. The radio frequency chip includes: a first processor and a radio frequency transceiver. The radio frequency chip and the computing power chip are connected via a preset communication interface, so that the first processor communicates with a second processor in the computing power chip, wherein the second processor is configured to: perform a processing of decoding data received by the radio frequency transceiver and encoding data to be sent for the radio frequency transceiver to send. The computing power chip is provided in the wireless communication device, and a codec with a good processing effect is supported by the second processor, thus the effect of the encoding or decoding processing may be improved.

Wireless communication device, and method and apparatus for processing voice data

The disclosure provides a wireless communication device and a method and apparatus for processing voice data. The wireless communication device includes: a radio frequency chip and a computing power chip. The radio frequency chip includes: a first processor and a radio frequency transceiver. The radio frequency chip and the computing power chip are connected via a preset communication interface, so that the first processor communicates with a second processor in the computing power chip, wherein the second processor is configured to: perform a processing of decoding data received by the radio frequency transceiver and encoding data to be sent for the radio frequency transceiver to send. The computing power chip is provided in the wireless communication device, and a codec with a good processing effect is supported by the second processor, thus the effect of the encoding or decoding processing may be improved.

TRUNCATEABLE PREDICTIVE CODING

A method, system, and computer program to encode and decode a channel coherence parameter applied on a frequency band basis, where the coherence parameters of each frequency band form a coherence vector. The coherence vector is encoded and decoded using a predictive scheme followed by a variable bit rate entropy coding.

TRUNCATEABLE PREDICTIVE CODING

A method, system, and computer program to encode and decode a channel coherence parameter applied on a frequency band basis, where the coherence parameters of each frequency band form a coherence vector. The coherence vector is encoded and decoded using a predictive scheme followed by a variable bit rate entropy coding.