G10L2019/0002

Enhanced user experience through bi-directional audio and visual signal generation

In various embodiments, a computer-implemented method of training a neural network for creating an output signal of different modality from an input signal is described. In embodiments, the first modality may be a sound signal or a visual image and where the output signal would be a visual image or a sound signal, respectively. In embodiments a model is trained using a first pair of visual and audio networks to train a set of codebooks using known visual signals and the audio signals and using a second pair of visual and audio networks to further train the set of codebooks using the augmented visual signals and the augmented audio signals. Further, the first and the second visual networks are equally weighted and where the first and the second audio networks are equally weighted. In aspects of the present disclosure, the set of codebooks comprise a visual codebook, an audio codebook and a correlation codebook. These codebooks are then used to create an visual image from a sound signal and/or a sound signal from a visual image.

APPARATUS AND METHOD REALIZING IMPROVED CONCEPTS FOR TCX LTP

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal is provided. The apparatus includes a receiving interface, a delay buffer and a sample processor for processing the selected audio signal samples to obtain reconstructed audio signal samples of the reconstructed audio signal. The sample selector is configured to select, if a current frame is received by the receiving interface and if the current frame being received by the receiving interface is not corrupted, the plurality of selected audio signal samples from the audio signal samples being stored in the delay buffer depending on a pitch lag information being included by the current frame.

Speech/audio bitstream decoding method and apparatus

A speech/audio bitstream decoding method includes acquiring a speech/audio decoding parameter of a current speech/audio frame, where the foregoing current speech/audio frame is a redundant decoded frame or a speech/audio frame previous to the foregoing current speech/audio frame is a redundant decoded frame, performing post processing on the acquired speech/audio decoding parameter according to speech/audio parameters of X speech/audio frames, where the foregoing X speech/audio frames include M speech/audio frames previous to the foregoing current speech/audio frame and/or N speech/audio frames next to the foregoing current speech/audio frame, and recovering a speech/audio signal using the post-processed speech/audio decoding parameter of the foregoing current speech/audio frame. The technical solutions of the speech/audio bitstream decoding method help improve quality of an output speech/audio signal.

APPARATUS AND METHOD FOR IMPROVED SIGNAL FADE OUT IN DIFFERENT DOMAINS DURING ERROR CONCEALMENT

An apparatus for decoding an audio signal is provided, having a receiving interface, configured to receive a first frame having a first audio signal portion of the audio signal, and configured to receive a second frame having a second audio signal portion of the audio signal; a noise level tracing unit, wherein the noise level tracing unit is configured to determine noise level information depending on at least one of the first audio signal portion and the second audio signal portion; a first reconstruction unit for reconstructing, in a first reconstruction domain, a third audio signal portion of the audio signal depending on the noise level information; a transform unit for transforming the noise level information to a second reconstruction domain; and a second reconstruction unit for reconstructing, in the second reconstruction domain, a fourth audio signal portion of the audio signal depending on the noise level information.

Hybrid concealment method: combination of frequency and time domain packet loss concealment in audio codecs

Embodiments of the invention relate to an error concealment unit for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. The error concealment unit provides a first error concealment audio information component for a first frequency range using a frequency domain concealment. The error concealment unit also provides a second error concealment audio information component for a second frequency range, which includes lower frequencies than the first frequency range, using a time domain concealment. The error concealment unit also combines the first error concealment audio information component and the second error concealment audio information component, to obtain the error concealment audio information. Other embodiments of the invention relate to a decoder including the error concealment unit, as well as related encoders, methods, and computer programs for decoding and/or concealing.

Method and device for quantization of linear prediction coefficient and method and device for inverse quantization
11848020 · 2023-12-19 · ·

A quantization apparatus comprises: a first quantization module for performing quantization without an inter-frame prediction; and a second quantization module for performing quantization with an inter-frame prediction, and the first quantization module comprises: a first quantization part for quantizing an input signal; and a third quantization part for quantizing a first quantization error signal, and the second quantization module comprises: a second quantization part for quantizing a prediction error; and a fourth quantization part for quantizing a second quantization error signal, and the first quantization part and the second quantization part comprise a trellis structured vector quantizer.

CONCEPT FOR SWITCHING OF SAMPLING RATES AT AUDIO PROCESSING DEVICES

Audio decoder device for decoding a bitstream, the audio decoder device including: a predictive decoder for producing a decoded audio frame from the bitstream, wherein the predictive decoder includes a parameter decoder for producing one or more audio parameters for the decoded audio frame from the bitstream and wherein the predictive decoder includes a synthesis filter device for producing the decoded audio frame by synthesizing the one or more audio parameters for the decoded audio frame; a memory device including one or more memories, wherein each of the memories is configured to store a memory state for the decoded audio frame, wherein the memory state for the decoded audio frame of the one or more memories is used by the synthesis filter device for synthesizing the one or more audio parameters for the decoded audio frame; and a memory state resampling device configured to determine the memory state for synthesizing the one or more audio parameters for the decoded audio frame, which has a sampling rate, for one or more of the memories by resampling a preceding memory state for synthesizing one or more audio parameters for a preceding decoded audio frame, which has a preceding sampling rate being different from the sampling rate of the decoded audio frame, for one or more of the memories and to store the memory state for synthesizing of the one or more audio parameters for the decoded audio frame for one or more of the memories into the respective memory.

Apparatus and method for improved signal fade out for switched audio coding systems during error concealment

An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain.

APPARATUS AND METHOD FOR GENERATING AN ERROR CONCEALMENT SIGNAL USING POWER COMPENSATION

An apparatus for generating an error concealment signal, includes: an LPC representation generator for generating a replacement LPC representation; a gain calculator for calculating a gain information from the LPC representations; a compensator for compensating a gain influence of the replacement LPC representation using the gain information; and an LPC synthesizer for filtering codebook information using the replacement LPC representation to obtain the error concealment signal, wherein the compensator is configured for weighting the codebook information or an LPC synthesis output signal.

APPARATUS AND METHOD FOR IMPROVED SIGNAL FADE OUT FOR SWITCHED AUDIO CODING SYSTEMS DURING ERROR CONCEALMENT

An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain