G10L19/083

Apparatus and method for generating an adaptive spectral shape of comfort noise

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal is provided, having: a receiving interface for receiving one or more frames, a coefficient generator, and a signal reconstructor. The coefficient generator is configured to determine one or more first audio signal coefficients, and one or more noise coefficients. Moreover, the coefficient generator is configured to generate one or more second audio signal coefficients, depending on the one or more first audio signal coefficients and depending on the one or more noise coefficients. The audio signal reconstructor is configured to reconstruct a first portion of the reconstructed audio signal depending on the one or more first audio signal coefficients and the audio signal reconstructor is configured to reconstruct a second portion of the reconstructed audio signal depending on the one or more second audio signal coefficients, if the current frame is not received by the receiving interface or if the current frame being received by the receiving interface is corrupted.

Apparatus and method for generating an adaptive spectral shape of comfort noise

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal is provided, having: a receiving interface for receiving one or more frames, a coefficient generator, and a signal reconstructor. The coefficient generator is configured to determine one or more first audio signal coefficients, and one or more noise coefficients. Moreover, the coefficient generator is configured to generate one or more second audio signal coefficients, depending on the one or more first audio signal coefficients and depending on the one or more noise coefficients. The audio signal reconstructor is configured to reconstruct a first portion of the reconstructed audio signal depending on the one or more first audio signal coefficients and the audio signal reconstructor is configured to reconstruct a second portion of the reconstructed audio signal depending on the one or more second audio signal coefficients, if the current frame is not received by the receiving interface or if the current frame being received by the receiving interface is corrupted.

Audio encoder and decoder using a frequency domain processor, a time domain processor, and a cross processing for continuous initialization

An audio encoder for encoding an audio signal includes: a first encoding processor for encoding a first audio signal portion in a frequency domain, wherein the first encoding processor includes: a time frequency converter for converting the first audio signal portion into a frequency domain representation having spectral lines up to a maximum frequency of the first audio signal portion; a spectral encoder for encoding the frequency domain representation; a second encoding processor for encoding a second different audio signal portion in the time domain; a cross-processor for calculating, from the encoded spectral representation of the first audio signal portion, initialization data of the second encoding processor, so that the second encoding processing is initialized to encode the second audio signal portion immediately following the first audio signal portion in time in the audio signal; a controller configured for analyzing the audio signal and for determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion of the audio signal is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal including a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second audio signal portion.

Audio encoder and decoder using a frequency domain processor, a time domain processor, and a cross processing for continuous initialization

An audio encoder for encoding an audio signal includes: a first encoding processor for encoding a first audio signal portion in a frequency domain, wherein the first encoding processor includes: a time frequency converter for converting the first audio signal portion into a frequency domain representation having spectral lines up to a maximum frequency of the first audio signal portion; a spectral encoder for encoding the frequency domain representation; a second encoding processor for encoding a second different audio signal portion in the time domain; a cross-processor for calculating, from the encoded spectral representation of the first audio signal portion, initialization data of the second encoding processor, so that the second encoding processing is initialized to encode the second audio signal portion immediately following the first audio signal portion in time in the audio signal; a controller configured for analyzing the audio signal and for determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion of the audio signal is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal including a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second audio signal portion.

Method for decoding and encoding a downmix matrix, method for presenting audio content, encoder and decoder for a downmix matrix, audio encoder and audio decoder

A method is described which decodes a downmix matrix for mapping a plurality of input channels of audio content to a plurality of output channels, the input and output channels being associated with respective speakers at predetermined positions relative to a listener position, wherein the downmix matrix is encoded by exploiting the symmetry of speaker pairs of the plurality of input channels and the symmetry of speaker pairs of the plurality of output channels. Encoded information representing the encoded downmix matrix is received and decoded for obtaining the decoded downmix matrix.

Text categorization using natural language processing

A method performed by a device may include identifying a plurality of samples of textual content; performing tokenization of the plurality of samples to generate a respective plurality of tokenized samples; performing embedding of the plurality of tokenized samples to generate a sample matrix; determining groupings of attributes of the sample matrix using a convolutional neural network; determining context relationships between the groupings of attributes using a bidirectional long short term memory (LSTM) technique; selecting predicted labels for the plurality of samples using a model, wherein the model selects, for a particular sample of the plurality of samples, a predicted label of the predicted labels from a plurality of labels based on respective scores of the particular sample with regard to the plurality of labels and based on a nonparametric paired comparison of the respective scores; and providing information identifying the predicted labels.

Text categorization using natural language processing

A method performed by a device may include identifying a plurality of samples of textual content; performing tokenization of the plurality of samples to generate a respective plurality of tokenized samples; performing embedding of the plurality of tokenized samples to generate a sample matrix; determining groupings of attributes of the sample matrix using a convolutional neural network; determining context relationships between the groupings of attributes using a bidirectional long short term memory (LSTM) technique; selecting predicted labels for the plurality of samples using a model, wherein the model selects, for a particular sample of the plurality of samples, a predicted label of the predicted labels from a plurality of labels based on respective scores of the particular sample with regard to the plurality of labels and based on a nonparametric paired comparison of the respective scores; and providing information identifying the predicted labels.

METHODS AND DEVICES FOR GENERATION AND PROCESSING OF MODIFIED AUDIO BITSTREAMS

Described herein is a method for generating a modified bitstream on a source device, wherein the method includes the steps of: a) receiving, by a receiver, a bitstream including coded media data; b) generating, by an embedder, payload of additional media data and embedding the payload in the bitstream for obtaining, as an output from the embedder, a modified bitstream including the coded media data and the payload of the additional media data; and c) outputting the modified bitstream to a sink device. Described is further a method for processing said modified bitstream on a sink device. Described are moreover a respective source device and sink device as well as a system of a source device and a sink device and respective computer program products.

METHODS AND DEVICES FOR GENERATION AND PROCESSING OF MODIFIED AUDIO BITSTREAMS

Described herein is a method for generating a modified bitstream on a source device, wherein the method includes the steps of: a) receiving, by a receiver, a bitstream including coded media data; b) generating, by an embedder, payload of additional media data and embedding the payload in the bitstream for obtaining, as an output from the embedder, a modified bitstream including the coded media data and the payload of the additional media data; and c) outputting the modified bitstream to a sink device. Described is further a method for processing said modified bitstream on a sink device. Described are moreover a respective source device and sink device as well as a system of a source device and a sink device and respective computer program products.

Spatial sound reproduction using multichannel loudspeaker systems

An apparatus for spatial audio signal decoding associated with a plurality of speaker nodes (201, 203, 205, 207, 209) placed within a three dimensional space, the apparatus comprising at least one processor and at least one memory including a computer program code. The at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to determine a non-overlapping virtual surface arrangement (400), the virtual surface arrangement (400) comprising a plurality of virtual surfaces (421, 423, 431, 433) with corners positioned at at least three speaker nodes of the plurality of speaker nodes (201, 203, 205, 207, 209) and sides connecting pairs of corners configured to be non-intersecting with at least one defined virtual plane within the three dimensional space. The apparatus is further caused to generate gains for the speaker nodes based on the determined the virtual surface arrangement and apply the gains to at least one audio signal, the at least one audio signal to be positioned within the three dimensional space.