Patent classifications
G10L19/16
Apparatus and method for screen related audio object remapping
An apparatus for generating loudspeaker signals includes an object metadata processor configured to receive metadata, to calculate a second position of the audio object depending on the first position of the audio object and on a size of a screen if the audio object is indicated in the metadata as being screen-related, to feed the first position of the audio object as the position information into the object renderer if the audio object is indicated in the metadata as being not screen-related, and to feed the second position of the audio object as the position information into the object renderer if the audio object is indicated in the metadata as being screen-related. The apparatus further includes an object renderer configured to receive an audio object and to generate the loudspeaker signals depending on the audio object and on position information.
AUDIO PACKET LOSS CONCEALMENT VIA PACKET REPLICATION AT DECODER INPUT
A system includes a server to generate a real-time stream of audio packets and a client device to decode and playback the audio content of the stream. The client device includes a network interface configured to receive a stream of audio packets via a network and a buffer configured to temporarily buffer a subset of audio packets of the stream. The client device further includes an audio decoder having an input to receive audio packets from the buffer and an output to provide corresponding segments of a decoded audio data stream. The client device also includes a stream monitoring module configured to provide an audio packet of the subset in the buffer which was previously decoded by the decoder to the input of the decoder again for a repeated decoding in place of a decoding of an audio packet that is lost or late.
Apparatus and method for encoding an audio signal using compensation values between three spectral bands
An apparatus for encoding an audio signal includes: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder includes: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.
Mass media presentations with synchronized audio reactions
Systems and methods of the present disclosure provide a plurality of audio reactions from a plurality of client devices. The audio reactions are captured by microphones on the client devices and are time-stamped. The method also includes mixing the audio reactions by a mixer server to form a mixed audio reaction, and sending the mixed audio reaction to at least one of the client devices. The client device is adapted to play the mixed audio reaction and a mass media presentation. The mixed audio reaction and the mass media presentation are synchronized to create an audience effect for the mass media presentation. The present technology also provides echo removal, volume balancing, compression, and time stamping of an audio stream by the client device. Reactions from at least one of buttons and gestures to activate synthesized sounds, for example clapping, booing, and cheering, which are mixed into the mixed audio reaction.
Multi-stride packet payload mapping for robust transmission of data
Systems and methods for packet payload mapping for robust transmission of data are described. For example, methods may include receiving, using a network interface, packets that each respectively include a primary frame and one or more preceding frames from the sequence of frames of data that are separated from the primary frame in the sequence of frames by a respective multiple of a stride parameter; storing the frames of the packets in a buffer with entries that each hold the primary frame and the one or more preceding frames of a packet; reading a first frame from the buffer as the primary frame from one of the entries; determining that a packet with a primary frame that is a next frame in the sequence has been lost; and, responsive to the determination, reading the next frame from the buffer as a preceding frame from one of the entries.
APPARATUS AND METHOD FOR AUDIO ENCODING
An audio encoding apparatus comprises an audio receiver (201) receiving audio items representing an audio scene and a metadata receiver (203) receives input presentation metadata for the audio items describing presentation constraints for the rendering of the audio items. The presentation constraints constrain a rendering parameter that can be adapted when rendering the audio items. An audio encoder (205) generates encoded audio data for the audio scene by encoding the plurality of audio items with the encoding being adapted in response to the input presentation metadata. A metadata circuit (207) generates output presentation metadata from the input presentation metadata. The output presentation metadata comprises data for encoded audio items which constrain the extent by which an adaptable parameter of a rendering can be adapted when rendering the encoded audio items. An output (209) generates an encoded audio data stream comprising the encoded audio data and the output presentation metadata.
Audio return channel data loopback
A system and method to process audio data received over the ARC or eARC interface of HDMI from audio sources are provided. A media device may receive compressed audio data in a number of data formats. The media device may convert between the audio formats provided by the audio sources and the audio formats supported by audio playback devices. The media device may inspect frames of audio data to determine if the frames are to be decoded. The frame may be decoded and subsequently encoded into the data formats supported by the audio playback devices. To reduce latency, the media device may enable a pass-through mode to bypass the decoding of the frames to allow the frames to be decoded at the audio playback devices. A bi-directional loopback application may route audio data received over the ARC or eARC interface from the audio sources to the audio playback devices.
Retransmission Softbit Decoding
Disclosed are methods and systems for using softbit decoding techniques in retransmission-based networks for error concealment of packets corrupted by bit-errors. The softbit decoding techniques derive softbit information from multiple corrupted hardbits of the retransmitted packet to aid a softbit decoder in decoding the packet. The approach realizes improved error concealment capability while maintaining a simple system architecture. A retransmission softbit module is inserted between a channel decoder used for channel-decoding and demodulating a compressed packet and the softbit decoder. The retransmission softbit module may derive an accumulated softbit packet from multiple corrupted copies of the packet received from the channel decoder, make bit decisions based on the accumulated softbit packet, and derive reliability information for the bit decisions. The bit decisions may be a majority decision packet (MDP) created using a majority voting scheme. The reliability information and the MDP may be provided to the softbit decoder for decoding.
METHODS OF ENCODING AND DECODING AUDIO SIGNAL, AND ENCODER AND DECODER FOR PERFORMING THE METHODS
Disclosed are methods of encoding and decoding an audio signal, and an encoder and a decoder for performing the methods. The method of encoding an audio signal includes identifying an input signal corresponding to a low frequency band of the audio signal, windowing the input signal, generating a first latent vector by inputting the windowed input signal to a first encoding model, transforming the windowed input signal into a frequency domain, generating a second latent vector by inputting the transformed input signal to a second encoding model, generating a final latent vector by combining the first latent vector and the second latent vector, and generating a bitstream corresponding to the final latent vector.
Transmission device, transmission method, reception device, and a reception method
It is possible to enable a reception side to easily recognize that metadata is inserted into an audio stream. A metafile including meta information for acquiring an audio stream into which metadata is inserted through a reception device is transmitted. The identification information indicating that the metadata is inserted into the audio stream is inserted into the metafile. At the reception side, it is possible to easily recognize that the metadata is inserted into the audio stream based on the identification information inserted into the metafile.