Patent classifications
G10L19/16
Methods and apparatuses for DTX hangover in audio coding
Transmitting node and receiving node for audio coding and methods therein. The nodes being operable to encode/decode speech and to apply a discontinuous transmission (DTX) scheme comprising transmission/reception of Silence Insertion Descriptor (SID) frames during speech inactivity. The method in the transmitting node comprising determining, from amongst a number N of hangover frames, a set Y of frames being representative of background noise, and further transmitting the N hangover frames, comprising at least said set Y of frames, to the receiving node. The method further comprises transmitting a first SID frame to the receiving node in association with the transmission of the N hangover frames, where the SID frame comprises information indicating the determined set Y of hangover frames to the receiving node. The method enables the receiving node to generate comfort noise based on the hangover frames most adequate for the purpose.
METHOD FOR PREVENTING DUPLICATE APPLICATION OF AUDIO EFFECTS TO AUDIO DATA AND ELECTRONIC DEVICE SUPPORTING THE SAME
An electronic device may include an audio output device; and a processor configured to be operatively connected to the audio output device. The processor is configured to: acquire a first user input for reproducing a first audio related to a first application; based on the first user input, generate first decoded data by decoding the first audio using a first codec; generate first synthesized data by applying a first audio effect to the first decoded data; transmit the first synthesized data to an audio framework; based on the first audio being decoded using the first codec, transmit, to the audio framework, a first request for deactivating a function of applying a second audio effect; and output the first synthesized data via the audio output device without applying the second audio effect to the first synthesized data, based on the function of applying the second audio effect being deactivated.
Transmission device, transmission method, reception device, and reception method
A processing load at a receiving side is reduced in a case where a plurality of classes of audio data is transmitted. A predetermined number of audio streams including coded data of a plurality of groups is generated and a container of a predetermined format having this predetermined number of audio streams is transmitted. Command information for creating a command specifying a group to be decoded from among the plurality of groups is inserted into the container and/or the audio stream. For example, a command insertion area for the receiving side to insert a command for specifying a group to be decoded is provided in at least one audio stream among the predetermined number of audio streams.
Dynamic network identification
Systems, apparatuses, and methods are described for a dynamic network identification to facilitate a selection of a desired audio. A premises (e.g., a public bar) may have a plurality of display devices (e.g., television screens) outputting videos associated with a plurality of content items (e.g., television programs). A computing device may assign an audio data of each of the content items to be transmitted over a separate wireless network. A user may be able to listen to the audio of a desired content item by causing a user device to join a wireless network assigned to transmit an audio data of the desired content item. The wireless network may be reused to transmit a different audio data. A network identifier of the wireless network may be renamed to indicate the different audio data. The network identifier may be based on metadata associated with a content item.
METHODS FOR PARAMETRIC MULTI-CHANNEL ENCODING
The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system configured to generate a bitstream indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal is described. The system comprises a downmix processing unit configured to generate the downmix signal from a multi-channel input signal; wherein the downmix signal comprises m channels and wherein the multi-channel input signal comprises n channels; n, m being integers with m<n. Furthermore, the system comprises a parameter processing unit configured to determine the spatial metadata from the multi-channel input signal. In addition, the system comprises a configuration unit configured to determine one or more control settings for the parameter processing unit based on one or more external settings; wherein the one or more external settings comprise a target data-rate for the bitstream and wherein the one or more control settings comprise a maximum data-rate for the spatial metadata.
METHODS, APPARATUS AND SYSTEMS FOR 6DOF AUDIO RENDERING AND DATA REPRESENTATIONS AND BITSTREAM STRUCTURES FOR 6DOF AUDIO RENDERING
The present disclosure relates to methods, apparatus and systems for encoding an audio signal into a bitstream, in particular at an encoder, comprising: encoding or including audio signal data associated with 3DoF audio rendering into one or more first bitstream parts of the bitstream, and encoding or including metadata associated with 6DoF audio rendering into one or more second bitstream parts of the bitstream. The present disclosure further relates to methods, apparatus and systems for decoding an audio signal and audio rendering based on the bitstream.
SOURCE CLASSIFICATION USING HDMI AUDIO METADATA
Methods, apparatus, systems and articles of manufacture are disclosed for source classification using HDMI audio metadata. An example apparatus includes a metadata extractor to extract values of audio encoding parameters from HDMI metadata obtained from a monitored HDMI port of a media device, the HDMI metadata corresponding to media being output from the monitored HDMI port; map the extracted values of the audio encoding parameters to a first unique encoding class (UEC) in a set of defined UECs, different ones of the set of defined UECs corresponding to different combinations of possible values of the audio encoding parameters capable of being included in the HDMI metadata; and identify a media source corresponding to the media output from the HDMI port based on one or more possible media sources mapped to the first UEC.
Stereo audio encoder and decoder
The present disclosure provides methods, devices and computer program products for encoding and decoding a stereo audio signal based on an input signal. According to the disclosure, a hybrid approach of using both parametric stereo coding and a discrete representation of the stereo audio signal is used which may improve the quality of the encoded and decoded audio for certain bitrates.
METHOD FOR ENCODING AUDIO AND VIDEO DATA, AND ELECTRONIC DEVICE
Provided is a method for encoding audio and video data. The method includes: encapsulating cached elementary stream (ES) data of audio frames into an audio packetized elementary stream (PES) packet, and then splitting the audio PES packet into consecutive audio transport stream (TS) packets; and outputting one or more audio TS packet groups based on an order of the audio frames, and outputting one or more video TS packet groups based on an order of the video frames; wherein the one or more video TS packet group is present between the audio TS packet groups belonging to a same audio PES packet, and the one or more audio TS packet group is present between the video TS packet groups belonging to different video PES packets.
Decoding audio bitstreams with enhanced spectral band replication metadata in at least one fill element
Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided.