G10L19/173

Methods and apparatuses for DTX hangover in audio coding

Transmitting node and receiving node for audio coding and methods therein. The nodes being operable to encode/decode speech and to apply a discontinuous transmission (DTX) scheme comprising transmission/reception of Silence Insertion Descriptor (SID) frames during speech inactivity. The method in the transmitting node comprising determining, from amongst a number N of hangover frames, a set Y of frames being representative of background noise, and further transmitting the N hangover frames, comprising at least said set Y of frames, to the receiving node. The method further comprises transmitting a first SID frame to the receiving node in association with the transmission of the N hangover frames, where the SID frame comprises information indicating the determined set Y of hangover frames to the receiving node. The method enables the receiving node to generate comfort noise based on the hangover frames most adequate for the purpose.

Methods, Apparatus and Systems for Determining Reconstructed Audio Signal

According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.

SOMATOSENSORY VIBRATOR FOR A WEARABLE DEVICE
20240214731 · 2024-06-27 · ·

A somatosensory vibrator for a wearable device is disclosed. The wearable device includes a wearable element, and the somatosensory vibrator includes a body, a speaker unit, and a communication device. The body has a coupling portion configured to couple with the wearing element. The speaker unit is arranged to the body. The communication device is electrically connected to the speaker unit, and configured to cause the speaker unit to generate a vibration in response to a first signal.

Methods and apparatus to perform audio watermarking and watermark detection and extraction

Example methods and apparatus to audio watermarking and watermark detection and extraction are disclosed herein. An example apparatus disclosed herein includes memory, computer readable instructions, and processor circuitry to execute the computer readable instructions to at least detect a first symbol, a second symbol, a third symbol, and a fourth symbol sequentially in encoded audio samples, determine whether the first symbol is a synchronization symbol, in response to a determination that the first symbol is a synchronization symbol, determine that the first symbol and the third symbol are associated with a first message and the second symbol and the fourth symbol are associated with a second message, and output at least one of the first message or the second message.

Encoding and decoding of audio signals

An audio signal (X) is represented by a bitstream (B) segmented into frames. An audio processing system (500) comprises a buffer (510) and a decoding section (520). The buffer joins sets of audio data (D.sub.1; D.sub.2, . . . , D.sub.N) carried by N respective frames (F.sub.1, F.sub.2, . . . , F.sub.N) into one decodable set of audio data (D) corresponding to a first frame rate and to a first number of samples of the audio signal per frame. The frames have a second frame rate corresponding to a second number of samples of the audio signal per frame. The first number of samples is N times the second number of samples. The decoding section decodes the decodable set of audio data into a segment of the audio signal by at least employing signal synthesis, based on the decodable set of audio data, with a stride corresponding to the first number of samples of the audio signal.

AUDIO METADATA PROVIDING APPARATUS AND METHOD, AND MULTICHANNEL AUDIO DATA PLAYBACK APPARATUS AND METHOD TO SUPPORT DYNAMIC FORMAT CONVERSION

An audio metadata providing apparatus and method and a multichannel audio data playback apparatus and method to support a dynamic format conversion are provided. Dynamic format conversion information may include information about a plurality of format conversion schemes that are used to convert a first format set by an author of multichannel audio data into a second format that is based on a playback environment of the multichannel audio data and that are each set for corresponding playback periods of the multichannel audio data. The audio metadata providing apparatus may provide audio metadata including the dynamic format conversion information. The multichannel audio data playback apparatus may identify the dynamic format conversion information from the audio metadata, may convert the first format of the multichannel audio data into the second format based on the identified dynamic format conversion information, and may play back the multichannel audio data in the second format.

METHODS, APPARATUS AND ARTICLES OF MANUFACTURE TO IDENTIFY SOURCES OF NETWORK STREAMING SERVICES
20190122673 · 2019-04-25 ·

Methods, apparatus and articles of manufacture to identify sources of network streaming services are disclosed. An example apparatus includes a coding format identifier to identify, from a received first audio signal representing a decompressed second audio signal, an audio compression configuration used to compress a third audio signal to form the second audio signal, and a source identifier to identify a source of the second audio signal based on the identified audio compression configuration.

Optimized mixing of audio streams encoded by sub-band encoding
10242683 · 2019-03-26 · ·

The invention relates to a method for mixing a plurality of audio streams coded according to a frequency sub-band coding, comprising the steps for decoding (E201) a part of the coded streams over at least a first frequency sub-band, for summing (E202) the streams thus decoded so as to form at least a first mixed stream. The method is such that it comprises the steps for detection (E203), over at least a second frequency sub-band different from the at least first sub-band, of the presence of a predetermined frequency band in the plurality of coded audio streams and for summing (E205) the decoded audio streams (E204) for which the presence of the predetermined frequency band has been detected, over said at least a second sub-band, so as to form at least a second mixed stream. The invention also relates to a mixing device implementing the method described and may be integrated into a conference bridge, a communications terminal or a communications gateway.

METHODS AND APPARATUS TO PERFORM AUDIO WATERMARKING AND WATERMARK DETECTION AND EXTRACTION
20190074021 · 2019-03-07 ·

Example methods and apparatus to audio watermarking and watermark detection and extraction are disclosed herein. Example methods disclosed herein include determining a first watermark symbol encoded in encoded audio samples and storing the first watermark symbol in tangible memory. Disclosed example methods also include determining a second watermark symbol encoded in the encoded audio samples and storing the second watermark symbol in the tangible memory. Disclosed example methods further include, in response to determining that the first watermark symbol matches the second watermark symbol, outputting the first watermark symbol.

GAME STREAMING WITH SPATIAL AUDIO

A game engine may generate video and audio content on a per-frame basis. Audio data corresponding to a current frame may be generated to comprise sound-field information independent of a speaker configuration or spatialization technology that may be used to play the associated audio. The sound-field may be generated based on monaural audio data corresponding to a sound produced by an in-game object at the object's position as of the current frame. The sound-field information may be transmitted to a remote computing device for reproduction using a selected, available speaker configuration and spatialization technology.