Patent classifications
G10L19/20
HYBRID EXPANSIVE FREQUENCY COMPRESSION FOR ENHANCING SPEECH PERCEPTION FOR INDIVIDUALS WITH HIGH-FREQUENCY HEARING LOSS
A method of audio signal processing comprising Hybrid Expansive Frequency Compression (hEFC) via a digital signal processor, wherein the method includes: classifying an audio signal input, wherein the audio signal input includes frication high-frequency speech energy, into two or more speech sound classes followed by selecting a form of input-dependent frequency remapping function; and performing hEFC including, re-coding of one or more input frequencies of the speech sound via the input-dependent frequency remapping function to generate an audio output signal, wherein the output signal is a representation of the audio signal input having a lower sound frequency.
Mapping virtual sound sources to physical speakers in extended reality applications
One or more embodiments include an audio processing system for generating an audio scene for an extended reality (XR) environment. The audio processing system determines that a first virtual sound source associated with the XR environment affects a sound in the audio scene. The audio processing system generates a sound component associated with the first virtual sound source based on a contribution of the first virtual sound source to the audio scene. The audio processing system maps the sound component to a first loudspeaker included in a plurality of loudspeakers. The audio processing system outputs at least a first portion of the component for playback on the first loudspeaker.
Mapping virtual sound sources to physical speakers in extended reality applications
One or more embodiments include an audio processing system for generating an audio scene for an extended reality (XR) environment. The audio processing system determines that a first virtual sound source associated with the XR environment affects a sound in the audio scene. The audio processing system generates a sound component associated with the first virtual sound source based on a contribution of the first virtual sound source to the audio scene. The audio processing system maps the sound component to a first loudspeaker included in a plurality of loudspeakers. The audio processing system outputs at least a first portion of the component for playback on the first loudspeaker.
SYSTEM AND METHOD FOR ADAPTIVE AUDIO SIGNAL GENERATION, CODING AND RENDERING
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
SYSTEM AND METHOD FOR ADAPTIVE AUDIO SIGNAL GENERATION, CODING AND RENDERING
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Signal processing device and method, and program
The present technology relates to a signal processing device and method, and a program making it possible to reduce the computational complexity of decoding at low cost. A signal processing device includes: a priority information generation unit configured to generate priority information about an audio object on the basis of a plurality of elements expressing a feature of the audio object. The present technology may be applied to an encoding device and a decoding device.
Signal processing device and method, and program
The present technology relates to a signal processing device and method, and a program making it possible to reduce the computational complexity of decoding at low cost. A signal processing device includes: a priority information generation unit configured to generate priority information about an audio object on the basis of a plurality of elements expressing a feature of the audio object. The present technology may be applied to an encoding device and a decoding device.
Post filter for audio signals
In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.
Post filter for audio signals
In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.
Low bitrate audio encoding/decoding scheme having cascaded switches
An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first and second encoding branches, the second encoding branch having a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and the second encoding branch having a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder, a second domain decoder, and a third domain decoder as well as two cascaded switches for switching between the decoders.