Patent classifications
H04S3/002
METHOD AND APPARATUS FOR BINAURAL RENDERING AUDIO SIGNAL USING VARIABLE ORDER FILTERING IN FREQUENCY DOMAIN
The present invention relates to a method and an apparatus for binaural rendering an audio signal using variable order filtering in frequency domain. To this end, provided are a method for processing an audio signal including: receiving an input audio signal; receiving a set of truncated subband filter coefficients for filtering each subband signal of the input audio signal, the set of truncated subband filter coefficients being constituted by one or more FFT filter coefficients generated by performing FFT by a predetermined block size; generating at least one subframe for each subband; generating at least one filtered subframe for each subband; performing inverse FFT on the filtered subframe for each subband; and generating a filtered subband signal by overlap-adding the transformed subframe for each subband and an apparatus for processing an audio signal using the same.
Audio reproduction system and method for reproducing audio data of at least one audio object
An audio reproduction system for reproducing audio data of at least one audio object and/or at least one sound source of an acoustic scene in a given environment comprising: at least two audio systems acting distantly apart from each other, wherein one of the audio systems is adapted to reproduce the audio object and/or the sound source in a first distance range to a listener and another of the audio systems is adapted to reproduce the audio object and/or the sound source in a second distance range to the listener, wherein the first and second distance ranges are different and possibly spaced apart from each other or placed adjacent to each other; and a panning information provider adapted to process at least one input to generate at least one panning information for each audio system to drive the at least two audio systems.
Encoding device and method, decoding device and method, and program
The present technique relates to an encoding device and a method, a decoding device and a method, and a program capable of obtaining higher quality audio. An encoding unit encodes position information and a gain of an object in a current frame in multiple encoding modes. A compressing unit generates, for each combination of encoding modes of each pieces of position information and gains, encoded meta data including encoding mode information indicating the encoding modes and encoded data which are the encoded position information and gains, and compresses the encoding mode information included in the encoding meta data. A determining unit selects encoded meta data of which amount of data is the least from among the encoded meta data generated for each combination, thus determining the encoding mode of each pieces of position information and gains. The present technique can be applied to an encoder and a decoder.
Signal processor and signal processing method
A signal processor includes an input unit that receives a first audio signal and a second audio signal including mutually correlated components, a delay unit that delays the first audio signal received at the input unit by a prescribed delay time, a synthesis unit that synthesizes the first audio signal having been delayed by the delay unit with the second audio signal received at the input unit, and outputs a third audio signal resulting from synthesis, and a frequency band restriction unit that restricts a level of the first audio signal before the synthesis in a prescribed frequency band including a frequency of a dip occurring at a lowest frequency among a plurality of dips occurring in a frequency characteristic of the third audio signal as a result of the synthesis performed by the synthesis unit.
Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
Systems, methods, and apparatus for backward-compatible coding of a set of basis function coefficients that describe a sound field are presented.
AUDIO OUTPUT CONTROLLING METHOD BASED ON ORIENTATION OF AUDIO OUTPUT APPARATUS AND AUDIO OUTPUT APPARATUS FOR CONTROLLING AUDIO OUTPUT BASED ON ORIENTATION THEREOF
Provided is a method of controlling an audio output according to an orientation of an audio output apparatus, performed by the audio output apparatus, the method including receiving a stereo signal; detecting the orientation of the audio output apparatus using a sensor; outputting the stereo signal to a left speaker unit and a right speaker unit from a viewpoint of a user who views a front surface portion of the audio output apparatus on which two speaker units are arranged; and down-mixing the stereo signal and outputting the down mixed signal to at least one among an upper speaker unit and a lower speaker unit from the user's viewpoint.
ADAPTIVE AUDIO RENDERING
The techniques disclosed herein can enable a system to coordinate the processing of object-based audio and channel-based audio generated by multiple applications. The system determines a spatialization technology to utilize based on contextual data. In some configurations, the contextual data can indicate the capabilities of one or more computing resources. In some configurations, the contextual data can also indicate preferences. The preferences, for example, can indicate user preferences for a type of spatialization technology, e.g., Dolby Atmos, over another type of spatialization technology, e.g., DTSX. Based on the contextual data, the system can select a spatialization technology and a corresponding encoder to process the input signals to generate a spatially encoded stream that appropriately renders the audio of multiple applications to an available output device. The techniques disclosed herein also allow a system to dynamically change the spatialization technologies during use.
ADAPTIVE PANNER OF AUDIO OBJECTS
An audio object including audio content and object metadata is received. The object metadata indicates an object spatial position of the audio object to be rendered by audio speakers in a playback environment. Based on the object spatial position and source spatial positions of the audio speakers, initial gain values for the audio speakers are determined. The initial gain values can be used to select a set of audio speakers from among the audio speakers. Based on the object spatial position and a set of source spatial positions at which the set of audio speakers are respectively located in the playback environment, a set of non-negative optimized gain values for the set of audio speakers is determined. The audio object at the object spatial position is rendered with the set of optimized gain values for the set of audio speakers.
Method and apparatus for rendering sound signal, and computer-readable recording medium
A method of reproducing a multi-channel audio signal including an elevation sound signal in a horizontal layout environment is provided, thereby obtaining a rendering parameter according to a rendering type and configuring a down-mix matrix, and thus effective rendering performance may be obtained with respect to an audio signal that is not suitable for applying virtual rendering. A method of rendering an audio signal includes receiving a multi-channel signal includes a plurality of input channels to be converted into a plurality of output channels; determining a rendering type for elevation rendering based on a parameter determined from a characteristic of the multi-channel signal; and rendering at least one height input channel according to the determined rendering type, wherein the parameter is included in a bitstream of the multi-channel signal.
Voice audio rendering augmentation
An audio rendering device enhances voice audio such that audible voice is not overwhelmed by other aspects of the soundtrack. The device attenuates right and left channels in an audio stream in response to a detected voice component in the audio stream, and boosts the voice component in the audio stream based on the level of attenuation of the right and left channels. Voice components are distinguished from the non-voice components by separating center channel and mono information from the left, right and surround channels. Non-voice components are attenuated down towards a non-voice threshold level based on an attenuation ratio. Voice components are boosted up toward a voice threshold level, so that the spoken voice is more audible to viewers and not overwhelmed or drowned out by the non-voice aspects of the soundtrack.