Patent classifications
H04S3/00
APPARATUS AND METHOD FOR ENCODING A PLURALITY OF AUDIO OBJECTS USING DIRECTION INFORMATION DURING A DOWNMIXING OR APPARATUS AND METHOD FOR DECODING USING AN OPTIMIZED COVARIANCE SYNTHESIS
An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects has: a downmixer for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.
VIDEO PROCESSING DEVICE AND METHOD
A video processing apparatus includes a memory storing instructions, and at least one processor configured to execute the instructions to generate a plurality of feature information by analyzing a video signal comprising a plurality of images based on a first DNN, extract a first altitude component and a first planar component corresponding to a movement of an object in a video from the video signal based on a second DNN, extract a second planar component corresponding to a movement of a sound source in audio from a first audio signal based on a third DNN, generate a second altitude component based on the first altitude component, the first planar component, and the second planar component, output a second audio signal comprising the second altitude component based on the feature information, and synchronize the second audio signal with the video signal and output the synchronized second audio signal and video signal.
VIDEO PROCESSING DEVICE AND METHOD
A video processing apparatus includes a memory storing instructions, and at least one processor configured to execute the instructions to generate a plurality of feature information by analyzing a video signal comprising a plurality of images based on a first DNN, extract a first altitude component and a first planar component corresponding to a movement of an object in a video from the video signal based on a second DNN, extract a second planar component corresponding to a movement of a sound source in audio from a first audio signal based on a third DNN, generate a second altitude component based on the first altitude component, the first planar component, and the second planar component, output a second audio signal comprising the second altitude component based on the feature information, and synchronize the second audio signal with the video signal and output the synchronized second audio signal and video signal.
SIGNAL PROCESSING DEVICE, METHOD, AND PROGRAM
The present technology relates to a signal processing device, a method, and a program capable of improving transmission efficiency and efficiency in the data processing amount. A signal processing device includes: an acquisition unit that acquires polar coordinate position information indicating a position of a first object expressed by polar coordinates, audio data of the first object, absolute coordinate position information indicating a position of a second object expressed by absolute coordinates, and audio data of the second object; a coordinate conversion unit that converts the absolute coordinate position information into polar coordinate position information indicating a position of the second object; and a rendering processing unit that performs rendering processing on the basis of the polar coordinate position information and the audio data of the first object and the polar coordinate position information and the audio data of the second object. The present technology can be applied to a content reproduction system.
INFORMATION PROCESSING DEVICE, CONTROL METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM
An information processing apparatus, a control method, and a control program capable of providing an experience close to one experienced in a real space to a user are provided. An information processing apparatus includes: an acquisition unit configured to acquire terminal position information of a communication terminal; a holding unit configured to hold a predetermined area and acoustic-image localization position information of an audio content to be output to the communication terminal while associating them with each other; a generation unit configured to generate acoustic-image localization information based on the acoustic-image localization position information and the terminal position information when the terminal position information is included in the predetermined area; and an output unit configured to output the acoustic-image localization information.
ACOUSTIC REPRODUCTION METHOD, ACOUSTIC REPRODUCTION DEVICE, AND RECORDING MEDIUM
An acoustic reproduction method includes: localizing a first sound image at a first position in a target space in which a user is present; and localizing, at a second position in the target space, a second sound image that represents an anchor sound for indicating a reference position.
Augmented hearing system
Some implementations may involve receiving, via an interface system, personnel location data indicating a location of at least one person and receiving, from an orientation system, headset orientation data corresponding with the orientation of a headset. First environmental element location data, indicating a location of at least a first environmental element, may be determined. Based at least in part on the headset orientation data, the personnel location data and the first environmental element location data, headset coordinate locations of at least one person and at least the first environmental element in a headset coordinate system corresponding with the orientation of the headset may be determined. An apparatus may be caused to provide spatialization indications of the headset coordinate locations. Providing the spatialization indications may involve controlling a speaker system to provide environmental element sonification corresponding with at least the first environmental element location data.
Associated spatial audio playback
An apparatus including at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured, with the at least one processor, to cause the apparatus at least to: generate content lock information for a content lock, wherein the content lock information enables control of audio signal processing associated with audio signals related to one or more audio sources based on a position and/or orientation input.
Inter-channel phase difference parameter encoding method and apparatus
This application discloses an IPD parameter encoding method, including: obtaining a reference parameter used to determine an IPD parameter encoding scheme of a current frame of a multi-channel signal; determining the IPD parameter encoding scheme of the current frame based on the reference parameter, where the determined IPD parameter encoding scheme of the current frame is one of at least two preset IPD parameter encoding schemes; and processing an IPD parameter of the current frame based on the determined IPD parameter encoding scheme of the current frame. The technical solutions provided in this application can improve encoding quality of the multi-channel signal.
Spatial audio for interactive audio environments
Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. For each sound source of a plurality of sound sources in the virtual environment, a respective first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the respective sound source in the virtual environment, and the respective first intermediate audio signal is associated with a first bus. For each of the sound sources of the plurality of sound sources in the virtual environment, a respective second intermediate audio signal is determined. The respective second intermediate audio signal corresponds to a reverberation of the input audio signal in the virtual environment. The respective second intermediate audio signal is determined based on a location of the respective sound source, and further based on an acoustic property of the virtual environment. The respective second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first bus and the second bus.