Patent classifications
H04S3/008
Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Parameter Smoothing
Apparatus for processing an audio scene representing a sound field, the audio scene having information on a transport signal and a first set of parameters. The apparatus has a parameter processor for processing the first set of parameters to obtain a second set of parameters, wherein the parameter processor is configured to calculate at least one raw parameter for each output time frame using at least one parameter of the first set of parameters for the input time frame, to calculate a smoothing information such as a factor for each raw parameter in accordance with a smoothing rule, and to apply a corresponding smoothing information to the corresponding raw parameter to derive the parameter of the second set of parameters for the output time frame. The apparatus further has an output interface for generating a processed audio scene using the second set of parameters and the information on the transport signal.
Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Bandwidth Extension
Apparatus for processing an audio scene representing a sound field, the audio scene comprising information on a transport signal and a set of parameters. The apparatus comprising an output interface for generating a processed audio scene using the set of parameters and the information on the transport signal, wherein the output interface is configured to generate a raw representation of two or more channels using the set of parameters and the transport signal and a multichannel enhancer for generating an enhancement representation of the two or more channels using the transport signal, and a signal combiner for combining the raw representation of the two or more channels and the enhancement representation of the two or more channels to obtain the processed audio scene.
APPARATUS AND METHOD FOR ENCODING A PLURALITY OF AUDIO OBJECTS USING DIRECTION INFORMATION DURING A DOWNMIXING OR APPARATUS AND METHOD FOR DECODING USING AN OPTIMIZED COVARIANCE SYNTHESIS
An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects has: a downmixer for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.
SIGNAL PROCESSING APPARATUS AND METHOD, AND PROGRAM TO REDUCE CALCULATION AMOUNT BASED ON MUTE INFORMATION
The present technology relates to a signal processing apparatus and method, and a program that make it possible to reduce an arithmetic operation amount.
The signal processing apparatus performs, on the basis of audio object mute information indicative of whether or not a signal of an audio object is a mute signal, at least either one of a decoding process or a rendering process of an object signal of the audio object. The present technology can be applied to a signal processing apparatus.
VIDEO PROCESSING DEVICE AND METHOD
A video processing apparatus includes a memory storing instructions, and at least one processor configured to execute the instructions to generate a plurality of feature information by analyzing a video signal comprising a plurality of images based on a first DNN, extract a first altitude component and a first planar component corresponding to a movement of an object in a video from the video signal based on a second DNN, extract a second planar component corresponding to a movement of a sound source in audio from a first audio signal based on a third DNN, generate a second altitude component based on the first altitude component, the first planar component, and the second planar component, output a second audio signal comprising the second altitude component based on the feature information, and synchronize the second audio signal with the video signal and output the synchronized second audio signal and video signal.
SIGNAL PROCESSING DEVICE, METHOD, AND PROGRAM
The present technology relates to a signal processing device, a method, and a program capable of improving transmission efficiency and efficiency in the data processing amount. A signal processing device includes: an acquisition unit that acquires polar coordinate position information indicating a position of a first object expressed by polar coordinates, audio data of the first object, absolute coordinate position information indicating a position of a second object expressed by absolute coordinates, and audio data of the second object; a coordinate conversion unit that converts the absolute coordinate position information into polar coordinate position information indicating a position of the second object; and a rendering processing unit that performs rendering processing on the basis of the polar coordinate position information and the audio data of the first object and the polar coordinate position information and the audio data of the second object. The present technology can be applied to a content reproduction system.
INFORMATION PROCESSING DEVICE, CONTROL METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM
An information processing apparatus, a control method, and a control program capable of providing an experience close to one experienced in a real space to a user are provided. An information processing apparatus includes: an acquisition unit configured to acquire terminal position information of a communication terminal; a holding unit configured to hold a predetermined area and acoustic-image localization position information of an audio content to be output to the communication terminal while associating them with each other; a generation unit configured to generate acoustic-image localization information based on the acoustic-image localization position information and the terminal position information when the terminal position information is included in the predetermined area; and an output unit configured to output the acoustic-image localization information.
TRANSMISSION APPARATUS, TRANSMISSION METHOD, RECEPTION APPARATUS, AND RECEPTION METHOD
There is provided a transmission signal to enable tactile presentation at more positions than the number of channels of a tactile vibration signal that can be transmitted. The transmission signal includes audio signals of a predetermined number of channels and tactile presentation signals of a predetermined number of channels and to which metadata designating the tactile presentation position targeted by each of the tactile presentation signals of the predetermined number of channels is added is generated. The transmission signal is transmitted to a reception side via a predetermined transmission path. For example, the metadata is dynamically changed to dynamically change the tactile presentation position targeted by each of the tactile presentation signals of the predetermined number of channels.
ACOUSTIC REPRODUCTION METHOD, ACOUSTIC REPRODUCTION DEVICE, AND RECORDING MEDIUM
An acoustic reproduction method includes: localizing a first sound image at a first position in a target space in which a user is present; and localizing, at a second position in the target space, a second sound image that represents an anchor sound for indicating a reference position.
Augmented hearing system
Some implementations may involve receiving, via an interface system, personnel location data indicating a location of at least one person and receiving, from an orientation system, headset orientation data corresponding with the orientation of a headset. First environmental element location data, indicating a location of at least a first environmental element, may be determined. Based at least in part on the headset orientation data, the personnel location data and the first environmental element location data, headset coordinate locations of at least one person and at least the first environmental element in a headset coordinate system corresponding with the orientation of the headset may be determined. An apparatus may be caused to provide spatialization indications of the headset coordinate locations. Providing the spatialization indications may involve controlling a speaker system to provide environmental element sonification corresponding with at least the first environmental element location data.