Patent classifications
H04S2420/11
Augmented reality and virtual reality feedback enhancement system, apparatus and method
- Chandrasekaran Sakthivel ,
- Michael Apodaca ,
- Kai Xiao ,
- Altug Koker ,
- Jeffery S. Boles ,
- Adam T. Lake ,
- Nikos Kaburlasos ,
- Joydeep Ray ,
- John H. Feit ,
- Travis T. Schluessler ,
- Jacek Kwiatkowski ,
- James M. Holland ,
- Prasoonkumar Surti ,
- Jonathan Kennedy ,
- Louis Feng ,
- Barnan Das ,
- Narayan Biswal ,
- Stanley J. Baran ,
- Gokcen Cilingir ,
- Nilesh V. Shah ,
- Archie Sharma ,
- Mayuresh M. Varerkar
Systems, apparatuses and methods may provide away to render augmented reality and virtual reality (VR/AR) environment information. More particularly, systems, apparatuses and methods may provide a way to selectively suppress and enhance VR/AR renderings of n-dimensional environments. The systems, apparatuses and methods may deepen a user's VR/AR experience by focusing on particular feedback information, while suppressing other feedback information from the environment.
APPARATUS AND METHOD FOR PROCESSING MULTI-CHANNEL AUDIO SIGNAL
Disclosed is an apparatus and method for processing a multichannel audio signal. A multichannel audio signal processing method may include: generating an N-channel audio signal of N channels by down-mixing an M-channel audio signal of M channels; and generating a stereo audio signal by performing binaural rendering of the N-channel audio signal.
Spatial Audio Representation and Rendering
An apparatus including circuitry configured to: obtain a spatial audio signal including at least one audio signal and spatial metadata associated with the at least one audio signal; obtain at least one data set related to binaural rendering; obtain at least one pre-defined data set related to binaural rendering; and generate a binaural audio signal based on a combination of at least part of the at least one data set and the at least one pre-defined data set, and the spatial audio signal.
SPATIAL AUDIO PARAMETER ENCODING AND ASSOCIATED DECODING
A method comprising: obtaining a first audio direction parameter value for each sub-band of a sub-frame of a frame of an audio signal; obtaining a second audio direction parameter value for the sub-frame of the frame of the audio signal for one or more audio objects associated with said audio signal; and determining a bit-efficient encoding for each first audio direction parameter value of the sub-frame based on a similarity between the first audio direction parameter value for each sub-band and the second audio direction parameter values for the one or more audio objects.
Audio processing apparatus and method therefor
An audio processing apparatus comprises a receiver (705) which receives audio data including audio components and render configuration data including audio transducer position data for a set of audio transducers (703). A renderer (707) generating audio transducer signals for the set of audio transducers from the audio data. The renderer (7010) is capable of rendering audio components in accordance with a plurality of rendering modes. A render controller (709) selects the rendering modes for the renderer (707) from the plurality of rendering modes based on the audio transducer position data. The renderer (707) can employ different rendering modes for different subsets of the set of audio transducers the render controller (709) can independently select rendering modes for each of the different subsets of the set of audio transducers (703). The render controller (709) can select the rendering mode for a first audio transducer of the set of audio transducers (703) in response to a position of the first audio transducer relative to a predetermined position for the audio transducer. The approach may provide improved adaptation, e.g. to scenarios where most speakers are at desired positions whereas a subset deviate from the desired position(s).
Determining corrections to be applied to a multichannel audio signal, associated coding and decoding
A method and device for determining a set of corrections to be made to a multichannel sound signal, in which the set of corrections is determined on the basis of an item of information representative of a spatial image of an original multichannel signal and an item of information representative of a spatial image of the original multichannel signal that has been coded and then decoded.
SIGNAL PROCESSING DEVICE, METHOD, AND PROGRAM
The present technology relates to a signal processing device, a method, and a program that make it possible for a user to obtain a higher realistic feeling. The signal processing device includes: an audio generation unit that generates a sound source signal according to a type of a sound source on the basis of a recorded signal obtained by sound collection by a microphone attached to a moving object; a correction information generation unit that generates position correction information indicating a distance between the microphone and the sound source; and a position information generation unit that generates sound source position information indicating a position of the sound source in a target space on the basis of microphone position information indicating a position of the microphone in the target space and the position correction information. The present technology can be applied to a recording/transmission/reproduction system.
AUDIO ZOOM
A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.
SYSTEM AND METHOD FOR ADAPTIVE AUDIO SIGNAL GENERATION, CODING AND RENDERING
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
CONTROL DEVICE, PROCESSING METHOD FOR CONTROL DEVICE, AND STORAGE MEDIUM STORING PROGRAM FOR PROCESSING METHOD
A control device controls sound of a plurality of speakers. The control device includes: a generation unit acquiring image data and generates display image data from the image data by using conversion processing using shape information indicating a shape of a display surface; a display controller displaying a display image on the display surface by using the display image data; a receiver causing a cursor to be superimposed and displayed on the display image and receives position designation related to the plurality of speakers on the display image from a user who has visually recognized the cursor; and an identification unit calculating a position of the cursor from the position designation by referring to a correspondence relationship between the image data before the conversion processing and the display image data after the conversion processing, and identifying the position of the cursor as a position related to the plurality of speakers.