Patent classifications
H04S7/30
SPLIT RENDERING OF EXTENDED REALITY DATA OVER 5G NETWORKS
An example device for processing extended reality (XR) data includes a processors configured to: parse entry point data of an XR scene to extract information about one or more required virtual objects for the XR scene, the required virtual objects including a number of dynamic virtual objects equal to or greater than one, each of the dynamic virtual objects including at least one dynamic media component for which media data is to be retrieved; initialize a number of streaming sessions equal to or greater than the number of dynamic virtual objects using the entry point data; configure quality of service (QoS) and charging information for the streaming sessions; retrieve media data for the dynamic virtual objects via the streaming sessions; and send the retrieved media data to a rendering unit to render the XR scene to include the retrieved media data at corresponding locations within the XR scene.
Optimal crosstalk cancellation filter sets generated by using an obstructed field model and methods of use
A crosstalk cancellation filter set configured for use in delivering binaural signals to human ears is provided. The crosstalk cancellation filter set includes a pressure matching system configured to perform spatial filtering or sound field control and an obstructed field model in communication with the pressure matching system. The crosstalk cancellation filter set is configured to take acoustic advantage of scattering effects and occlusional effects caused by violations to a free-field assumption, thereby delivering improved crosstalk cancellation acoustic displays to a listener without the use of headphones.
Multi-camera device
This specification describes: using a first camera of a multi-camera device to obtain first video data of a first region; using a second camera of the multi-camera device to obtain second video data of a second region; generating a multi-camera video output from the first and second video data using a first video mapping to map the first video data to a first portion of the multi-camera video output and using a second video mapping to map the second video data to a second portion of the multi-camera video output; and generating an audio output from obtained audio data, the audio output comprising an audio output having a directional component within the first portion of the video output and an audio output having a directional component within the second portion of the video output, wherein generating the audio output comprises using a first audio mapping to map audio data having a directional component within the first region to the audio output having a directional component within the first portion of the video output and using a second audio mapping to map audio data having a directional component within the second region to the audio output having a directional component within the second portion of the video output.
Detection of audio panning and synthesis of 3D audio from limited-channel surround sound
A method includes receiving a multi-channel audio signal (101) including multiple input audio channels (102, 104, 106, 108) that are configured to play audio from multiple respective locations relative to a listener. One or more spectral components that undergo a panning effect (1001, 1002, 1003) are identified in the multi-channel audio signal among at least some of the input audio channels. One or more virtual channels (1100, 1200, 1300) are generated, which together with the input audio channels form an extended set (111) of audio channels that retain the identified panning effect. A reduced set (222) of output audio signals, fewer in number than the input audio signals, is generated from the extended set, including recreating the panning effect in the output audio signals. The reduced set of output audio signals is outputted to a user.
Capturing and synchronizing data from multiple sensors
Processes, methods, systems, and devices are disclosed for synchronizing multiple wireless data streams captured in action by various sensors, with lost data recovery. For example, a source device may have multiple sensors acquiring data and sending the data streams (e.g., via Bluetooth connections) to a target device. Timing information may be appended for each of the data streams. Data packets of the multiple data streams may be formed with the timing information. The data packets may be transmitted to a target device that is configured to synchronize the multiple data streams using the timing information. The target device, applying the example processes or techniques of this disclosure, may accurately synchronize the multiple data streams. In some cases, the target device may capture additional data streams and the processor synchronizes all data streams of both the source and the target devices.
Determining corrections to be applied to a multichannel audio signal, associated coding and decoding
A method and device for determining a set of corrections to be made to a multichannel sound signal, in which the set of corrections is determined on the basis of an item of information representative of a spatial image of an original multichannel signal and an item of information representative of a spatial image of the original multichannel signal that has been coded and then decoded.
Image and Audio Apparatus and Method
An apparatus including circuitry configured for causing audio processing to a spatial audio-visual representation of an image and sound apparatus, the spatial audio-visual representation being live or reproduced from recording; and modifying the audio processing applied to an audio-visually manipulated spatial section of the spatial audio-visual representation in response to information a prior audio-visual manipulation with data processing in the audio-visually manipulated spatial section.
SYSTEM AND METHOD FOR ADAPTIVE AUDIO SIGNAL GENERATION, CODING AND RENDERING
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Virtual simulation of spatial audio characteristics
Embodiments of the present invention are directed to a system and method for demonstrating spatial performance of a demonstration speaker model to consumers in order to evaluate different speakers. The system and method comprise a microphone array for recording the output of the demonstration speaker model. The system and method comprise acoustic input samples for processing to an acoustic output and a processor for determining characteristics of each microphone recording, and processing an acoustic input sample and characteristics of each microphone recording corresponding to a selected demonstration speaker model. The system and method further comprise a reference speaker model for outputting an acoustic signal based on the result of the processing. The processing compensates for the performance characteristic of the reference speaker and the performance characteristic of the selected demonstration speaker so as to mimic the spatial characteristics of the demonstration speaker while avoiding bias from the reference speaker.
Spatial arrangement of sound broadcasting devices
The present invention relates to a spatial arrangement for optimizing the broadcasting of a sound signal and thus replacing the conventional stereo systems. For this purpose, the spatial arrangement is capable of broadcasting a spatialized sound signal, the spatialized sound signal comprising N mutually distinct audio signals, N being an integer strictly greater than 3, and the spatial arrangement comprising a set of N sound broadcasting devices predominantly distributed over the entire width of a scene. Each sound broadcasting device receives an audio signal of which it will amplify and broadcast the transmitted sound. In particular, each sound broadcasting device is specifically capable of reproducing and preserving the characteristics of the sound transmitted by the audio signal received, in particular the sound frequency bands and the sound intensity of the frequency bands of the audio signal.