Patent classifications
H04S3/008
APPARATUS, METHODS AND COMPUTER PROGRAMS FOR ENABLING REPRODUCTION OF SPATIAL AUDIO SIGNALS
An apparatus (101) for enabling reproduction of spatial audio signals. The apparatus comprises means for obtaining (401) audio signals (501) comprising one or more channels and obtaining (403) spatial metadata (503) relating to the audio signals (501). The spatial metadata (503) comprises information that indicates how to spatially reproduce the audio signals. The apparatus also comprises means for obtaining (405) information relating to a field of view of video (505) wherein the video is for display on a display (205) of a rendering device (201) and wherein the video is associated with the audio signals (501). The apparatus also comprises means for aligning (407) spatial reproduction of the audio signals based, at least in part, on the obtained spatial metadata (503), with objects (309A, 309B) in the video according to the obtained information relating to the field of view of video; and enabling (409) reproduction of the audio signals based on the aligning (407).
TRANSPARENT AUDIO MODE FOR VEHICLES
In general, techniques are described by which to enable a transparency mode in vehicles. A device comprising one or more microphones and one or more processors may be configured to perform the techniques. The microphones may capture audio data representative of a sound scene external to a vehicle. The processors may perform beamforming with respect to the audio data to obtain object audio data representative of an audio object in the sound scene external to the vehicle. The processors may next reproduce, by interfacing with one or more speakers included within the vehicle and based on the object audio data, the audio object in the sound scene external to the vehicle.
Time-varying always-on compensation for tonally balanced 3D-audio rendering
A system reduces sound coloration caused by rendering of a 3D audio signal. The system renders the 3D audio signal including a plurality of channels using the input audio signal. Input spectra data defining spectral information of the input audio signal is computed. 3D spectra data defining spectral information of a single channel representation of the 3D audio signal is computed. The system generates a tonal balance filter based on the input spectral data and the 3D spectral data. The tonal balance filter, when applied to the 3D audio signal, reduces sound coloration caused by the rendering of the 3D audio signal. The tonal balance filter is applied to the 3D audio signal to generate an output audio signal and the output audio signal is presented via a speaker array.
MANAGEMENT OF MEDIA DEVICES HAVING LIMITED CAPABILITIES
Embodiments disclosed herein include managing playback devices with limited capabilities and playback devices with advanced capabilities by way of a control device. In some embodiments, the control device may control a first playback device by way of a legacy control application including a first control interface comprising first playback controls operable to control the first playback device in performing a set of legacy playback functions. The mobile device may control a second playback device by way of a production control application including a second control interface comprising second playback controls operable to control the second playback device in performing a set of production playback functions.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
An information processing device includes a processor configured to output, in a case where a service is being used in which at least speech is exchanged among multiple users such that a conversation takes places among all of the multiple users, a speech of a separate conversation distinctly from a speech of the conversation taking place among all of the multiple users to a device of a user who is engaged in the separate conversation with a specific user from among the multiple users, and output the speech of the conversation taking place among all of the multiple users without outputting the speech of the separate conversation to a device of a user who is not engaged in the separate conversation.
Audio decoder, audio encoder, method for providing at least four audio channel signals on the basis of an encoded representation, method for providing an encoded representation on the basis of at least four audio channel signals and computer program using a bandwidth extension
An audio decoder for providing at least four bandwidth-extended channel signals on the basis of an encoded representation provides first and second downmix signals on the basis of a jointly encoded representation of the first and second downmix signals using a multi-channel decoding and provides at least first and second audio channel signals on the basis of the first downmix signal using a multi-channel decoding, and provides at least third and fourth audio channel signals on the basis of the second downmix signal using a multi-channel decoding. It performs a multi-channel bandwidth extension on the basis of the first and third audio channel signals, to obtain first and third bandwidth-extended channel signals, and performs a multi-channel bandwidth extension on the basis of the second and fourth audio channel signals, to obtain second and fourth bandwidth extended channel signals. An audio encoder uses a related concept.
Methods for parametric multi-channel encoding
The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system configured to generate a bitstream indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal is described. The system comprises a downmix processing unit configured to generate the downmix signal from a multi-channel input signal; wherein the downmix signal comprises m channels and wherein the multi-channel input signal comprises n channels; n, m being integers with m<n. Furthermore, the system comprises a parameter processing unit configured to determine the spatial metadata from the multi-channel input signal. In addition, the system comprises a configuration unit configured to determine one or more control settings for the parameter processing unit based on one or more external settings; wherein the one or more external settings comprise a target data-rate for the bitstream and wherein the one or more control settings comprise a maximum data-rate for the spatial metadata.
Systems and methods for sound source virtualization
A system and method for externalizing sound. The system includes a headphone assembly and a localizer configured to collect information related to a location of the user and of an acoustically reflective surface in the environment. A controller is configured to determine a location of at least one virtual sound source, and generate head related transfer functions that simulate characteristics of sound from the virtual sound source directly to the user and to the user via a reflection by the reflective surface. A signal processing assembly is configured to create one or more output signals by filtering the sound signal respectively with the HRTFs. Each speaker of the headphone assembly is configured to produce sound in accordance with the output signal.
Streaming binaural audio from a cloud spatial audio processing system to a mobile station for playback on a personal audio delivery device
Spatial audio is received from an audio server over a first communication link. The spatial audio is converted by a cloud spatial audio processing system into binaural audio. The binauralized audio is streamed from the cloud spatial audio processing system to a mobile station over a second communication link to cause the mobile station to play the binaural audio on the personal audio delivery device.
PROCESSING OF AUDIO SIGNALS FROM MULTIPLE MICROPHONES
A first device includes a memory configured to store instructions and one or more processors configured to receive audio signals from multiple microphones. The one or more processors are configured to process the audio signals to generate direction-of-arrival information corresponding to one or more sources of sound represented in one or more of the audio signals. The one or more processors are also configured to and send, to a second device, data based on the direction-of-arrival information and a class or embedding associated with the direction-of-arrival information.