H04S2420/11

Audio zoom

A device includes one or more processors configured to execute instructions to determine a first phase based on a first audio signal of first audio signals and to determine a second phase based on a second audio signal of second audio signals. The one or more processors are also configured to execute the instructions to apply spatial filtering to selected audio signals of the first audio signals and the second audio signals to generate an enhanced audio signal. The one or more processors are further configured to execute the instructions to generate a first output signal including combining a magnitude of the enhanced audio signal with the first phase and to generate a second output signal including combining the magnitude of the enhanced audio signal with the second phase. The first output signal and the second output signal correspond to an audio zoomed signal.

METHOD AND DEVICE FOR GENERATING AND PLAYING BACK AUDIO SIGNAL

A method for generating audio according to an embodiment of the present invention, for solving the above technical problem, comprises the steps of: receiving an audio signal through at least one mic; generating an input channel signal respectively corresponding to the at least one mic; generating a virtual input channel signal based on the input channel signal; generating additional information including playback positions of the input channel signal and the virtual input channel signal; and transmitting a multichannel audio signal including the input channel signal and the virtual input channel signal, and the additional information.

Switching between predictive and non-predictive quantization techniques in a higher order ambisonics (HOA) framework

A device comprising a memory and one or more processors may be configured extract, from the bitstream, a type of quantization mode. The one or more processors may also be configured to switch, based on the type of quantization mode, between non-predictive vector dequantization to reconstruct a first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and predictive vector dequantization to reconstruct a second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain. The memory may be configured to store the reconstructed first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and the reconstructed second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain.

Binaural navigation cues

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing binaural navigational cures. In one aspect, a method includes presenting audio media in a non-directional playback state that presents the audible media in an original format, iteratively determining a navigational heading relative to a current navigational course, the navigational heading indicative of a direction to face to proceed along a navigational route, and for each iteration, determining whether a change is required to the current navigational course based on the navigational heading. For each determination that a change is not required to the current navigational course, presenting the audio media in the non-directional playback state, and for each determination that a change is required to the current navigational course, changing the non-directional playback state to a directional playback state that presents the audible media in a modified format that is directional from the navigational heading.

Sound collection and playback apparatus, and recording medium

A microphone array includes first and second microphones (Ma, Mb) placed on a first axis (f), a third microphone (Mc) placed on a plane (fg) formed by the first axis and a second axis (g) and at a position other than on the first axis, and a fourth microphone (Md) placed on a third axis (h), and at a position other than on the plane formed by the first and the second axes, and a processing circuit generates signals (Cx, Cy, Cz) having bidirectionality in first, second and third mutually perpendicular directions (x, y, z), and an omnidirectional signal (Cw), based on signals (Ba to Bd) obtained by sound collection by means of the first to fourth microphones. It is possible to generate signals having bidirectionality in mutually perpendicular directions, and an omnidirectional signal, without using special microphones, and without excessive restrictions with regard to the placement of the microphones.

METHOD AND APPARATUS FOR LOW BIT RATE COMPRESSION OF A HIGHER ORDER AMBISONICS HOA SIGNAL REPRESENTATION OF A SOUND FIELD
20170243589 · 2017-08-24 ·

The invention is suited for improving a low bit rate compressed and decompressed Higher Order Ambisonics HOA signal representation of a sound field, wherein the decompression provides a spatially sparse decoded HOA representation and a set of indices of coefficient sequences of this representation. From reconstructed signals of the original HOA representation a number of modified phase spectra signals are created using de-correlation filters, which modified phase spectra signals are uncorrelated with the signals of said original representation. The modified phase spectra signals are mixed with each other using predetermined mixing parameters, in order to provide a replicated ambient HOA component. Finally the spatially sparse decoded HOA representation is enhanced with the replicated time domain HOA representation.

ORIENTATION-AWARE SURROUND SOUND PLAYBACK

Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.

METHOD, COMPUTER READABLE STORAGE MEDIUM, AND APPARATUS FOR DETERMINING A TARGET SOUND SCENE AT A TARGET POSITION FROM TWO OR MORE SOURCE SOUND SCENES

A method, a computer readable storage medium, and an apparatus for determining a target sound scene at a target position from two or more source sound scenes. A positioning unit positions spatial domain representations of the two or more source sound scenes in a virtual scene. These representations are represented by virtual loudspeaker positions. A projecting unit then obtains projected virtual loudspeaker positions of a spatial domain representation of the target sound scene by projecting the virtual loudspeaker positions of the two or more source sound scenes on a circle or a sphere around the target position.

SIGNAL PROCESSING METHODS AND SYSTEMS FOR RENDERING AUDIO ON VIRTUAL LOUDSPEAKER ARRAYS
20170245082 · 2017-08-24 ·

Techniques of rendering audio involve applying a balanced-realization state space model to each head-related transfer function (HRTF) to reduce the order of an effective FIR or even an infinite impulse response (IIR) filter. Along these lines, each HRTF G(z) is derived from a head-related impulse response filter (HRIR) via, e.g., a z-transform. The data of the HRIR may be used to construct a first state space representation [A, B, C, D] of the HRTF via the relation .G(z)=C(zI−A).sup.−1B+D This first state space representation is not unique and so for an FIR filter, A and B may be set to simple, binary-valued arrays, while C and D contain the HRIR data. This representation leads to a simple form of a Gramian Q whose eigenvectors provide system states that maximize the system gain as measured by a Hankel norm. Further, a factorization of Q provides a transformation into a balanced state space in which the Gramian is equal to a diagonal matrix of the eigenvalues of Q. By considering only those states associated with an eigenvalue greater than some threshold, the balanced state space representation of the HRTF may be truncated to provide an approximate HRTF that approximates the original HRTF very well while reducing the amount of computation required by as much as 90%.

Method and system for processing an audio signal including ambisonic encoding
11432092 · 2022-08-30 · ·

A method for processing a sound signal including synchronously acquiring an input sound signal S.sub.input by means of at least two omnidirectional microphones, encoding the input sound signal S.sub.entréeinput in a sound data D format of the ambisonics type of order R, R being a natural number greater than or equal to one, the encoding step including a directivity optimisation sub-step carried out by means of filters of the Finite Impulse Response filter type. Each of the signals acquired by the microphones is filtered during the directivity optimisation sub-step by a FIR filter, then subtracted from an unfiltered version of each of the other signals in order to obtain N enhanced signals. The present invention also relates to a system for processing the sound signal.