H04R5/00

Enhanced spatialization system

A system enhances spatialization in an audio signal at a receiving location. The system applies a phase difference analysis to signals received from an array of spaced apart input devices that convert sound into electrical signals. The system derives spatial or directional information about the relative locations of the sound sources. The converted signals may be mixed using weights derived from the spatial information to generate a multichannel output signal that, when processed by a remote or local audio system, generates a representation of the relative locations of the sound sources at the originating location at the receiving location.

Encoder, decoder and methods for backward compatible dynamic adaption of time/frequency resolution spatial-audio-object-coding

A decoder for generating an audio output signal having one or more audio output channels from a downmix signal having a plurality of time-domain downmix samples is provided. The downmix signal encodes two or more audio object signals. The decoder has a window-sequence generator for determining a plurality of analysis windows, each having a plurality of time-domain downmix samples of the downmix signal and a window length indicating the number of the time-domain downmix samples. Moreover, the decoder has a t/f-analysis module for transforming the plurality of time-domain downmix samples of each analysis window from a time-domain to a time-frequency domain depending on the window length of said analysis window, to obtain a transformed downmix. Furthermore, the decoder has an un-mixing unit for un-mixing the transformed downmix based on parametric side information on the two or more audio object signals to obtain the audio output signal. Moreover, an encoder is provided.

Method and apparatus for compressing and decompressing a Higher Order Ambisonics representation

Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.

Acoustic camera based audio visual scene analysis

Techniques are disclosed for scene analysis including the use of acoustic imaging and computer audio vision processes for monitoring applications. In some embodiments, an acoustic image device is utilized with a microphone array, image sensor, acoustic image controller, and a controller. In some cases, the controller analyzes at least a portion of the spatial spectrum within the acoustic image data to detect sound variations by identifying regions of pixels having intensities exceeding a particular threshold. In addition, the controller can detect two or more co-occurring sound events based on the relative distance between pixels with intensities exceeding the threshold. The resulting data fusion of image pixel data, audio sample data, and acoustic image data can be analyzed using computer audio vision, sound/voice recognition, and acoustic signature techniques to recognize/identify audio and visual features associated with the event and to empirically or theoretically determine one or more conditions causing each event.

Stereo microphone device
09729962 · 2017-08-08 · ·

There is provided an external stereo microphone device that makes it possible to change orientations of microphones over a wide range. An external stereo microphone attached to a mobile electronic device includes a pair of symmetrically-positioned microphones 60, a holder unit 40 having a pair of holders 46 symmetrically positioned so as to accommodate the pair of microphones 60 respectively and a joint 48 for joining the pair of holders 46 together, and a case 14 having a substantially-cylindrical portion 20 that rotatably supports the holder unit 40 and a body 18 that accommodates a circuit board 22 of the case 14. The joint 48 is a substantially-plate-shaped region that joins outer circumferential edges of the holders 46 together and that has a circumferential width which is one-half or less of an entire circumference of the holder unit.

Audio processing

An audio processing system (100) for spatial synthesis comprises an upmix stage (110) receiving a decoded m-channel downmix signal (X) and outputting, based thereon, an n-channel upmix signal (Y), wherein 2≦m<n. The upmix stage comprises a downmix modifying processor (120), which receives the m-channel downmix signal and outputting a modified downmix signal (d.sub.1, d.sub.2) obtained by cross mixing and non-linear processing of the downmix signal, and further comprises a first mixing matrix (130) receiving the downmix signal and the modified downmix signal, forming an n-channel linear combination of the downmix signal channels and modified downmix signal channels only and outputting this as the n-channel upmix signal. In an embodiment, the first mixing matrix accepts one or more mixing parameters (g, α.sub.1, . . . ) controlling at least one gain in the linear combination performed by the first mixing matrix. The gains are polynomials of degree ≦2.

Intelligently increasing the sound level of player

Techniques for optimizing a player based on the addition of a second player are disclosed. In an embodiment, when a first player no longer needs to play certain audio frequencies due to the addition of a second player, the gain of the first player is automatically increased as part of the setup process. In another embodiment, when a first player needs to play certain audio frequencies, for example due to the removal of a second player, the gain of the first player is automatically decreased. Many other embodiments are disclosed.

Intelligently increasing the sound level of player

Techniques for optimizing a player based on the addition of a second player are disclosed. In an embodiment, when a first player no longer needs to play certain audio frequencies due to the addition of a second player, the gain of the first player is automatically increased as part of the setup process. In another embodiment, when a first player needs to play certain audio frequencies, for example due to the removal of a second player, the gain of the first player is automatically decreased. Many other embodiments are disclosed.

Front loudspeaker directivity for surround sound systems
09729992 · 2017-08-08 · ·

An audio receiver that receives left, right, and center audio channels for a piece of sound program content is described. A content processor in the audio receiver generates separate audio signals corresponding to each channel for driving corresponding transducers in left and right loudspeaker arrays. The content processor generates (1) first center audio signals for driving transducers in the left array to generate a first center pattern, (2) second center audio signals for driving transducers in the right array to generate a second center pattern, (3) left audio signals for driving transducers in the left array to generate a left pattern, and (4) right audio signals for driving transducers in the right array to generate a right pattern. The first and second center patterns collectively represent the center channel while the left and right patterns respectively represent the left and right channels.

Information processing apparatus, information processing method, and program
11240624 · 2022-02-01 · ·

An information processing apparatus includes a storage, a sensor, a controller, and a sound output unit. The storage is capable of storing a plurality of sound information items associated with respective positions. The sensor is capable of detecting a displacement of one of the information processing apparatus and a user of the information processing apparatus. The controller is capable of extracting at least one sound information satisfying a predetermined condition out of the plurality of stored sound information items and generating, based on the detected displacement, multichannel sound information obtained by localizing the extracted sound information at the associated position. The sound output unit is capable of converting the generated multichannel sound information into stereo sound information and outputting it.