H04S2400/15

System and Method for Self-attention-based Combining of Multichannel Signals for Speech Processing

A method, computer program product, and computing system for receiving a plurality of signals from a plurality of microphones, thus defining a plurality of channels. A weighted multichannel representation of the plurality of channels may be generated. A plurality of weights for each channel of the plurality of channels may be generated based upon, at least in part, the weighted multichannel representation of the plurality of channels. A single channel representation of the plurality of channels may be generated based upon, at least in part, the weighted multichannel representation of the plurality of channels and the plurality of weights generated for each channel of the plurality of channels.

Distributed audio capturing techniques for virtual reality (VR), augmented reality (AR), and mixed reality (MR) systems

Systems, devices, and methods for capturing audio which can be used in applications such as virtual reality, augmented reality, and mixed reality systems. Some systems can include a plurality of distributed monitoring devices. Each monitoring device can include a microphone and a location tracking unit. The monitoring devices can capture audio signals in an environment, as well as location tracking signals which respectively indicate the locations of the monitoring devices over time during capture of the audio signals. The system can also include a processor to receive the audio signals and the location tracking signals. The processor can determine one or more acoustic properties of the environment based on the audio signals and the location tracking signals.

Acquisition equipment, sound acquisition method, and sound source tracking system and method

An acquisition equipment, a sound acquisition method, a sound source tracking system and a sound source tracking method are provided. The acquisition equipment includes an audio acquisition device, an image acquisition device, an information processing device and an angle control device. The audio acquisition device is configured to acquire the sound of a target object; the image acquisition device is configured to acquire an optical image including an acquisition object; the information processing device is configured to process the optical image to determine position information of the target object; and the angle control device is configured to receive the position information of the target object sent by the information processing device, and control the sound pick-up angle of the audio acquisition device according to the position information of the target object.

Wearer identification based on personalized acoustic transfer functions

A wearable device includes an audio system. In one embodiment, the audio system includes a sensor array that includes a plurality of acoustic sensors. When a user wears the wearable device, the audio system determines an acoustic transfer function for the user based upon detected sounds within a local area surrounding the sensor array. Because the acoustic transfer function is based upon the size, shape, and density of the user's body (e.g., the user's head), different acoustic transfer functions will be determined for different users. The determined acoustic transfer functions are compared with stored acoustic transfer functions of known users in order to authenticate the user of the wearable device.

APPARATUS, METHOD AND COMPUTER-READABLE STORAGE MEDIUM FOR MIXING COLLECTED SOUND SIGNALS OF MICROPHONES
20220394382 · 2022-12-08 ·

An apparatus comprising: one or more processors; and one or more memory devices configured to store one or more computer programs executable by the one or more processors. The one or more programs, when executed by the one or more processors, cause the apparatus to function as: a setting unit configured to set an angle section at a single sound collection position, selected by a user; a analysis unit configured to convert each of M collected sound signals into a frequency component; a beamforming unit configured to multiply M frequency components obtained through conversion by the analysis unit by respective beamforming matrices to generate a plurality of acoustic signals of two channels; and a signal generation unit configured to synthesize the acoustic signals per channel and outputting an acoustic signal for every channel.

SOMATOSENSORY VIBRATION GENERATING DEVICE AND METHOD FOR FORMING SOMATOSENSORY VIBRATION
20220392318 · 2022-12-08 ·

The invention provides a somatosensory vibration generating device comprising: an audio signal receiving module for receiving sound waves of external environmental sounds and converting the sound waves into a first audio frequency signal; a digital-to-analog conversion module for performing digital-to-analog conversion on the first audio frequency signal to generate and output a second audio frequency signal after digital-to-analog conversion; a digital signal processing module for converting the second audio frequency signal output by the digital-to-analog conversion module into a first vibration signal; an operational amplifier for performing gain processing on the first vibration signal and outputting a second vibration signal after gain processing; and at least one tactile transducer at least comprising a vibration element and a tactile transducer; and a frequency of the second audio frequency signal is less than 200 Hz.

Audio apparatus and method of operation therefor

An audio apparatus, e.g. for rendering audio for a virtual/augmented reality application, comprises a receiver (201) for receiving audio data for an audio scene including a first audio component representing a real-world audio source present in an audio environment of a user. A determinator (203) determines a first property of a real-world audio component from the real-world audio source and a target processor (205) determines a target property for a combined audio component being a combination of the real-world audio component received by the user and rendered audio of the first audio component received by the user. An adjuster determines a render property by modifying a property of the first audio component indicated by the audio data for the first audio component in response to the target property and the first property. A renderer (209) renders the first audio component in response to the render property.

Spatial audio processing

According to an example embodiment, a method for processing a spatial audio signal that represents an audio scene, wherein the spatial audio signal is controllable and associated with at least two viewing directions is provided, the method including receiving a focus direction and a focus amount; processing the spatial audio signal by modifying the audio scene so as to control emphasis in, at least in part, a portion of the spatial audio signal in said focus direction according to said focus amount; and outputting the processed spatial audio signal, wherein the modified audio scene enables the emphasis in, at least in part, said portion of the spatial audio signal in said focus direction according to said focus amount.

Apparatus and method for processing volumetric audio

A method including receiving an audio scene including at least one source captured using at least one near field microphone and at least one far field microphone. The method includes determining at least one room-impulse-response associated with the audio scene based on the at least one near field microphone and the at least one far field microphone, accessing a predetermined scene geometry corresponding to the audio scene, and identifying best match to the predetermined scene geometry in a scene geometry database. The method also includes performing RIR comparison based on the at least one RIR and at least one geometric RIR associated with the best matching geometry and rendering a volumetric audio scene based on a result of the RIR comparison.

Audio system for dynamic determination of personalized acoustic transfer functions

An eyewear device includes an audio system. In one embodiment, the audio system includes a microphone array that includes a plurality of acoustic sensors. Each acoustic sensor is configured to detect sounds within a local area surrounding the microphone array. For a plurality of the detected sounds, the audio system performs a direction of arrival (DoA) estimation. Based on parameters of the detected sound and/or the DoA estimation, the audio system may then generate or update one or more acoustic transfer functions unique to a user. The audio system may use the one or more acoustic transfer functions to generate audio content for the user.