H04M2203/509

Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
20190268476 · 2019-08-29 · ·

An overlay network platform facilitates a multi-party conference. End users participate in the conference using client-based web browser software, and using a protocol such as WebRTC. According to this disclosure, an enhanced audio experience for the conference is providing by collecting and correlating microphone data from multiple co-located clients, and then constructing (at the platform) a three-dimensional (3D) sound profile of the room in which the clients are co-located. By processing in the platform (as opposed to locally at each client), the approach enables platform-side creation of an ad-hoc, high quality microphone array that identifies the relative positions and orientations of the microphones that are being used by the clients. Individual audio streams received from the microphones are combined, and the relative position information (of the individual microphones) is used to render a single audio stream that represents a high quality recording of the audio in the common physical space. Other clients in the conference request, receive and play back this high quality stream to obtain a high-fidelity 3D representation of the audio as if they are physically present in the room.

Method for optimizing speech pickup in a communication device

A method for optimizing speech pickup in a speakerphone system, wherein the speakerphone system comprises a microphone system placed in a specific configuration, wherein the method comprising receiving acoustic input signals by the microphone system, processing said acoustic input signals by using an algorithm for focusing and steering a selected target sound signal towards a desired direction, and transmitting an output signal based on said processing.

SOUND SIGNAL PROCESSING METHOD AND APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM
20240185876 · 2024-06-06 · ·

The present disclosure discloses a sound signal processing method and apparatus, and a computer-readable storage medium, and belongs to the field of audio processing technologies. A device first obtains first spatial location information of sound pickup space and second spatial location information of a plurality of microphones deployed non-linearly, and then determines, based on first sound signals respectively received by the plurality of microphones and the second spatial location information, a spatial location of a first sound source that emits the first sound signals. If the first sound source is located in the sound pickup space, the device performs enhancement processing on the sound signals emitted by the first sound source. In the present disclosure, sound sources in different space are distinguished, so that only a sound signal emitted by a sound source located in the sound pickup space is enhanced.

OPTIMAL VIEW SELECTION METHOD IN A VIDEO CONFERENCE
20190158733 · 2019-05-23 ·

A system for ensuring that the best available view of a person's face is included in a video stream when the person's face is being captured by multiple cameras at multiple angles at a first endpoint. The system uses one or more microphone arrays to capture direct-reverberant ratio information corresponding to the views, and determines which view most closely matches a view of the person looking directly at the camera, thereby improving the experience for viewers at a second endpoint.

Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
10291783 · 2019-05-14 · ·

An overlay network platform facilitates a multi-party conference. End users participate in the conference using client-based web browser software, and using a protocol such as WebRTC. According to this disclosure, an enhanced audio experience for the conference is providing by collecting and correlating microphone data from multiple co-located clients, and then constructing (at the platform) a three-dimensional (3D) sound profile of the room in which the clients are co-located. By processing in the platform (as opposed to locally at each client), the approach enables platform-side creation of an ad-hoc, high quality microphone array that identifies the relative positions and orientations of the microphones that are being used by the clients. Individual audio streams received from the microphones are combined, and the relative position information (of the individual microphones) is used to render a single audio stream that represents a high quality recording of the audio in the common physical space. Other clients in the conference request, receive and play back this high quality stream to obtain a high-fidelity 3D representation of the audio as if they are physically present in the room.

METHOD FOR OPTIMIZING SPEECH PICKUP IN A COMMUNICATION DEVICE

A method for optimizing speech pickup in a speakerphone system, wherein the speakerphone system comprises a microphone system placed in a specific configuration, wherein the method comprising receiving acoustic input signals by the microphone system, processing said acoustic input signals by using an algorithm for focusing and steering a selected target sound signal towards a desired direction, and transmitting an output signal based on said processing.

Wireless conference call telephone

A wireless conference call telephone system uses body-worn wired or wireless audio endpoints comprising microphones and, optionally, speakers. These audio-endpoints, which include headsets, pendants, and clip-on microphones to name a few, are used to capture the user's voice and the resulting data may be used to remove echo and environmental acoustic noise. Each audio-endpoint transmits its audio to the telephony gateway, where noise and echo suppression can take place if not already performed on the audio-endpoint, and where each audio-endpoint's output can be labeled, integrated with the output of other audio-endpoints, and transmitted over one or more telephony channels of a telephone network. The noise and echo suppression can also be done on the audio-endpoint. The labeling of each user's output can be used by the outside caller's phone to spatially locate each user in space, increasing intelligibility.

MULTITALKER OPTIMISED BEAMFORMING SYSTEM AND METHOD

A method of processing a series of microphone inputs of an audio conference, the method including the steps of: (a) conducting a spatial analysis and feature extraction of the audio conference based on current audio activity; (b) aggregating historical information to obtain information about the approximate relative location of recent sound objects relative to the series of microphone inputs; (c) utilising the relative location or distance of the sound objects from the series of microphone inputs to determine if beam forming should be utilised to enhance the audio reception from recent sound objects.

Array microphone module and system

A microphone module comprises a housing, an audio bus, and a first plurality of microphones in communication with the audio bus. The microphone module further comprises a module processor in communication with the first plurality of microphones and the audio bus. The module processor is configured to detect the presence of an array processor in communication with the audio bus, detect the presence of a second microphone module in communication with the audio bus, and configure the audio bus to pass audio signals from both the first plurality of microphones and the second microphone module to the array processor.

PHASE RESPONSE MISMATCH CORRECTION FOR MULTIPLE MICROPHONES

For a multiple microphone system, a phase response mismatch may be corrected. One embodiment includes receiving audio from a first microphone and from a second microphone, the microphones being coupled to a single device for combining the received audio, recording the received audio from the first microphone and the second microphone before combining the received audio, detecting a phase response mismatch in the recording at the device between the audio received at the second microphone and the audio received at the first microphone, if a phase response mismatch is detected, then estimating a phase delay between the second microphone and the first microphone, and storing the estimated phase delay for use in correcting the phase delay in received audio before combining the received audio.