Patent classifications
H04M2203/509
Nearby talker obscuring, duplicate dialogue amelioration and automatic muting of acoustically proximate participants
In an audio conferencing environment, including multiple users participating by means of a series of associated audio input devices for the provision of audio input, and a series of audio output devices for the output of audio output streams to the multiple users, with the audio input and output devices being interconnected to a mixing control server for the control and mixing of the audio inputs from each audio input devices to present a series of audio streams to the audio output devices, a method of reducing the effects of cross talk pickup of at least a first audio conversation by multiple audio input devices, the method including the steps of: (a) monitoring the series of audio input devices for the presence of a duplicate audio conversation input from at least two input audio sources in an audio output stream; and (b) where a duplicate audio conversation input is detected, suppressing the presence of the duplicate audio conversation input in the audio output stream.
Joint acoustic echo control and adaptive array processing
A system and method for joint acoustic echo control and adaptive array processing, comprising the decomposition of a captured sound field into N sub-sound fields, applying linear echo cancellation to each sub-sound field, selecting L sub-sound fields from the N sub-sound fields, performing L channel adaptive array processing utilizing the L selected sub-sound fields, and applying non-linear audio echo cancellation.
Filtering sounds for conferencing applications
A conferencing system includes a display device that displays video received from a remote communication device of a communication partner. An audio stream is transmitted to the remote communication device. The audio stream includes real-world sounds produced by one or more real-world audio sources captured by a microphone array and virtual sounds produced by one or more virtual audio sources. A relative volume of sounds in the audio stream is selectively adjusted based, at least in part, on real-world positioning of corresponding audio sources, including real-world and/or virtualized audio sources.
Optimal view selection method in a video conference
A system for ensuring that the best available view of a person's face is included in a video stream when the person's face is being captured by multiple cameras at multiple angles at a first endpoint. The system uses one or more microphone arrays to capture direct-reverberant ratio information corresponding to the views, and determines which view most closely matches a view of the person looking directly at the camera, thereby improving the experience for viewers at a second endpoint.
ARRAY MICROPHONE MODULE AND SYSTEM
A microphone module comprises a housing, an audio bus, and a first plurality of microphones in communication with the audio bus. The microphone module further comprises a module processor in communication with the first plurality of microphones and the audio bus. The module processor is configured to detect the presence of an array processor in communication with the audio bus, detect the presence of a second microphone module in communication with the audio bus, and configure the audio bus to pass audio signals from both the first plurality of microphones and the second microphone module to the array processor.
Sound emission and collection device, and sound emission and collection method
A sound emission and collection device includes a speaker, a filter processing a sound emission signal, microphones, echo cancellers cancelling regression sound signals of the sound emitted by the speaker from the sound collection signals of the corresponding microphones, a first integration section integrating adaptive filter coefficients taken out from the plurality of echo cancellers, a reverberation time estimation section estimating the reverberation time for each frequency band in the space in which the speaker and the plurality of microphones are present on the basis of the integrated adaptive filter coefficient, and an arithmetic operation section specifying a frequency band having a long reverberation time from the sound emission signal based on the estimated reverberation time, calculating a filter coefficient for suppressing power of the specified frequency band, and setting the filter coefficient to the filter.
Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
An overlay network platform facilitates a multi-party conference. End users participate in the conference using client-based web browser software, and using a protocol such as WebRTC. According to this disclosure, an enhanced audio experience for the conference is providing by collecting and correlating microphone data from multiple co-located clients, and then constructing (at the platform) a three-dimensional (3D) sound profile of the room in which the clients are co-located. By processing in the platform (as opposed to locally at each client), the approach enables platform-side creation of an ad-hoc, high quality microphone array that identifies the relative positions and orientations of the microphones that are being used by the clients. Individual audio streams received from the microphones are combined, and the relative position information (of the individual microphones) is used to render a single audio stream that represents a high quality recording of the audio in the common physical space. Other clients in the conference request, receive and play back this high quality stream to obtain a high-fidelity 3D representation of the audio as if they are physically present in the room.
Active speaker location detection
Various examples related to determining a location of an active participant are provided. In one example, image data of a room from an image capture device is received. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array spaced from the image capture device is received. Using a three dimensional model, a location of the second microphone array is determined. Using the first audio data, second audio data, location of the second microphone array, and an angular orientation of the second microphone array, an estimated location of the active participant is determined.
Nearby Talker Obscuring, Duplicate Dialogue Amelioration and Automatic Muting of Acoustically Proximate Participants
In an audio conferencing environment, including multiple users participating by means of a series of associated audio input devices for the provision of audio input, and a series of audio output devices for the output of audio output streams to the multiple users, with the audio input and output devices being interconnected to a mixing control server for the control and mixing of the audio inputs from each audio input devices to present a series of audio streams to the audio output devices, a method of reducing the effects of cross talk pickup of at least a first audio conversation by multiple audio input devices, the method including the steps of: (a) monitoring the series of audio input devices for the presence of a duplicate audio conversation input from at least two input audio sources in an audio output stream; and (b) where a duplicate audio conversation input is detected, suppressing the presence of the duplicate audio conversation input in the audio output stream.
Acoustic echo cancellation for audio system with bring your own devices (BYOD)
A controller for the conference session generates a speaker signal for speakers in a conference room. The controller correlates the speaker signal with network timing information and generates speaker timing information. The controller transmits the correlated speaker signal and timing information to a mobile device participating in the conference session. The mobile device generates an echo cancelled microphone signal from a microphone of the mobile device, and transmits the echo cancelled signal back to the controller. The controller also receives array microphone signals associated with an array of microphones at known positions in the room. The controller estimates a relative location of the mobile device within the conference room. The controller dynamically selects as audio output corresponding to the mobile device location either the echo cancelled microphone signal from the mobile device or an echo cancelled array microphone signal associated with the relative location of the mobile device.