Patent classifications
H04M3/568
METHOD TO PROTECT PRIVATE AUDIO COMMUNICATIONS
A computer-implemented method for detecting and concealing confidential communications is disclosed. The computer-implemented method includes determining an audio output source used by a participant of an audio conference is unidentifiable. The computer-implemented method further includes responsive to determining that the audio output source used by the participant of the audio conference is unidentifiable, transmitting a high frequency signal via audio conference software used by the participant to conduct the audio conference. The computer-implemented method further includes responsive to detecting the high frequency signal via a microphone of a user device used by the participant to listen to the audio conference, determining that the participant is using a speaker to listen to the audio conference.
Adaptive energy limiting for transient noise suppression
The present disclosure describes aspects of adaptive energy limiting for transient noise suppression. In some aspects, an adaptive energy limiter sets a limiter ceiling for an audio signal to full scale and receives a portion of the audio signal. For the portion of the audio signal, the adaptive energy limiter determines a maximum amplitude and evaluates the portion with a neural network to provide a voice likelihood estimate. Based on the maximum amplitude and the voice likelihood estimate, the adaptive energy limiter determines that the portion of the audio signal includes noise. In response to determining that the portion of the audio signal includes noise, the adaptive energy limiter decreases the limiter ceiling and provides the limiter ceiling to a limiter module effective to limit an amount of energy of the audio signal. This may be effective to prevent audio signals from carrying full energy transient noise into conference audio.
Multiple device conferencing with improved destination playback
Disclosed are systems and methods for providing a virtual conference using personal devices of the participants. In one embodiment, a proximity value is generated and encoded in audio streams from each device. A server can compare proximity values and enable a suitable microphone, while disabling the remaining microphones. Systems and techniques for improving capturing and synchronization of source audio and improved audio playback at destination are also disclosed.
Emotes for non-verbal communication in a videoconferencing system
A method is disclosed for videoconferencing in a three-dimensional virtual environment. In the method, a position and direction, a specification of an emote, and a video stream are received. The position and direction specify a location and orientation in the virtual environment and are input by a first user. The specification of the emote is also input by the first user. The video stream is captured from a camera on a device of the first user that is positioned to capture photographic images of the first user. The video stream is mapped onto a three-dimensional model of an avatar. From a perspective of a virtual camera of a second user, the virtual environment is rendering for display to the second user. The rendered environment includes the mapped three-dimensional model of the avatar located at the position and oriented at the direction and the emote attached to the video stream-mapped avatar.
AUTOCORRECTION OF PRONUNCIATIONS OF KEYWORDS IN AUDIO/VIDEOCONFERENCES
The present disclosure relates to automatically correcting mispronounced keywords during a conference session. More particularly, the present invention provides methods and systems for automatically correcting audio data generated from audio input having indications of mispronounced keywords during an audio/videoconferencing system. In some embodiments, the process of automatically correcting the audio data may require a re-encoding process of the audio data at the conference server. In alternative embodiments, the process may require updating the audio data at the receiver end of the conferencing system.
SYSTEMS AND METHODS FOR VIRTUAL MEETING SPEAKER SEPARATION
A computer-implemented machine learning method for improving speaker separation is provided. The method comprises processing audio data to generate prepared audio data and determining feature data and speaker data from the prepared audio data through a clustering iteration to generate an audio file. The method further comprises re-segmenting the audio file to generate a speaker segment and causing to display the speaker segment through a client device.
BREAKOUT OF PARTICIPANTS IN A CONFERENCE CALL
Systems and methods for creating and managing a breakout conference for a primary conference are disclosed. The system monitors communications between participants of a primary conference to determine if a) participants have a disagreement that needs to be resolved or b) if a topic from the meeting agenda requires additional time for discussion. Participant language, including negations and repetitive word usage, job profiles, body language, overlapping voice signals, among other factors, are monitored to determine if a disagreement exists. If a disagreement exists or additional time is required, the system automatically creates a virtual breakout session, determines the topic that created the disagreement, determines participants associated with the disagreed topic, and moves them to the breakout session. The system also provides meeting tools such that participants in the primary conference may communicate and alert participants in the breakout session, and vice versa, without leaving their respective sessions.
Use of local link to support transmission of spatial audio in a virtual environment
A method including enabling a first user having a first user device to communicate with one or more second users having one or more second user devices via a network, wherein each user has a spatial position within a virtual space, such that, for each user within the virtual space, all other users within the virtual space have a relative spatial position; providing spatial audio data from the first user device to the one or more second user devices and receiving spatial audio data at the first user device from the one or more second user devices, such that each user is provided with audio from the other users in the respective relative spatial positions of the other users; and enabling a third user having a third user device to communicate with the first user and the one or more second users via the first user device.
Relaying device and method of recording voice communication
[Problem] Provided is a relaying device that can track and record a communicated voice of a specific communication terminal using a mixing function of the relaying device. [Solution] When a communication terminal has made a call, a communication session in which the communication terminal that has made a call and a communication terminal that has been called are participating terminals is established. When a voice signal is transmitted from one participating terminal of the established communication session, this voice signal is transmitted to the other participating terminal(s) of the same communication session along with session information. A virtual device is associated with a communication terminal, and is registered in a communication session in which the communication terminal participates as a participating terminal together with the communication terminal. A communication monitoring unit records a voice signal transmitted to the virtual device from the communication session.
Holographic Calling for Artificial Reality
A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.