Patent classifications
H04M3/568
VOICE CALL CONTROL METHOD AND APPARATUS, COMPUTER-READABLE MEDIUM, AND ELECTRONIC DEVICE
Embodiments of this application provide a real-time voice call control method performed by an electronic device. The method includes: obtaining a mixed call voice in real time during a cloud conference call, where the mixed call voice includes at least one branch voice; determining energy information corresponding to each frequency point of the call voice in a frequency domain; determining an energy proportion of each branch voice at each frequency point in total energy of the frequency point based on the energy information at the frequency point; determining a quantity of branch voices comprised in the call voice based on the energy proportion of each branch voice at each frequency point; and controlling the voice call by setting a call voice control manner based on the quantity of branch voices.
Systems and methods of handling speech audio stream interruptions
A device for communication includes one or more processors configured to receive, during an online meeting, a speech audio stream representing speech of a first user. The one or more processors are also configured to receive a text stream representing the speech of the first user. The one or more processors are further configured to selectively generate an output based on the text stream in response to an interruption in the speech audio stream.
Video Conferencing Systems Featuring Multiple Spatial Interaction Modes
Systems and methods for multi-attendee video conferencing are described. A system can convert from huddle video conference mode to spatial video conference mode. In particular, by assigning user roles, specific users can have greater control of the video conference as compared to other users. For instance, moderators may have a greater level of control of the video conferencing system. Thus, in example implementations of the present disclosure, specific users can affect transition between two or more video conferencing modes, such as between a huddle video conference mode and a spatial video conference mode.
POST-CONFERENCE PLAYBACK SYSTEM HAVING HIGHER PERCEIVED QUALITY THAN ORIGINALLY HEARD IN THE CONFERENCE
Some aspects of the present disclosure involve the recording, processing and playback of audio data corresponding to conferences, such as teleconferences. In some teleconference implementations, the audio experience heard when a recording of the conference is played back may be substantially different from the audio experience of an individual conference participant during the original teleconference. In some implementations, the recorded audio data may include at least some audio data that was not available during the teleconference. In some examples, the spatial characteristics of the played-back audio data may be different from that of the audio heard by participants of the teleconference.
Controlling Audio Signal Parameters
A method and corresponding system for correcting for deviations in a performance that includes a plurality of audio sources, the method comprising detecting a parameter relating to an audio source, determining if the parameter deviates from a predetermined characteristic and in response to it being determined that the parameter deviates from the predetermined characteristic, causing display of a user interface configured to control the parameter, to allow a user to correct the deviation.
MIXED REALITY VIRTUAL REVERBERATION
A method of presenting an audio signal to a user of a mixed reality environment is disclosed, the method comprising the steps of detecting a first audio signal in the mixed reality environment, where the first audio signal is a real audio signal; identifying a virtual object intersected by the first audio signal in the mixed reality environment; identifying a listener coordinate associated with the user; determining, using the virtual object and the listener coordinate, a transfer function; applying the transfer function to the first audio signal to produce a second audio signal; and presenting, to the user, the second audio signal.
COMMUNICATION SYSTEM AND EVALUATION METHOD
A communication system is configured to broadcast utterance voice data received from one of mobile communication terminals to other mobile communication terminals, to control text delivery such that a result of utterance voice recognition from voice recognition processing on the received utterance voice data is displayed on the mobile communication terminals in synchronization, and to use the result of utterance voice recognition to perform communication evaluation. The communication evaluation includes a first evaluation including evaluating a dialogue between users based on a group dialogue index to produce group communication evaluation information, a second evaluation including evaluating utterances constituting the dialogue between the users based on a personal utterance index to produce personal utterance evaluation information, and a third evaluation including using the group communication evaluation information and the personal utterance evaluation information to produce entire communication group evaluation information.
AUDIO WATERMARK ADDITION METHOD, AUDIO WATERMARK PARSING METHOD, DEVICE, AND MEDIUM
An audio watermark addition method is provided, and includes: A playback terminal obtains first audio in real time, embeds an audio watermark into the first audio, where the audio watermark is associated with the playback terminal; and plays the first audio embedded with the audio watermark.
Virtual Conferencing System with Layered Conversations
A virtual communication system including logic for supporting layered conversations, the system comprising: at least two computers communicating to one another over a communications medium, where each computer represents a User, each computer including a display, audio input, audio output, video input (such as a camera), memory, data storage, and a processor, a knowledge base stored in memory, the knowledgebase containing an identifier identifying each conversation associated with a given user, the knowledgebase storing information identifying each participant in each of the conversations, the knowledgebase storing information identifying the type of each conversation associated with the given one of the at least two computers, where each User may participate in multiple simultaneous conversations; and layered conversation logic executed by the processor, the layered conversation logic controlling a volume and a visual layout of each participant in each conversation in which the User participates in accordance with the conversation type.
Voice Filtering Other Speakers From Calls And Audio Messages
A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.