G10L17/14

Voice monitoring system and voice monitoring method

A recording device records a video and an imaging time, and a voice. Based on the voice, a sound parameter calculator calculates a sound parameter for specifying magnitude of the voice in a monitoring area at the imaging time for each of pixels and for each of certain times. A sound parameter storage unit stores the sound parameter. A sound parameter display controller superimposes a voice heat map on a captured image of the monitoring area and displays the superimposed image on a monitor. At this time, the sound parameter display controller displays the voice heat map based on a cumulative time value of magnitude of the voice, according to designation of a time range.

Information processing method, system, electronic device, and computer storage medium
11664030 · 2023-05-30 · ·

An information processing method includes receiving first text information, which is generated according to a speech, input through a first input device; receiving audio information recorded by a second input device, wherein the audio information is generated and recorded according to the speech; performing speech recognition on the audio information to obtain second text information; and presenting the first text information and the second text information. A correspondence relationship exists between content in the first text information and content in the second text information.

Information processing method, system, electronic device, and computer storage medium
11664030 · 2023-05-30 · ·

An information processing method includes receiving first text information, which is generated according to a speech, input through a first input device; receiving audio information recorded by a second input device, wherein the audio information is generated and recorded according to the speech; performing speech recognition on the audio information to obtain second text information; and presenting the first text information and the second text information. A correspondence relationship exists between content in the first text information and content in the second text information.

Authentication via a dynamic passphrase
11625467 · 2023-04-11 · ·

A computerized method for voice authentication of a customer in a self-service system is provided. A request for authentication of the customer is received and the customer is enrolled in the self-service system with a text-independent voice print. A passphrase from a plurality of passphrases to transmit to the customer is determined based on comparing each of the plurality of passphrases to a text-dependent or text-independent voice biometric model. The passphrase is transmitted to the customer, and when the customer responds, an audio stream of the passphrase is received. The customer is authenticated by comparing the audio stream of the passphrase against the text-independent voice print. If the customer is authenticated, then the audio stream of the passphrase and the topic of the passphrase may be stored.

Authentication via a dynamic passphrase
11625467 · 2023-04-11 · ·

A computerized method for voice authentication of a customer in a self-service system is provided. A request for authentication of the customer is received and the customer is enrolled in the self-service system with a text-independent voice print. A passphrase from a plurality of passphrases to transmit to the customer is determined based on comparing each of the plurality of passphrases to a text-dependent or text-independent voice biometric model. The passphrase is transmitted to the customer, and when the customer responds, an audio stream of the passphrase is received. The customer is authenticated by comparing the audio stream of the passphrase against the text-independent voice print. If the customer is authenticated, then the audio stream of the passphrase and the topic of the passphrase may be stored.

Shared Assistant Profiles Verified Via Speaker Identification
20230153410 · 2023-05-18 · ·

A method for sharing assistant profiles includes receiving, at a profile service, from an assistant service interacting with a user device of a user, a request requesting the profile service to release personal information associated with the user to the assistant service. The operations also include performing, through the assistant service, a verification process to verify that the user consents to releasing the requested personal information by: instructing the assistant service to prompt the user to recite a unique token prescribed to the user; receiving audio data characterizing a spoken utterance captured by the user device of the user; processing the audio data to determine whether a transcription of the spoken utterance recites the unique token; and when the transcription of the spoken utterance recites the unique token, releasing, to the assistant service, the requested personal information stored on a centralized data store managed by the profile service.

Shared Assistant Profiles Verified Via Speaker Identification
20230153410 · 2023-05-18 · ·

A method for sharing assistant profiles includes receiving, at a profile service, from an assistant service interacting with a user device of a user, a request requesting the profile service to release personal information associated with the user to the assistant service. The operations also include performing, through the assistant service, a verification process to verify that the user consents to releasing the requested personal information by: instructing the assistant service to prompt the user to recite a unique token prescribed to the user; receiving audio data characterizing a spoken utterance captured by the user device of the user; processing the audio data to determine whether a transcription of the spoken utterance recites the unique token; and when the transcription of the spoken utterance recites the unique token, releasing, to the assistant service, the requested personal information stored on a centralized data store managed by the profile service.

ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF

An electronic apparatus is disclosed. The electronic apparatus may include a microphone; a memory configured to store a wakeup word; and a processor configured to: identify, based on context information of the electronic apparatus, an occurrence of a pre-determined event; change, based on the occurrence of the pre-determined event, a first threshold value for recognizing the wakeup word; obtain, based on a first user voice input received via the microphone, a similarity value between first text information corresponding to the first user voice input and the wakeup word; and perform, based on the similarity value being greater than or equal to the first threshold value, a voice recognition function on second text information corresponding to a second user voice input received via the microphone after the first user voice input.

ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF

An electronic apparatus is disclosed. The electronic apparatus may include a microphone; a memory configured to store a wakeup word; and a processor configured to: identify, based on context information of the electronic apparatus, an occurrence of a pre-determined event; change, based on the occurrence of the pre-determined event, a first threshold value for recognizing the wakeup word; obtain, based on a first user voice input received via the microphone, a similarity value between first text information corresponding to the first user voice input and the wakeup word; and perform, based on the similarity value being greater than or equal to the first threshold value, a voice recognition function on second text information corresponding to a second user voice input received via the microphone after the first user voice input.

REDUCING BANDWIDTH REQUIREMENTS OF VIRTUAL COLLABORATION SESSIONS
20230146818 · 2023-05-11 ·

A computer-implemented method, a computer system and a computer program product reduce bandwidth requirements of a virtual collaboration session. The method includes capturing session data from a virtual collaboration session. The session data is selected from a group consisting of video data, audio data, an image of a screen of a connected device and text data. The method also includes connecting to a live blog platform. The method further includes transmitting a text transcription of the virtual collaboration session to the live blog platform. The text transcription is generated by scanning the audio data using a speech-to-text algorithm. In addition, the method includes classifying a topic in the virtual collaboration session based on importance. Lastly, the method includes transmitting a multimedia file related to the topic to the live blog platform in response to the topic being classified as important. The multimedia file is extracted from the session data.