Patent classifications
G10L17/12
USER IDENTITY VERIFICATION USING VOICE ANALYTICS FOR MULTIPLE FACTORS AND SITUATIONS
A security platform architecture is described herein. A user identity platform architecture which uses a multitude of biometric analytics to create an identity token unique to an individual human. This token is derived on biometric factors like human behaviors, motion analytics, human physical characteristics like facial patterns, voice recognition prints, usage of device patterns, user location actions and other human behaviors which can derive a token or be used as a dynamic password identifying the unique individual with high calculated confidence. Because of the dynamic nature and the many different factors, this method is extremely difficult to spoof or hack by malicious actors or malware software.
Electronic device and control method therefor
An electronic device is disclosed. The electronic device comprises: a voice input unit; a storage unit for storing a first text according to a first transcript format and at least one second text obtained by transcribing the first text in a second transcript format; and a processor for, when a voice text converted from a user voice input through the voice input unit corresponds to a preset instruction, executing a function according to the preset instruction. The processor executes a function according to a preset instruction when the preset instruction includes a first text and a voice text is a text in which the first text of the preset instruction has been transcribed into a second text of a second transcript format.
Electronic device and control method therefor
An electronic device is disclosed. The electronic device comprises: a voice input unit; a storage unit for storing a first text according to a first transcript format and at least one second text obtained by transcribing the first text in a second transcript format; and a processor for, when a voice text converted from a user voice input through the voice input unit corresponds to a preset instruction, executing a function according to the preset instruction. The processor executes a function according to a preset instruction when the preset instruction includes a first text and a voice text is a text in which the first text of the preset instruction has been transcribed into a second text of a second transcript format.
SPEAKER IDENTIFICATION DEVICE AND METHOD FOR REGISTERING FEATURES OF REGISTERED SPEECH FOR IDENTIFYING SPEAKER
[Problem] To suppress an erroneous identification resulting from registration speech, and identify the speaker stably and precisely.
[Solving means] The speech recognition unit 102 extracts the text data corresponding to the registration speech, as the extraction text data. The registration speech is a speech input by a registration speaker reading aloud registration target text data that is preliminarily set text data. The registration speech evaluation unit 103 calculates a score representing a similarity degree between the extracted text data and the registration target text data (registration speech score) for each registration speaker. The dictionary registration unit 104 registers the feature value of the registration speech in the speaker identification dictionary 108 for registering the feature value of the registration speech for each registration speaker, according to the evaluation result by the registration speech evaluation unit 103.
Assessing Speaker Recognition Performance
A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
Assessing Speaker Recognition Performance
A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
Computer-implement voice command authentication method and electronic device
A computer-implement voice command authentication method is provided. The method includes obtaining a sound signal stream; calculating a Signal-to-Noise Ratio (SNR) value of the sound signal stream; converting the sound signal stream into a Mel-Frequency Cepstral Coefficients (MFCC) stream; calculating a Dynamic Time Warping (DTW) distance corresponding to the MFCC stream according to the MFCC stream and one of a plurality of sample streams generated by the Gaussian Mixture Model with Universal Background Model (GMM-UBM); calculating, according to the MFCC stream and the sample streams, a Log-likelihood ratio value corresponding to the MFCC stream as a GMM-UBM score; determining whether the sound signal stream passes a voice command authentication according to the GMM-UBM score, the DTW distance and the SNR value; in response to determining that the sound signal stream passes the voice command authentication, determining that the sound signal stream is a voice stream spoken from a legal user.
Automatic speaker identification using speech recognition features
Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.
Automatic speaker identification using speech recognition features
Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.
Methods, apparatus and systems for authentication
Methods, apparatus and systems for biometric authentication based on an audio signal are provided. The audio signal comprises a representation of a voice signal of a user conducted via at least part of a user's skeleton. Further embodiments may relate to biometric authentication based upon a combination of a bone-conducted audio signal, or a bone-conducted voice biometric process, with an air-conducted voice signal.