G10L17/06

SYSTEM AND METHOD FOR AUGMENTED AUTHENTICATION USING ACOUSTIC DEVICES
20230216845 · 2023-07-06 · ·

Systems, methods, and computer program products are provided for augmented authentication using acoustic devices. The method includes receiving a transfer request including an NFT identifier from one of one or more acoustic devices. The NFT identifier corresponds to an acoustic device NFT associated with the given acoustic device and a device user. The method includes comparing the NFT identifier with one or more stored NFT identifiers to determine the given acoustic device associated with the NFT identifier. The method further includes confirming that the identity of the voice command user matches the device user associated with the acoustic device. The method still further includes causing an authentication of the transfer request upon confirming the acoustic device is associated with the voice command user.

SYSTEM AND METHOD FOR AUGMENTED AUTHENTICATION USING ACOUSTIC DEVICES
20230216845 · 2023-07-06 · ·

Systems, methods, and computer program products are provided for augmented authentication using acoustic devices. The method includes receiving a transfer request including an NFT identifier from one of one or more acoustic devices. The NFT identifier corresponds to an acoustic device NFT associated with the given acoustic device and a device user. The method includes comparing the NFT identifier with one or more stored NFT identifiers to determine the given acoustic device associated with the NFT identifier. The method further includes confirming that the identity of the voice command user matches the device user associated with the acoustic device. The method still further includes causing an authentication of the transfer request upon confirming the acoustic device is associated with the voice command user.

SYSTEM AND METHOD FOR REAL-TIME FRAUD DETECTION IN VOICE BIOMETRIC SYSTEMS USING PHONEMES IN FRAUDSTER VOICE PRINTS
20230214850 · 2023-07-06 · ·

A system and method for real-time fraud detection with a social engineering phoneme (SEP) watchlist of phoneme sequences may perform real-time fraud prevention operations including receiving incoming call interactions and grouping the call interactions into one or more clusters, each cluster associated with a speaker's voice based on voiceprints. For a pair of voiceprints in a cluster, a phoneme sequence is extracted for each voice print. From the extracted phoneme sequences, a similarity score is then calculated to determine if a match exists between the extracted phoneme sequences based on a threshold. If determined a match exists, the phoneme sequence may be added to a SEP watchlist.

SYSTEM AND METHOD FOR REAL-TIME FRAUD DETECTION IN VOICE BIOMETRIC SYSTEMS USING PHONEMES IN FRAUDSTER VOICE PRINTS
20230214850 · 2023-07-06 · ·

A system and method for real-time fraud detection with a social engineering phoneme (SEP) watchlist of phoneme sequences may perform real-time fraud prevention operations including receiving incoming call interactions and grouping the call interactions into one or more clusters, each cluster associated with a speaker's voice based on voiceprints. For a pair of voiceprints in a cluster, a phoneme sequence is extracted for each voice print. From the extracted phoneme sequences, a similarity score is then calculated to determine if a match exists between the extracted phoneme sequences based on a threshold. If determined a match exists, the phoneme sequence may be added to a SEP watchlist.

Speaker identification
11694695 · 2023-07-04 · ·

A method of speaker identification comprises receiving an audio signal representing speech; performing a first voice biometric process on the audio signal to attempt to identify whether the speech is the speech of an enrolled speaker; and, if the first voice biometric process makes an initial determination that the speech is the speech of an enrolled user, performing a second voice biometric process on the audio signal to attempt to identify whether the speech is the speech of the enrolled speaker. The second voice biometric process is selected to be more discriminative than the first voice biometric process.

Speaker identification
11694695 · 2023-07-04 · ·

A method of speaker identification comprises receiving an audio signal representing speech; performing a first voice biometric process on the audio signal to attempt to identify whether the speech is the speech of an enrolled speaker; and, if the first voice biometric process makes an initial determination that the speech is the speech of an enrolled user, performing a second voice biometric process on the audio signal to attempt to identify whether the speech is the speech of the enrolled speaker. The second voice biometric process is selected to be more discriminative than the first voice biometric process.

Auto-completion for gesture-input in assistant systems

In one embodiment, a method includes receiving an initial input in a first modality from a first user from a client system associated with the first user, determining one or more intents corresponding to the initial input by an intent-understanding module, generating one or more candidate continuation-inputs based on the one or more intents, where the one or more candidate continuation-inputs are in one or more candidate modalities, respectively, and wherein the candidate modalities are different from the first modality, and sending instructions for presenting one or more suggested inputs corresponding to one or more of the candidate continuation-inputs to the client system.

Auto-completion for gesture-input in assistant systems

In one embodiment, a method includes receiving an initial input in a first modality from a first user from a client system associated with the first user, determining one or more intents corresponding to the initial input by an intent-understanding module, generating one or more candidate continuation-inputs based on the one or more intents, where the one or more candidate continuation-inputs are in one or more candidate modalities, respectively, and wherein the candidate modalities are different from the first modality, and sending instructions for presenting one or more suggested inputs corresponding to one or more of the candidate continuation-inputs to the client system.

Detecting deep-fake audio through vocal tract reconstruction

A method is provided for identifying synthetic “deep-fake” audio samples versus organic audio samples. Methods may include: generating a model of a vocal tract using one or more organic audio samples from a user; identifying a set of bigram-feature pairs from the one or more audio samples; estimating the cross-sectional area of the vocal tract of the user when speaking the set of bigram-feature pairs; receiving a candidate audio sample; identifying bigram-feature pairs of the candidate audio sample that are in the set of bigram-feature pairs; calculating a cross-sectional area of a theoretical vocal tract of a user when speaking the identified bigram-feature pairs; and identifying the candidate audio sample as a deep-fake audio sample in response to the calculated cross-sectional area of the theoretical vocal tract of a user failing to correspond within a predetermined measure of the estimated cross sectional area of the vocal tract of the user.

Detecting deep-fake audio through vocal tract reconstruction

A method is provided for identifying synthetic “deep-fake” audio samples versus organic audio samples. Methods may include: generating a model of a vocal tract using one or more organic audio samples from a user; identifying a set of bigram-feature pairs from the one or more audio samples; estimating the cross-sectional area of the vocal tract of the user when speaking the set of bigram-feature pairs; receiving a candidate audio sample; identifying bigram-feature pairs of the candidate audio sample that are in the set of bigram-feature pairs; calculating a cross-sectional area of a theoretical vocal tract of a user when speaking the identified bigram-feature pairs; and identifying the candidate audio sample as a deep-fake audio sample in response to the calculated cross-sectional area of the theoretical vocal tract of a user failing to correspond within a predetermined measure of the estimated cross sectional area of the vocal tract of the user.