G10L17/12

Information processing apparatus, control method, and program

The information processing apparatus (2000) computes a first score representing a degree of similarity between the input sound data (10) and the registrant sound data (22) of the registrant (20). The information processing apparatus (2000) obtains a plurality of pieces of segmented sound data (12) by segmenting the input sound data (10) in the time direction. The information processing apparatus (2000) computes, for each piece of segmented sound data piece (12), a second score representing the degree of similarity between the segmented sound data (12) and the registrant sound data (22). The information processing apparatus 2000 makes first determination to determine whether a number of speakers of sound included in the input sound data (10) is one or multiple, using at least the second score. The information processing apparatus (2000) makes second determination to determine whether the input sound data (10) includes the sound of the registrant (20), based on the first score, the second scores, and a result of the first determination.

Presence data determination and utilization

Systems and methods for presence ground truth approximation and utilization are disclosed. For example, a system detects the presence of a predefined subject, such as a person associated with a given user profile, and/or determines that authentication criteria for performing an action in association with the user profile has been satisfied. A period of time to associate data is determined, and data of one or more data types is labeled as being associated with the speaker identification event. That data may be formatted and input into one or more models to train those models to more accurately detect presence and/or determine whether authentication of a user profile should succeed.

Presence data determination and utilization

Systems and methods for presence ground truth approximation and utilization are disclosed. For example, a system detects the presence of a predefined subject, such as a person associated with a given user profile, and/or determines that authentication criteria for performing an action in association with the user profile has been satisfied. A period of time to associate data is determined, and data of one or more data types is labeled as being associated with the speaker identification event. That data may be formatted and input into one or more models to train those models to more accurately detect presence and/or determine whether authentication of a user profile should succeed.

Systems and methods for proactive fraudster exposure in a customer service channel
11451658 · 2022-09-20 · ·

Methods for improved fraudster detection in a call center. A subset of a plurality of voiceprints from a plurality of interactions between callers and agents at a call center can be used as the basis for fraudster detection. A plurality of connected components that represents one or more voiceprints can be determined based on the subset of the plurality of voiceprints.

Systems and methods for proactive fraudster exposure in a customer service channel
11451658 · 2022-09-20 · ·

Methods for improved fraudster detection in a call center. A subset of a plurality of voiceprints from a plurality of interactions between callers and agents at a call center can be used as the basis for fraudster detection. A plurality of connected components that represents one or more voiceprints can be determined based on the subset of the plurality of voiceprints.

Speaker identification with ultra-short speech segments for far and near field voice assistance applications

A speaker recognition device includes a memory, and a processor. The memory stores enrolled key phrase data corresponding to utterances of a key phrase by enrolled users,and text-dependent and text-independent acoustic speaker models of the enrolled users. The processor is operatively connected to the memory, and executes instructions to authenticate a speaker as an enrolled user, which includes detecting input key phrase data corresponding to a key phrase uttered by the speaker, computing text-dependent and text-independent scores for the speaker using speech models of the enrolled user, computing a confidence score, and authenticating or rejecting the speaker as the enrolled user based on whether the confidence score indicates that the input key phrase data corresponds to the speech from the enrolled user.

Assessing speaker recognition performance
11837238 · 2023-12-05 · ·

A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Assessing speaker recognition performance
11837238 · 2023-12-05 · ·

A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

APPARATUS AND METHOD FOR OWN VOICE SUPPRESSION
20220078561 · 2022-03-10 ·

An own voice suppression apparatus applicable to a hearing aid is disclosed. The own voice suppression apparatus comprises: an air conduction sensor, an own voice indication module and a suppression module. The air conduction sensor is configured to generate an audio signal. The own voice indication module is configured to generate an indication signal according to at least one of user's mouth vibration information and user's voice feature vector comparison result. The suppression module coupled to the air conduction sensor and the own voice indication module is configured to generate an own-voice-suppressed signal according to the indication signal and the audio signal.

APPARATUS AND METHOD FOR OWN VOICE SUPPRESSION
20220078561 · 2022-03-10 ·

An own voice suppression apparatus applicable to a hearing aid is disclosed. The own voice suppression apparatus comprises: an air conduction sensor, an own voice indication module and a suppression module. The air conduction sensor is configured to generate an audio signal. The own voice indication module is configured to generate an indication signal according to at least one of user's mouth vibration information and user's voice feature vector comparison result. The suppression module coupled to the air conduction sensor and the own voice indication module is configured to generate an own-voice-suppressed signal according to the indication signal and the audio signal.