G10L17/02

MACHINE LEARNING FOR IMPROVING QUALITY OF VOICE BIOMETRICS

Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

MACHINE LEARNING FOR IMPROVING QUALITY OF VOICE BIOMETRICS

Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

Voiceprint recognition method, model training method, and server

Embodiments of this application disclose a voiceprint recognition method performed by a computer. After obtaining a to-be-recognized target voice message, the computer obtains target feature information of the target voice message by using a voice recognition model, the voice recognition model being obtained through training according to a first loss function and a second loss function. Next, the computer determines a voiceprint recognition result according to the target feature information and registration feature information, the registration feature information being obtained from a voice message of a to-be-recognized object using the voiceprint recognition model. The normalized exponential function and the centralization function are used for jointly optimizing the voice recognition model, and can reduce an intra-class variation between depth features from the same speaker. The two functions are used for simultaneously supervising and learning the voice recognition model, and enable the depth feature to have better discrimination, thereby improving recognition performance.

Voiceprint recognition method, model training method, and server

Embodiments of this application disclose a voiceprint recognition method performed by a computer. After obtaining a to-be-recognized target voice message, the computer obtains target feature information of the target voice message by using a voice recognition model, the voice recognition model being obtained through training according to a first loss function and a second loss function. Next, the computer determines a voiceprint recognition result according to the target feature information and registration feature information, the registration feature information being obtained from a voice message of a to-be-recognized object using the voiceprint recognition model. The normalized exponential function and the centralization function are used for jointly optimizing the voice recognition model, and can reduce an intra-class variation between depth features from the same speaker. The two functions are used for simultaneously supervising and learning the voice recognition model, and enable the depth feature to have better discrimination, thereby improving recognition performance.

ELECTRONIC DEVICE, METHOD AND COMPUTER PROGRAM

An electronic device having circuitry configured to perform source separation on an audio signal to obtain a separated source and a residual signal, to perform feature extraction on the separated source to obtain one or more processing parameters, and to perform audio processing on a captured audio signal based on the one or more processing parameters to obtain an adjusted separated source.

ELECTRONIC DEVICE, METHOD AND COMPUTER PROGRAM

An electronic device having circuitry configured to perform source separation on an audio signal to obtain a separated source and a residual signal, to perform feature extraction on the separated source to obtain one or more processing parameters, and to perform audio processing on a captured audio signal based on the one or more processing parameters to obtain an adjusted separated source.

SYSTEM WITH POST-CONVERSATION EVALUATION, ELECTRONIC DEVICE, AND RELATED METHODS
20220358935 · 2022-11-10 ·

System, electronic device, and related methods, in particular a method of operating a system comprising an electronic device is disclosed, the method comprising obtaining one or more audio signals including a first audio signal of a first conversation; determining a first conversation period of the first conversation, the first conversation period having a first duration; determining first conversation metric data including a first conversation metric based on the first conversation period; determining a second conversation period of the first conversation different from the first conversation period, the second conversation period having a second duration; determining second conversation metric data including a second conversation metric based on the second conversation period; determining a first performance metric based on a change between the first conversation metric data and the second conversation metric data; outputting, via the interface of the electronic device, the first performance metric.

Information presentation device, and information presentation method
11495209 · 2022-11-08 · ·

There is provided an information presentation device that is configured to present information, to a plurality of users that differ in level, in such a manner that each of the users can easily understand the information, and an information presentation method. The information presentation device includes: an identification unit that identifies respective levels of one or more users; an obtaining unit that obtains presentation information to be presented to the users; a conversion unit that appropriately converts the obtained presentation information according to the level of each user; and a presentation unit that presents the appropriately converted presentation information to each user. The present technology can be applied to, for example, a robot, a signage device, a car navigation device, and the like.

Information presentation device, and information presentation method
11495209 · 2022-11-08 · ·

There is provided an information presentation device that is configured to present information, to a plurality of users that differ in level, in such a manner that each of the users can easily understand the information, and an information presentation method. The information presentation device includes: an identification unit that identifies respective levels of one or more users; an obtaining unit that obtains presentation information to be presented to the users; a conversion unit that appropriately converts the obtained presentation information according to the level of each user; and a presentation unit that presents the appropriately converted presentation information to each user. The present technology can be applied to, for example, a robot, a signage device, a car navigation device, and the like.

Headset for acoustic authentication of a user
11494473 · 2022-11-08 · ·

A headset for acoustic authentication of a user is provided, the headset comprising at least a first microphone, a second microphone, a controllable filter, and an authenticator. The first microphone is arranged to obtain a first input signal. The second microphone is arranged to obtain a second input signal. The controllable filter is configured to receive the first input signal and the second input signal and to determine at least one filter transfer function from the received first input signal and the second input signal. The authenticator is configured to determine a current user acoustic signature from the at least one filter transfer function and to compare the current user acoustic signature with a predefined user acoustic signature and to authenticate the user based on the comparison of the current user acoustic signature with the predefined user acoustic signature.