Patent classifications
G10L17/04
Attentive Scoring Function for Speaker Identification
A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.
Attentive Scoring Function for Speaker Identification
A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.
Method and apparatus for outputting information
A method and an apparatus for outputting information are provided. The method includes acquiring voice information received within a preset time period before a device is awakened, where the device is provided with a wake-up model for outputting preset response information when a preset wake-up word is received; performing speech recognition on the voice information to obtain a recognition result; extracting feature information of the voice information in response to determining that the recognition result does not include a preset wake-up word; generating a counterexample training sample according to the feature information; and training the wake-up model using a counter-example training sample, and outputting the trained wake-up model.
Method and apparatus for outputting information
A method and an apparatus for outputting information are provided. The method includes acquiring voice information received within a preset time period before a device is awakened, where the device is provided with a wake-up model for outputting preset response information when a preset wake-up word is received; performing speech recognition on the voice information to obtain a recognition result; extracting feature information of the voice information in response to determining that the recognition result does not include a preset wake-up word; generating a counterexample training sample according to the feature information; and training the wake-up model using a counter-example training sample, and outputting the trained wake-up model.
ACCESS CONTROL SYSTEM
Method and system for granting access to a restricted area or allowing a user to access a restricted area. The method includes training of a machine learning engine, recording a voiceprint of a user, determining the voiceprint of a user to be validated when access is attempted; if a primary key is entered: select the voiceprint identified by the primary key, if no primary key is entered: identify the voiceprint closest to the recorded voiceprints and validating the voiceprint. The training of the machine learning engine may be carried out by capturing an audio sample through an audio capture device, wherein each user from a plurality of users repeats at least a same fixed phrase three times, wherein each audio sample of each user from a plurality of users is divided into two audio parts, training a machine learning engine by using the first part of the audio samples and validating the trained machine learning engine by using the second part of the audio samples. An anti-fraud method includes requesting the user to repeat a phrase, capturing the audio sample of the phrase repeated by the user, transcribing the phrase repeated by the user, and comparing the transcribed phrase with the targeted phrase
ACCESS CONTROL SYSTEM
Method and system for granting access to a restricted area or allowing a user to access a restricted area. The method includes training of a machine learning engine, recording a voiceprint of a user, determining the voiceprint of a user to be validated when access is attempted; if a primary key is entered: select the voiceprint identified by the primary key, if no primary key is entered: identify the voiceprint closest to the recorded voiceprints and validating the voiceprint. The training of the machine learning engine may be carried out by capturing an audio sample through an audio capture device, wherein each user from a plurality of users repeats at least a same fixed phrase three times, wherein each audio sample of each user from a plurality of users is divided into two audio parts, training a machine learning engine by using the first part of the audio samples and validating the trained machine learning engine by using the second part of the audio samples. An anti-fraud method includes requesting the user to repeat a phrase, capturing the audio sample of the phrase repeated by the user, transcribing the phrase repeated by the user, and comparing the transcribed phrase with the targeted phrase
Systems and methods for preventing errors in medical imaging
A method for preventing wrong-patient errors includes receiving a selection of a current imaging subject. The current imaging subject is selected for a current image acquisition session comprising capturing one or more current images of the current imaging subject utilizing at least a first image sensor system of a first imaging modality. The method includes accessing one or more previous images of a previous imaging subject. The one or more previous images depict the previous imaging subject according to at least a second imaging modality that is different from the first imaging modality. The method includes presenting the one or more previous images on a display system and, in response to determining that the previous imaging subject matches the current imaging subject based upon the one or more previous images, performing the current image acquisition session.
Systems and methods for preventing errors in medical imaging
A method for preventing wrong-patient errors includes receiving a selection of a current imaging subject. The current imaging subject is selected for a current image acquisition session comprising capturing one or more current images of the current imaging subject utilizing at least a first image sensor system of a first imaging modality. The method includes accessing one or more previous images of a previous imaging subject. The one or more previous images depict the previous imaging subject according to at least a second imaging modality that is different from the first imaging modality. The method includes presenting the one or more previous images on a display system and, in response to determining that the previous imaging subject matches the current imaging subject based upon the one or more previous images, performing the current image acquisition session.
MULTI-USER PERSONALIZATION AT A VOICE INTERFACE DEVICE
A method at an electronic device with one or more microphones and a speaker includes receiving a first voice input; comparing the first voice input to one or more voice models; based on the comparing, determining whether the first voice input corresponds to any of a plurality of occupants, and according to the determination, authenticating an occupant and presenting a response, or restricting functionality of the electronic device.
MULTI-USER PERSONALIZATION AT A VOICE INTERFACE DEVICE
A method at an electronic device with one or more microphones and a speaker includes receiving a first voice input; comparing the first voice input to one or more voice models; based on the comparing, determining whether the first voice input corresponds to any of a plurality of occupants, and according to the determination, authenticating an occupant and presenting a response, or restricting functionality of the electronic device.