Patent classifications
G10L17/06
Method and device with data recognition
A processor-implemented method with data recognition includes: extracting input feature data from input data; calculating a matching score between the extracted input feature data and enrolled feature data of an enrolled user, based on the extracted input feature data, common component data of a plurality of enrolled feature data corresponding to the enrolled user, and distribution component data of the plurality of enrolled feature data corresponding to the enrolled user; and recognizing the input data based on the matching score.
Method and device with data recognition
A processor-implemented method with data recognition includes: extracting input feature data from input data; calculating a matching score between the extracted input feature data and enrolled feature data of an enrolled user, based on the extracted input feature data, common component data of a plurality of enrolled feature data corresponding to the enrolled user, and distribution component data of the plurality of enrolled feature data corresponding to the enrolled user; and recognizing the input data based on the matching score.
Voice interaction scripts
This disclosure describes systems and methods that identify activities for which scripts can be built to perform an activity when requested by a user. The scripts can be voice-activated by a defined customized voice command and can include delivery preferences. The user's identity can be verified by analyzing voice biometrics of the customized voice command. After performance of the activity, results can be delivered to the device in the format indicated in the script.
Voice interaction scripts
This disclosure describes systems and methods that identify activities for which scripts can be built to perform an activity when requested by a user. The scripts can be voice-activated by a defined customized voice command and can include delivery preferences. The user's identity can be verified by analyzing voice biometrics of the customized voice command. After performance of the activity, results can be delivered to the device in the format indicated in the script.
Systems and methods for performing commands in a vehicle using speech and image recognition
Systems and methods are disclosed herein for implementation of a vehicle command operation system that may use multi-modal technology to authenticate an occupant of the vehicle to authorize a command and receive natural language commands for vehicular operations. The system may utilize sensors to receive data indicative of a voice command from an occupant of the vehicle. The system may receive second sensor data to aid in the determination of the corresponding vehicular operation in response to the received command. The system may retrieve authentication data for the occupants of the vehicle. The system authenticates the occupant to authorize a vehicular operation command using a neural network based on at least one of the first sensor data, the second sensor data, and the authentication data. Responsive to the authentication, the system may authorize the operation to be performed in the vehicle based on the vehicular operation command.
Systems and methods for performing commands in a vehicle using speech and image recognition
Systems and methods are disclosed herein for implementation of a vehicle command operation system that may use multi-modal technology to authenticate an occupant of the vehicle to authorize a command and receive natural language commands for vehicular operations. The system may utilize sensors to receive data indicative of a voice command from an occupant of the vehicle. The system may receive second sensor data to aid in the determination of the corresponding vehicular operation in response to the received command. The system may retrieve authentication data for the occupants of the vehicle. The system authenticates the occupant to authorize a vehicular operation command using a neural network based on at least one of the first sensor data, the second sensor data, and the authentication data. Responsive to the authentication, the system may authorize the operation to be performed in the vehicle based on the vehicular operation command.
Utilizing sensor data for automated user identification
This disclosure describes techniques for identifying users that are enrolled for use of a user-recognition system. To be identified using the user-recognition system, a user may first enroll in the system by stating an utterance at a first device having a first microphone. In response, the first microphone may generate first audio data. Later, when the user would like to be identified by the system, the user may state the utterance again, although this time to a second device having a second microphone. This second microphone may accordingly generate second audio data. Because the acoustic response of the first microphone may differ from the acoustic response of the second microphone, however, this disclosure describes techniques to apply a relative transfer function to one or both of the first or second audio data prior to comparing these data so as to increase the recognition accuracy of the system.
Utilizing sensor data for automated user identification
This disclosure describes techniques for identifying users that are enrolled for use of a user-recognition system. To be identified using the user-recognition system, a user may first enroll in the system by stating an utterance at a first device having a first microphone. In response, the first microphone may generate first audio data. Later, when the user would like to be identified by the system, the user may state the utterance again, although this time to a second device having a second microphone. This second microphone may accordingly generate second audio data. Because the acoustic response of the first microphone may differ from the acoustic response of the second microphone, however, this disclosure describes techniques to apply a relative transfer function to one or both of the first or second audio data prior to comparing these data so as to increase the recognition accuracy of the system.
Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array
An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.
Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array
An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.