Patent classifications
G10L15/01
Detecting potential significant errors in speech recognition results
In some embodiments, recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential errors. In some embodiments, the indications of potential errors may include discrepancies between recognition results that are meaningful for a domain, such as medically-meaningful discrepancies. The evaluation of the recognition results may be carried out using any suitable criteria, including one or more criteria that differ from criteria used by an ASR system in determining the top recognition result and the alternative recognition results from the speech input. In some embodiments, a recognition result may additionally or alternatively be processed to determine whether the recognition result includes a word or phrase that is unlikely to appear in a domain to which speech input relates.
Detecting potential significant errors in speech recognition results
In some embodiments, recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential errors. In some embodiments, the indications of potential errors may include discrepancies between recognition results that are meaningful for a domain, such as medically-meaningful discrepancies. The evaluation of the recognition results may be carried out using any suitable criteria, including one or more criteria that differ from criteria used by an ASR system in determining the top recognition result and the alternative recognition results from the speech input. In some embodiments, a recognition result may additionally or alternatively be processed to determine whether the recognition result includes a word or phrase that is unlikely to appear in a domain to which speech input relates.
Information presentation device, and information presentation method
There is provided an information presentation device that is configured to present information, to a plurality of users that differ in level, in such a manner that each of the users can easily understand the information, and an information presentation method. The information presentation device includes: an identification unit that identifies respective levels of one or more users; an obtaining unit that obtains presentation information to be presented to the users; a conversion unit that appropriately converts the obtained presentation information according to the level of each user; and a presentation unit that presents the appropriately converted presentation information to each user. The present technology can be applied to, for example, a robot, a signage device, a car navigation device, and the like.
Information presentation device, and information presentation method
There is provided an information presentation device that is configured to present information, to a plurality of users that differ in level, in such a manner that each of the users can easily understand the information, and an information presentation method. The information presentation device includes: an identification unit that identifies respective levels of one or more users; an obtaining unit that obtains presentation information to be presented to the users; a conversion unit that appropriately converts the obtained presentation information according to the level of each user; and a presentation unit that presents the appropriately converted presentation information to each user. The present technology can be applied to, for example, a robot, a signage device, a car navigation device, and the like.
Artificial intelligence device for providing voice recognition service and method of operating the same
An artificial intelligence device for providing a voice recognition service includes a microphone configured to receive a voice command, a memory configured to store an error analysis model for inferring an error cause of voice recognition, an output unit, and a processor configured to determine whether voice recognition of the voice command has failed based on the voice command and voice recognition surrounding information, acquire the error cause from the voice recognition surrounding information using the error analysis model, and output the acquired error cause through the output unit.
Artificial intelligence device for providing voice recognition service and method of operating the same
An artificial intelligence device for providing a voice recognition service includes a microphone configured to receive a voice command, a memory configured to store an error analysis model for inferring an error cause of voice recognition, an output unit, and a processor configured to determine whether voice recognition of the voice command has failed based on the voice command and voice recognition surrounding information, acquire the error cause from the voice recognition surrounding information using the error analysis model, and output the acquired error cause through the output unit.
Developing an Automatic Speech Recognition System Using Normalization
A computer-implemented technique identifies terms in an original reference transcription and original ASR output results that are considered valid variants of each other, even though these terms have different textual forms. Based on this finding, the technique produces a normalized reference transcription and normalized ASR output results in which valid variants are assigned the same textual form. In some implementations, the technique uses the normalized text to develop a model for an ASR system. For example, the technique may generate a word error rate (WER) measure by comparing the normalized reference transcription with the normalized ASR output results, and use the WER measure as guidance in developing the model. Some aspects of the technique involve identifying occasions in which a term can be properly split into component parts. Other aspects can identify other ways in which two terms may vary in spelling, but nonetheless remain valid variants.
Enhancing ASR System Performance for Agglutinative Languages
A training-stage technique trains a language model for use in an ASR system. The technique includes: obtaining a training corpus that includes a sequence of terms; determining that an original term in the training corpus is not present in a dictionary resource; segmenting the original term into two or more sub-terms using a segmentation resource; determining that the segmentation of the original term into the two or more sub-terms is a valid segmentation, based on two or more validity tests; and training the language model based on the terms that have been identified. A computer-implemented inference-stage technique applies the language model to produce ASR output results. The inference-stage technique merges a sub-term with a preceding term if these two terms are separated by no more than a prescribed interval of time.
Enhancing ASR System Performance for Agglutinative Languages
A training-stage technique trains a language model for use in an ASR system. The technique includes: obtaining a training corpus that includes a sequence of terms; determining that an original term in the training corpus is not present in a dictionary resource; segmenting the original term into two or more sub-terms using a segmentation resource; determining that the segmentation of the original term into the two or more sub-terms is a valid segmentation, based on two or more validity tests; and training the language model based on the terms that have been identified. A computer-implemented inference-stage technique applies the language model to produce ASR output results. The inference-stage technique merges a sub-term with a preceding term if these two terms are separated by no more than a prescribed interval of time.
METHOD AND SYSTEM FOR MONITORING THE PERFORMANCE OF A VOICE RECOGNITION ASSISTANCE SYSTEM IN A DATA SENSITIVE ENVIRONMENT
The disclosure relates to a method and system for monitoring the performance of a voice recognition (VR) assistance system in a data sensitive environment, wherein the VR assistance system comprises one or more client devices and a server, the server comprising a monitoring component. The method comprises determining, by at least one client device, client input data; processing, by the VR assistance system, the client input data; determining, by the monitoring component, one or more anonymized performance indicators of the VR assistance system; determining, by the monitoring component, one or more anonymized performance indicator values for the one or more anonymous performance indicators during the processing of the client input data; outputting and/or saving, by the monitoring component, the determined one or more anonymized performance indicator values.