Patent classifications
G10L15/01
Automated Social Agent Interaction Quality Monitoring and Improvement
A system for monitoring and improving social agent interaction quality includes a computing platform having processing hardware and a system memory storing a software code. The processing hardware is configured to execute the software code to receive, from a social agent, interaction data describing an interaction of the social agent with a user, and to perform an assessment of the interaction, using the interaction data, as one of successful or including a flaw. When the assessment indicates that the interaction includes the flaw, the processing hardware is further configured to execute the software code to identify an interaction strategy for correcting the flaw, and to deliver, to the social agent, one or both of the assessment and the interaction strategy to correct the flaw in the interaction.
Methods and systems for correcting transcribed audio files
Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.
Methods and systems for correcting transcribed audio files
Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.
Method and apparatus for correcting failures in automated speech recognition systems
Systems and methods are disclosed and described for correcting errors in ASR transcriptions. For an incorrect transcription, different words or phrases from the transcription, and/or related words or phrases, are submitted as hint words to the ASR system, and the voice query is submitted again, to determine new transcriptions. This process is repeated with different transcription terms, until a different and more proper transcription is generated. This increases the accuracy of ASR systems.
Method and apparatus for correcting failures in automated speech recognition systems
Systems and methods are disclosed and described for correcting errors in ASR transcriptions. For an incorrect transcription, different words or phrases from the transcription, and/or related words or phrases, are submitted as hint words to the ASR system, and the voice query is submitted again, to determine new transcriptions. This process is repeated with different transcription terms, until a different and more proper transcription is generated. This increases the accuracy of ASR systems.
UTTERANCE EVALUATION APPARATUS, UTTERANCE EVALUATION, AND PROGRAM
A stable evaluation result is obtained from a voice of speech for any sentence. A speech evaluation device (1) outputs a score for evaluating speech of an input voice signal spoken by a speaker in a first group. A feature extraction unit (11) extracts an acoustic feature from the input voice signal. A conversion unit (12) converts the acoustic feature of the input voice signal to an acoustic feature when a speaker in a second group speaks the same text as text of the input voice signal. An evaluation unit (13) calculates a score indicating a higher evaluation as a distance between the acoustic feature before the conversion and the acoustic feature after the conversion becomes shorter.
UTTERANCE EVALUATION APPARATUS, UTTERANCE EVALUATION, AND PROGRAM
A stable evaluation result is obtained from a voice of speech for any sentence. A speech evaluation device (1) outputs a score for evaluating speech of an input voice signal spoken by a speaker in a first group. A feature extraction unit (11) extracts an acoustic feature from the input voice signal. A conversion unit (12) converts the acoustic feature of the input voice signal to an acoustic feature when a speaker in a second group speaks the same text as text of the input voice signal. An evaluation unit (13) calculates a score indicating a higher evaluation as a distance between the acoustic feature before the conversion and the acoustic feature after the conversion becomes shorter.
Method and system for adaptive language learning
Methods and systems provide an adaptive method of language learning using automatic speech recognition that allows a user to learn a new language using only their voice—and without using their hands or eyes. The system may be implemented in an application for a smartphone. Each lesson comprises a series of questions that adapt to the user's knowledge. The questions ask for the translation of a word or phrase by playing an audio prompt in the origin language, recording the user speaking the translation in the target language, indicating whether the utterance was correct or incorrect, and providing feedback related to the user's utterance. Each user response is evaluated in real time, and the application provides individualized feedback to the user based on their response. Subsequent questions in the lesson and future lessons are dynamically ordered to adapt to the user's knowledge.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
Disclosed is an information processing apparatus including a control section that estimates utterance environment information in a state where the control section is set to cooperate with a predetermined mobile terminal.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
Disclosed is an information processing apparatus including a control section that estimates utterance environment information in a state where the control section is set to cooperate with a predetermined mobile terminal.