Patent classifications
G10L15/10
Improving speech recognition transcriptions
An approach to correcting transcriptions of speech recognition models may be provided. A list of similar sounding phonemes from associated with the phonemes of high frequency terms may be generated for a particular node associated with a virtual assistant. An utterance may be transcribed and receive a confidence score regarding the correctness of the transcription based on audio metrics and other factors. The phonemes of the utterance can be compared to the phonemes of the high frequency terms from the list and a score for the matching phonemes and similar sounding phonemes can be determined. If it is determined the sounds similar score for a term from the high frequency term list is above a threshold, the transcription can be replaced with the term, providing a corrected transcription.
Improving speech recognition transcriptions
An approach to correcting transcriptions of speech recognition models may be provided. A list of similar sounding phonemes from associated with the phonemes of high frequency terms may be generated for a particular node associated with a virtual assistant. An utterance may be transcribed and receive a confidence score regarding the correctness of the transcription based on audio metrics and other factors. The phonemes of the utterance can be compared to the phonemes of the high frequency terms from the list and a score for the matching phonemes and similar sounding phonemes can be determined. If it is determined the sounds similar score for a term from the high frequency term list is above a threshold, the transcription can be replaced with the term, providing a corrected transcription.
Satisfaction estimation model learning apparatus, satisfaction estimating apparatus, satisfaction estimation model learning method, satisfaction estimation method, and program
Estimation accuracies of a conversation satisfaction and a speech satisfaction are improved. A learning data storage unit (10) stores learning data including a conversation voice containing a conversation including a plurality of speeches, a correct answer value of a conversation satisfaction for the conversation, and a correct answer value of a speech satisfaction for each speech included in the conversation. A model learning unit (13) learns a satisfaction estimation model using a feature quantity of each speech extracted from the conversation voice, the correct answer value of the speech satisfaction, and the correct answer value of the conversation satisfaction, the satisfaction estimation model configured by connecting a speech satisfaction estimation model part that receives a feature quantity of each speech and estimates the speech satisfaction of each speech with a conversation satisfaction estimation model part that receives at least the speech satisfaction of each speech and estimates the conversation satisfaction.
Satisfaction estimation model learning apparatus, satisfaction estimating apparatus, satisfaction estimation model learning method, satisfaction estimation method, and program
Estimation accuracies of a conversation satisfaction and a speech satisfaction are improved. A learning data storage unit (10) stores learning data including a conversation voice containing a conversation including a plurality of speeches, a correct answer value of a conversation satisfaction for the conversation, and a correct answer value of a speech satisfaction for each speech included in the conversation. A model learning unit (13) learns a satisfaction estimation model using a feature quantity of each speech extracted from the conversation voice, the correct answer value of the speech satisfaction, and the correct answer value of the conversation satisfaction, the satisfaction estimation model configured by connecting a speech satisfaction estimation model part that receives a feature quantity of each speech and estimates the speech satisfaction of each speech with a conversation satisfaction estimation model part that receives at least the speech satisfaction of each speech and estimates the conversation satisfaction.
Confusion network distributed representation generation apparatus, confusion network classification apparatus, confusion network distributed representation generation method, confusion network classification method and program
There is provided a technique for transforming a confusion network to a representation that can be used as an input for machine learning. A confusion network distributed representation sequence generating part that generates a confusion network distributed representation sequence, which is a vector sequence, from an arc word set sequence and an arc weight set sequence constituting the confusion network is included. The confusion network distributed representation sequence generating part comprises: an arc word distributed representation set sequence transforming part that, by transforming an arc word included in the arc word set to a word distributed representation, obtains an arc word distributed representation set and generates an arc word distributed representation set sequence; and an arc word distributed representation set weighting/integrating part that generates the confusion network distributed representation sequence from the arc word distributed representation set sequence and the arc weight set sequence.
Systems and methods for identifying and storing a portion of a media asset
Systems and methods are described herein for a media guidance application that can cause a specific portion of a media asset to be stored based on a user command. For example, if the user requests the closing scene from a given movie, the media guidance application may detect the command, determine that it comprises an instruction to store a portion of a media asset, identify a source of the portion of the media asset, and cause the portion of the media asset to be stored. The media guidance application may also cause the entirety of the media asset to be stored and initiate playback at the start of the requested portion. This may allow users to store and watch portions of particular interest without requiring that the users seek through the entire media asset on their own.
Systems and methods for identifying and storing a portion of a media asset
Systems and methods are described herein for a media guidance application that can cause a specific portion of a media asset to be stored based on a user command. For example, if the user requests the closing scene from a given movie, the media guidance application may detect the command, determine that it comprises an instruction to store a portion of a media asset, identify a source of the portion of the media asset, and cause the portion of the media asset to be stored. The media guidance application may also cause the entirety of the media asset to be stored and initiate playback at the start of the requested portion. This may allow users to store and watch portions of particular interest without requiring that the users seek through the entire media asset on their own.
WASTE IDENTIFICATION METHOD, WASTE IDENTIFICATION DEVICE, AND WASTE IDENTIFICATION PROGRAM
An excreta identification device includes: a sound data acquisition unit that acquires sound data collected by a microphone arranged in a toilet; an excreta identification unit that identifies which of defecation, urination, and farting has been performed by inputting the acquired sound data to an identification model subjected to machine learning where sound data indicating any of defecation sound, urination sound, and farting sound is an input value, and which of defecation, urination, and farting has been performed is an output value; and an identification result output unit that outputs an identification result.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
An information processing device including a control unit that performs control not to react to a user's expression, if the user's expression includes a representation of a predetermined non-response setting, until predetermined setting conditions are satisfied and to react to the user's expression if the user's expression does not include the representation of the non-response setting.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM
Smooth text communication is realized between users. An information processing apparatus according to the present disclosure includes a control unit configured to: determine speech generated by a first user on the basis of sensing information of at least one sensor apparatus sensing at least one of the first user and a second user communicating with the first user on the basis of the speech generation of the first user; and control information output to the first user on the basis of a result of the determination of the speech generation of the first user.