Patent classifications
G10L15/19
EXTRACTING NEXT STEP SENTENCES FROM A COMMUNICATION SESSION
Methods and systems provide for extracting next step sentences from a communication session. In one embodiment, the system connects to a communication session involving one or more participants; receives or generates a transcript of a conversation; extracts, from the transcript, a number of utterances including one or more sentences spoken by the participants; identifies a subset of the number of utterances spoken by a subset of the participants associated with a prespecified organization; extracts one or more next step sentences within the subset of the utterances, where the next step sentences each include an owner-action pair structure in which the action is an actionable verb in future tense or present tense; determines a set of analytics data corresponding to the next step sentences and the associated participants; and presents, to one or more users, at least a subset of the analytics data corresponding to the next step sentences.
EXTRACTING NEXT STEP SENTENCES FROM A COMMUNICATION SESSION
Methods and systems provide for extracting next step sentences from a communication session. In one embodiment, the system connects to a communication session involving one or more participants; receives or generates a transcript of a conversation; extracts, from the transcript, a number of utterances including one or more sentences spoken by the participants; identifies a subset of the number of utterances spoken by a subset of the participants associated with a prespecified organization; extracts one or more next step sentences within the subset of the utterances, where the next step sentences each include an owner-action pair structure in which the action is an actionable verb in future tense or present tense; determines a set of analytics data corresponding to the next step sentences and the associated participants; and presents, to one or more users, at least a subset of the analytics data corresponding to the next step sentences.
INFORMATION PROCESSING APPARATUS, METHOD AND COMPUTER READABLE MEDIUM
According to one embodiment, an information processing apparatus includes a processor. The processor generates a template, regarding a recording data sheet including a plurality of items, for one or more of the items that can be specified, with reference to an input order of input target items selected from the items. The processor performs a speech recognition on an utterance of a user and generate a speech recognition result. The processor determines an input target range relating to one more items specified by the utterance of the user among the items based on the template and the speech recognition result.
INFORMATION PROCESSING APPARATUS, METHOD AND COMPUTER READABLE MEDIUM
According to one embodiment, an information processing apparatus includes a processor. The processor generates a template, regarding a recording data sheet including a plurality of items, for one or more of the items that can be specified, with reference to an input order of input target items selected from the items. The processor performs a speech recognition on an utterance of a user and generate a speech recognition result. The processor determines an input target range relating to one more items specified by the utterance of the user among the items based on the template and the speech recognition result.
Language models using domain-specific model components
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using domain-specific model components. In some implementations, context data for an utterance is obtained. A domain-specific model component is selected from among multiple domain-specific model components of a language model based on the non-linguistic context of the utterance. A score for a candidate transcription for the utterance is generated using the selected domain-specific model component and a baseline model component of the language model that is domain-independent. A transcription for the utterance is determined using the score the transcription is provided as output of an automated speech recognition system.
Techniques for dialog processing using contextual data
Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.
Techniques for dialog processing using contextual data
Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.
Method and apparatus for automatic categorization of calls in a call center environment
A system for categorizing a call between an agent and a caller comprises at least one processor and a memory communicably coupled to the at least one processor. The memory comprises computer executable instructions, which, when executed by the at least one processor implement a method as follows. A call document comprising text of the call between the agent and the caller is received by the system. The system categorizes the call into at least one class using regressive probability analysis of the call document. The system splits the call document to at least two portions, the at least two portions comprising a call header and a call body, and thereafter, using rule-based entity extraction, the system extracts a mandatory entity from the call header and an optional entity from the call body.
Triggering voice control disambiguation
In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.
Triggering voice control disambiguation
In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.