G10L15/183

Intelligent Voice Interface for Handling Out-of-Context Dialog

In a method for handling out-of-sequence caller dialog, an intelligent voice interface is configured to lead callers through pathways of an algorithmic dialog that includes available voice prompts for requesting different types of caller information. The method may include, during a voice communication with a caller via a caller device, receiving from the caller device caller input data indicative of a voice input of the caller, without having first provided to the caller device any voice prompt that requests a first type of caller information, and determining, by processing the caller input data, that the voice input includes caller information of the first type. The method also includes after determining that the voice input includes the caller information of the first type, bypassing one or more voice prompts, of the available voice prompts, that request the first type of caller information.

Intelligent Voice Interface for Handling Out-of-Context Dialog

In a method for handling out-of-sequence caller dialog, an intelligent voice interface is configured to lead callers through pathways of an algorithmic dialog that includes available voice prompts for requesting different types of caller information. The method may include, during a voice communication with a caller via a caller device, receiving from the caller device caller input data indicative of a voice input of the caller, without having first provided to the caller device any voice prompt that requests a first type of caller information, and determining, by processing the caller input data, that the voice input includes caller information of the first type. The method also includes after determining that the voice input includes the caller information of the first type, bypassing one or more voice prompts, of the available voice prompts, that request the first type of caller information.

Generating topic-specific language models

Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.

Generating topic-specific language models

Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.

SYSTEM AND METHOD FOR GENERATING WRAP UP INFORMATION

A system for generating wrap-up information is capable of learning how interactions are transformed into contact notes and outcome codes using natural language processing and can generate the contact notes and outcome codes for new incoming interactions by applying prediction models trained on interaction data, contact notes and outcome codes. The system for generating wrap-up information receives interaction data, including interaction audio data, interaction transcripts, associated contact notes and associated outcome codes. The interaction transcripts are generated from the previous interactions between agents and customers. The contact notes and outcome codes are generated by agents during the associated previous interactions. The system processes and uses the interaction data to train prediction models to analyze interaction audio data and interaction transcripts and predict appropriate contact notes and outcome codes for the interaction. Once trained the prediction model(s) can generate appropriate contact notes and outcome codes for new interactions.

SYSTEM AND METHOD FOR GENERATING WRAP UP INFORMATION

A system for generating wrap-up information is capable of learning how interactions are transformed into contact notes and outcome codes using natural language processing and can generate the contact notes and outcome codes for new incoming interactions by applying prediction models trained on interaction data, contact notes and outcome codes. The system for generating wrap-up information receives interaction data, including interaction audio data, interaction transcripts, associated contact notes and associated outcome codes. The interaction transcripts are generated from the previous interactions between agents and customers. The contact notes and outcome codes are generated by agents during the associated previous interactions. The system processes and uses the interaction data to train prediction models to analyze interaction audio data and interaction transcripts and predict appropriate contact notes and outcome codes for the interaction. Once trained the prediction model(s) can generate appropriate contact notes and outcome codes for new interactions.

SYSTEM AND METHOD FOR SIMULTANEOUSLY IDENTIFYING INTENT AND SLOTS IN VOICE ASSISTANT COMMANDS

In an embodiment of the present disclosure, a method of simultaneously identifying intent and slots in a voice assistant command includes tokenizing, into a plurality of tokens, a current utterance of a user of a device comprising the voice assistant command, prepending the plurality of tokens with a previous utterance and a separation token, obtaining, using a transformer-based machine learning model, one or more predictions for the voice assistant command from the prepended plurality of tokens, aligning, according to one or more constraints, the at least one of the flag prediction, the goal prediction, and the sub-goal prediction, providing, to the device, the identified intent and the identified slots based on the intent prediction and the aligned at least one of the flag prediction, the goal prediction, and the sub-goal prediction, causing the device to perform the voice assistant command according to the identified intent and the identified slots.

SYSTEM AND METHOD FOR SIMULTANEOUSLY IDENTIFYING INTENT AND SLOTS IN VOICE ASSISTANT COMMANDS

In an embodiment of the present disclosure, a method of simultaneously identifying intent and slots in a voice assistant command includes tokenizing, into a plurality of tokens, a current utterance of a user of a device comprising the voice assistant command, prepending the plurality of tokens with a previous utterance and a separation token, obtaining, using a transformer-based machine learning model, one or more predictions for the voice assistant command from the prepended plurality of tokens, aligning, according to one or more constraints, the at least one of the flag prediction, the goal prediction, and the sub-goal prediction, providing, to the device, the identified intent and the identified slots based on the intent prediction and the aligned at least one of the flag prediction, the goal prediction, and the sub-goal prediction, causing the device to perform the voice assistant command according to the identified intent and the identified slots.

Adversarial, learning framework for persona-based dialogue modeling

Various embodiments may be generally directed to the use of an adversarial learning framework for persona-based dialogue modeling. In some embodiments, automated multi-turn dialogue response generation may be performed using a persona-based hierarchical recurrent encoder-decoder-based generative adversarial network (phredGAN). Such a phredGAN may feature a persona-based hierarchical recurrent encoder-decoder (PHRED) generator and a conditional discriminator. In some embodiments, the conditional discriminator may include an adversarial discriminator that is provided with attribute representations as inputs. In some other embodiments, the conditional discriminator may include an attribute discriminator, and attribute representations may be handled as targets of the attribute discriminator. The embodiments are not limited in this context.

Adversarial, learning framework for persona-based dialogue modeling

Various embodiments may be generally directed to the use of an adversarial learning framework for persona-based dialogue modeling. In some embodiments, automated multi-turn dialogue response generation may be performed using a persona-based hierarchical recurrent encoder-decoder-based generative adversarial network (phredGAN). Such a phredGAN may feature a persona-based hierarchical recurrent encoder-decoder (PHRED) generator and a conditional discriminator. In some embodiments, the conditional discriminator may include an adversarial discriminator that is provided with attribute representations as inputs. In some other embodiments, the conditional discriminator may include an attribute discriminator, and attribute representations may be handled as targets of the attribute discriminator. The embodiments are not limited in this context.