Patent classifications
G10L2015/0638
APPARATUS AND METHOD FOR COMPOSITIONAL SPOKEN LANGUAGE UNDERSTANDING
A method includes identifying multiple tokens contained in an input utterance. The method also includes generating slot labels for at least some of the tokens contained in the input utterance using a trained machine learning model. The method further includes determining at least one action to be performed in response to the input utterance based on at least one of the slot labels. The trained machine learning model is trained to use attention distributions generated such that (i) the attention distributions associated with tokens having dissimilar slot labels are forced to be different and (ii) the attention distribution associated with each token is forced to not focus primarily on that token itself.
Natural language input routing
Techniques for performing runtime ranking of skill components are described. A skill developer may generate a rule indicating a skill component is to be invoked at runtime when a natural language input corresponds to a specific context. At runtime, a virtual assistant system may implement a machine learned model to generate an initial ranking of skill components. Thereafter, the virtual assistant system may use skill component-specific rules to adjust the initial ranking, and this second ranking is used to determine which skill component to invoke to respond to the natural language input. Overtime, if a rule results in beneficial user experiences, the virtual assistant system may incorporate the rule into the machine learned model.
INTELLIGENT VOICE INTERACTION METHOD AND APPARATUS, DEVICE AND COMPUTER STORAGE MEDIUM
The present disclosure discloses an intelligent voice interaction method and apparatus, a device and a computer storage medium, and relates to voice, big data and deep learning technologies in the field of artificial intelligence technologies. A specific implementation solution involves: acquiring first conversational voice entered by a user; and inputting the first conversational voice into a voice interaction model, to acquire second conversational voice generated by the voice interaction model for the first conversational voice for return to the user; wherein the voice interaction model includes: a voice encoding submodel configured to encode the first conversational voice and historical conversational voice of a current session, to obtain voice state Embedding; a state memory network configured to obtain Embedding of at least one preset attribute by using the voice state Embedding; and a voice generation submodel configured to generate the second conversational voice by using the voice state Embedding and the Embedding of the at least one preset attribute. The at least one preset attribute is preset according to information of a verified object. Intelligent data verification is realized according to the present disclosure.
SYSTEMS AND METHODS FOR FEW-SHOT INTENT CLASSIFIER MODELS
Some embodiments of the current disclosure disclose methods and systems for training for training a natural language processing intent classification model to perform few-shot classification tasks. In some embodiments, a pair of an utterance and a first semantic label labeling the utterance may be generated and a neural network that is configured to perform natural language inference tasks may be utilized to determine the existence of an entailment relationship between the utterance and the semantic label. The semantic label may be predicted as the intent class of the utterance based on the entailment relationship and the pair may be used to train the natural language processing intent classification model to perform few-shot classification tasks.
Real time key conversational metrics prediction and notability
A system and a method are disclosed for alerting a manager device to an occurrence of an event an agent device during a conversation between the agent device and an external party. N an embodiment, a processor receives transcript data during a conversation between the agent device and the external party. The processor normalizing the transcript data, and inputs the normalized transcript data into a machine learning model, the machine learning model trained to identify an inflection point in the conversation. The processor receives, as output from the machine learning model, a measure of notability of the normalized transcript data. The processor determines whether the measure of notability corresponds to an inflection point, and, responsive to determining that the measure of notability corresponds to an inflection point, alerts the manager device.
Enhancing signature word detection in voice assistants
Systems and methods detecting a spoken sentence in a speech recognition system are disclosed herein. Speech data is buffered based on an audio signal captured at a computing device operating in an active mode. The speech data is buffered irrespective of whether the speech data comprises a signature word. The buffered speech data is processed to detect a presence of the sentence comprising at least one command and a query for the computing device. Processing the buffered speech data includes detecting the signature word in the buffered speech data, and in response to detecting the signature word in the speech data, initiating detection of the sentence in the buffered speech data.
Onboard device, traveling state estimation method, server device, information processing method, and traveling state estimation system
An onboard device estimates a traveling state of a vehicle that may be influenced by the psychological state of a driver, based on an utterance of the driver without the use of various sensors, and includes: a voice collection unit for collecting a driver's voice; a traveling state collection unit for collecting traveling state information representing a traveling state of a vehicle; a database generation unit for generating a database by associating voice information corresponding to the collected voice with the collected traveling state information; a learning unit for learning an estimation model, with pairs including the voice information and the traveling state information recorded in the generated database being used as learning data; and an estimation unit for estimating the traveling state of the vehicle that may be influenced by a psychological state of the driver by using the estimation model, based on an utterance of the driver.
System and method for automating natural language understanding (NLU) in skill development
A method includes receiving, from an electronic device, information defining a user utterance associated with a skill to be performed, where the skill is not recognized by a natural language understanding (NLU) engine. The method also includes receiving, from the electronic device, information defining one or more actions for performing the skill. The method further includes identifying, using at least one processor, one or more known skills having one or more slots that map to at least one word or phrase in the user utterance. The method also includes creating, using the at least one processor, a plurality of additional utterances based on the one or more mapped slots. In addition, the method includes training, using the at least one processor, the NLU engine using the plurality of additional utterances.
Apparatus and method for providing voice assistant service
Provided are an electronic device and method for providing a voice assistant service. The method, performed by the electronic device, of providing the voice assistant service includes: obtaining a voice of a user; obtaining voice analysis information of the voice of the user by inputting the voice of the user to a natural language understanding model; determining whether a response operation with respect to the voice of the user is performable, according to a preset criterion, based on the obtained voice analysis information; and based on the determining that the response operation is not performable, outputting a series of guide messages for learning the response operation related to the voice of the user.
Display apparatus and method for registration of user command
An apparatus including a user input receiver; a user voice input receiver; a display; and a processor. The processor is configured to: (a) based on a user input being received through the user input receiver, perform a function corresponding to voice input state for receiving a user voice input; (b) receive a user voice input through the user voice input receiver; (c) identify whether or not a text corresponding to the received user voice input is related to a pre-registered voice command or a prohibited expression; and (d) based on the text being related to the pre-registered voice command or the prohibited expression, control the display to display an indicator that the text is related to the pre-registered voice command or the prohibited expression. A method and non-transitory computer-readable medium are also provided.