Patent classifications
G10L2015/0638
SPEECH SKILL JUMPING METHOD FOR MAN MACHINE DIALOGUE, ELECTRONIC DEVICE AND STORAGE MEDIUM
Disclosed is a speech skill jumping method for man-machine dialogue applied to an electronic device, comprising constructing a field migration map in advance based on user's historical man-machine dialogue data, the field migration map being a directed map including a plurality of dialogue fields; receiving external speech; determining a dialogue field that the external speech hits; and judging whether the hit dialogue field belongs to one of the plurality of dialogue fields in the field migration map, and ignoring the external speech if not, or jumping to a speech skill corresponding to the hit dialogue field if yes. A field migration map is generated based on a user's historical man-machine dialogue data which reflects the user's interaction habits, and whether to perform a speech skill jump is judged based on the field migration map., obviously abnormal input content can be shielded, improving the task completion and interaction efficiency.
Explaining anomalous phonetic translations
A method includes: receiving, by a computing device, a digital voice stream; receiving, by the computing device, converted text that represents the digital voice stream; identifying, by the computing device, an erroneously converted portion of the converted text; selecting, by the computing device, the erroneously converted portion for explainability processing; parsing, by the computing device, the erroneously converted portion into parts based on a predetermined parsing level; collecting, by the computing device, supplementary input data related to the erroneously converted portion; and determining, by the computing device and based on the supplemental input data, a reason why the erroneously converted portion was erroneously converted.
Predicting and learning carrier phrases for speech input
Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.
VOICE DIALOGUE PROCESSING METHOD AND APPARATUS
The present application discloses a voice dialogue processing method and apparatus. The voice dialogue processing method includes: determining a voice semantics corresponding to a user voice to be processed; determining a reply sentence for the voice semantics based on a dialogue management engine, a training sample set of which is constructed from a dialogue business customization file including at least one dialogue flow, and the dialogue flow includes a plurality of dialogue nodes in a set order; and generating a customer service voice for replying to the user voice according to the determined reply sentence.
Utterance quality estimation
Techniques herein relate to improving quality of classification models for differentiating different user intents by improving the quality of training samples used to train the classification models. Pairs of user intents that are difficult to differentiate by classification models trained using the given training samples are identified based upon distinguishability scores (e.g., F-scores). For each of the identified pairs of intents, pairs of training samples each including a training sample associated with a first intent and a training sample associated with a second intent in the pair of intents are ranked based upon a similarity score between the two training samples in each pair of training samples. A particular pair of training samples with a highest similarity score is selected and provided as output with a suggestion for modifying the particular pair of training samples.
INTERACTIVE SYSTEM APPLICATION FOR DIGITAL HUMANS
A digital human interactive platform can determine a contextual response to a user input and generate a digital human. The digital human can convey the contextual response to the user in real time. The digital human can be configured to convey the contextual response with a predetermined behavior corresponding to the contextual response.
DEVELOPMENT PLATFORM FOR DIGITAL HUMANS
A digital human development platform can enable a user to generate a digital human. The digital human development platform can receive user input specifying a dialogue for the digital human and one or more behaviors for the digital human, the one or more specified behaviors corresponding with one or more portions of the dialog on a common timeline. Scene data can be generated with the digital human development platform by merging the one or more behaviors with one or more portions of the dialogue based on times of the one or more behaviors and the one or more portions of the dialog on the common timeline.
STRUCTURAL ASSEMBLY FOR DIGITAL HUMAN INTERACTIVE DISPLAY
A digital human display assembly can include a rectangular display panel enclosed within a frame, the rectangular display panel capable of visually rendering a digital human during an interactive dialogue between the digital human and a user. The digital human display assembly also can include a glass covering enclosed within the frame and extending over a front side of the rectangular display panel. A base can support the frame in an upright position. A light emitting diode (LED) arrangement can be positioned on one or more outer portions of the frame.
COMMUNICATIVE LIGHT ASSEMBLY SYSTEM FOR DIGITAL HUMANS
Communications pertaining to a digital human can include communicating via a lighting system based on determining an aspect of a user based on one or more sensor-generated signals. A communicative lighting sequence can be determined based on the user attribute. The lighting sequence can correspond to a condition of a digital human and can be configured to communicate to the user the condition of the digital human. The light sequence can be generated with an LED array mounted on a digital human display assembly.
METHOD AND SYSTEM FOR CLASSIFYING A USER OF AN ELECTRONIC DEVICE
A method and a system for training a machine-learning algorithm (MLA) to determine a user class of a user of an electronic device are provided. The method comprises: receiving a training audio signal representative of a training user utterance; soliciting, by the processor, a plurality of assessor-generated labels for the training audio signal, the given one of the plurality of assessor-generated labels being indicative of whether the training user is perceived to be one of a first class and a second class; generating an amalgamated assessor-generated label for the training audio signal, the amalgamated assessor-generated label being indicative of a label distribution of the plurality of assessor-generated labels between the first class and the second class; generating a training set of data including the training audio signal and the amalgamated assessor-generated to train the MLA to determine the user class of the user producing an in-use user utterance.