Patent classifications
G10L25/39
METHOD AND SYSTEM FOR AUTOMATIC BACK-CHANNEL GENERATION IN INTERACTIVE AGENT SYSTEM
There are provided a method and a system for automatically generating a back-channel in an interactive agent system. According to an embodiment of the disclosure, an automatic back-channel generation method includes: predicting a back-channel by analyzing an utterance of a user inputted in a back-channel prediction model; and generating the predicted back-channel, and the back-channel prediction model is an AI model that is trained to predict a back-channel to express from the utterance of the user. Accordingly, a back-channel is automatically generated by utilizing a back-channel prediction module which is based on a language model, so that a natural dialogue interaction with a user may be implemented in an interactive agent system, and quality of a dialogue service provided to a user may be enhanced.
METHOD AND SYSTEM FOR AUTOMATIC BACK-CHANNEL GENERATION IN INTERACTIVE AGENT SYSTEM
There are provided a method and a system for automatically generating a back-channel in an interactive agent system. According to an embodiment of the disclosure, an automatic back-channel generation method includes: predicting a back-channel by analyzing an utterance of a user inputted in a back-channel prediction model; and generating the predicted back-channel, and the back-channel prediction model is an AI model that is trained to predict a back-channel to express from the utterance of the user. Accordingly, a back-channel is automatically generated by utilizing a back-channel prediction module which is based on a language model, so that a natural dialogue interaction with a user may be implemented in an interactive agent system, and quality of a dialogue service provided to a user may be enhanced.
SYSTEMS AND METHODS FOR RULE-BASED USER CONTROL OF AUDIO RENDERING
A sound processing system includes a sound input device for providing a sound input, a sound output device for providing a sound output, and processing electronics including a processor and a memory, wherein the processing electronics is configured to receive a target sound input identifying a target sound, receive a rule input establishing a sound processing rule that references the target sound, receive a sound input from the sound input device, analyze the sound input for the target sound, process the sound input according to the sound processing rule in view of the analysis of the sound input, and provide a processed sound output to the sound output device.
Automated speech recognition proxy system for natural language understanding
An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs. In some embodiments, the ASR proxy dynamically selects one or more recognizers based at least in part on the identified grammar and the time length of the utterance.
Automated speech recognition proxy system for natural language understanding
An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs. In some embodiments, the ASR proxy dynamically selects one or more recognizers based at least in part on the identified grammar and the time length of the utterance.
PREDICTION METHOD, DEVICE AND SYSTEM FOR ROCK MASS INSTABILITY STAGES
Embodiments of the present application provide a prediction method, device and system for rock mass instability stages, and belong to the technical field of rock mass instability prediction. The method includes the steps: acquiring acoustic emission signals of rock mass; extracting feature parameters from the acquired acoustic emission signals; and predicting instability stages of the rock mass in accordance with the feature parameters and a preset back propagation (BP) neural network model, wherein the preset BP neural network model is obtained by training a BP neural network and a genetic algorithm by virtue of the feature parameters of the acoustic emission signals at different rock mass instability stages. According to the technical solution in the present application, the problem in the training process of the BP neural network model that model parameter optimization may be easily trapped in a locally optimal solution is effectively solved.
Voiceprint recognition method, device, storage medium and background server
The present invention provides a voiceprint recognition method, a device, a storage medium and a background server. The voiceprint recognition method comprises: collecting, by a client, and sending a voice recognition request to the background server, the voice recognition request comprises the user ID and the test voice; receiving the voice recognition request, and determining the voice recognition request to be processed with a message queue and an asynchronous mechanism; acquiring a target voiceprint feature which corresponds to the user ID of the voice recognition request to be processed, and acquiring a test voiceprint feature which corresponds to the test voice of the voice recognition request to be processed; judging whether the target voiceprint feature and the test voiceprint feature correspond to the same user, and outputting the result of the judging to the client; and receiving and displaying, by the client, the result of the judging.
Technologies for end-of-sentence detection using syntactic coherence
Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
Technologies for end-of-sentence detection using syntactic coherence
Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
VOICEPRINT RECOGNITION METHOD, DEVICE, STORAGE MEDIUM AND BACKGROUND SERVER
The present invention provides a voiceprint recognition method, a device, a storage medium and a background server. The voiceprint recognition method comprises: collecting, by a client, a test voice of a user, and sending a voice recognition request to the background server, the voice recognition request comprises the user ID and the test voice; receiving, by the background server, the voice recognition request, and determining the voice recognition request to be processed with a message queue and an asynchronous mechanism; acquiring, by the background server, a target voiceprint feature which corresponds to the user ID of the voice recognition request to be processed, and acquiring a test voiceprint feature which corresponds to the test voice of the voice recognition request to be processed; judging, by the background server according to the target voiceprint feature and the test voiceprint feature, whether the target voiceprint feature and the test voiceprint feature correspond to the same user, and outputting the result of the judging to the client; and receiving and displaying, by the client, the result of the judging. The voiceprint recognition method may achieve a fast voice recognition effect and improve the voice recognition efficiency.