G10L2015/0636

Speech correction system and speech correction method

The speech correction system includes a storage device, an audio receiver and a processing device. The processing device includes a speech recognition engine and a determination module. The storage device is configured to store a database. The audio receiver is configured to receive an audio signal. The speech recognition engine is configured to identify a key speech pattern in the audio signal and generate a candidate vocabulary list and a transcode corresponding to the key speech pattern; wherein the candidate vocabulary list includes a candidate vocabulary corresponding to the key speech pattern and a vocabulary score corresponding to the candidate vocabulary. The determination module is configured to determine whether the vocabulary score is greater than a score threshold. If the vocabulary score is greater than the score threshold, the determination module stores the candidate vocabulary corresponding to the vocabulary score in the database.

AUTOMATED WORD CORRECTION IN SPEECH RECOGNITION SYSTEMS
20210272550 · 2021-09-02 ·

Systems and methods for correcting recognition errors in speech recognition systems are disclosed herein. Natural conversational variations are identified to determine whether a query intends to correct a speech recognition error or whether the query is a new command. When the query intends to correct a speech recognition error, the system identifies a location of the error and performs the correction. The corrected query can be presented to the user or be acted upon as a command for the system.

Systems and methods for mixed setting training for slot filling machine learning tasks in a machine learning task-oriented dialogue system

Systems and methods for intelligently training a subject machine learning model includes identifying new observations comprising a plurality of distinct samples unseen by a target model during a prior training; creating an incremental training corpus based on randomly sampling a collection of training data samples that includes a plurality of new observations and a plurality of historical training data samples used in the prior training of the target model; implementing a first training mode that includes an incremental training of the target model using samples from the incremental training corpus as model training input; computing performance metrics of the target model based on the incremental training; evaluating the performance metrics of the target model against training mode thresholds; and selectively choosing based on the evaluation one of maintaining the first training mode and automatically switching to a second training mode that includes a full retraining of the target model.

Audio-based link generation

First and second speech data can be received from respective first and second devices. The first and second speech data can be determined to be from a same dialog. A link can be generated based on the dialog.

AUTOMATED WORD CORRECTION IN SPEECH RECOGNITION SYSTEMS
20230410792 · 2023-12-21 ·

Systems and methods for correcting recognition errors in speech recognition systems are disclosed herein. Natural conversational variations are identified to determine whether a query intends to correct a speech recognition error or whether the query is a new command. When the query intends to correct a speech recognition error, the system identifies a location of the error and performs the correction. The corrected query can be presented to the user or be acted upon as a command for the system.

TEXT INDEPENDENT SPEAKER RECOGNITION

Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.

Artificial Intelligence Based System And Method For Controlling Virtual Agent Task Flow

The present system and method may generally include organizing the task flow of a virtual agent in a way that is controlled by a set of rules and set of conditional probability distributions. The system and method may include receiving a user utterance including a first task, identifying the first task from the user utterance, and obtaining a set of rules related to the plurality of tasks. The set of rules may determine whether pre-tasks and/or pre-conditions are to be executed before executing the first task. The set of rules may also determine whether post-tasks and/or post-conditions are to be executed after executing the first task. The system and method may include executing the task; running a probabilistic graphical model on the plurality of tasks to determine a second task based on the first task; suggesting to the user the second task; and updating the probabilistic graphical model after a threshold number of runs.

System to detect and reduce understanding bias in intelligent virtual assistants
11854532 · 2023-12-26 · ·

Disclosed is a system and method for detecting and addressing bias in training data prior to building language models based on the training data. Accordingly system and method, detect bias in training data for Intelligent Virtual Assistant (IVA) understanding and highlight any found. Suggestions for reducing or eliminating them may be provided This detection may be done for each model within the Natural Language Understanding (NLU) component. For example, the language model, as well as any sentiment or other metadata models used by the NLU, can introduce understanding bias. For each model deployed, training data is automatically analyzed for bias and corrections suggested.

METHOD AND DEVICE FOR PROVIDING VOICE RECOGNITION SERVICE
20200365138 · 2020-11-19 · ·

A method, performed by the electronic device, of providing a voice recognition service includes obtaining a user call keyword for activating the voice recognition service, based on a first user voice input; generating a user-customized voice database (DB) by inputting the obtained user-customized keyword to a text to speech module; and obtaining a user-customized feature by inputting an audio signal of the user-customized voice DB to a pre-trained wake-up recognition module.

DETECTING AND SUPPRESSING VOICE QUERIES

A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.