G10L15/01

Virtual assistant architecture for natural language understanding in a customer service system

A virtual assistant system for communicating with customers uses human intelligence to correct any errors in the system AI, while collecting data for machine learning and future improvements for more automation. The system may use a modular design, with separate components for carrying out different system functions and sub-functions, and with frameworks for selecting the component best able to respond to a given customer conversation.

Virtual assistant architecture for natural language understanding in a customer service system

A virtual assistant system for communicating with customers uses human intelligence to correct any errors in the system AI, while collecting data for machine learning and future improvements for more automation. The system may use a modular design, with separate components for carrying out different system functions and sub-functions, and with frameworks for selecting the component best able to respond to a given customer conversation.

Bias detection in speech recognition models
11626112 · 2023-04-11 · ·

Systems and methods for detecting demographic bias in automatic speech recognition (ASR) systems. Corpuses of transcriptions from different demographic groups are analyzed, where one of the groups is known to be susceptible to bias and another group is known not to be susceptible to bias. ASR accuracy for each group is measured and compared to each other using both statistics-based and practicality-based methodologies to determine whether a given ASR system or model exhibits a meaningful level of bias.

Exploring Heterogeneous Characteristics of Layers In ASR Models For More Efficient Training

A computer-implemented method includes obtaining a multi-domain (MD) dataset and training a neural network model using the MD dataset with short-form data withheld (MD-SF). The neural network model includes a plurality of layer each having a plurality of parameters. The method also includes resetting each respective layer in the trained neural network one at a time. For each respective layer in the trained neural network model, and after resetting the respective layer, the method also includes determining a corresponding word error rate of the trained neural network model and identifying the respective layer as corresponding to an ambient layer when the corresponding word error rate satisfies a word error rate threshold. The method also includes transmitting an on-device neural network model to execute on one or more client devices for generating gradients based on the withheld domain (SF) of the MD dataset.

Exploring Heterogeneous Characteristics of Layers In ASR Models For More Efficient Training

A computer-implemented method includes obtaining a multi-domain (MD) dataset and training a neural network model using the MD dataset with short-form data withheld (MD-SF). The neural network model includes a plurality of layer each having a plurality of parameters. The method also includes resetting each respective layer in the trained neural network one at a time. For each respective layer in the trained neural network model, and after resetting the respective layer, the method also includes determining a corresponding word error rate of the trained neural network model and identifying the respective layer as corresponding to an ambient layer when the corresponding word error rate satisfies a word error rate threshold. The method also includes transmitting an on-device neural network model to execute on one or more client devices for generating gradients based on the withheld domain (SF) of the MD dataset.

Automated speech recognition confidence classifier

A method of enhancing an automated speech recognition confidence classifier includes receiving a set of baseline confidence features from one or more decoded words, deriving word embedding confidence features from the baseline confidence features, joining the baseline confidence features with word embedding confidence features to create a feature vector, and executing the confidence classifier to generate a confidence score, wherein the confidence classifier is trained with a set of training examples having labeled features corresponding to the feature vector.

SYSTEM AND METHOD FOR AUTOMATED OBSERVATION AND ANALYSIS OF INSTRUCTIONAL DISCOURSE

A method of analyzing instructor discourse includes recording an audio signal representing speech of the instructor during a class session, converting the audio signal to a session transcript comprising speech data for the session using an automatic speech recognition tool and segmenting the transcript into utterances, extracting a set of features from the session transcript, filtering student talk out from the utterances, analyzing a first subset of the features to produce a number of local context predictions for each utterance of the session transcript, analyzing a second subset of the features to produce a number of global context predictions for the session transcript, and combining a subset of the number of local context predictions and the number of global context predictions into a classification that attends to differential reliability.

SYSTEM AND METHOD FOR AUTOMATED OBSERVATION AND ANALYSIS OF INSTRUCTIONAL DISCOURSE

A method of analyzing instructor discourse includes recording an audio signal representing speech of the instructor during a class session, converting the audio signal to a session transcript comprising speech data for the session using an automatic speech recognition tool and segmenting the transcript into utterances, extracting a set of features from the session transcript, filtering student talk out from the utterances, analyzing a first subset of the features to produce a number of local context predictions for each utterance of the session transcript, analyzing a second subset of the features to produce a number of global context predictions for the session transcript, and combining a subset of the number of local context predictions and the number of global context predictions into a classification that attends to differential reliability.

Recognizing accented speech
11651765 · 2023-05-16 · ·

Techniques and apparatuses for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.

Recognizing accented speech
11651765 · 2023-05-16 · ·

Techniques and apparatuses for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.