Patent classifications
G06F40/284
Method for training speech recognition model, method and system for speech recognition
Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.
Systems for real-time intelligent haptic correction to typing errors and methods thereof
Systems and methods of the present disclosure enable context-aware haptic error notifications. The systems and methods include a processor to receive input segments into a software application from a character input component and determine a destination. A context identification model predicts a context classification of the input segments based at least in part on the software application and the destination. Potential errors are determined in the input segments based on the context classification. An error characterization machine learning model determines an error type classification and an error severity score associated with each potential error and a haptic feedback pattern is determined for each potential error based on the error type classification and the error severity score of each potential error of the one or more potential errors. And a haptic event latency is determined based on the error type classification and the error severity score of each potential error.
Detecting system events based on user sentiment in social media messages
Methods and systems are disclosed herein for using anomaly detection in timeseries data of user sentiment to detect incidents in computing systems and identify events within an enterprise. An anomaly detection system may receive social media messages that include a timestamp indicating when each message was published. The system may generate sentiment identifiers for the social media messages. The sentiment identifiers and timestamps associated with the social media messages may be used to generate a timeseries dataset for each type of sentiment identifier. The timeseries datasets may be input into an anomaly detection model to determine whether an anomaly has occurred. The system may retrieve textual data from the social media messages associated with the detected anomaly and may use the text to determine a computing system or event associated with the detected anomaly.
Refining training sets and parsers for large and dynamic text environments
Briefly stated, the invention is directed to retrieving a semantically matched knowledge structure. A question and answer pair is received, wherein the answer is received from a query of a search engine. A question is constraint-matched with the answer based on maximizing a plurality of constraints, wherein at least one of the plurality of the constraints is a similarity score between question and answer, wherein the constraint matching generates a matched sequence. For one or more answer sequences, a subsequence is found that are not parsed as answer slots. Query results are obtained from another search engine based on a combination of the answer or question, and the non-answer subsequence. And a KB based is refined on the query results and the constraint matching and based on a neural network training, for a further subsequent semantic matching, wherein the KB includes a dense semantic vector indication of concepts.
Machine translation of chat sessions
An embodiment may involve a database containing a first user profile that specifies a first preferred language of a first user and a second user profile that specifies a second preferred language of a second user. The embodiment may also involve one or more processors configured to: receive, from the first user and within a chat session, a first set of messages in the first preferred language; cause the first set of messages to be translated into the second preferred language; provide, to the second user and within the chat session, the first set of messages as translated; receive, from the second user and within the chat session, a second set of messages in the second preferred language; cause the second set of messages to be translated into the first preferred language; and provide, to the first user and within the chat session, the second set of messages as translated.
Pointer sentinel mixture architecture
The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.
Pointer sentinel mixture architecture
The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.
Hierarchical multi-task term embedding learning for synonym prediction
Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework. Embeddings trained from the multi-task model embodiments yield significant improvement for entity semantic relatedness evaluation, neighboring word/term semantic type prediction, and synonym prediction compared with baselines.
Systems and methods for response selection in multi-party conversations with dynamic topic tracking
Embodiments described herein provide a dynamic topic tracking mechanism that tracks how the conversation topics change from one utterance to another and use the tracking information to rank candidate responses. A pre-trained language model may be used for response selection in the multi-party conversations, which consists of two steps: (1) a topic-based pre-training to embed topic information into the language model with self-supervised learning, and (2) a multi-task learning on the pretrained model by jointly training response selection and dynamic topic prediction and disentanglement tasks.
Machine learning system and method to map keywords and records into an embedding space
In some embodiments, a method includes determining a position for a search query and a position for each audience record from multiple audience records in an embedding space. The method further includes receiving multiple device records, each associated with an audience record. The method further includes determining multiple keywords, each associated with an audience record and determining a position for each keyword in the embedding space. The method further includes calculating a first distance between the position of the search query in the embedding space and the position of each audience record in the embedding space. The method further includes calculating a second distance between the position of the search query in the embedding space and the position of each keyword in the embedding space. The method further includes ranking each audience record based on the first distance and the second distance.