Patent classifications
G06F40/216
WORD MINING METHOD AND APPARATUS, ELECTRONIC DEVICE AND READABLE STORAGE MEDIUM
The present disclosure provides a word mining method and apparatus, an electronic device and a readable storage medium, and relates to the field of artificial intelligence technologies, such as natural language processing technologies, deep learning technologies, cloud service technologies, or the like. The word mining method includes: acquiring search data; taking first identification information, a search sentence and second identification information in the search data as nodes, and taking a relationship between the first identification information and the search sentence, a relationship between the first identification information and the second identification information and a relationship between the search sentence and the second identification information as sides to construct a behavior graph; obtaining a label vector of each search sentence in the behavior graph according to a search sentence with a preset label in the behavior graph; determining a target search sentence in the behavior graph according to the label vector; and extracting a target word from the target search sentence, and taking the target word as a word mining result of the search data.
Systems, Methods, and Devices for a Form Converter
Methods, systems, and devices for automatically converting a static electronic file format and its various elements into a dynamic digital form with executable elements that can be customized before being used. The resulting digital form is compatible with digital workflows and processes. The disclosed systems, methods, and devices go beyond simply extracting data from the original electronic file format and instead enable users to, without using code, convert the source form into a dynamic, malleable digital form while still retaining the source form's original purpose and functionality.
MULTI-MODEL APPROACH TO NATURAL LANGUAGE PROCESSING AND RECOMMENDATION GENERATION
In some implementations, a device may monitor a set of data sources to generate a set of language models corresponding to the set of data sources. The device may determine a plurality of sets of keyword groups. The device may generate a plurality of sets of skill catalogs. The device may receive a source document for processing. The device may process the source document to extract a key phrase set and to determine a first similarity distance. The device may select a corresponding skill catalog and an associated language model based on a relevancy value. The device may determine second similarity distances between the source document and one or more target documents using the corresponding skill catalog and the associated language model. The device may output information associated with one or more target documents based at least in part on the second similarity distances.
Multimodal based punctuation and/or casing prediction
Techniques for predicting punctuation and casing using multimodal fusion are described. An exemplary method includes processing generated text by: tokenizing the generated text into sub-words, and generating a sequence of lexical features for the sub-words using a pre-trained lexical encoder; processing audio of the audio by: generating a sequence of frame level acoustic embeddings using a pre-trained acoustic encoder on the audio, and generating task specific embeddings from the frame level acoustic embeddings; performing multimodal fusion of the sub-word level acoustic embeddings and the sequence of lexical features by: aligning the task specific embeddings to the sequence of lexical features, and combining the sequence of lexical features and aligned acoustic sequence; predicting punctuation and casing from the combined sequence of lexical features and aligned acoustic sequence; concatenating the sub-words of the text, and applying the predicted punctuation and casing; and outputting text having the predicted punctuation and casing.
Predictive time series data object machine learning system
Provided is a method including obtaining a first data object including a first set of data entries, wherein each data entry of the first set of data entries includes text content associated with a time entry. The method includes generating a first data object score using the text content and the time entries included in the first set of data entries and using scoring parameters, determine that the first data object score satisfies a data object score condition; perform in response to the first data object score satisfying the data object score condition, a condition-specific action associated with the data object score condition.
Tracking specialized concepts, topics, and activities in conversations
Embodiments are directed to organizing conversation information. A tracker vocabulary may be provided to a universal model to predict a generalized vocabulary associated with the tracker vocabulary. A tracker model may be generated based on the portions of the universal model activated by the tracker vocabulary such that a remainder of the universal model may be excluded from the tracker model. Portions of a conversation stream may be provided to the tracker model. A match score may be generated based on the track model and the portions of the conversation stream such that the match score predicts if the portions of the conversation stream may be in the generalized vocabulary predicted for the tracker vocabulary. Tracker metrics may be collected based on the portions of the conversation and the match scores such that the tracker metrics may be included in reports or notifications.
Systems for real-time intelligent haptic correction to typing errors and methods thereof
Systems and methods of the present disclosure enable context-aware haptic error notifications. The systems and methods include a processor to receive input segments into a software application from a character input component and determine a destination. A context identification model predicts a context classification of the input segments based at least in part on the software application and the destination. Potential errors are determined in the input segments based on the context classification. An error characterization machine learning model determines an error type classification and an error severity score associated with each potential error and a haptic feedback pattern is determined for each potential error based on the error type classification and the error severity score of each potential error of the one or more potential errors. And a haptic event latency is determined based on the error type classification and the error severity score of each potential error.
Algorithm for scoring partial matches between words
Techniques are disclosed relating to scoring partial matches between words. In certain embodiments, a method may include receiving a request to determine a similarity between an input text data and a stored text data. The method also includes determining, based on comparing one or more words included in the input text data with one or more words included in the stored text data, a set of word pairs and a set of unpaired words. Further, in response to determining that the set of unpaired words passes elimination criteria, the method includes calculating a base similarity score between the input text data and the stored text data based on the set of word pairs. The method also includes determining a scoring penalty based on the set of unpaired words and generating a final similarity score between the input text data and the stored text data by modifying the base similarity score based on the scoring penalty.
Algorithm for scoring partial matches between words
Techniques are disclosed relating to scoring partial matches between words. In certain embodiments, a method may include receiving a request to determine a similarity between an input text data and a stored text data. The method also includes determining, based on comparing one or more words included in the input text data with one or more words included in the stored text data, a set of word pairs and a set of unpaired words. Further, in response to determining that the set of unpaired words passes elimination criteria, the method includes calculating a base similarity score between the input text data and the stored text data based on the set of word pairs. The method also includes determining a scoring penalty based on the set of unpaired words and generating a final similarity score between the input text data and the stored text data by modifying the base similarity score based on the scoring penalty.
Real-time anomaly determination using integrated probabilistic system
An audio stream is detected during a communication session with a user. Natural language processing on the audio stream is performed to update a set of attributes by supplementing the set of attributes based on attributes derived from the audio stream. A set of filter values is updated based on the updated set of attributes. The updated set of filter values is used to query a set of databases to obtain datasets. A probabilistic program is executed during the communication session by determining a set of probability parameters characterizing a probability of an anomaly occurring based on the datasets and the set of attributes. A determination is made if whether the probability satisfies a threshold. In response to a determination that the probability satisfies the threshold, a record is updated to identify the communication session to indicate that the threshold is satisfied.