G06F16/353

Building and managing cohesive interaction for virtual assistants

A method includes receiving data comprising a plurality of requests and a plurality of responses to the requests. The requests and the responses are associated with a virtual assistant programmed to address the plurality of requests. In the method, a machine learning (ML) classifier is used to partition the requests into a plurality of partitions corresponding to a plurality of request types. An interface for a user is generated to display a subset of the requests corresponding to at least one partition of the plurality of partitions and to display a response corresponding to the subset of the plurality of requests, wherein the response is based on one or more of the plurality of responses. The interface is configured to permit editing of the response by the user. The method also includes processing the response edited by the user, and transmitting the edited response to the virtual assistant.

Neologism classification techniques with trigrams and longest common subsequences

Techniques are provided for identifying attributes associated with a neologism or an unknown word or name. Real world characteristics can be predicted for the neologism. Trigrams are identified for an input word and word embedding model vector values are calculated for the identified trigrams and entered into a matrix. Trigrams are identified for nearest names. Classification values are calculated based on the trigrams for the input word and the trigrams from the nearest names and the classification values are entered into the matrix. A convolutional neural network can process the matrix to identify one or more characteristics associated with the neologism.

Cross-lingual unsupervised classification with multi-view transfer learning
11694042 · 2023-07-04 · ·

Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.

Permutation-based clustering of computer-generated data entries
11693851 · 2023-07-04 · ·

A computer-generated data entry is received. The computer-generated data entry is segmented into a set of tokens. A plurality of different token permutation groupings are determined. Each of the different token permutation groupings includes a different subset of tokens from the set of tokens of the computer-generated data entry. For the computer-generated data entry, a corresponding token permutation grouping identifier is determined for each grouping of the plurality of different token permutation groupings. It is determined whether the computer-generated data entry belongs to any data entry cluster among a plurality of previously identified data entry clusters based on a search performed using the token permutation grouping identifiers of the computer-generated data entry.

AI-AUGMENTED AUDITING PLATFORM INCLUDING TECHNIQUES FOR AUTOMATED ASSESSMENT OF VOUCHING EVIDENCE

Systems and methods for determining whether an electronic document constitutes vouching evidence is provided. The system may receive ERP item data and generate hypothesis data based thereon, and may receive electronic document data and extract ERP information therefrom. The system may then apply one or more models to compare the hypothesis data to the extracted ERP information to determine whether the electronic document constitutes vouching evidence for the ERP item. Systems and methods for verifying an assertion against a source document are provided. The system may receive first data indicating an unverified assertion and second data comprising a plurality of source documents. The system may apply one or more extraction models to extract a set of key data from the plurality of source documents and may apply one or more matching models to compare the first data to the set of key data to determine whether vouching criteria are met.

GENERATING SECURITY EVENT CASE FILES FROM DISPARATE UNSTRUCTURED DATA
20230004978 · 2023-01-05 ·

Described herein are systems and methods for generating security event case files with unstructured data. For example, the method can include receiving, by a computing system, unstructured data and system-based inferences from devices positioned throughout a store, and adding structure to the unstructured data and system-based inferences based on applying one or more structuring models. Adding structure can include labeling the data and system-based inferences, classifying them into security event categories, and identifying objective identifiers to identify users in the data and system-based inferences. The method also can include generating case files for each of the objective identifiers, where the case files include the associated data. The method can include determining whether the case files satisfy alerting rules. The case files can then be reported out and acted upon (e.g., based on satisfying the alerting rules) and/or stored for subsequent analysis and use.

System and method for parsing user query

A system and a method for parsing a user query. The system includes a database arrangement operable to store an ontology; and a processing module communicably coupled to the database arrangement. The processing module operable to receive the user query; refine the user query to obtain a search query using an algorithm; generate a plurality of strings for the obtained search query; sort the plurality of strings in a decreasing order of length of the plurality of strings; assign a part-of-speech tag to each of the query segments of the plurality of strings based on the ontology; identify at least one of the query segments as at least one output class or at least one input class based on the assigned part-of-speech tags; and establish semantic associations between the query segments based on the ontology to obtain the parsed user query.

Artificially intelligent system employing modularized and taxonomy-based classifications to generated and predict compliance-related content
11537649 · 2022-12-27 · ·

A system employing new and improved artificially intelligent system for employing modularized and taxonomy-based classifications to generate compliance-related content. In one embodiment, the system comprises monitoring circuitry that receives regulatory compliance data from one or more regulatory institutions, as well as a taxonomy engine that processes the regulatory compliance data to generate taxonomy-based classifications of the regulatory compliance data comprising a plurality of modules and compliance requirements within each module. In certain embodiments, the system also includes a database storing the taxonomy-based classifications of the regulatory compliance data, and a plurality of processors in operative communication with the database that receive at least two of the plurality of modules from the taxonomy-based classifications and process the compliance requirements within each received module using natural language processing to generate a mapping of semantic relationship pairs between each received module. In certain embodiments, the system also includes scoring circuitry that processes the mapping of semantic relationship pairs between each received module to produce a similarity score for each relationship pair, as well as interface circuitry that uses the similarity scores to generate a set of compliance steps that covers all compliance requirements from each of the received modules.

Moderator tool for moderating acceptable and unacceptable contents and training of moderator model
11531834 · 2022-12-20 · ·

A computer-executable method for moderating publication of data content with a moderator tool. The data contents are labelled as acceptable or unacceptable. The moderator tool receives the training data and executes a first algorithm that identifies features that exist in the training data and extracts them and ending up with a feature space. The moderator tool executes a second algorithm in the feature space for defining a distribution of data features that differentiate between the acceptable contents and the unacceptable contents in order to create a moderation model. When the moderator tool receives a new data content to be moderated, it executes the moderator tool on the new data content for identifying the data features in the new data content to be moderated in accordance with the moderation model created, and for producing a moderation result for the new data content by indicating whether the new data content is acceptable.

Systems and methods for identifying personal identifiers in content
11531779 · 2022-12-20 · ·

Provided herein are systems and methods for identifying personal identifiers in content. An entity engine may receive content to identify candidate personal identifiers. The entity engine may determine that a text string in the content matches to a data format specified in entity definitions corresponding to types of personal identifiers and a rule for finding a geographic or linguistic term in the content correlated to the specific type of personal identifier. Each entity definition may specify a data format for finding a specific type of personal identifier in content. The data format corresponds to a type of personal identifier. The entity engine may identify, according to a rule of the first entity definition, a geographic or linguistic term in the content correlated to the type of personal identifier. The entity engine may classify the text string as the type of personal identifier, for preventing data breach or exfiltration.