G06F16/90344

MACHINE LEARNING ENHANCED CLASSIFIER
20230046471 · 2023-02-16 ·

The presently disclosed subject matter includes a computerized method and system that provide the ability to train and execute a unique machine learning (ML) model specifically configured to enhance classifier (e.g., RegEx) output by identifying and removing false positive results from the classifiers output. Classifier output, comprising a collection of data-subsets (e.g., columns in a relational database) of one or more structured or semi-structured data sources (e.g., tables of a relational database), are transformed to be represented by a plurality of numerical vectors. The numerical vectors are used during a training phase (as well as the execution phase) for training a machine learning model to enhance the classifier output and reduce false positives.

SYSTEMS AND METHODS FOR MATCHING ELECTRONIC ACTIVITIES WITH RECORD OBJECTS BASED ON ENTITY RELATIONSHIPS

The present disclosure relates to systems and methods for matching electronic activities with record objects based on entity relationships. The method can include accessing a plurality of electronic activities, identifying an electronic activity, identifying a first participant associated with a first entity and a second participant associated with a second entity, determining whether a record object identifier is included in the electronic activity, identifying a first record object of the system of record that includes an instance of the record object identifier, and storing an association between the electronic activity and the first record object. The method can include determining a second record object corresponding to the second entity, identifying, using a matching policy, a third record object linked to the second record object and identifying a third entity, and storing, by the one or more processors, an association between the electronic activity and the third record object.

ELECTRONIC APPARATUS THAT CAUSES DISPLAY DEVICE TO DISPLAY INFORMATION CORRESPONDING TO KEYWORD AND INTERROGATIVE IN INPUTTED CHARACTER STRING, AND IMAGE FORMING APPARATUS

An electronic apparatus includes a display device, an operation device, and a control device. The control device acts as a controller. The controller causes the display device, when a character string inputted via the operation device contains a keyword and a first interrogative for questioning a method, to display procedure information indicating a procedure related to the setting item corresponding to the keyword. The controller causes the display device, when the character string contains the keyword and a second interrogative for questioning a location, to display location information indicating a location of a setup screen related to the setting item corresponding to the keyword. The controller causes the display device, when the character string contains the keyword and a third interrogative for questioning what a subject is, to display a current set value of the setting item corresponding to the keyword.

Regular expression generation using span highlighting alignment

Techniques for generated regular expressions are disclosed. In some embodiments, a regular expression generator may receive input data comprising one or more character sequences. The regular expression generator may convert character sequences into a sets of regular expression codes and/or span data structures. The regular expression generator may identify a longest common subsequence shared by the sets of regular expression codes and/or spans, and may generate a regular expression based upon the longest common subsequence. Alignment of span data structures may be performed when generating the regular expression.

AUTOMATED INTEROPERATIONAL TRACKING IN COMPUTING SYSTEMS
20230040862 · 2023-02-09 ·

Techniques of automated interoperation tracking in computing systems are disclosed herein. One example technique includes tokenizing a first event log from a first software component and a second event log from the second software component by calculating frequencies of appearance corresponding to strings in the first and second event logs and selecting, as tokens, a first subset of the strings in the first event log and a second subset of the strings in the second event log individually having calculated frequencies of appearance above a preset frequency threshold. The example technique can also include generating an overall event log for a task executed by both the first and second software components by matching one of the strings in the first subset to another of the strings in the second subset.

Tenant-isolated custom annotations for search within a public corpus

Annotations are customized for a tenant-specific search within a public corpus. In a non-limiting embodiment of the invention, a cartridge file is received by a semantic search application. The cartridge file includes a new attribute definition that is not available in an index of the semantic search application. The new attribute definition is incorporated within the index based on an approximation of one or more existing attributes in the index. One or more documents are retrieved from the public corpus based on a concept search using the incorporated new attribute definition and the one or more documents are annotated based on the incorporated new attribute definition. The annotated one or more documents are stored in a tenant-specific dataset separate from the public corpus.

Method for automatically collecting and matching of laboratory data

The present disclosure provides a method for automatically collecting and matching laboratory data, including: obtaining a creation time of experimental data, determining target experimental data corresponding to a target time in accordance with the creation time, segmenting the target experimental data into a plurality data blocks, generating a data block index table, including at least one data block identifier, according to the data blocks, selecting a target matching mode from a plurality of predetermined matching modes according to the data block index table, obtaining the data block identifier upon determining the target experimental data in a storage node is loaded, and extracting data content in the target experimental data corresponding to the data block identifier by the target matching mode. This method may greatly reduce the number of string matching and may reduce the complexity of the algorithm.

Vehicle scenario mining for machine learning models
11550851 · 2023-01-10 · ·

Provided are methods for vehicle scenario mining for machine learning methods, which can include determining a set of attributes associated with an untested scenario for which a machine learning model of an autonomous vehicle is to make planned movements. The method includes searching a scenario database for the untested scenario based on the set of attributes. The scenario database includes a plurality of datasets representative of data received from an autonomous vehicle sensor system in which the plurality of datasets is marked with at least one attribute of the set of attributes. The method further includes obtaining the untested scenario from the scenario database for inputting into the machine learning model for training the machine learning model. The machine learning model is configured to make the planned movements for the autonomous vehicle. Systems and computer program products are also provided.

Feature engineering pipeline generation for machine learning using decoupled dataset analysis and interpretation

Techniques for feature engineering pipeline generation for machine learning using decoupled dataset analysis and interpretation are described. A feature engineering engine obtains a dataset and utilizes a number of analyzers to generate data facts associated with the columnar values of the dataset. The data facts are consolidated together as a set of data statements that are used by multiple interpretation engines that implement different strategies for treating the data in order to generate feature engineering pipeline code.

ENGINE ARCHITECTURE FOR PROCESSING FINITE AUTOMATA

An engine architecture for processing finite automata includes a hyper non-deterministic automata (HNA) processor specialized for non-deterministic finite automata (NFA) processing. The HNA processor includes a plurality of super-clusters and an HNA scheduler. Each super-cluster includes a plurality of clusters. Each cluster of the plurality of clusters includes a plurality of HNA processing units (HPUs). A corresponding plurality of HPUs of a corresponding plurality of clusters of at least one selected super-cluster is available as a resource pool of HPUs to the HNA scheduler for assignment of at least one HNA instruction to enable acceleration of a match of at least one regular expression pattern in an input stream received from a network.