G06F16/2468

DATA STRUCTURE MANAGEMENT SYSTEM
20230043217 · 2023-02-09 ·

A computing device generates a first token for first data content that is associated with a first relationship and a second relationship, and a second token for second data content that is associated with the first relationship and a third relationship, such that the first token and second token are generated based on a frequency of use of data values included in the first and the second data content. The computing device calculates a first similarity score of data values from third data content that is associated with the second relationship and a fourth relationship with data values from fourth data content that is associated with the third relationship and the fourth relationship in response to the first and second token matching. The computing device then performs, in response to the first similarity score satisfying a similarity threshold, a first modification to any of the data content.

Multi-stage adaptable continuous learning / feedback system for machine learning models

Data is received that specifies a term generated by user input in a graphical user interface. Thereafter, the term is looked up in a dictionary in which there are multiple classes for terms. The term can be classified based on a first class having a top ranked effective count for the term within the dictionary when a ratio of the first class relative to a second class having a second ranked effective count for the term in the dictionary is above a pre-defined threshold. In addition, the term is classified using a machine learning model when the ratio of the first class relative to the second class is below the pre-defined threshold. Data can be provided which characterizes the classifying. Related apparatus, systems, techniques and articles are also described.

SYSTEMS AND METHODS FOR A DATA SEARCH ENGINE BASED ON DATA PROFILES

Systems and methods for searching data are disclosed. For example, the system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving a sample dataset and identifying a data schema of the sample dataset. The operations may include generating a sample data vector that includes statistical metrics of the sample dataset and information based on the data schema of the sample dataset. The operations may include searching a data index comprising a plurality of stored data vectors corresponding to a plurality of reference datasets. The stored data vectors may include statistical metrics of the reference datasets and information based on corresponding data schema. The operations may include generating, based on the search and the sample data vector, one or more similarity metrics of the sample dataset to individual ones of the reference datasets.

FUZZY LOGIC MODELING FOR DETECTION AND PRESENTMENT OFANOMALOUS MESSAGING
20230239322 · 2023-07-27 · ·

Disclosed is an approach that applies a fuzzy logic model that may involve fuzzy-matching a plurality of address fields to determine a common physical address, and determining a number of communiques directed to that address with reference to a threshold that may determine an excessive number of communiques. The plurality of address fields may also be fuzzy-matched to information in a fraud-risk database which may comprise a fraud-risk address. One or more matches may be presented to a user who may adjust the views of the various matches, track various trends within the data, and harmonize the various address fields relating to a physical address.

Adaptive data retrieval with runtime authorization

Methods and systems are disclosed for data retrieval, from databases to clients, in an environment requiring runtime authorization. In response to a request for T data records, a learning module provides a prediction R of a suitable number of data records to retrieve from a database. Following retrieval of R records or record identifiers, authorization is sought from an authorization service, resulting in A of the records being authorized. The A authorized records are returned to the requesting client, and, if more records are needed, T is decremented and the cycle is repeated. A performance notification is provided to the learning module for training, with respect to providing values of prediction R. The performance notification can be based on a measure of authorization service performance, the number A of authorized records, latency, communication or resource costs, a measure of resource congestion, or other parameters. Variants are disclosed.

Method and system for associating a license plate number with a user
11709828 · 2023-07-25 · ·

Methods and systems for determining at least one candidate user for a license plate number (LPN) are described herein. A set of license plate recognition (LPR) events that correspond to the LPN and a set of access events of a plurality of users may be obtained. One or more associated events for each respective user of the plurality of users may be determined. A confidence score for each respective user may be determined based on the one or more associated events for the respective user. At least one candidate user for the LPN may be identified based on the confidence score. An indication that the at least one candidate user is a candidate for the LPN is output. Methods and systems for determining at least one candidate license plate number for a user are also described herein. Additional related methods and systems are described herein.

Method for improving maintenance of complex systems

A computer-implemented method of improving maintenance of a complex system, the complex system having a plurality of components, the method involving: preparing data across a plurality of data streams from a plurality of data sources; generating a matrix representation of the data; calculating a time proximity of the data; calculating a plurality of corresponding cell values of the matrix representation; matching event information across the plurality of data streams from a plurality of data sources, the plurality of data sources corresponding to the plurality of components, wherein at least one data stream of the plurality of data streams has at least one of low fidelity data and imprecise event generation information; and scoring the imprecise event generation information across the plurality of data streams, thereby providing a score indicating a match quality of the imprecise event generation information.

Method for providing data associated with original data and electronic device and storage medium for the same

According to an embodiment, an electronic device comprises at least one processor, and a memory that stores instructions configured to cause the at least one processor to obtain first data associated with original data based on random number using a first program, obtain first similarity information between the original data and the first data, obtain second data associated with the original data based on the random number using a second program, obtain second similarity information between the original data and the second data, in response to receiving a request, and provide the first program or the second program based on information included in a request that corresponds to a range that includes at least one of the first similarity information or the second similarity information.

Similarity sharding
11704342 · 2023-07-18 · ·

Computer-implemented systems and methods for efficiently searching large data volumes for one or more items with a definable degree of similarity. The systems and methods may include functionality directed to selecting at least one token from the one or more tokens in a target item, the token including an identifiable character string defining, fully or partially, at least one of a name, an address, an entity or other identifier associated with the target item; extracting a character from the identifiable character string after the character string is standardized to a known common version of the character string; responsive to a character distribution lookup, determining that the extracted character corresponds to a first shard from among a plurality of discrete shards; and grouping the item into the first shard, the character distribution lookup being adjustable overtime to provide for a balanced distribution of items across the plurality of discrete shards.

FUZZY LOGIC MODELING FOR DETECTION AND PRESENTMENT OF ANOMALOUS MESSAGING
20230231876 · 2023-07-20 · ·

Disclosed is an approach that applies a fuzzy logic model that may involve fuzzy-matching a plurality of address fields to determine a common physical address, and determining a number of communiques directed to that address with reference to a threshold that may determine an excessive number of communiques. The plurality of address fields may also be fuzzy-matched to information in a fraud-risk database which may comprise a fraud-risk address. One or more matches may be presented to a user who may adjust the views of the various matches, track various trends within the data, and harmonize the various address fields relating to a physical address.