G06F16/00

Data mapper tool

An apparatus includes a processor. The processor extracts a column from an external source for import into a database configured to store a set of columns including a first and second column. The processor splits the entries of the import column into a set of terms. The processor generates a first, second, and third vector based on the frequency of each term of the set of terms in the first, second, and import columns, respectively. The processor determines a first similarity measure between the first and third vectors and a second similarity measure between the second and third vectors. The first similarity measure is greater than the second. In response, the processor provides an indication to a user that the first column is a mapping candidate for the import column, such that entries of the import column may be stored in the database as additional entries in the first column.

Focused probabilistic entity resolution from multiple data sources

Various systems and methods are provided for performing soft entity resolution. A plurality of data objects are retrieved from a plurality of data stores to create aggregated data objects for one or more entities. One or more retrieved data objects may be associated with the same entity, based at least in part upon one or more attribute types and attribute values of the data objects. In response to a determination that the one or more of the retrieved data objects should be associated with the same entity, metadata is generated that associates the data objects with the entity, the metadata being stored separately from the data objects, such that the underlying data objects remain unchanged. In addition, one or more additional attributes may be determined for the entity, based upon the data objects associated with the entity.

Personalizing explainable recommendations with bandits

Methods, systems and computer program products are provided personalizing recommendations of items with associated explanations. The example embodiments described herein use contextual bandits to personalize explainable recommendations (“recsplanations”) as treatments (“Bart”). Bart learns and predicts satisfaction (e.g., click-through rate, consumption probability) for any combination of item, explanation, and context and, through logging and contextual bandit retraining, can learn from its mistakes in an online setting.

Cost-based query optimization for array fields in database systems

A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.

Recommendation method and apparatus, and storage medium

A recommendation method is provided. In the method, a candidate item to be recommended to a social network user is obtained. The social network user has at least two different types of social relationships. For at least one target social object in each of the at least two different types of social relationships of the social network user, attention of each of the at least one target social object in the respective type of social relationship to the candidate item is determined. According to the attention of each of the at least one target social object in the at least two different types of social relationships to the candidate item, a comprehensive attention of the target social objects of the at least two different types of social relationships to the candidate item is determined. According to the comprehensive attention, whether to recommend the candidate item to the social network user is determined.

Securing computing resources through multi-dimensional enchainment of mediated entity relationships
11710052 · 2023-07-25 ·

Synthesizing a control object for a computing event, the control object for securing a computing resource based on a set of access and privilege information provided through a set of mediated associations that are represented by an enchained set of certificates, portions of which are encrypted including entity-specific paths to entity-specific predecessor certificates and partial decryption keys therefor, wherein the control object is applied to secure the computing resource for performing a computing action indicated by a process-type entity identified in the certificate for the control object.

SYSTEM AND METHOD FOR OPTIMIZING WEBSITE CREATION TOOLS BASED ON USER CONTEXT
20180011608 · 2018-01-11 · ·

A system and method for optimizing website creation tools based on user context. The method includes accessing information from a user device, wherein the user information includes a direct user input; analyzing the user information to determine a user context; and generating a website development dashboard for display, wherein the website development dashboard includes at least one website development tool selected based on the determined user context.

Systems and methods for targeted annotation of data

There is provided a system and a method of generating an annotated structured dataset, comprising: receiving a medical classification term, searching over the unstructured patient data for extracting unclassified unstructured text fragments, presenting a subset of the unclassified unstructured text fragments, receiving an indication of a selection of none or at least one of the text fragments, and one of: (i) classifying non-selected unclassified unstructured text fragments according to the medical classification term, and classifying selected text fragments as not satisfying the medical classification term, and (ii) classifying selected unclassified unstructured text fragments according to the medical classification term, and classifying non-selected unclassified unstructured text fragments as not satisfying the medical classification term, and iterating the searching, and/or the presenting, until no text fragments are obtained by the search, wherein the annotated structured dataset is created by the classification of unclassified unstructured text fragments into the medical classification term.

System and method for detecting potential matches between a candidate biometric and a dataset of biometrics
11710297 · 2023-07-25 · ·

A system and method for detecting a potential match between a candidate facial image and a dataset of facial images is described. Some implementations of the invention determine whether a candidate facial image (or multiple facial images) of a person taken, for example, at point of entry corresponds to one or more facial images stored in a dataset of persons of interest (e.g., suspects, criminals, terrorists, employees, VIPs, “whales,” etc.). Some implementations of the invention detect potential fraud in a dataset of facial images. In a first form of potential fraud, a same facial image is associated with multiple identities. In a second form of potential fraud, different facial images are associated with a single identity, as in the case, for example, of identity theft. According to various implementations of the invention, spectral clustering techniques are used to determine a likelihood that pairs of facial images (or pairs of facial image sets) correspond to the person or different persons.

Using sparse merkle trees for smart synchronization of S3
11711204 · 2023-07-25 · ·

One example method, which may be performed in connection with an object store, includes receiving a key of a key-value pair, correlating the key to a location in a base of a Merkle tree, inserting the key at the location, hashing the value associated with the key to produce a data hash, and inserting the data hash in the Merkle tree. The Merkle tree may then be checked for consistency, and synchronized with another Merkle tree. The Merkle tree may be of a fixed size, and insertion of the key in the Merkle tree does not change the location of any keys existing in the Merkle tree prior to insertion of the new key.