G06F16/24547

Support for Multi-Type Users in a Single-Type Computing System
20230050683 · 2023-02-16 ·

Persistent storage contains a parent table and one or more child tables, the parent table containing: a class field specifying types, and one or more filter fields. One or more processors may: receive a first request to read first information of a first type for a first entity; determine that, in a first entry of the parent table for the first entity, the first type is specified in the class field; obtain the first information from a child table associated with the first type; receive a second request to read second information of a second type for a second entity; determine that, in a second entry of the parent table for the second entity, the second type is indicated as present by a filter field that is associated with the second type; and obtain the second information from a set of additional fields in the second entry.

Compression, searching, and decompression of log messages
11593373 · 2023-02-28 · ·

Log messages are compressed, searched, and decompressed. A dictionary is used to store non-numeric expressions found in log messages. Both numeric and non-numeric expressions found in log messages are represented by placeholders in a string of log “type” information. Another dictionary is used to store the log type information. A compressed log message contains a key to the log-type dictionary and a sequence of values that are keys to the non-numeric dictionary and/or numeric values. Searching may be performed by parsing a search query into subqueries that target the dictionaries and/or content of the compressed log messages. A dictionary may reference segments that contain a number of log messages, so that all log message need not be considered for some searches.

Cost-based query optimization for array fields in database systems

A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.

System and method for user interactive contextual model classification based on metadata

A system and a method for contextual categorization of data comprises a server having a processor and a non-transitory computer-readable storage medium in electronic communication with the processor and comprising program instructions executable by the processor to access an initial inventory of data set and metadata associated with the initial inventory of data set. The system is then configured to classify the initial inventory of data set by using the metadata into (a) reduced set of data comprising high level sensitivity classification and (b) a remainder data set. The system and method can be further configured for contextual categorization of data that involves receiving an initial data set to be categorized; establishing a library of contextual classifiers, the library comprising (1) a set of predetermined high level sensitivity classifications and (2) a set of user-generated business-specific sensitivity classifications subordinated below the high level sensitivity classifications; identifying and removing redundant, outdated, trivial or abandoned (ROTA) data from the initial data set to create a reduced data set and a remainder data set of ROTA data; applying the user-generated business-specific sensitivity classifications to the reduced data set to create a first set of classified data and a second set of unclassified data; and iteratively applying additional user-generated business-specific sensitivity classifications to the both the first set of classified data and the second set of unclassified data until all data in the reduced data set has been classified in exactly one use-generated business-specific sensitivity classification.

METADATA CLASSIFICATION
20230222142 · 2023-07-13 ·

Systems and method are disclosed that retrieve data from a data set organized in a plurality of columns. For each column in the plurality of columns, the systems and method generate one or more candidate semantic categories for the column, where each of the one or more candidate semantic categories has a corresponding probability. The systems and method create a feature vector for the column from the one or more candidate semantic categories and the corresponding probabilities. The systems and method determine a semantic category type of the column based on the feature vector. The systems and method anonymize the data in the column based on the semantic category type, which includes replacing more specific data in the column with less specific data based on a data hierarchy that relates the more specific data to the less specific data.

Using machine learning to estimate query resource consumption in MPPDB

Methods and apparatus are provided for using machine learning to estimate query resource consumption in a massively parallel processing database (MPPDB). In various embodiments, the machine learning may jointly perform query resource consumption estimation for a query and resource extreme events detection together, utilize an adaptive kernel that is configured to learn most optimal similarity relation metric for data from each system settings, and utilize multi-level stacking technology configured to leverage outputs of diverse base classifier models. Advantages and benefits of the disclosed embodiments include providing faster and more reliable system performance and avoiding resource issues such as out of memory (OOM) occurrences.

DATA ACCESS CONTROL METHOD, DATA ACCESS CONTROL APPARATUS, AND DATA ACCESS CONTROL PROGRAM
20220405377 · 2022-12-22 ·

A policy determination unit acquires a rule for a request for accessing data based on a preset access control policy, and selects whether to acquire attribute information about an attribute of each record of the data from the outside of a database in which the data is stored. As a result, when selecting acquisition of the attribute information, the attribute information is acquired and the rule based on the attribute information is evaluated, and when selecting no acquisition of the attribute information, the database is caused to execute filtering of the data based on the rule. Then, based on the evaluation result of the rule or the filtering execution result, a record of the data corresponding to the access request is acquired from the database.

Support for multi-type users in a single-type computing system

Persistent storage contains a parent table and one or more child tables, the parent table containing: a class field specifying types, and one or more filter fields. One or more processors may: receive a first request to read first information of a first type for a first entity; determine that, in a first entry of the parent table for the first entity, the first type is specified in the class field; obtain the first information from a child table associated with the first type; receive a second request to read second information of a second type for a second entity; determine that, in a second entry of the parent table for the second entity, the second type is indicated as present by a filter field that is associated with the second type; and obtain the second information from a set of additional fields in the second entry.

Method and apparatus for optimizing database transactions
11500869 · 2022-11-15 · ·

The disclosure provides a database operation method and apparatus. The method comprises: sequentially acquiring, during a process of executing a target transaction by an application server, database operation commands executed by the application server for the target transaction; executing a prediction algorithm on the database operation commands, returning predicted execution results to the application server so that the application server determines a next to-be-executed database operation command, and locally recording the database operation commands and predicted execution data generated from the executing of the prediction; and when acquiring a transaction commit command regarding the target transaction, controlling a database corresponding to the application server to actually execute the target transaction according to the locally recorded database operation commands and the predicted execution data. The disclosed embodiments improve transaction execution efficiency and increase transaction throughput.

Optimizing cloud query execution

An approach for optimizing server application response times. The approach creates a trust sharing context between edge clients and a server application. The approach identifies similar requests from the edge clients to the server application. The approach integrates the similar requests into a single request and normalizes the single request into a normalized data structure. The approach sends the single request to the server application for processing and receives the server application response to the single request. The approach distributes at least a portion of the response to the edge clients.