G06F16/24547

Large scale application specific computing system architecture and operation
11249998 · 2022-02-15 · ·

A data input sub-system of a large scale application specific computing system receives a data set that includes a plurality of records, each with a plurality of data fields, and divides the data set into a plurality of data segments. The data input sub-system further restructures records of data segments based on a key field of the plurality of data fields to produce restructured data segments and generates storage instructions for storing the restructured data segments. A data storage and processing sub-system of the computing system interprets the storage instructions to determine resources to engage and stores the restructured data segments using engaged resources. A query and results sub-system of the computing system generates an initial query plan based on a data processing request, optimizes the initial query plan to produce an optimized query plan, and sends the optimized query plan to the data storage and processing sub-system for execution.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD AND PROGRAM

With respect to an information processing device which anonymizes data composed of records including one or more items through statistical processing, the information processing device includes a memory, and a processor configured to classify respective records constituting the data into one or more first sets, based on masking target items, a dictionary, and a selected hierarchy level, classify the respective records into one or more second sets with respect to a number of records belonging to each of the one or more first sets, and calculate a number of records of each of the one or more second sets and a ratio of records belonging to each of the one or more second sets to the records constituting the data, change the selected hierarchy level based on the ratio and priority set in advance, and create a statistically processed record by statistically processing records belonging to a same first set.

Cost-based optimization for document-oriented database queries

A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.

SYSTEMS AND METHODS FOR TRANSLATING N-ARY TREES TO BINARYQUERY TREES FOR QUERY EXECUTION BY A RELATIONAL DATABASEMANAGEMENT SYSTEM
20210382898 · 2021-12-09 · ·

A method for obtaining query response data by a relational database management system (RDBMS) is provided. The method receives a user input query, by a processor associated with the RDBMS, wherein the user input query comprises a query request for a set of data; formats the user input query into a second query language suitable for communication between the RDBMS and a query response interface associated with a second data storage external to the RDBMS, by the processor, to generate a reformatted user input query, wherein the RDBMS is configured to perform query operations using an n-ary tree format, and wherein the query response interface is configured to perform query operations using a binary tree format consisting of two child nodes per non-terminal node of a binary tree; and transmits the reformatted user input query to the query response interface, via a communication device communicatively coupled to the processor.

Constrained query execution

Service interruptions in a multi-tenancy, network-based storage system can be mitigated by constraining the execution of queries. In various examples, a network-based storage system may receive a request to execute a query against data maintained by the network-based storage system. The network-based storage system may perform a unit of work to execute the query, progressing through some, but not all, of a set of operations that are to be completed for completing execution of the query. Upon completion of the unit of work, query execution may be paused, query state data may be saved, and query results may be generated for consumption by the requesting computing device. In some embodiments, tokens that are usable to resume query execution based on the saved query state data may be sent to customer computing devices for resuming query execution on-demand.

METHOD AND SYSTEM FOR ADAPTING PROGRAMS FOR INTEROPERABILITY AND ADAPTERS THEREFOR
20220171784 · 2022-06-02 · ·

A method and system according to embodiments enable generalized program to program interoperability. The method and system employ an automatic or substantially automatic transform adapter for using a given exchange standard for two-way communication with a program. In order for the adapter to employ the exchange standard, a discovery manager may learn the program's data communications structure and/or format, and may learn data meaning information from the program. An adapter creator may derive a transform which converts the program's data communications structure and data meaning into the exchange standard. The transform may be used by the adapter to enable two-way communication with any adapter and/or program similarly employing the given exchange standard to achieve interoperability.

STRUCTURED QUERY LANGUAGE INTERFACE FOR TABULAR ABSTRACTION OF STRUCTURED AND UNSTRUCTURED DATA
20220171772 · 2022-06-02 ·

The subject matter discloses a system for handling complex database queries using parallel HTTP requests. The system includes a processor configured to receive a query from a software application to access relational data from a data source. The system processes the received query to generate a plurality of queries that are delivered concurrently as HTTP requests to API Backends to retrieve data. For each request to each of the API Backends, Cypher request is sent to the data source. A plurality of GraphQL responses are received from the backends. The system further generates a table from the responses based on a relational database schema such as user-defined table functions (UDTFs) and transmits the generated table to the nodes as a response to the database query.

Content masking in a storage system
11340800 · 2022-05-24 · ·

Content masking within a storage system includes: responsive to receiving a first request to access a portion of a stored snapshot, creating a transformed snapshot portion by applying a transformation specified in an access policy to one or more data objects contained within the portion of the stored snapshot; and presenting the transformed snapshot portion.

Adaptive model transformation in multi-tenant environment

A query processing method includes receiving a query from a requestor. The query is directed to a first data model specifying multiple base data fields. The method includes determining a set of extension bindings for the first data model based on the query. Each binding specifies an extension to the first data model from a set of model extensions and specifies one of the base data fields of the first data model as a node at which the extension is added. The method includes generating a data object from base data values and extension data values according to an extended data model. The extended data model is defined by the first data model extended by, for each binding of the set, adding data fields from the specified extension to the first data model at the specified node. The method includes returning the data object to the requestor.

Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization

Various techniques are described for platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization, including receiving at a dataset access platform a query formatted according to a first data schema, generating a copy of the query, saving the query and the copy to a datastore, parsing the copy of the query in the first schema using an inference engine, determining whether the query comprises data associated with an access control condition associated with accessing the dataset, the access control condition being configured to indicate whether the query is permitted to access the dataset, and rewriting, using a proxy server, the copy of the query in a second schema by converting the copy of the query into a triple associated with the query and another triple associated with the access control condition.