Patent classifications
G06F16/24547
QUERY EXECUTION VIA NODES WITH PARALLELIZED RESOURCES
A node includes a plurality of processing core resources. Each processing core resource of the plurality of processing core resources includes a corresponding processing module, a corresponding memory interface module, a corresponding memory device, and a corresponding cache memory. The plurality of processing core resources of the node is operable to collectively perform corresponding operations of the node. Each processing core resource of the plurality of processing core resources of the node is operable to perform operations independently from other ones of the plurality of processing core resources of the node.
Tokenization and encryption of sensitive data
A method and system for anonymizing data are disclosed. The method and system include receiving, at the wrapper, a request to store data in a data source. The wrapper includes a dispatcher and at least one service. The dispatcher receives the communication and is data agnostic. The method and system also include providing the request from the dispatcher to the at least one service and anonymizing, at the service(s), the data to provide anonymized data.
Database system implementation of a plurality of operating system layers
A computing device comprises a plurality of nodes and a plurality of operating system layers. The plurality of operating system layers includes a local database operating system and a sub-system database operating system. The plurality of nodes utilize the local database operating system to execute at least one database operation independently of other ones of the plurality of nodes. The computing device utilizes the sub-system database operating system in conjunction with other ones of a plurality of computing devices of at least one sub-system to facilitate execution of at least one sub-system operation of the at least one sub-system.
JOINING LARGE DATABASE TABLES
Techniques to process a query and perform a join of tables that are distributed across nodes of a network. The join can be performed by analyzing a Where clause. An active flag structure can have flag values that identify table entries satisfying criteria of the Where clause. Keys of surviving entries of a first table can be used to generate a request for a second table to be joined. The request can be for second flags for the second table when the Where clause has criteria for the second table. A response can be used to update the first flags to change a first flag to False. After updating, data can be retrieved for first flags that are True. Requests can use identifiers associated with the first table that identify a location for sending the request, e.g., using RDMA or MPI.
Behavioral baselining from a data source perspective for detection of compromised users
A method and system are disclosed. The method and system include receiving, at a wrapper, a communication and a context associated with the communication from a client. The communication is for a data source. The wrapper includes a dispatcher and a service. The dispatcher receives the communication and is data agnostic. The method and system also include providing the context from the dispatcher to the service. In some embodiments, the method and system use the service to compare the context to a behavioral baseline for the client. The behavioral baseline incorporates a plurality of contexts previously received from the client.
DISTINCT VALUE ESTIMATION FOR QUERY PLANNING
The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.
GENERATE DIGITAL SIGNATURE OF A QUERY EXECUTION PLAN USING SIMILARITY HASHING
Embodiments are for generating a digital signature of a query execution plan using similarity hashing. A technique includes generating a node digital signature for nodes in a query and generating an edge digital signature for edges in the query, the edges connecting the nodes. The technique includes selecting at least one previously executed query based on the node digital signature and the edge digital signature for the query and causing the query to be processed according to an assignment associated with the at least one previously executed query.
PROCESSING DATABASE QUERIES USING FORMAT CONVERSION
Devices, methods and systems for processing database queries formatted differently than the database storage model being queried are disclosed. Processing database queries independent of the storage model of the queried database may be performed by receiving a query for one or more data items stored in a database, determining whether to use at least one query operator that uses data having a format different from the storage model format of at least one of one or more data items stored in the database and converting the format of the data used by the at least one query operator to a format that matches the storage model format of at least one of one or more data items stored in the database. Related systems, methods, and articles of manufacture are also described.
Parallel Processing Of Data
A data parallel pipeline may specify multiple parallel data objects that contain multiple elements and multiple parallel operations that operate on the parallel data objects. Based on the data parallel pipeline, a dataflow graph of deferred parallel data objects and deferred parallel operations corresponding to the data parallel pipeline may be generated and one or more graph transformations may be applied to the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations. The deferred, combined parallel operations may be executed to produce materialized parallel data objects corresponding to the deferred parallel data objects.
Method and system for adapting programs for interoperability and adapters therefor
A method and system according to embodiments enable generalized program to program interoperability. The method and system employ an automatic or substantially automatic transform adapter for using a given exchange standard for two-way communication with a program. In order for the adapter to employ the exchange standard, a discovery manager may learn the program's data communications structure and/or format, and may learn data meaning information from the program. An adapter creator may derive a transform which converts the program's data communications structure and data meaning into the exchange standard. The transform may be used by the adapter to enable two-way communication with any adapter and/or program similarly employing the given exchange standard to achieve interoperability.