Patent classifications
G06F2211/1011
NULL ELIMINATION DATA SLAB COMPRESSION SCHEME OF A PARALLELIZED DATABASE SYSTEM
A data input sub-system of a parallelized database system includes processing core resources operable to obtain divisions of data slabs of a dataset and compress the divisions of data slabs using a null elimination compression scheme to produce divisions of compressed data slabs, A first data slab of a first division of data slabs of the divisions of data slabs is compressed using the null elimination compression scheme to produce a first compressed data slab. The first compressed data slab includes first compressed data and first compression information. The processing core resources are further operable to store a respective division of compressed data slabs of the divisions of compressed data slabs.
PARALLELIZED DATA INPUT SUB-SYSTEM OF A DATABASE SYSTEM
A parallelized data input sub-system of a database system includes a first set of loader nodes of pluralities of computing nodes of a plurality of computing device clusters. The first set of loader nodes includes a plurality of memory devices and a plurality of processing modules. The first set of loader node is operable to ingest at least a portion of a dataset (data), a set of the memory devices stores the data, and a set of the processing modules determines whether the data is regarding a query. When it is regarding the query, the set of processing modules provides the data to a query and response sub-system. When it is not regarding the query, the set of processing modules determines long term storage parameters, processes the data in accordance with the parameters to produce formatted data, and provides the formatted data to a store and compute sub-system.
PARALLEL TASK EXECUTION BY A STORE AND COMPUTE SUB-SYSTEM OF A DATABASE SYSTEM
A store and compute sub-system of a database system includes a computing cluster that is operable to receive a plurality of tasks. The computing cluster is further operable to execute, in a concurrent manner, the plurality of tasks. For a first task, a first lead computing device of the computing cluster is operable to: generate a plurality of first partial tasks based on the first task and allocate the plurality of first partial tasks to the plurality of computing devices, wherein the plurality of computing devices executes the plurality of first partial tasks. For a second task, a second lead computing device of the computing cluster is operable to: generate a plurality of second partial tasks based on the second task; and allocate the plurality of second partial tasks to the plurality of computing devices, wherein the plurality of computing device executes the plurality of second partial tasks.
DATABASE SUB-SYSTEM ARCHITECTURE
A database system includes a data ingest subsystem, a store and compute subsystem, and a query and response subsystem interconnected through a system communication network. Each subsystem includes a hierarchy of computing resources defined by an a number of computing clusters, a b number of computing entities per cluster providing an a*b number of computing entities, a c number of computing devices per entity providing an a*b*c number of computing devices, a d number of computing nodes per device providing an a*b*c*d number of computing nodes, and an e number of processing core resources per node providing an a*b*c*d*e number of processing core resources, wherein an asterisk (*) denotes multiplication. The network operably couples the subsystems to enable distributed data ingestion, storage, and query execution across scalable processing resources.