Patent classifications
G06F16/24561
Tail-based top-N query evaluation
Techniques are described for executing a query with a top-N clause to select a first N-number of rows in a data source arranged at least according to a first key and a second key of the data source using a first sort order respectively specified for the first key and a second sort order respectively specified for the second key by the query. The data source may include one or more tiles that include at least a portion of the first key and the second key. To execute the query, in an embodiment, a DBMS determines, in a first vector of first key values that are in a first tile, row identifiers identifying entries of the first vector that contain values equal to a tail value that follows a particular top number of the first key values. The DBMS may select, from a second vector of values of the second key in the first tile, second key values identified based on the determined row identifiers of the first vector. In an embodiment, the DBMS generates a result set of the query that includes at least a value from the second key values selected from the second vector based on the determined first row identifiers.
Constrained query execution
Service interruptions in a multi-tenancy, network-based storage system can be mitigated by constraining the execution of queries. In various examples, a network-based storage system may receive a request to execute a query against data maintained by the network-based storage system. The network-based storage system may perform a unit of work to execute the query, progressing through some, but not all, of a set of operations that are to be completed for completing execution of the query. Upon completion of the unit of work, query execution may be paused, query state data may be saved, and query results may be generated for consumption by the requesting computing device. In some embodiments, tokens that are usable to resume query execution based on the saved query state data may be sent to customer computing devices for resuming query execution on-demand.
INTERMEDIATE DATA OBJECTS AND USES THEREOF
An intermediate data object is described that bridges source data to analysis-ready variables (ARVs).
Asynchronous Search of Electronic Assets Via a Distributed Search Engine
Asynchronous search of electronic assets via a distributed search engine is disclosed herein. An example method includes receiving a request from a user, the request including a query and a query time parameter, the query time parameter defining a time that the user will wait for results to be completed synchronously, determining that the query is incomplete and that the time has been exceeded, issuing the query a unique query identifier, and asynchronously adding results to an index based on the unique query identifier.
SYSTEMS AND METHODS FOR GENERATING A SNAPSHOT VIEW OF VIRTUAL INFRASTRUCTURE
A computer may receive a request to generate a snapshot view of a virtual infrastructure. The virtual infrastructure may comprise a plurality of virtual server management applications, each managing a respective set of virtual machines. The computer may implement a multi-threaded process to contemporaneously query one or more databases and retrieve status and other information of the virtual machines from different virtual server management applications. The computer may aggregate the retrieved information to determine the summary counters and statistic information for the virtual machines. The computer may generate a snapshot view file based on the retrieved information. The snapshot view file may be in hypertext markup language (HTML) format. The computer may transmit a selectable link to the snapshot view file to multiple user devices. A user may select the link and the respective user device may display the snapshot view in an application such as a web browser.
System And Method For Analyzing Data Records
Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
Efficient use of TRIE data structure in databases
The invention provides a time-efficient way of performing a query in a database or information retrieval system comprising operations such as intersection, union, difference and exclusive disjunction on two or more sets of keys stored in a database or information retrieval system. In a novel execution model, all data sources are tries. Two or more input tries are combined in accordance with the respective set operation, to obtain the set of keys associated with the nodes of a respective resulting trie. An intersection operation performed in this way can be used for efficient range queries, in particular when two or more data items are involved in the query. The physical algebra of the implementation of tries based on bitmaps corresponds directly to the logical algebra for the set operations and allows for efficient implementation by means of bitwise Boolean operations.
Data processing apparatus and data processing method
According to an aspect of the invention, a data processing apparatus is provided. The data processing apparatus includes a data processing unit. The data processing unit is configured to perform at least part of: preprocessing on a data processing request to a common database issued by one of a plurality of applications accessing the common database; and post-processing on a search result returned from the common database in response to the data processing request.
Metadata converter and memory management system
System, method, and various embodiments for providing a metadata converter and memory management system are described herein. An embodiment operates by determining that first metadata corresponding to a table of a database comprising load preferences for a column level for a plurality of columns of the table, wherein the load preferences include either column load or page load. It is determined that the database is enabled with both load preferences for a table level and load preferences partition level, in addition to load preferences for the column level. Values for the load preferences are automatically assigned for both the table level and the partition level in second metadata, and wherein the second metadata preserves the load preferences for the column level of the first metadata. A query against the table based on load preferences from the second metadata.
FINGERPRINTS FOR COMPRESSED COLUMNAR DATA SEARCH
The present disclosure involves systems, software, and computer implemented methods for compressed columnar data search using fingerprints. One example method includes compressing columnar data that includes dividing the columnar data into multiple data blocks and generating a fingerprint for each data block, storing the compressed columnar data and the generated fingerprints in an in-memory database, receiving a query for the columnar data, for each in-memory data block stored in the in-memory database, determining whether the in-memory data block satisfies the query and in response to a determination that the in-memory data block does not satisfy the query, pruning the in-memory data block from the multiple data blocks to generate an unpruned set of data blocks, decompressing the unpruned set of data blocks, and performing a query search on the decompressed unpruned set of data blocks for the received query.