Patent classifications
G06F16/24539
Data retrieval systems and methods
Described herein is a computer implemented method. The method comprises generating a subscription identifier based on an original SQL query and determining if a set of first stage query results is associated with the subscription identifier in a cache. If so, the method further comprises generating a second stage SQL query based on a second subset of the plurality of clauses and the set of first stage query results, causing execution of the second stage SQL query to obtain a set of second stage query results from a database, and returning the set of second stage query results.
OPTIMIZING INDEXES FOR ACCESSING DATABASE TABLES
A system executes a set of database operations and determines counts of instances that each key is specified for a corresponding column by any database operations on a database table. The system identifies each key which is associated with any determined count that satisfies a threshold as a corresponding frequently accessed key. The system creates an optimized index for each column which stores any frequently accessed key. The system inserts each frequently accessed key into a corresponding optimized index. The system receives a database operation that specifies a specific key for a specific column in the database table. If any optimized index matches the specific column and stores any frequently accessed key that matches the specific key, then the system references a matching frequently accessed key in a matching optimized index to access a record, which is associated with the specific column and the specific key, via the database table.
METHOD AND APPARATUS FOR QUERYING SIMILAR VECTORS IN A CANDIDATE VECTOR SET
A method for querying in a candidate vector set candidate vectors similar to object vectors is disclosed, wherein the candidate vector set comprises a plurality of candidate vectors each being quantized as having a central vector portion and a residual vector portion, and the candidate vector set comprises a plurality of candidate vector subsets, the method comprising: acquiring a set of object vectors; querying, for each object vector of the set of object vectors, a first number of candidate vector subsets that are closest to the object vector; generating and storing a plurality of common calculation results based on a set of central vector portions and a set of residual vector portions of candidate vectors of the first number of candidate vector subsets; generating and storing pre-calculation results based on the set of object vectors and the set of residual vector portion; and determining, for each object vector of the set of object vectors, a second number of candidate vectors that are similar to the object vector among the candidate vectors in the corresponding first number of candidate vector subsets based on the stored pre-calculation results and common calculation results.
SYSTEM AND METHOD FOR EFFICIENT MULTI-STAGE QUERYING OF ARCHIVED DATA
A method for searching indexed packages, generating indexed packages for the records of data based on a parameter, each indexed package characterized by a package key, generating metadata for the indexed packages, the metadata comprising the package key and a reference to the packaged records of data based on a value of the parameter, storing the indexed packages and querying the records of data based on a query defining a search value of the parameter. Querying the records comprises searching the metadata based on the search value and identifying a package key for the metadata referencing the search value of the parameter, loading, from a file-based cache, an indexed package based on the identified package key, when the indexed package is stored in the cache, and loading, from the data repository which is an archive storage, the indexed package when the indexed package is not stored in the cache.
Point in time consistent materialization for object storage environments
A query that is frequently processed to access an object storage is identified. Results from the query returned from the object storage is transformed into a relational database format as a materialized view. When the query is submitted a subsequent time, updated results are managed from the materialized view, other materialized views, and/or the object storage when needed.
Conversational interface for generating and executing controlled natural language queries on a relational database
A conversational analytics system may provide for a conversational interface to any relational database. A controlled natural language may be constructed in an automated manner from a given database (e.g., from schema and values associated with a relational database). For instance, a user natural language expression may be converted to an expression in the constructed controlled natural language and the controlled natural language expression may be converted into a sequence of one or more queries in a query language (e.g., queries in structured query language (SQL)). Such an intermediate controlled natural language may provide queries without ambiguity (e.g., as each expression or phrase in the controlled natural language may be mapped to one sequence of SQL queries). Accordingly, any natural language user utterance that ultimately follows the controlled natural language may be automatically converted into a sequence of one or more SQL queries sent to the database.
Managing Real Time Data Stream Processing
A method for managing data processing includes receiving, from a user of a data query system, a data query for data stored in a data store in communication with the data query system. The method also includes receiving a staleness parameter indicating an upper time boundary for the data query. The upper time boundary limits a query response to data within the data store that is older than the upper time boundary. The method further includes determining whether the data stored within the data store satisfies the staleness parameter. When a portion of the data within the data store fails to satisfy the staleness parameter, the method includes generating the query response that excludes the portion of the data that fails to satisfy the staleness parameter.
SQL implication decider for materialized view selection
A system and method for managing queries including receiving a new query comprising a first plurality of conjoined terms, accessing a filtered view of a database from memory, the filtered view being filtered by the previously received query according to a filter represented by a second plurality of conjoined terms, at least one of the first plurality of conjoined terms or the second plurality of conjoined terms including at least one NULL value, determining that the filter of the new query implies a filter of the previously received query, and based on the determination that the filter of the new query implies the filter of the previously received query, executing the new query using the filtered view of the previously received query.
PROVIDING A RESILIENT APPLICATION PROGRAMMING INTERFACE FOR GEOGRAPHIC SERVICES
A method for providing availability of geographic data to enterprise clients. The method is implemented by processing hardware and includes generating a storage storing geographic information available to an enterprise client via an API call, where the enterprise client configured to (i) receive service requests from user devices and (ii) invoke the API to provide, in response to the service requests, information related to geography. When the enterprise client invokes the API to submit a query (304), the method includes: in a first instance, transmitting the query to a geographic service via a communication network (306) and generating a network-based response to the query using geographic information received from the geographic service in response to the query (312), and in a second instance, generating a storage-based response to the query using the geographic information stored in the storage (322).
Continuous cloud-scale query optimization and processing
Runtime statistics from the actual performance of operations on a set of data are collected and utilized to dynamically modify the execution plan for processing a set of data. The operations performed are modified to include statistics collection operations, the statistics being tailored to the specific operations being quantified. Optimization policy defines how often optimization is attempted and how much more efficient an execution plan should be to justify transitioning from the current one. Optimization is based on the collected runtime statistics but also takes into account already materialized intermediate data to gain further optimization by avoiding reprocessing.