Patent classifications
G06F16/24557
Method and apparatus for stress management in a searchable data service
Method and apparatus for stress management in a searchable data service. The searchable data service may provide a searchable index to a backend data store, and an interface to build and query the searchable index, that enables client applications to search for and retrieve locators for stored entities in the backend data store. Embodiments of the searchable data service may implement a distributed stress management mechanism that may provide functionality including, but not limited to, the automated monitoring of critical resources, analysis of resource usage, and decisions on and performance of actions to keep resource usage within comfort zones. In one embodiment, in response to usage of a particular resource being detected as out of the comfort zone on a node, an action may be performed to transfer at least part of the resource usage for the local resource to another node that provides a similar resource.
System and method for storing and reading a database on flash memory or other degradable storage
A system and method stores a database file into Flash memory or other write-constrained storage. The system and method can decompress the data to use to process a request by only decompressing data it determines, via metadata, might correspond to a criteria in the request.
Systems and methods for large scale complex storage operation execution
A Multi-Threaded Indexed (“MTI”) file system may use a first set of threads, processes, or executable instances to index desired file attributes in a database while simultaneously but independently executing file operations with a second set of threads, processes, or executable instances. In response to receiving a file operation, the second set of threads, processes, or executable instance may query the database to directly identify files that are indirectly implicated by the file operation with a wildcard, regular expression, and/or other expression that indirectly identifies the files based on different file attributes, paths, name expressions, or combinations thereof. The second set of threads, processes, or executable instances are therefore able to identify the files implicated by the file operation based solely on the indexed file attributes already entered in the database without the need to load and scan the metadata of files in directories targeted by the file operation.
SYSTEMS AND METHODS FOR DATABASE QUERY EFFICIENCY IMPROVEMENT
Methods and systems for database query efficiency improvement are disclosed. In one embodiment, a method includes mirroring a primary database to a secondary database; creating a testing database comprising the schema; receiving a query; running the query on the testing database; and evaluating the query by: identifying predicates in the query; determining most common values for each column name by querying the secondary database; creating, for each column name, a list comprising at least one of the most common values; creating a test predicate comprising one of the column names and an entry for the list corresponding to the column name; creating a test query comprising one or more test predicates; determining a resource utilization of the query by running each of the test queries on the secondary database; and providing, to a user interface for display, an efficiency improvement recommendation when the resource utilization exceeds a threshold.
Metadata routing in a distributed system
In some examples, a first computing device may receive, from a second computing device, partition mapping information indicating partitions of a metadata database. The first computing device may be able to communicate with a plurality of metadata nodes, each metadata node maintaining a portion of the metadata database based on the partitioning of the metadata database to distribute the metadata database across the plurality of metadata nodes. The first computing device may determine to send a request to the metadata database based at least on key information. The first computing device may determine, based on the partition mapping information, a first metadata node of the plurality of metadata nodes indicated to maintain a partition of the metadata database corresponding to the key information. The first computing device, may send, to the first metadata node, based on the partition mapping information, a request to perform a database operation.
Database table with a minimum-maximum filter for multiple ranges
A method for responding to a tabular database (TD) query, the method may include (i) receiving the TD query, wherein the TD query comprises one or more numerical conditions; (ii) determining, using gap filters and based on the one or more numeral conditions, a relevancy to the TD query of groups of cells of the TD that are associated with the gap filters; wherein different gap filters are associated with different groups of cells of the TD; wherein each gap filter comprises one or more pairs of minimum-maximum values that are defined based on one or more gaps between sorted values of the group of cells, wherein at least one gap filter of the gap filters is set up based on a storage parameter of the gap filter and a filtering parameter of the gap filter; (iii) skip a scanning of one or more groups of cells of the TD that are irrelevant to the TD query; and (iv) generate a response to the TD query, wherein the generating comprises scanning one or more groups of cells of the TD that are relevant to the TD query.
Light weight index for querying low-frequency data in a big data environment
The present disclosure relates to searching for and committing low-frequency data to a database. An example method generally includes receiving, from a requesting application, a query for data from the data repository. A database system retrieves a set of indices associated with the data specified in the query from an index table in the data repository. Upon determining that the set of indices comprises a non-null set, the database system retrieves records associated with each index in the set of indices from a data table associated with the index table and returns the retrieved records to the requesting application.
Pruning index generation for pattern matching queries
A query directed at a source table organized into a set of batch units is received. The query includes a pattern matching predicate that specifies a search pattern. A set of N-grams are generated based on the search pattern. A pruning index associated with the source table is accessed. The pruning index comprises a set of filters that index distinct N-grams in each column of the source table. The pruning index is used to identify a subset of batch units to scan for matching data based on the set of N-grams generated for the search pattern. The query is processed by scanning the subset of batch units.
METHOD FOR QUERYING TABLES WITH DIFFERENT PARTITION INFORMATION
Disclosed is a method for processing a query related to a plurality of partitions included in a plurality of tables having different partition information, which is performed by a computing device including one or more processors. The method includes acquiring the plurality of partitions for processing the query. The method includes acquiring global partition indexes for encompassing the acquired partitions and acquiring local partition indexes corresponding to the acquired partitions, respectively. The method includes processing the query at least partially based on the global partition indexes and the local partition indexes.
STORAGE SYSTEM AND DATA CACHE METHOD
A database management system identifies a required column which is required for executing the query, reads out data of the identified required column from a storage device, and executes the query based on the data of the required column. When reading out the data of the required column, the database management system preferentially reads out the data of the required column from a high-speed storage device storing the data of the required column among a memory, a second storage, and a first storage, stores, in the memory, data of the second data size unit including the data of the required column used for executing the query, and, when the data of the required column is read out from the first storage, stores the data of the second data size unit in the memory and stores the read-out data of the first data size unit in the second storage.