G06F16/2272

Hierarchical window function
11544267 · 2023-01-03 · ·

A method may include generating, based on a representation of a hierarchy stored in a database, a visiting sequence data structure. The hierarchy may be stored in a table in the database. Each of a plurality of rows comprising the table may correspond to one of a plurality of nodes comprising the hierarchy. The visiting sequence data structure may include a row vector specifying an order for traversing the plurality of nodes in the hierarchy. A hierarchical window function may be executed by iterating through the plurality of rows in the table in accordance with the order specified by the row vector. The execution of the hierarchical window function may further include determining, for a first node in the hierarchy, a summary value corresponding to a first value of the first node and a second value of a second node descendent from the first node.

Management of indexed data to improve content retrieval processing

The present disclosure relates to processing operations configured to uniquely utilize indexing of content to improve content retrieval processing, particularly when working with large data sets. The techniques described herein enables efficient content retrieval when working with large data sets such as those that may be associated with a plurality of tenants of a data storage application/service. Among other technical advantages, the present disclosure is applicable to train a classifier using relevant samples based on text search in tenant-specific scenarios, where accurate searching can be executed for content associated with one or more tenant accounts of an application/service concurrently in milliseconds even in instances where there may be millions of documents to be searched. As an example, exemplary data shards may be generated and managed for efficient and scalable content retrieval processing including training of a classifier (e.g., artificial intelligence classifier) and real-time (or near real-time) query processing.

BLOCKCHAIN DATA INDEX METHOD, BLOCKCHAIN DATA STORAGE METHOD AND DEVICE
20220414090 · 2022-12-29 ·

A blockchain data index method, a blockchain data storage method and a device are provided. This method first persists a block into a block file, persists an index data layer into an index database, then obtains a matching transaction location according to a query condition, and finally obtains a complete transaction from the block file. In the present application, through expanding the process of the blockchain data storage processing, the blockchain index data is stored in an independent index database and transaction atomicity of multi-database is guaranteed, and a data index mechanism, in which business information is customized in a transaction note field, and then a transaction is associated with the business information through the customized transaction note index field and finally the complete transaction is acquired based on key transaction information, is established.

Pre-caching of relational database management system based on data retrieval patterns

A processor tracks a frequency of access requests of a first index corresponding to a first data page of a plurality of data pages stored in a database. The processor determines the first index corresponding to the first data page having a frequency of access requests that exceeds a configurable target, and the processor retains, with preference, the first data page that corresponds to the first index, within the cache memory.

Fast database loading with time-stamped records
11537599 · 2022-12-27 · ·

For a first record of a batch of records, a first timestamp may be determined to be stored with the first record in a database into which the batch of records are to be loaded as part of a database loading process. For each remaining record of the batch of records, a future timestamp may be generated using the first timestamp, until a final timestamp of a final record of the batch of records is generated. For a load completion time at which the database loading process completes, prior to the final timestamp, a wait time until a batch completion time may be determined by comparing the load completion time and the final timestamp, and waiting for the wait time to reach the batch completion time. If the load completion time is at or after the final timestamp, the batch completion time may be reached at the load completion time.

SYSTEM AND METHOD TO EVALUATE DATA CONDITION FOR DATA ANALYTICS

A system, program product, and/or method for evaluating the condition of data for using data analytics options that includes: collecting data to evaluate its condition for supporting a plurality of data analytics options; determining, for each data analytics option, a plurality of a group of data indices, the group consisting of: a volume index measuring the amount of data, a history index for measuring the amount of historical data, a variety index for measuring the variety and type of data, a veracity index for measuring the quality of the data, a value index for measuring the information gain provided by the data; and determining a data readiness score that encompasses scaling, for each of the data analytics options, the plurality of the data indices group. Utilizing a data requirements matrix, providing domain-specific business objectives, and calculating for each of the data analytics options the information gain is also disclosed.

Automated database index management

A database index management system uses one or more machine learning models to analyze a query log in relation to a database. A machine learning model may identify a query pattern and/or a change in the query pattern from the query log, identify a column associated with the query pattern, and identify an addition, removal, or modification of an index related to the identified column. The database index management system may perform one or more additions, removals or modifications of indices of the database based on query patterns identified in the query log. The database index management system continuously improves database performance in response to changing database usage patterns over time.

Meta-indexing, search, compliance, and test framework for software development using smart contracts
11531538 · 2022-12-20 · ·

A system and method for meta-indexing, search, compliance, and test framework for software development using smart contracts is provided, comprising an indexing service configured to create a dataset by processing and indexing source code of a project provided by a developer, perform a code audit on the indexed source code, store results from the code audit in the dataset, gather additional information relating to the provided project, store the additional information in the dataset, and store the dataset into memory; and a monitoring service configured to continuously monitor the project for at least source code changes and make changes to the dataset as needed. Additionally, a smart contract authority creates and enforces smart contracts for every transaction taking place upon the software essentially mandating and guaranteeing the security and authenticity of the software during the software's development and use.

Indexing partitions using distributed bloom filters

Methods, systems, and computer-readable media for indexing partitions using distributed Bloom filters are disclosed. A data indexing system generates a plurality of indices for a plurality of partitions in a distributed object store. The indices comprise a plurality of Bloom filters. An individual one of the Bloom filters corresponds to one or more fields of an individual one of the partitions. Using the Bloom filters, the data indexing system determines a first portion of the partitions that possibly comprise a value and a second portion of the partitions that do not comprise the value. Based (at least in part) on a scan of the first portion of the partitions and not the second portion of the partitions, the data indexing system determines one or more partitions of the first portion of the partitions that comprise the value.

AUTOMATIC GENERATION OF A MATCHING ALGORITHM IN MASTER DATA MANAGEMENT

A method for receiving an additional dataset including a plurality of additional data records; determining a record type using classifiers and an internal domain knowledge corpus; dividing the plurality of additional data records into a plurality of indexing groups; assigning the given additional data record to a match set based on completeness and similarity of natures of attributes of the given additional data record; and assigning the given additional data record to and a comparison group based on completeness and similarity of natures of attributes of the given additional data record.