G06F16/2272

TRUSTED LEDGER MANAGEMENT SYSTEMS AND METHODS

The disclosure relates to, among other things, systems and methods for mitigating the risks of errors, benign or otherwise, occurring within trusted ledgers and/or for validating the integrity of information provided by operators of trusted ledgers. Consistent with embodiments disclosed herein, trusted agents, which may comprise proxy agents and/or test agents, may be employed to examine ledgers and/or derivatives, which may be meshed with other ledgers, to ensure the integrity of information provided by ledger operators. Ledger meshing techniques are described to link ledgers in a manner that improves the ability to verify ledger entries and/or recover from data faults. Further embodiments provide for tagging processes may be performed to give semantic meaning to hashes included in trusted ledgers.

Object scriptability

Object scriptability methods and system are described herein. The method includes generating a graph-based data structure including a plurality of nodes, where each node from the plurality of nodes represents a respective data-analysis object in a data analysis system, where each node from the plurality of nodes is connected to at least one other node from the plurality of nodes by an edge, where the edge represents a relationship between the respective objects in the data analysis system, and where generating the graph-based data structure includes receiving a high-level language script describing at least one data-analysis object, and generating at least one node from the plurality of nodes in accordance with the high-level language script.

Index sheets for robust spreadsheet-based applications

At a data management service, an index structure corresponding to a data sheet is stored. The data sheet comprises a grid of cells. An entry of the index structure comprises a reference to content of a cell of the data sheet. In response to a grid structure change of the data sheet, the index entry is automatically updated such that the same content remains referenced from the index entry as before. A result of a computation of an application is obtained using an identifier of the index entry to obtain content from the data sheet. The result is provided to a destination.

Method for duplicate determination in a graph

Embodiments of the present invention determines duplicates in a graph. The graph comprises nodes representing entities and edges representing relationships between the entities. The method comprises: identifying at least two nodes in the graph. A neighborhood subgraph may be determined for each of the two nodes. The neighborhood subgraph includes the respective node. The method further comprises determining whether the two nodes are duplicates with respect to each other, based on a result of a comparison between the two subgraphs.

Adaptive compression optimization for effective pruning

A database management system is described that can encode data to generate a plurality of data vectors. The database management system can perform the encoding by using a dictionary. The database management system can adaptively reorder the plurality of data vectors to prepare for compression of the plurality of data vectors. During a forward pass of the adaptive reordering, most frequent values of a data vector of the plurality of data vectors can be moved-up in the data vector. During a backward pass of the adaptive reordering, content within a rest range of a plurality of rest ranges can be rearranged within the plurality of data vectors according to frequencies of the content. The reordering according to frequency can further sort the rest range by value. Related apparatuses, systems, methods, techniques, computer programmable products, computer readable media, and articles are also described.

Transmuting data associations among data arrangements to facilitate data operations in a system of networked collaborative datasets

Various embodiments relate generally to data science and data analysis and computer software and systems to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform configured to transmute associations between data arrangements of different formats or different data models to facilitate data operations, such as queries, configured to enhance, for example, an ingested dataset via transmuted associations as, for example, interrelations among a system of networked collaborative datasets. For example, a method may include identifying a referential indicator, determining an association with a value representative of the referential indicator to an equivalent value representative of another referential indicator associated with a different dataset, transmuting the association to form a transmuted association as a link between the value and the equivalent value, and integrating the link into an ingested data arrangement.

Database performance degradation detection and prevention

Techniques for database performance degradation detection and prevention are described. A statement performance monitor observes queries executed against a database engine and clusters the queries into groups of queries. The index utilization of the query groups and execution metrics are tracked over time, and a sudden change of index utilization can be detected. The change can be reported to users and/or new indexes may be automatically generated to serve affected query groups. Additionally, a statement performance monitor may be deployed to statically analyze code to identify modified queries and the resultant change of use of query indexes.

LIGHTWEIGHT GRAPH DATABASE AND SEARCHABLE DATASTORE

A computer-implemented method includes receiving a message comprising an origin, a destination and a relationship type for a relationship between the origin and the destination. The message further includes a payload. A first node is created in a graph database for the origin and a second node is created in the graph database for the destination. A relationship is set between the first node and the second node in the graph database based on the relationship type. A node is created in the graph database for the message while preventing the payload from being stored in the graph database. A relationship is created between the first node and the node for the message. The message, including the payload, is stored in a searchable datastore separate from the graph database.

MANAGEMENT OF CONSISTENT INDEXES WITHOUT TRANSACTIONS

In various embodiments, a computer-implemented method for supporting consistent secondary indexes, comprises receiving, at a first node, a write request comprising a data entry, storing the data entry in an in-memory structure separate from a primary structure for storing the data entry, generating, based on the data entry, a secondary index data entry for a secondary index, and transmitting the secondary index data entry to a second node for inclusion in the secondary index.

LAZY REASSEMBLING OF SEMI-STRUCTURED DATA
20220358128 · 2022-11-10 ·

A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.