G06F16/2255

SYSTEM AND METHOD FOR AN ULTRA HIGHLY AVAILABLE, HIGH PERFORMANCE, PERSISTENT MEMORY OPTIMIZED, SCALE-OUT DATABASE

A shared-nothing database system is provided in which parallelism and workload balancing are increased by assigning the rows of each table to “slices”, and storing multiple copies (“duplicas”) of each slice across the persistent storage of multiple nodes of the shared-nothing database system. When the data for a table is distributed among the nodes of a shared-nothing system in this manner, requests to read data from a particular row of the table may be handled by any node that stores a duplica of the slice to which the row is assigned. For each slice, a single duplica of the slice is designated as the “primary duplica”. All DML operations (e.g. inserts, deletes, updates, etc.) that target a particular row of the table are performed by the node that has the primary duplica of the slice to which the particular row is assigned. The changes made by the DML operations are then propagated from the primary duplica to the other duplicas (“secondary duplicas”) of the same slice.

SYSTEM AND METHOD FOR REDUCING FEATURE CALCULATIONS

A computer-implemented system, platform, computer program product, and/or method for reducing data processing that includes identifying data properties used to generate features used as input to data analytic models; associating the data properties used to generate the features to corresponding features; determining whether an incoming data record is a previously processed data record; determining, in response to an incoming data record being a previously processed data record, whether the incoming data record matches the previously processed data record; identifying data properties in the incoming data record that have changed; determining features associated with the data properties in the incoming data record that have changed; and generating the features associated with the data properties in the incoming data record that have changed.

Storage volume regulation for multi-modal machine data

A network storage volume stores first entries in a first-mode storage bucket and a second entries in a second-mode storage bucket. The first-mode storage bucket has first bucket metadata, and the second-mode storage bucket has second bucket metadata. A computer-implemented method includes comparing a utilized capacity of the network storage volume to a target capacity information of the network storage volume to obtain a comparison result. Based on the comparison result, at least one bucket is selected to be purged from the buckets of the network storage volume based at least in part on bucket metadata of the buckets. The method further includes causing a purge of the at least one selected bucket from the network storage volume.

Scanning of content in weblink

An illustrative computing system for a weblink content scanning system scans an electronic message for the presence of one or more weblinks. The computing system accesses, in a sandbox computing environment, content linked to the one or more weblinks. The computing system generates a hash of the accessed content and/or content linked to weblinks accessible via the accessed content. The computing system scans the content accessed via the one or more weblinks for a presence of malicious content and categorizes the scanned content accessed via the one or more weblinks (e.g., safe, malicious, and the like), associates the categorization with each corresponding hash, and saves such information to a data store for future analysis. Based on a result of this analysis, the computing system allows delivery of the original electronic message or generates a modified electronic message for delivery to a recipient device.

High performance dictionary for managed environment

Systems and methods are provided for optimizing data structures to improve the data retrieval through the use of bucketing techniques. A number of objects within an environment is drastically reduced utilizing bucketing techniques. Within the buckets, items are sequentially organized such that location is quicker. Items, or keys, are aligned with the same hash value together in a bucket and a mapping of the hash value to the offset of the first key occurrence in that bucket. This guarantees each lookup operation is only two random read accesses. Systems and methods provided herein control the pressures on a system for garbage collection and minimize memory usage with minimal impacts on performance.

Cache conscious techniques for generation of quasi-dense grouping codes of compressed columnar data in relational database systems

Herein are techniques for dynamic aggregation of results of a database request, including concurrent grouping of result items in memory based on quasi-dense keys. Each of many computational threads concurrently performs as follows. A hash code is calculated that represents a particular natural grouping key (NGK) for an aggregate result of a database request. Based on the hash code, the thread detects that a set of distinct NGKs that are already stored in the aggregate result does not contain the particular NGK. A distinct dense grouping key for the particular NGK is statefully generated. The dense grouping key is bound to the particular NGK. Based on said binding, the particular NGK is added to the set of distinct NGKs in the aggregate result.

Industrial data verification using secure, distributed ledger

A verification platform may include a data connection to receive a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors. The verification platform may store the subset of industrial asset data into a data store, the subset of industrial asset data being marked as invalid, and record a hash value associated with a compressed representation of the subset of industrial asset data combined with metadata in a secure, distributed ledger (e.g., associated with blockchain technology). The verification platform may then receive a transaction identifier from the secure, distributed ledger and mark the subset of industrial asset data in the data store as being valid after using the transaction identifier to verify that the recorded hash value matches a hash value of an independently created version of the compressed representation of the subset of industrial asset data combined with metadata.

Systems and methods for identifying unknown protocols associated with industrial control systems

A device may receive a hash table that includes lists of protocol detectors, wherein the hash table is generated based on historical process data identifying potential process variables associated with an industrial control system. The device may receive a packet identifying potential process variables associated with the industrial control system, and may extract, from the packet, packet data identifying a source address, a destination address, a port, and a transport protocol. The device may compare the packet data with data in the hash table to identify a set of lists of protocol detectors, and may process the packet data, with the set of lists of protocol detectors, to determine a matching protocol, no matching protocol, or a potential matching protocol for the packet. The device may perform one or more actions based on determining the matching protocol, no matching protocol, or the potential matching protocol for the packet.

Utilizing metadata to prune a data set

A query directed to database data stored across a set of files is received. The query includes predicates and each file from the set of files is associated with metadata stored in a metadata store that is separate from a storage platform that stores the set of files. One or more files are removed from the set of files whose metadata does not satisfy a predicate of the plurality of predicates to generate a pruned set of files. One or more predicates are removed that are satisfied by the metadata of the pruned set of files to generate a modified query.

System and method for disjunctive joins using a lookup table

Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.