G06F16/9014

Revealing content reuse using coarse analysis

Systems and methods for managing content provenance are provided. A network system accesses a plurality of documents. The plurality of documents is then hashed to identify one or more content features within each of the documents. In one embodiment, the hash is a MinHash. The network system compares the content features of each of the plurality of documents to determine a similarity score between each of the plurality of documents. In one embodiment, the similarly score is a Jaccard score. The network system then clusters the plurality of documents into one or more clusters based on the similarity score of each of the plurality of documents. In one embodiment, the clustering is performed using DBSCAN. DBSCAN can be iteratively performed with decreasing epsilon values to derive clusters of related but relatively dissimilar documents. The clustering information associated with the clusters are stored for use during runtime.

Key-Value Storage System including a Resource-Efficient Index
20180011852 · 2018-01-11 ·

A key-value storage system is described herein for interacting with key-value entries in a content store using a resource-efficient index. The index provides a data structure that includes a plurality of hash buckets. Each hash bucket includes a linked list of hash bucket units. The key-value storage system stores hash entries in each linked list of hash bucket units in a distributed manner between an in-memory index store and a secondary index store, based on time of their creation. The key-value storage system is further configured to store hash entries in a particular collection of linked hash bucket units in a chronological order to reflect time of their creation. The index further includes various tunable parameters that affect the performance of the key-value storage system.

Dynamic CFI using line-of-code behavior and relation models
11709981 · 2023-07-25 · ·

Disclosed herein are techniques for analyzing control-flow integrity based on functional line-of-code behavior and relation models. Techniques include receiving data based on runtime operations of a controller; constructing a line-of-code behavior and relation model representing execution of functions on the controller based on the received data; constructing, based on the line-of-code behavioral and relation model, a dynamic control flow integrity model configured for the controller to enforce in real-time; and deploying the dynamic control flow integrity model to the controller.

TECHNIQUES FOR IN-MEMORY DATA SEARCHING

A method performs efficient data searches in a distributed computing system. The method may include, receiving a first key. The method may further include determining a hash map associated with the first key from among a plurality of hash maps. In some examples, the obtained hash map maps a partition of a set of keys to particular index values. The method may further include determining an index value associated with a second key using the determined hash map. The method may further include determining transaction processing data associated with the first key using the determined index value and providing the transaction processing data. Utilization of the plurality of hash maps may enable a data search to be performed using on-board memory of an electronic device of the distributed computing system.

FEDERATED IDENTITY MANAGEMENT WITH DECENTRALIZED COMPUTING PLATFORMS
20230239284 · 2023-07-27 ·

Provided is a process that establishes user identities within a decentralized data store, like a blockchain. A user's mobile device may establish credential values within a trusted execution environment of the mobile device. Representations of those credentials may be generated on the mobile device and transmitted for storage in association with an identity of the user established on the blockchain. Similarly, one or more key-pairs may be generated or otherwise used by the mobile device for signatures and signature verification. Private keys may remain resident on the device (or known and input by the user) while corresponding public keys may be stored in associated with the user identity on the blockchain. A private key is used to sign representations of credentials and other values as a proof of knowledge of the private key and credential values for authentication of the user to the user identity on the blockchain.

Technologies for providing edge deduplication

Technologies for providing deduplication of data in an edge network includes a compute device having circuitry to obtain a request to write a data set. The circuitry is also to apply, to the data set, an approximation function to produce an approximated data set. Additionally, the circuitry is to determine whether the approximated data set is already present in a shared memory and write, to a translation table and in response to a determination that the approximated data set is already present in the shared memory, an association between a local memory address and a location, in the shared memory, where the approximated data set is already present. Additionally, the circuitry is to increase a reference count associated with the location in the shared memory.

Methods and apparatuses for vulnerability detection and maintenance prediction in industrial control systems using hash data analytics

Method, apparatus and computer program product for detecting vulnerability and predicting maintenance in an industrial control system are described herein.

Location sensitive ensemble classifier

Computer-implemented systems and methods for generating and using a location sensitive ensemble classifier for classifying content includes dividing a validation data set into regions. Each region encompasses data points of the validation data set that fall within the region. A regional ensemble classifier is generated for each region based on the data points that fall within the region. A content item is then classified in at least one of a plurality of classes using the regional ensemble classifier for the region to which the content item belongs.

Fast and accurate rule selection for interpretable decision sets
11704591 · 2023-07-18 · ·

An IDS generator determines multiple classes for electronic data items. The IDS generator determines, for each class, a class-specific candidate ruleset. The IDS generator performs a differential analysis of each class-specific candidate ruleset. The differential analysis is based on differences between result values of a scoring objective function. In some cases, the differential analysis determines at least one of the differences based on additional data structures, such as an augmented frequent-pattern tree. A probability function based on the differences is compared to a threshold probability At least one testing ruleset is modified based on the comparison. The IDS generator determines, for each class, a class-specific optimized ruleset based on the differential analysis of each class-specific candidate ruleset. The IDS generator creates an optimized interpretable decision set based on combined class-specific optimized rulesets for the multiple classes.

KEY PACKING FOR FLASH KEY VALUE STORE OPERATIONS

A key value (KV) store, a method thereof, and a storage system are provided herein. The KV store may include a key logger; and a processor configured to receive a first command for storing a first KV in the KV store, write a first value of the first KV to a first NAND page, generate an extent map for identifying the first memory page including the first value, write the extent map to a second memory page, append an entry for storing the first KV to the key logger, and update a device hashmap of the KV store to include a first key of the first KV, upon a threshold being met within the key logger.