G06F16/1844

SYSTEM AND METHOD FOR SECURING INFORMATION IN A DISTRIBUTED NETWORK VIA A DISTRIBUTED IDENTIFIER

Embodiments of the invention are directed to a system, method, or computer program product for an approach to securing information stored in a distributed network. The system allows for generating distributed identifiers for information entries, wherein the distributed identifiers mask the information entries using a hash function and the distributed identifiers are dispersed across distributed ledgers. The system also allows for originating nodes to access the information entries within the distributed identifiers, while permitting other nodes and domains to reference the distributed identifiers themselves instead of referencing the information entries.

METHOD AND APPARATUS FOR IMPLEMENTING CHANGES TO A FILE SYSTEM THAT IS EMULATED WITH AN OBJECT STORAGE SYSTEM

A method is described. The method includes receiving logs from multiple connector nodes, each of the logs having entries that describe changes made to a file system at its respective one of the connector nodes. The method includes constructing a directed acyclic graph (DAG) from the logs' respective entries, the DAG comprising nodes connected by flows, the nodes describing actions made to directories and files of the file system, wherein, a flow of the flows connects a subset of the nodes that describe actions made to a particular directory or file of the file system. The method includes removing one or more irrelevant nodes from the DAG. The method includes applying actions described by remaining nodes of the DAG to an object storage system having respective objects for directories and files of the file system.

EFFICIENT REPLICATION OF FILE CLONES
20220414064 · 2022-12-29 ·

A method for managing replication of cloned files is provided. Embodiments include determining, at a source system, that a first file has been cloned to create a second file. Embodiments include sending, from the source system to a replica system, an address of the first extent and an indication that a status of the first extent has changed from non-cloned to cloned. Embodiments include changing, at the replica system, a status of a second extent associated with a replica of the first file on the replica system from non-cloned to cloned and creating a mapping of the address of the first extent to an address of the second extent on the replica system. Embodiments include creating, at the replica system, a replica of the second file comprising a reference to the address of the second extent on the replica system.

Data guardianship in a cloud-based data storage system
11537475 · 2022-12-27 · ·

Techniques and mechanisms described herein provide for verification of data across cloud-based and on-premises data storage systems. According to various embodiments, a backup client implemented on a first compute node can store a data file in a backup data repository. A data guardianship can store first data file state information describing the data file in a key-value store accessible via the internet. A data verification instance can analyze the backup data repository to verify that the data file is stored intact in the backup data repository.

FILE STORAGE METHOD, TERMINAL, AND STORAGE MEDIUM
20220407725 · 2022-12-22 ·

Embodiments of the present disclosure disclose a file storage method, terminal, and storage medium. The file storage method includes: obtaining a to-be-stored file, performing splitting processing on the to-be-stored file to obtain N sub-files corresponding to the to-be-stored file, wherein N is an integer greater than or equal to 1; sending the N sub-files to an IPFS, and receiving M pieces of address information corresponding to the N sub-files returned by the IPFS, wherein M is an integer greater than or equal to 1 and less than or equal to N; generating an address set corresponding to the to-be-stored file according to the M pieces of address information, and encrypting the address set to obtain an address set ciphertext; sending the address set ciphertext to a blockchain network and receiving a target index value returned by the blockchain network, wherein the target index value is used to identify the address set ciphertext.

VIRTUAL PRIVATE DATA LAKES AND DATA CORRELATION DISCOVERY TOOL FOR IMPORTED DATA
20220398258 · 2022-12-15 ·

Methods, systems, and computer-readable storage media for providing a VPDL within a data exploration system, storing enterprise-provided data in the VPDL, the enterprise-provided data including enterprise data from an enterprise system and data lake data from an enterprise data lake, importing, from an external data source, external data, automatically identifying associations between a sub-set of the enterprise-provided data and a sub-set of the external data and storing correlation data in the VPDL in response to an association, and reading at least a portion of the enterprise-provided data, at least a portion of the external data, and at least a portion of the correlation data, the data exploration tool being configured to generate one or more of visualizations and analytics by processing the at least a portion of the enterprise-provided data, the at least a portion of the external data, and the at least a portion of the correlation data.

Data validation for data record migrations
11526470 · 2022-12-13 · ·

Methods, systems, and devices for data validation are described. A user may store a set of data records on a source database and backup the set of data records at a target database through a data migration. A migration and validation server may initiate the data migration. After the data migration is complete, the migration and validation server may perform a validation process that includes comparing a calculated hash value from the source database and the target database that is based on unique identifiers and timestamps for each data record in the set of data records migrated from the source database to the target database. The migration and validation server may determine if the data migration was successful (e.g., the data was transferred correctly) if the hash value calculated for the data records at the target database equals the hash value calculated for the data records at the source database.

STORING VARIATIONS OF DATA ACROSS DIFFERENT REPLICATION SITES

A computer-implemented method according to one embodiment includes determining patterns of an application that utilizes a filesystem and/or properties of queries of the application. Data of the filesystem is stored across a plurality of replication sites of a data storage system. Based on the determined patterns of the application and/or the determined proper-ties of the queries of the application, a utility of storing at least some of the data of the filesystem in different variations at more than one of the replication sites is estimated. The estimated utility is compared against a predetermined utility threshold, and in response to a determination that the estimated utility is greater than the predetermined utility threshold, a write system call offered by the filesystem is modified to store the data in different variations at more than one of the replication sites.

Identifying a network node to which data will be replicated

A method performed by a device for identifying a network node within a network to which data will be replicated is disclosed. The method comprises encrypting a session key according to an attribute-based encryption scheme; broadcasting the encrypted session key within the network; receiving at least one message encrypted using the session key from at least one network node within the network; and selecting a network node from the at least one network node to which data will be replicated. A further method, a device and a non-transitory machine-readable medium are also disclosed.

Remote control of a change data capture system

The present disclosure relates to a control system for remotely controlling a change data capture (CDC) system. The CDC system comprises a source computing system and target computing system. The target computing system is configured to store a copy of data of the source computing system. The source computing system and the target computing system are configured to execute coordinated actions using predefined agents in order identify a change to data of the source computing system and to propagate, and store the change to the target computing system. The control system is configured for dynamically installing User-Defined Functions, UDF functions, in the source and target systems in order to control the agents to perform the predefined actions.