G06F16/273

Systems and methods for managing distributed database deployments

Various aspects provide for implementation of a cloud service for running, monitoring, and maintaining cloud distributed database deployments and in particular examples, provides cloud based services to run, monitor and maintain deployments of the known MongoDB database. Various embodiments provide services, interfaces, and manage provisioning of dedicated servers for the distributed database instances (e.g., MongoDB instances). Further aspects, including providing a database as a cloud service that eliminates the design challenges associated with many distributed database implementations, while allowing the client's input on configuration choices in building the database. In some implementations, clients can simply identity a number of database nodes, capability of the nodes, and within minutes have a fully functioning, scalable, replicated, and secure distributed database in the cloud.

Interface custom resource definition for stateful service management of clusters
11544289 · 2023-01-03 · ·

In an example embodiment, an additional interface customer resource definition (CRD), which operates in conjunction with the normal CRD, is utilized. The interface CRD may be called a service CRD. The service CRD provides an abstraction of the original CRD by abstracting away all technical details that no other services should depend upon. The service CRD provides a façade to the original CRD. Both are kept in sync by a component called an operator, which infers the specification of the original CRD on the basis of the specification of a given service CRD. Furthermore, status updates sent to the original CRD that are relevant to the dependent services are mirrored back to the corresponding service CRD. Correspondingly, status updates with technical details that are too specific for the dependent services are not mirrored back.

Optimized content object storage service for large scale content

Provided are techniques for optimized content object storage service for large scale content. A content object file is created. An index entry for the content object file is created with a content object key and a content object location. The content object file is appended to an aggregated file on a storage node. In response to a request to retrieve the content object file from the aggregated file, the content object key is used to access the content object location that describes the storage node, a name of the aggregated file, an offset into the aggregated file, and a size of the content object file to retrieve the content object file.

Data system on a module (DSoM) for connecting computing devices and cloud-based services
11537631 · 2022-12-27 · ·

A communication device (e.g., a data system on a module (DSoM)/a Data System in a Package (DSiP)) for communicatively coupling a computing device with a cloud-based service to synchronize one or more data modifications, on an asynchronous basis with respect to one another is provided. The communication device may be configured to be communicatively coupled to the computing device and may include a wireless cellular transceiver. The communication device may be configured to one or more of transmit at least one data object from the computing device to the cloud-based service and receive at least one data object from the cloud-based service. Embodiments of the present disclosure may provide a distributed replicated spatiotemporal database packaged on a communication device and integrated with an internet-based secure communication hub.

Data system configured to transparently cache data of data sources and access the cached data

The disclosed embodiments include a method for caching by a data system. The method includes automatically caching a portion of a data object from an external data source to a local cluster of nodes in accordance with a unit of caching. The portion of the data object can be selected for caching based on a frequency of accessing the portion of the data object. The portion of the data object in the cache is mapped to the external data source in accordance with a unit of hashing. The method further includes, responsive to the data system receiving a query for data stored in the external data source, obtaining query results that satisfy the received query by reading the portion of the cached data object instead of reading the data object from the external data source.

DATABASE REPLICATION SYSTEM AND METHOD, SOURCE END DEVICE, AND DESTINATION END DEVICE

In a database replication operation, a source end device obtains at least two groups of transaction logs from a log file of a source end database in parallel, where the at least two groups of transaction logs include a first group of transaction logs and a second group of transaction logs. The first group of transaction logs includes at least a first transaction log and a second transaction log that are adjacent to each other, and the second group of transaction logs includes at least a third transaction log and a fourth transaction log that are adjacent to each other, and a generation time point of the second transaction log is earlier than a generation time point of the third transaction log. The source end device then sends the at least two groups of transaction logs to a destination end device.

SYSTEM AND METHOD FOR MANAGING B TREE NODE SHARING USING OPERATION SEQUENCE NUMBERS

System and method for managing copy-on-write (COW) B tree structures for metadata of storage objects stored in a storage system determine, when a request to modify a target storage object stored in the storage system that requires a modification of a target leaf node in a B tree structure for metadata of the target storage object is received, whether an operation sequence number of the target leaf node is greater than a snapshot sequence number of a parent snapshot of a running point of the B tree structure. When the operation sequence number is greater than the snapshot sequence number, the target leaf mode is modified in place without copying the target leaf node. When the operation sequence number is not greater than the snapshot sequence number, the target leaf node is copied as a new leaf node for the B tree structure and the new leaf node is modified.

Discovery of linkage points between data sources

Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.

Data recovery method and apparatus, server, and computer-readable storage medium

A data recovery method is provided. In the method, a backup type of a backup data packet is identified. Data recovery is performed based on physically backed up data in the backup data packet in a case that the identified backup type is a hybrid backup, the hybrid backup being a backup process that includes a physical backup and a logical backup. Data recovery is performed on logically backed up data in the backup data packet after the data recovery based on the physically backed up data is completed.

Replicating data changes through distributed invalidation

A computer-implemented method for replicating data changes through distributed invalidation includes receiving, by a distributed database system, an instruction to change a data element in a table. The distributed database system includes at least a first server and a second server. A first copy of the table is stored on the first server, and a second copy of the table is stored on the second server. The method further includes in response to the instruction, determining that the data element is secured by a replication key that is stored on a shared key management system that is accessible by the first server and by the second server, wherein the replication key is unique to the data element. The method further includes invalidating the replication key and modifying the first copy of the table on the first server according to the instruction that is received.