Patent classifications
G06F11/1471
System and method for hybrid kernel and user-space checkpointing using a character device
A system, method, and computer readable medium for hybrid kernel-mode and user-mode checkpointing of multi-process applications using a character device. The computer readable medium includes computer-executable instructions for execution by a processing system. A multi-process application runs on primary hosts and is checkpointed by a checkpointer comprised of a kernel-mode checkpointer module and one or more user-space interceptors providing barrier synchronization, checkpointing thread, resource flushing, and an application virtualization space. Checkpoints may be written to storage and the application restored from said stored checkpoint at a later time. Checkpointing is transparent to the application and requires no modification to the application, operating system, networking stack or libraries. In an alternate embodiment the kernel-mode checkpointer is built into the kernel.
Managing storage systems that are synchronously replicating a dataset
Managing storage systems that are synchronously replicating a dataset, including: detecting a change in membership to the set of storage systems synchronously replicating the dataset; and applying one or more membership protocols to determine a new set of storage systems to synchronously replicate the dataset, wherein the one or more membership protocols include a quorum protocol, an external management protocol, or a racing protocol, and wherein one or more I/O operations directed to the dataset are applied to a new set of storage systems.
Method for replacing a currently operating data replication engine in a bidirectional data replication environment without application downtime and while preserving target database consistency, and by using audit trail tokens that provide a list of active transactions
An automated method is provided for use when replacing a currently operating data replication engine in a first system with a new data replication engine in the first system in a bidirectional data replication environment. The currently operating data replication engine in the first system and the new data replication engine in the first system replicates first database transactions from an audit trail of a first database in the first system to a second database in a second system. The new data replication engine in the first system generating a list of active database transactions in the first system, and sends the list of active database transactions to the new data replication engine in the second system as a first token. The new data replication engine in the second system receives the first token, fetches a transaction event from an audit trail of second database, and replicates the fetched transaction event to the new data replication engine of the first system when the fetched transaction event does not match a transaction on the list in the first token. These steps are repeated during operation of the new data replication engine of the second system. The currently operating data replication engine in the first system is stopped from replicating first database transactions when all of the transactions on the list of active database transactions that were generated have been replicated to the second system.
INPUT/OUTPUT (I/O) QUIESCING FOR SEQUENTIAL ORDERING OF OPERATIONS IN A WRITE-AHEAD-LOG (WAL)-BASED STORAGE SYSTEM
A method for of input/output (I/O) quiescing in a write-ahead-log (WAL)-based storage system comprising a WAL, is provided. The method generally includes receiving a request to process a control operation for the storage system, determining whether a memory buffer includes payload data for one or more write requests previously received for the storage system and added to the WAL, forcing a flush of the payload data in the memory buffer to a persistent layer of the storage system when the memory buffer includes the payload data, and processing the control operation subsequent to completing the asynchronous flush, without waiting for processing of one or more other write requests in the WAL corresponding to payload data that was not added to the memory buffer prior to receiving the request to process the control operation.
Messaging system failover
A device receives a notification indicating a failure of a first server device responsible for a primary message queue that includes messages at a time of the failure. A second server device is responsible for a standby message queue to which the messages are replicated, where a position in the standby message queue and a message time are assigned to each of the replicated messages. The device obtains a record time that identifies the message time of one of the messages that was last obtained from the primary message queue prior to the failure, compares an adjusted record time and the message time of one or more of the messages of the standby message queue to determine a starting position in the standby message queue, and processes messages obtained from the standby message queue beginning at one of the messages assigned to the position that matches the starting position.
Machine learning to predict container failure for data transactions in distributed computing environment
Inflight transactions having predictable pod failure in distributed computing environments are managed by integrating a transaction manager into pods having containers running applications in a distributed computing environment, wherein the transaction manager records a transaction log having data indicative of historical pod failure. A pod health check that is also integrated into the pods determines predictive pod failure scenarios from the data of historical pod failure in the transaction log. Pod health can be tracked using the pod health checker by matching the predictive pod failure scenarios to transaction calls. Calls may be sent to a load balancer for recovery of pod failure for transaction calling match the predictive pod failure scenarios. Pods can be configured recover for the predictive pod failure.
Systems and methods for provisioning and decoupled maintenance of cloud-based database systems
Methods and systems are described for provisioning cloud-based database systems and performing decoupled maintenance. For example, conventional systems may rely on database management systems to provision and modify databases hosted by a service provider. However, for entities operating complex database systems with the need for highly customized cloud infrastructure, database management systems fail to provide the granular customization and the control necessary to create and service these systems. In contrast, the described solutions provide an improvement over conventional database management system architecture by providing direct communication between an entity and its cloud-based database systems via command line prompts or API calls, decoupling database system maintenance from database system provisioning process to increase the speed and granular customization of the database system. Moreover, the disclosed solution leverages machine learning to predict optimal database system provisioning and maintenance processes and resources.
Detecting execution hazards in offloaded operations
Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.
Relaying storage operation requests to storage systems using underlying volume identifiers
Example implementations relate to virtual persistent volumes. In an example, a storage operation request includes a volume identifier. A volume mapping that corresponds to the volume identifier is identified. Underlying volume identifiers are identified based on the volume mapping. The underlying volume identifiers relate to underlying storage volumes that form at least part of a virtual persistent volume associated with the volume identifier. The storage operation request is relayed, using the underlying volume identifiers, to storage systems on which the underlying storage volumes are respectively located.
Techniques for command execution using a state machine
Techniques for processing a request may include: providing tasks to a state machine framework, wherein the tasks perform processing of a workflow for servicing the request; generating, by the state machine framework, a state machine for processing the request, wherein the state machine includes states associated with the tasks, wherein generating the state machine may include automatically determining a first state transition of the state machine between a first and a second of the states; receiving the request; and responsive to receiving the request, performing first processing using the state machine to service the request. The framework may automatically generate triggers that drive the state machine to determine subsequent states in accordance with defined state transitions. State machine internal state information may be persistently stored and used in restoring the state machine to one of its states in connection processing of the command.