G06F11/1474

Transaction processing method, apparatus, and device and computer storage medium

A transaction processing method includes: dividing a to-be-processed transaction obtained from a database into at least two subtransactions; dividing each subtransaction into N parts with an association relationship; processing the N parts of each subtransaction based on the association relationship, to obtain a processing result of a lastly executed part of the N parts; determining, upon detecting an abnormal subtransaction based on the processing result, a processing policy matching an abnormality reason of the abnormal subtransaction; and processing the abnormal subtransaction by using the processing policy, to obtain a final processing result of the to-be-processed transaction.

Assignment of quora values to nodes based on importance of the nodes

Embodiments described herein are generally directed to techniques for avoiding or mitigating shared-state damage during a split-brain condition in a distributed network of compute nodes. According to an example, a number, N, of nodes within the distributed computing system is determined. During normal operation of the distributed computing system, a unified state is maintained by synchronizing shared state information. The nodes are ordered by increasing importance to an application from 1 to N. A quora value, q.sub.n, is assigned to each of the nodes in accordance with the ordering, where q.sub.1=1 and each subsequent quora value, q.sub.n+1, is a sum of all prior quora values, q.sub.1 to q.sub.n, plus either 1 or a current value of n. These quora values may then be used to determine membership in the dominant or a yielding set to facilitate recovery from the split-brain condition by performing pessimistic or optimistic mitigation actions.

Hardware validation of safety critical scheduling
11537481 · 2022-12-27 · ·

The exemplary embodiments are related to a device, a system, and a method for implementing a hardware mechanism that is configured to validate the performance of scheduling software utilized by a safety-critical system. The hardware device may receive an indication that a first frame of a frame schedule is in use. The hardware device may also monitor a time parameter corresponding to the first frame. The hardware device may also determine whether an indication that a second frame of the frame schedule is in use is received prior to the expiration of the time parameter. When the indication that the second frame of the frame scheduler is in use is not received prior to the expiration of time parameter, the hardware device may send a signal to an operating system of the safety-critical system indicating that an error in executing the frame scheduled has occurred.

Systems and methods of backup and recovery of journaling systems

In part, the disclosure relates to a backup and restoration system for a transactional log based journaling application. The system includes a transactional log backup process executing on one or more computing devices; an archive stored in non-transitory computer readable memory; and a binary difference file generator in electronic communication with the archive and responsive to instructions from the transactional log backup process. In one embodiment, the binary difference file generator includes a backup driver in electrical communication with and responsive to communication signals from the transactional backup process.

Data recovery method and apparatus, server, and computer-readable storage medium

A data recovery method is provided. In the method, a backup type of a backup data packet is identified. Data recovery is performed based on physically backed up data in the backup data packet in a case that the identified backup type is a hybrid backup, the hybrid backup being a backup process that includes a physical backup and a logical backup. Data recovery is performed on logically backed up data in the backup data packet after the data recovery based on the physically backed up data is completed.

Error recovery for non-volatile memory modules

A memory controller includes a command queue, a memory interface queue, at least one storage queue, and a replay control circuit. The command queue has a first input for receiving memory access commands. The memory interface queue receives commands selected from the command queue and couples to a heterogeneous memory channel which is coupled to at least one non-volatile storage class memory (SCM) module. The at least one storage queue stores memory access commands that are placed in the memory interface queue. The replay control circuit detects that an error has occurred requiring a recovery sequence, and in response to the error, initiates the recovery sequence. In the recovery sequence, the replay control circuit transmits selected memory access commands from the at least one storage queue by grouping non-volatile read commands together separately from all pending volatile reads, volatile writes, and non-volatile writes.

Persistent memory image capture

A memory image can be captured by generating metadata indicative of a state of volatile memory and/or byte-addressable PMEM at a particular time during execution of a process by an application. This memory image can be persisted without copying the in-memory data into a separate persistent storage by storing the metadata and safekeeping the in-memory data in the volatile memory and/or PMEM. Metadata associated with multiple time-evolved memory images captured can be stored and managed using a linked index scheme. A linked index scheme can be configured in various ways including a full index and a difference-only index. The memory images can be used for various purposes including suspending and later resuming execution of the application process, restoring a failed application to a previous point in time, cloning an application, and recovering an application process to a most recent state in an application log.

System and method for improving detection and capture of a host system catastrophic failure

An information handling system includes a non-volatile storage device communicatively coupled to a boot processor and an application processor. The boot processor, prior to the execution of a hang sensitive transaction, stores information associated with the hang sensitive transaction at a memory device. The application processor is configured to detect a catastrophic failure of the hang sensitive transaction. In response to the detection of the catastrophic failure, the application processor retrieves the information stored at the memory device and store the information at the non-volatile storage device.

Jitter-tolerant distributed two-phase commit (2PC) systems
11507411 · 2022-11-22 · ·

A method of ensuring atomicity of transactions across a plurality of active hosts in a distributed environment, is provided. The method generally includes receiving, from a client, a second request to commit a second transaction subsequent to receiving a first request to commit a first transaction; assigning a second prepare identifier (ID) to the second transaction, wherein the second prepare ID assigned to the second transaction is greater than a first prepare ID assigned to the first transaction; transmitting, to the plurality of active hosts, instructions to prepare for committing the second transaction, the instructions including the second prepare ID; receiving, from each host, an acknowledgement indicating successful preparation for committing the second transaction; and transmitting, to the plurality of active hosts, instructions to commit the second transaction prior to receiving, from each host, an acknowledgement indicating successful preparation for committing the first transaction.

ROLLBACK RECOVERY WITH DATA LINEAGE CAPTURE FOR DATA PIPELINES
20220365851 · 2022-11-17 ·

Computer-readable media, methods, and systems are disclosed for performing rollback recovery with data lineage capture for data pipelines. A middle operator receives ingested input events from a source operator reading data from an external input data source. The middle operator then logs information regarding middle input events to a middle operator input log, designating the logged middle input event information as incomplete. The middle operator then processes data associated with the middle input events and updates the middle input log entries setting them to a completed logging status designation for middle input events that were consumed to produce the one or more middle output events. The middle operator then transmits the middle output events to subsequent operators. Garbage collection is performed to remove completed entries from the middle operator output log. Finally, based on receiving a recovering message from a subsequent operator, corresponding middle output events are re-sent.