Patent classifications
G06F11/203
METHOD, ELECTRONIC DEVICE, AND PROGRAM PRODUCT FOR FAILURE HANDLING
Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for failure handling. This failure handling method includes determining a sector set failure type associated with at least one failed sector set of a disk; if the sector set failure type indicates that the number of failed sector sets in the at least one failed sector set is greater than a first threshold number, generating an instruction for replacing the disk; and otherwise performing at least one of the following: migrating data from a failed sector set in which the number of failed sectors is greater than a second threshold number to a spare sector set, and performing a failure recovery for a failed sector set in which the number of failed sectors is less than or equal to the second threshold number.
REMOTELY HEALING CRASHED PROCESSES
A method of repairing crashed applications includes detecting a crash in an application operating in a host computing device. The application is migrated to a remote computer server. The remote computer server provisions computing resources to the application, while the application is resident in the remote computer server. Resumed operation of the application is executed, using the provisioned computing resources, in the remote computer server. Execution results are generated from the application, in the remote computer server. The generated execution results are migrated from the application back to the host computing device.
SYSTEMS AND METHODS FOR MIGRATION OF VIRTUAL COMPUTING RESOURCES USING SMART NETWORK INTERFACE CONTROLLER ACCELERATION
An information handling system may include a first host system, comprising a first processor and a first network interface, and a second host system, comprising a second processor and a second network interface. The first network interface may be configured to accelerate migration of a designated virtual resource from the first host system to the second host system.
Processes and systems that detect abnormal behavior of objects of a distributed computing system
Automated processes and systems for detecting abnormally behaving objects of a distributed computing system are described. Processes and systems obtain metrics that are generated in a historical time window and are associated with an object of the distributed computing system. Processes and system use the metrics to compute a time-dependent system indicator over the historical time window. Each value of the system indicator corresponds to a point in time of the historical time window when the object was in a normal or an abnormal state. Processes and systems use the normal and abnormal states of the system indicator in the historical time window to train a state classifier that is used to detect run-time abnormal behavior of the object. When the state classifier identifies abnormal behavior of the object, an alert is generated, indicating the abnormal behavior of the object.
Container-Based Application Data Protection Method and System
A computer-implemented method of continuous restore for containerized applications includes initiating a continuous restore process for a containerized application having an application template and application data, where the containerized application executes on a first cluster. A backup plan for the containerized application is generated. A persistent volume containing the application data in the first cluster is identified and some of the application data is moved from the persistent volume to a backup target based on the backup plan schedule. The backup plan is received at a data synch process executing on a second cluster. A persistent volume is created on the second cluster. Some of the application data is moved from the backup target to the created persistent volume on the second cluster based on the backup plan schedule. The containerized application is recovered at the second cluster based on some of the application data moved to the persistent volume on the second cluster by the data synch process such that the recovered containerized application is operational at the most recent backup point-of-time of the backup plan schedule.
Monitoring of replicated data instances
Replicated instances in a distributed computing environment provide for automatic failover and recovery. A component monitors the status of event processors in a set or bucket and handles the failure of an event processor. For a large number of instances, the data environment can be partitioned such that each monitoring component is assigned a partition of the workload. At intervals, each event processor sends a “heartbeat” message to the event processors in the bucket covering the same workload partition, to inform the other event processors of the status of the event processor sending the heartbeat. If it is determined that a heartbeat is received from each event processor in the bucket, a current process can continue. In the event of monitoring component failure, the instances can be repartitioned, and the remaining monitoring components can be assigned to the new partitions to substantially evenly distribute the workload.
MANAGING APPLICATIONS IN A CLUSTER
Approaches for managing applications in a cluster are described. In an example, a first agent may be executing on a first programmable network adapter card installed within a first computing node within a cluster. The first agent may isolate an application executing on the first computing node. Thereafter, the application may be managed by the second computing node.
Methods, systems and apparatus to dynamically facilitate boundaryless, high availability M:N working configuration system management
A system for dynamically load-balancing at least one redistribution element across a group of computing resources that facilitates at least an aspect of an Industrial Execution Process in an M:N working configuration is illustrated. The system is configured to: access from a central or distributed data store, a configuration component operational data and capabilities or characteristics associated with the M:N working configuration; identify a load-balancing opportunity to trigger redistribution of a redistribution element to a redistribution target selected from a redistribution target pool defined by remaining computing resource components associated with the M:N computing resource working configuration; select at least one redistribution target for redeployment; redeploy the at least one redistribution element to the redistribution target; determine redeployment to the at least one selected redistribution target to be a viable redeployment; and execute the Industrial Execution Process utilizing the at least one redistribution element at the selected redistribution target.
SYSTEMS, METHODS, AND APPARATUS FOR HIGH AVAILABILITY APPLICATION MIGRATION IN A VIRTUALIZED ENVIRONMENT
Methods, apparatus, systems, and articles of manufacture are disclosed for high availability (HA) application migration in a virtualized environment. An example apparatus includes at least one memory, instructions in the apparatus, and processor circuitry to at least one of execute or instantiate the instructions to identify an HA slot in a virtual server rack, the HA slot to facilitate a failover of an application executing on a first virtual machine (VM) in the virtual server rack, the first VM identified as a protected VM, deploy a second VM in the HA slot, transfer data from the first VM to the second VM, and, in response to not identifying a failure of at least one of the first or second VMs during the transfer, trigger a shutdown of the first VM, and synchronize migration data associated with the virtual server rack to identify the second VM as the protected VM.
Data Center Restoration and Migration
A system can maintain a first data center that comprises a virtualized overlay network and virtualized volume identifiers. The system can determine to perform a restore of data of the first data center to a second data center, the data comprising first instances of virtualized workloads. The system can transfer the data to the second data center. The system can configure the second data center with the virtualized overlay network and the virtualized volume identifiers. The system can operate the virtualized workloads on the second data center, the second instances of the virtualized workloads invoking the second instance of the virtualized overlay network and the second instance of the virtualized volume identifiers.