G06F11/1438

Failover management for batch jobs

Computer-implemented methods, computer program products, and computer systems are provided. A method includes generating a running result matrix for a plurality of batch jobs, indicating corresponding running results for respective processing actions in batch jobs of the plurality of batch jobs. The method further includes obtaining an internal dependency matrix for the plurality of batch jobs, indicating corresponding dependencies between respective processing actions within a batch job of the plurality of batch jobs. The method further includes calculating a recovery matrix for the plurality of batch jobs based, at least in part, on the running result matrix and the internal dependency matrix, the recovery matrix indicating corresponding recovery actions for respective processing actions in batch jobs of the plurality of batch jobs. The method further includes executing failover management for one or more batch jobs based, at least in part, on the calculated recovery matrix.

Saving page retire information persistently across operating system reboots

Examples described herein include systems and methods for retaining information about bad memory pages across an operating system reboot. An example method includes detecting, by a first instance of an operating system, an error in a memory page of a non-transitory storage medium of a computing device executing the operating system. The operating system can tag the memory page as a bad memory page, indicating that the memory page should not be used by the operating system. The operating system can also store tag information indicating memory pages of the storage medium that are tagged as bad memory pages. The example method can also include receiving an instruction to reboot the operating system, booting a second instance of the operating system, and providing the tag information to the second instance of the operating system. The operating system can use the tag information to avoid using the bad memory pages.

System and method for hybrid kernel- and user-space incremental and full checkpointing

A system includes a multi-process application that runs. A multi-process application runs on primary hosts and is checkpointed by a checkpointer comprised of at least one of a kernel-mode checkpointer module and one or more user-space interceptors providing at least one of barrier synchronization, checkpointing thread, resource flushing, and an application virtualization space. Checkpoints may be written to storage and the application restored from said stored checkpoint at a later time. Checkpointing may be incremental using Page Table Entry (PTE) pages and Virtual Memory Areas (VMA) information. Checkpointing is transparent to the application and requires no modification to the application, operating system, networking stack or libraries. In an alternate embodiment the kernel-mode checkpointer is built into the kernel.

STORAGE ARCHITECTURE FOR VIRTUAL MACHINES
20180011731 · 2018-01-11 ·

Some embodiments of the present invention include a method comprising: accessing units of network storage that encode state data of respective virtual machines, wherein the state data for respective ones of the virtual machines are stored in distinct ones of the network storage units such that the state data for more than one virtual machine are not commingled in any one of the network storage units.

Restoring virtual network function (VNF) performance via VNF reset of lifecycle management

Techniques for identifying and remedying performance issues of Virtualized Network Functions (VNFs) are discussed. An example method includes outputting a request to a network Element Manager (EM) to create a Virtualized Network Function (VNF) Performance Measurement (PM) job to collect VNF PM data from a VNF and receiving a set of VNF PM data associated with the VNF from the EM. The set of VNF PM data is processed associated with the VNF. A request to the EM is output to create a Virtualization Resource (VR) PM job to collect, through a VNF Manager (VNFM) and a virtualized infrastructure manager (VIM), VR PM data from a VR used by the VNF. Then a set of VR PM data is received from the EM and processed.

Image processing apparatus and control method for image processing apparatus for error reduction
11570330 · 2023-01-31 · ·

An image processing apparatus includes a processing section and a control section configured to instruct operation of the processing section. The control section executes, before sending, to the processing section, a first instruction corresponding to an instruction that caused an error in the past, an error avoidance operation based on instruction history information and operation state history information acquired from a storing section that stores the instruction history information and the operation state history information, the instruction history information indicating an instruction given to the processing section by the control section, the operation state history information indicating an operation state of the processing section caused by the instruction.

Communication apparatus, communication method, program, and communication system

Communication is performed more reliably. A CCI (I3C DDR) processing section determines status of an index when requested to be accessed by an I3C master for a read operation. An error handling section then controls an I3C slave 13 to detect occurrence of an error based on the status of the index and to neglect all communication until DDR mode is stopped or restarted by the I3C master, the I3C slave 13 being further controlled to send a NACK response when performing acknowledge processing on a signal sent from the I3C master. This technology can be applied to the I3C bus, for example.

NF service consumer restart detection using direct signaling between NFs

Systems and methods for detecting, e.g., that a Network Function (NF) service consumer in a core network of a cellular communications system has restarted are disclosed. In some embodiments, a method of operation of a NF service consumer in a core network of a cellular communications system comprises sending, to a NF service producer, a message comprising information related to a unit of the NF service consumer.

Input-output path selection using switch topology information

Switch topology-aware path selection in an information processing system is provided. For example, an apparatus comprises a host device comprising a processor coupled to a memory. The host device is configured to communicate with a storage system over a network with a plurality of switches. The host device is further configured to obtain topology information associated with the plurality of switches in the network, and select a path from the host device to the storage system through one or more of the plurality of switches based at least in part on the obtained topology information.

Basic input/output system (BIOS) device management

A computing device includes a hardware switch that is activated when a primary Basic Input/Output System (BIOS) of a first BIOS chip of the device fails to load an Operating System (OS) image from an OS partition of a hard drive. The switch passes control to a backup BIOS that executes from a backup BIOS chip. The backup BIOS loads a recovery image from BIOS recovery partition of the hard drive, which causes a reflash application to execute from the recovery image. Reflash application obtains a recovery BIOS from the BIOS recovery partition of the hard drive, reflashes/writes the recovery BIOS onto the first BIOS chip and reboots the device. Following reboot of the device, recovery BIOS loads the OS image from the OS partition, and recovery BIOS becomes the primary BIOS on the first BIOS chip of the device.