G06F11/203

POOLED MEMORY HEARTBEAT IN SHARED MEMORY ARCHITECTURE

Examples provide a pooled memory heartbeat for virtual machine hosts. A virtual controller creates a pooled memory heartbeat file system in a shared memory partition of a pooled memory. An agent running on each host in a plurality of virtual machine hosts updates a heartbeat file at an update time interval to lock the heartbeat file. The lock indicates the heartbeat status for a given host is active. A master agent accesses the shared memory partition to check the heartbeat status of each host in the pooled memory file system. The heartbeat status is used to determine whether a host that has lost pooled memory access, is network isolated, or failed. If the pooled memory heartbeat status for a given host indicates the host is a failed host, the set of virtual machines running on the given host are respawned on another healthier host.

Asynchronous data replication in a multiple availability zone cloud platform

The present disclosure relates to computer-implemented methods, software, and systems for managing asynchronous data replication in a multiple availability zone cloud environment. Metadata for files for asynchronous replication at a second availability zone is stored at an in-memory data grid of a first instance of a storage service at a first availability zone at a multiple availability cloud platform that provides storage services. The in-memory data grid includes a queue data structure of metadata records and a map of metadata records. In response to determining that connection from the first availability zone to the second availability zone is available, asynchronous data replication for files identified at the map is executed. A file for replication is identified at the map and provided for replication at a second file storage at the second availability zone through a replication interface of a second instance of the storage service at the second availability zone.

METHOD AND APPARATUS FOR FAILOVER PROCESSING
20170364423 · 2017-12-21 ·

Embodiments of the present disclosure provide a method and apparatus for failover. In an embodiment is provided a method implemented at a first node in a cluster comprising a plurality of heterogeneous nodes. The method comprises: determining whether an application at a second node in the cluster is failed; and in response to determining that the application is failed, causing migration of data and services associated with the application from the second node to a third node in the cluster, the migration involving at least one node heterogeneous to the second node in the cluster. The present disclosure further provides a method implemented at the third node in the cluster and corresponding devices and computer program products.

CORE PAIRING IN MULTICORE SYSTEMS
20170364421 · 2017-12-21 ·

A method, executed by a computer, includes pairing a first core with a second core to form a first core group, wherein each core of the group has a plurality of functional units, transferring instructions received by the first core to the second core for execution via a first inter-core communication bus, and executing the instructions on the second core. A computer system and computer program product corresponding to the above method are also disclosed herein.

LIFECYCLE MANAGEMENT OF VIRTUAL INFRASTRUCTURE MANAGEMENT SERVER APPLIANCE

A method of upgrading a VIM server appliance includes: creating a snapshot of logical volumes mapped to physical volumes that store configuration and database files of virtual infrastructure management (VIM) services provided by a first VIM server appliance to be upgraded; after the snapshot is created, expanding the configuration and database files to be compatible with a second VIM server appliance; replicating the logical volumes which have been modified as a result of expanding the configuration and database files, in the second VIM server appliance; after replication, performing a switchover of VIM services that are provided, from the first VIM server appliance to the second VIM server appliance; and upon failure of any of the steps of expanding, replicating, and performing the switchover, aborting the upgrade, and reverting to a version of the configuration and database files that was preserved by creating the snapshot.

Load balancing and fault tolerant service in a distributed data system
11681566 · 2023-06-20 · ·

Techniques for load balancing and fault tolerant service are described. An apparatus may comprise load balancing and fault tolerant component operative to execute a load balancing and fault tolerant service in a distributed data system. The load balancing and fault tolerant service distributes a load of a task to a first node in a cluster of nodes using a routing table. The load balancing and fault tolerant service stores information to indicate the first node from the cluster of nodes is assigned to perform the task. The load balancing and fault tolerant service detects a failure condition for the first node. The load balancing and fault tolerant service moves the task to a second node from the cluster of nodes to perform the task for the first node upon occurrence of the failure condition.

API registry in a container platform providing property-based API functionality

A method of customizing deployment and operation of services in container environments may include receiving, at an API registry, a property for a service that is or will be encapsulated in a container that is or will be deployed in a container environment. The method may also include determining whether the property for the service affects the deployment of the service to the container environment, and in response to a determination that the property affects the deployment of the service, deploying the service based at least in part on the property. The method may additionally include determining whether the property for the service affects the generation of a client library that calls the service in the container environment, and in response to a determination that the property affects the generation of the client library, generating the client library based at least in part on the property.

HIGH RELIABILITY FAULT TOLERANT COMPUTER ARCHITECTURE

A fault tolerant computer system and method are disclosed. The system may include a plurality of CPU nodes, each including: a processor and a memory; at least two IO domains, wherein at least one of the IO domains is designated an active IO domain performing communication functions for the active CPU nodes; and a switching fabric connecting each CPU node to each IO domain. One CPU node is designated a standby CPU node and the remainder are designated as active CPU nodes. If a failure, a beginning of a failure, or a predicted failure occurs in an active node, the state and memory of the active CPU node are transferred to the standby CPU node which becomes the new active CPU node. If a failure occurs in an active IO domain, the communication functions performed by the failing active IO domain are transferred to the other IO domain.

Single code set applications executing in a multiple platform system
09836303 · 2017-12-05 · ·

Embodiments of the claimed subject matter are directed to methods and a system that allows an application comprising a single code set under the COBOL Programming Language to execute in multiple platforms on the same multi-platform system (such as a mainframe). In one embodiment, a single code set is pre-compiled to determine specific portions of the code set compatible with the host (or prospective) platform. Once the code set has been pre-compiled to determine compatible portions, those portions may be compiled and executed in the host platform. According to these embodiments, an application may be executed from a single code set that is compatible with multiple platforms, thereby potentially reducing the complexity of developing the application for multiple platforms.

OPERATING A DATA CENTER

In an approach, a primary data center is provided including primary source and primary target database systems, where a function is activated causing the primary target database system to: include a copy of data and receive analysis queries from the primary source database system; and execute the analysis queries on data. A processor, in response to detecting a failure in the primary source database system: offloads queries intended for the primary source database system to a secondary source database system of a secondary data center also including a secondary target database system and a copy of data, where the function is deactivated. A processor, responsive to the primary target database system being available: receives analysis queries, processed by the secondary source database system, of the offloaded queries; and copies data to the secondary target database system. A processor causes the function to be activated in the secondary data center.