Patent classifications
G06F11/2038
ENHANCED FILE INDEXING, LIVE BROWSING, AND RESTORING OF BACKUP COPIES OF VIRTUAL MACHINES AND/OR FILE SYSTEMS BY POPULATING AND TRACKING A CACHE STORAGE AREA AND A BACKUP INDEX
An illustrative approach accelerates file indexing operations for block-level backup copies in a data storage management system. A cache storage area is maintained for locally storing and serving key data blocks, thus relying less on retrieving data on demand from the backup copy. File indexing operations are used for populating the cache storage area for speedier retrieval during subsequent live browsing of the same backup copy, and vice versa. The key data blocks cached while file indexing and/or live browsing an earlier backup copy help to pre-fetch corresponding data blocks of later backup copies, thus producing a beneficial learning cycle. The approach is especially beneficial for cloud and tape backup media, and is available for a variety of data sources and backup copies, including block-level backup copies of virtual machines (VMs) and block-level backup copies of file systems, including UNIX-based and Windows-based operating systems and corresponding file systems.
Fault tolerant system, server, and operation method of fault tolerant system
A first server and a second server use a virtual address to mount the storage synchronous area in a storage by the NFS. The first server obtains a snapshot of memory content of a virtual system operated as an active system and transmits the snapshot to the second server. The first server replicates content of the storage synchronous area in the storage to a storage synchronous area in a storage. When a failure occurs in the first server, the second server sets a virtual address to the storage and uses the virtual address to mount the storage synchronous area in the storage by NFS. The second server uses the snapshot received from the first server to execute the application on the virtual system.
HARDWARE ASSIST MECHANISMS FOR ALIVE DETECTION OF REDUNDANT DEVICES
An apparatus includes a first hardware assist device having at least one transmitter, at least one receiver, and a timer. The at least one transmitter is configured to transmit at least one first signal to a second hardware assist device of a redundant second apparatus. The at least one first signal indicates that the apparatus is functional. The at least one receiver is configured to receive at least one second signal from the second hardware assist device. The at least one second signal indicates that the second apparatus is functional. The timer is configured to control a driver to block transmission of the at least one first signal in response to a fault associated with the apparatus. The apparatus also includes at least one processing device configured to perform one or more actions in response to a loss of the at least one second signal from the second apparatus.
Fast single-master failover
Techniques for switching mastership from one service in a first data center to a second (redundant) service in a second data center are provided. A service coordinator in the first data center is notified about the master switch. The service coordinator notifies each instance of the first service that the first service is not a master. Each instance responds with an acknowledgement. After it is confirmed that all instances of the first service have responded with an acknowledgement, a client coordinator in the first and/or second data center is updated to indicate that the second service is the master so that clients may send requests to the second service. Also, a service coordinator in the second data center is notified that the second service is the master. The service coordinator notifies each instance of the second service that the second service is the master. Each instance responds with an acknowledgement.
HIGH RELIABILITY FAULT TOLERANT COMPUTER ARCHITECTURE
A fault tolerant computer system and method are disclosed. The system may include a plurality of CPU nodes, each including: a processor and a memory; at least two IO domains, wherein at least one of the IO domains is designated an active IO domain performing communication functions for the active CPU nodes; and a switching fabric connecting each CPU node to each IO domain. One CPU node is designated a standby CPU node and the remainder are designated as active CPU nodes. If a failure, a beginning of a failure, or a predicted failure occurs in an active node, the state and memory of the active CPU node are transferred to the standby CPU node which becomes the new active CPU node. If a failure occurs in an active IO domain, the communication functions performed by the failing active IO domain are transferred to the other IO domain.
OPERATING A DATA CENTER
In an approach, a primary data center is provided including primary source and primary target database systems, where a function is activated causing the primary target database system to: include a copy of data and receive analysis queries from the primary source database system; and execute the analysis queries on data. A processor, in response to detecting a failure in the primary source database system: offloads queries intended for the primary source database system to a secondary source database system of a secondary data center also including a secondary target database system and a copy of data, where the function is deactivated. A processor, responsive to the primary target database system being available: receives analysis queries, processed by the secondary source database system, of the offloaded queries; and copies data to the secondary target database system. A processor causes the function to be activated in the secondary data center.
DISTRIBUTED STORAGE SYSTEM
A distributed storage system includes a plurality of host servers including a primary compute node and backup compute nodes for processing first data having a first identifier, and a plurality of storage nodes that communicates communicate with the plurality of compute nodes, and includes a plurality of storage volumes. The plurality of storage volumes include a primary storage volume and backup storage volumes for storing the first data. The primary compute node provides a replication request for the first data to a primary storage node providing the primary storage volume, when a write request for the first data is received and the primary storage node stores, based on the replication request, the first data in the primary storage volume, copies the first data to the backup storage volumes, and provides, to the primary compute node, a completion acknowledgement to the replication request.
METHODS AND SYSTEMS FOR A NON-DISRUPTIVE PLANNED FAILOVER FROM A PRIMARY COPY OF DATA AT A PRIMARY STORAGE SYSTEM TO A MIRROR COPY OF THE DATA AT A CROSS-SITE SECONDARY STORAGE SYSTEM WITHOUT USING AN EXTERNAL MEDIATOR
Systems and methods are described for a non-disruptive planned failover from a primary copy of data at a primary storage cluster to a mirror copy of the data at a cross-site secondary storage cluster without using an external mediator. According to an example, a planned failover feature of a multi-site distributed storage system provides an order of operations such that a primary copy of a first data center continues to serve I/O operations until a mirror copy of a second data center is ready. This planned failover feature improves functionality and efficiency of the distributed storage system by providing non-disruptiveness during planned failover without using an external mediator based on a primary storage cluster being selected as an authority to implement a state machine with a persistent configuration database to track a planned failover state for the planned failover.
FLIGHT MANAGEMENT SYSTEM FOR AN AIRCRAFT AND METHOD OF SECURING OPEN WORLD DATA USING SUCH A SYSTEM
A flight management system for an aircraft and method of securing open world data using such a system. The flight management system includes at least two flight management computers including one computer termed active forming part of an active guidance subsystem configured to supply data for guiding the aircraft. Another computer is termed inactive at the current time. The flight management system includes a validation subsystem that includes the inactive flight management computer and a validation unit connected to the flight management computers. The validation subsystem is independent of the active guidance subsystem and configured to validate open world data and to transmit at least to the active flight management computer data that is validated during the validation.
REMOTE DIRECT MEMORY ACCESS (RDMA)-BASED RECOVERY OF DIRTY DATA IN REMOTE MEMORY
Techniques for implementing RDMA-based recovery of dirty data in remote memory are provided. In one set of embodiments, upon occurrence of a failure at a first (i.e., source) host system, a second (i.e., failover) host system can allocate a new memory region corresponding to a memory region of the source host system and retrieve a baseline copy of the memory region from a storage backend shared by the source and failover host systems. The failover host system can further populate the new memory region with the baseline copy and retrieve one or more dirty page lists for the memory region from the source host system via RDMA, where the one or more dirty page lists identify memory pages in the memory region that include data updates not present in the baseline copy. For each memory page identified in the one or more dirty page lists, the failover host system can then copy the content of that memory page from the memory region of the source host system to the new memory region via RDMA.