DUPLEX OPERATION SYSTEM, DUPLEX OPERATION METHOD, AND PROGRAM
20230081290 · 2023-03-16
Inventors
- Kotaro MIHARA (Musashino-shi, Tokyo, JP)
- Nobuhiro KIMURA (Musashino-shi, Tokyo, JP)
- Minoru Sakuma (Musashino-shi, Tokyo, JP)
- Takato TODA (Musashino-shi, Tokyo, JP)
Cpc classification
G06F11/20
PHYSICS
International classification
Abstract
A virtual machine control device 20 includes: an external disk 22 that has recorded thereon initialization information including user data and application software for each virtual machine 11; and a restart control unit 21 that, when a failure in which a reboot of an OS is executed without a restart escalation for expanding an initialization range in stages occurs in a virtual machine 11.sub.0 of an active system (ACT), stops a duplexed operation, causes another general-purpose device 10.sub.x to load the initialization information for the virtual machine 11.sub.0 of an active system that has stopped and to reboot an OS and also causes a virtual machine 11.sub.1 of a standby system (SBY) that has stopped the duplexed operation to load the initialization information for the virtual machine 11.sub.1 and to reboot an OS, and sets as an active system the general-purpose device 10.sub.x that has started up first, and sets as standby system a general-purpose device 10.sub.1 that has started up later.
Claims
1. A duplexed operation system comprising: a plurality of general-purpose devices that have a plurality of virtual machines installed thereon; and a virtual machine control device that controls duplexed operation by two systems of an active system and a standby system of the virtual machines; wherein the virtual machine control device includes: an external disk that has initialization information recorded thereon, the initialization information including user data and application software for each of the virtual machines; a processor; a memory device storing instructions that, when executed by the processor, cause the processor to perform operations comprising: when a failure occurs in a first one of the virtual machines, stopping the duplexed operation, the first one being an active system, the failure being such that a reboot of an OS is executed without a restart escalation, the restart escalation being for expanding an initialization range in stages; causing another of the general-purpose devices to load the initialization information of the first virtual machine of an active system that has stopped and to reboot an OS and also causes a second one of the virtual machines, the second one being a standby system, that has stopped the duplexed operation to load the initialization information of the second virtual machine and to reboot an OS; and setting as an active system one of the general-purpose devices that has started up first and setting as a standby system one of the general-purpose devices that has started up later.
2. A duplexed operation method executed by a virtual machine control device of a duplexed operation system, the duplexed operation system comprising: a plurality of general-purpose devices that have a plurality of virtual machines installed thereon; and the virtual machine control device that controls duplexed operation by two systems of an active system and a standby system of the virtual machines; wherein the virtual machine control device performs operations comprising: when a failure occurs in a first one of the virtual machines, stopping the duplexed operation, the first one being an active system, the failure being such that a reboot of an OS is executed without a restart escalation, the restart escalation being for expanding an initialization range in stages; causing another of the general-purpose devices to load initialization information including user data and application software of the first virtual machine of an active system that has stopped and to reboot an OS, and also causing a second one of the virtual machines, the second one being a standby system, that has stopped the duplexed operation to load the initialization information of the second virtual machine and to reboot an OS, and setting as an active system one of the general-purpose devices that has started up first and setting as a standby system one of the general-purpose devices that has started up later.
3. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers of a virtual machine control device of a duplexed operation system, the duplexed operation system comprising: a plurality of general-purpose devices that have a plurality of virtual machines installed thereon; and the virtual machine control device that controls duplexed operation by two systems of an active system and a standby system of the virtual machines; wherein the virtual machine control device performs operations comprising: when a failure occurs in a first one of the virtual machines, stopping the duplexed operation, the first one being an active system, the failure being such that a reboot of an OS is executed without a restart escalation, the restart escalation being for expanding an initialization range in stages; causing another of the general-purpose devices to load initialization information including user data and application software of the first virtual machine of an active system that has stopped and to reboot an OS, and also causing a second one of the virtual machines, the second one being a standby system, that has stopped the duplexed operation to load the initialization information of the second virtual machine and to reboot an OS, and setting as an active system one of the general-purpose devices that has started up first and setting as a standby system one of the general-purpose devices that has started up later.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
DESCRIPTION OF EMBODIMENTS
[0017] Hereinafter, an embodiment of the present invention will be described with reference to drawings. The same components in a plurality of drawings are denoted by the same reference characters and description thereof will not be repeated.
[0018]
[0019] As illustrated in
[0020] Thus, the duplexed operation system 100 includes a plurality of general-purpose devices 10 each having a virtual machine 11 installed thereon and a plurality of general-purpose devices 10 (in
[0021] The general-purpose device 10 and the virtual machine control device 20 can be implemented by a computer including, for example, a ROM, RAM, and CPU. In this case, the processing contents of functions that the general-purpose device 10 and the virtual machine control device 20 should include are described by a program.
[0022] The virtual machine control device 20 includes a restart control unit 21 and an external disk 22; and controls a duplexed operation by two systems of an active system (ACT) and a standby system (SBY) of the virtual machines 11.
[0023] The external disk 22 has recorded thereon initialization information including user data and application software for each virtual machine 11. The external disk 22 is configured with, for example, a hard disk drive (HDD).
[0024] The restart control unit 21 stops the duplexed operation when a failure in which a reboot of an operating system (OS) is executed without a restart escalation for expanding an initialization range in stages occurs in a virtual machine 11 of an active system. The restart control unit 21 causes another general-purpose device 10 to load initialization information for a virtual machine 11.sub.0 of an active system (ACT) that has stopped and to reboot an OS; and also causes a virtual machine 11.sub.1 of a standby system (SBY) that has stopped the duplexed operation to load initialization information for the virtual machine 11.sub.1 and to reboot an OS. The restart control unit 21 sets as an active system (ACT) a general-purpose device 10.sub.1 that has started up first and sets as a standby system (SBY) a general-purpose device 10.sub.x that has started up later.
[0025] The restart escalation refers to expanding in stages the range of reboot when a failure occurs in a voice communication system, for example, that controls the duplexed operation of the duplexed operation system 100.
[0026]
[0027] The PH 0.5 means an individual process reset. Only reset of an individual process on the same hardware is performed and also, a reboot is not performed.
[0028] The PH1.0 causes initialization of operation by application software. Hereinafter, application software may be referred to as app (APL). Only reset of the operation of specific app on the same hardware is performed and also, a reboot is not performed.
[0029] The PH2.0 causes initialization of operation by app and middleware. Only reset of specific app and middleware on the same hardware is performed and also, a reboot is not performed. The middleware refers to software in a layer for connection between app and an operation system (OS).
[0030] The PH2.5 causes initialization of an OS too in addition to the initialization range in the PH2.0. The PH2.5 causes the initialization by reloading of the app, MW, and OS on the same hardware; and causes a reboot of the OS. In this case, the initialization is performed by using a current file.
[0031] The PH3.0 is different from the PH2.5 in that initialization is performed by using a LAF file that is backup data which is backed up daily, for example. In addition, initialization may be performed by using a REF file that is an initial data set. Note that the PH3.0 may cause initialization by using either the LAF file or REF file. Alternatively, initialization by the REF file may be separated as a PH3.5 from that stage.
[0032] The PH0.5 to PH3.0 is initialization performed on the same hardware. If a failure is not resolved by executing the restart phase of PH3.0, Auto Healing in which a target virtual machine 11 is deleted and the virtual machine 11 is reconfigured on other hardware is executed.
[0033] Execution of initialization by performing in sequence each of the stages from PH0.5 to Auto Healing described above is a common restart escalation. Compared to this common restart escalation, restart control of the present embodiment is different in that Auto Healing is executed when a failure in which an OS is rebooted without the restart escalation described above occurs in a virtual machine 11 of an active system.
[0034] The restart control of the present embodiment will be described in detail with reference to
[0035]
[0036] The virtual machine 11.sub.1 of a standby system is stopping providing a service. However, data for the active system (#0) and data for the standby system (#1) in the external disk 22 are sequentially updated in synchronous with each other.
[0037]
[0038]
[0039] More specifically,
[0040]
[0041] As described above, the duplexed operation system 100 of this embodiment is a duplexed operation system that includes: a plurality of general-purpose devices 10 that have a plurality of virtual machines 11 installed thereon; and a virtual machine control device 20 that controls duplexed operation by two systems of an active system (ACT) and a standby system (SBY) of the virtual machines 11. The virtual machine control device 20 includes: an external disk 22 that has recorded thereon initialization information including user data and application software for each of the virtual machines 11; and a restart control unit 21 that, when a failure in which a reboot of an OS is executed without a restart escalation for expanding an initialization range in stages occurs in an active system (ACT), stops the duplexed operation, causes another of the general-purpose devices 10.sub.x to load the initialization information for a virtual machine 11.sub.0 of the active system (ACT) that has stopped and to reboot an OS and also causes a virtual machine 11.sub.1 of a standby system (SBY) that has stopped the duplexed operation to load initialization information for the virtual machine 11.sub.1 and to reboot an OS, and sets as an active system (ACT) a general-purpose device 10.sub.1 that has started up first and sets as a standby device a general-purpose device 10.sub.x that has started up later. Thus, the duplexed operation system 100 of this embodiment can reduce a recovery time, thereby improving the reliability of the system.
[0042] More specifically, if a soft failure due to a hardware failure occurs first, Auto Healing is executed without performing a restart escalation. Therefore, a recovery time is reduced and thereby, the reliability of the system can be improved.
[0043] (Duplexed Operation Method)
[0044]
[0045] When the duplexed operation system 100 starts operation, the occurrence of a failure in a general-purpose device 10 of an active system (ACT) is monitored (step S1). The monitoring of a failure is repeated until a failure is detected (step S2: NO).
[0046] If a failure in the general-purpose device 10 of an active system (ACT) is detected (step S2: YES), whether a restart escalation is in progress is determined (step S3). For example, assume a case in which a failure occurs in an individual process of the general-purpose device 10.
[0047] In this case, it is a failure at the beginning of starting a restart escalation and therefore, the restart escalation has not been started yet (step S3: NO). Therefore, a determination at step S5 is also made as NO and a restart escalation starts from PH0.5 (step S4).
[0048] After that, if the failure is resolved by the restart of PH0.5, NO at step S2 and a loop at step S1 (failure detection) are repeated. If the failure is not resolved by the restart of PH0.5, a restart escalation is performed in the order of PH1.0, PH2.0, PH2.5, PH3.0, and Auto Healing.
[0049] This process flow of the step S1, No at step S5, and step S4 is the operation of a conventional restart escalation. Therefore, description on the flow will be omitted.
[0050] The duplexed operation method according to this embodiment is different from the conventional restart method in that Auto Healing is executed in a case where a failure requiring the restart of PH2.5 occurs first (step S5: YES) like a case where NG is detected by Watch dog, for example.
[0051] If a failure requiring the restart of PH2.5 occurs (step S5: YES) in a state where a restart escalation is not being executed (step S3: NO), duplexed operation is immediately stopped (step S6).
[0052] Next, another general-purpose device is caused to load initialization information including user data and application software of a virtual machine 11.sub.0 of an active system (ACT) that has stopped and to reboot an OS, and also, a virtual machine 11.sub.1 of a standby system (SBY) that has stopped the duplexed operation is caused to load initialization information for the virtual machine 11.sub.1 and to reboot an OS (step S7).
[0053] Then, a restart control step is performed in which a general-purpose device 10.sub.1 that has started up first is set as an active system (ACT) and a general-purpose device 10.sub.x that has started up later is set as a standby system (SBY) (step S8).
[0054] As described above, the duplexed operation method according to this embodiment is a duplexed operation method that is executed by a virtual machine control device 20 of a duplexed operation system including: a plurality of general-purpose devices 10 that have a plurality of virtual machines installed thereon; and the virtual machine control device 20 that controls duplexed operation by two systems of an active system (ACT) and a standby system (SBY) of the virtual machines 11. The virtual machine control device 20 performs a restart control step of: when a failure in which a reboot of an OS is executed without a restart escalation for expanding an initialization range in stages occurs in an active system (ACT), stopping the duplexed operation; causing another general-purpose device 10.sub.x to load initialization information including user data and application software of a virtual machine 11.sub.0 of the active system that has stopped and to reboot an OS, and also causing a virtual machine 11.sub.1 of a standby system (SBY) that has stopped the duplexed operation to load initialization information for the virtual machine 11.sub.1 and to reboot an OS; and setting as an active system (SBY) a general-purpose device 10.sub.1 that has started up first and setting as a standby system (SBY) the general-purpose device 10.sub.x that has started up later.
Thus, in the duplexed operation method according to this embodiment, a duplexed operation method capable of reducing a recovery time and thereby improving the reliability of the system can be provided.
[0055] The virtual machine control device 20 and general-purpose device 10 that constitute the duplexed operation system 100 can be implemented by a common computer system illustrated in
[0056] The present invention is not limited to the embodiment described above, and modifications are possible within the gist thereof. For example, description has been made by using an example in which the virtual machine control device 20 executes Auto Healing when a failure that requires the restart of PH2.5 occurs; however, the present invention is not limited thereto. Auto Healing may be executed for any failure involving a reboot of an OS. For example, Auto Healing may be executed during the PH3.0.
[0057] In addition, description has been made by using an example in which the duplexed operation system 100 of the present invention is applied to a voice communication system; however, this example is not limited thereto. The present invention can be widely applied to communication systems that communicate information other than voice.
[0058] As described above, the present invention naturally includes various embodiments not described herein. Therefore, the technical scope of the present invention is defined only by the matters specifying the invention according to the scope of claims reasonable from the above description.
REFERENCE SIGNS LIST
[0059] 100 Duplexed operation system [0060] 10 General-purpose device [0061] 11 Virtual machine [0062] 20 Virtual machine control device [0063] 21 Restart control unit [0064] 22 External disk [0065] VM Virtual machine [0066] ACT Active system [0067] SBY Standby system