G06F11/0703

Cloud oversubscription system

A cloud oversubscription system comprising an overload detector configured to model a time series of data of at least one virtual machine on a host as a vector-valued stochastic process including at least one model parameter, the overload detector communicating with an inventory database, the overload detector configured to obtain an availability requirement for each of the at least one virtual machine; a model parameter estimator communicating with the overload detector, the model parameter estimator communicating with a database containing resource measurement data for at least one virtual machine on a host at a selected time interval, the model parameter estimator is configured to estimate the at least one model parameter from the resource measurement data; a loading assessment module communicating with the model parameter module to obtain the at least one model parameter for each of the at least one host running at least one virtual machine and determine a probability of overload based on the at least one model parameter, wherein the loading assessment module communicates the probability of overload to the overload detector; wherein the overload detector compares the probability of overload to the availability requirement to identify a probable overload condition value; and wherein the overload detector communicates the probable overload condition value to a recommender, wherein the recommender generates an alert when the overload condition value exceeds the service level agreement requirements for any of the at least one virtual machine.

AUGMENTED EXCEPTION PROGNOSIS AND MANAGEMENT IN REAL TIME SAFETY CRITICAL EMBEDDED APPLICATIONS

A smart exception handler system for safety-critical real-time systems is provided. The system is configured to: receive a plurality of parameters at a plurality of nodal points in a real-time execution path; analyze the received parameters using a trained exception handling model, wherein the trained exception handling model has been trained using machine learning techniques to learn the critical path of execution and/or critical range of parameters at critical nodes, wherein the critical range of parameters comprises a learned threshold at a node; compute, using the trained exception handling model, a probability of fault at the critical nodes; compare the probability of fault at a critical node against a learned threshold at the node; and take proactive action in real-time to avoid the occurrence of a fault when the probability of fault at the node is higher than the learned threshold at the node.

METHODS AND SYSTEMS FOR MANAGING APPLICATION CONFIGURATIONS

Systems and Methods for changing a configuration of an application are disclosed. A change request can be identified to update a property of the application that has been approved by a change management process. The property can be updated in a test environment. A test of the application with the updated property in the test environment can be identified as successful. The property can be updated in a database of an application configuration environment after identifying that the test was successful. The application can refresh the property by replacing the property with an updated property from the database without restarting or recreating the application.

METHOD AND SYSTEM FOR REDUCING INCIDENT ALERTS

A system and method for reducing incident alerts for an enterprise environment are described. In one embodiment, a method of reducing incident alerts for an enterprise environment includes receiving a plurality of historical incident alerts associated with previous incidents associated with nodes within an enterprise environment. The method includes extracting from a first subset of the historical incident alerts a plurality of rules to generate a rule knowledge base and analyzing a second subset of the historical incident alerts against the plurality of rules to identify candidate incidents alerts as potential dead-end tickets. The method also includes providing feedback on the candidate incident alerts to confirm or deny that the alert is a dead-end ticket. Based on the feedback, a prescriptive avoidance rule set is generated to identify an incident alert as a dead-end ticket and eliminate the dead-end tickets from submitted incident alerts.

SYSTEM AND METHOD FOR CONFIGURATION DRIFT DETECTION AND REMEDIATION

Administration of IHSs (Information Handling Systems) within a data center results gradual drift of the configuration parameters of the individual IHSs such that the IHSs may no longer be in compliance with data center policies, such as policies in support of security and disaster recovery procedures. Embodiments provide techniques for distributed determination of drift within a network of managed IHSs, in which each managed IHS is provided with baselines for the configuration parameters utilized by each managed IHS. Using the provided baselines, each managed IHS identifies discrepancies between its current configuration and the applicable baselines. Based on discrepancies reported by the managed IHSs, a management console evaluates drift within the network of managed IHSs and determines when to trigger remediation procedures in order to correct the drift.

INFORMATION PROCESSING DEVICE, IMAGE FORMING APPARATUS, IMAGE FORMING SYSTEM, AND INFORMATION PROCESSING METHOD

An information processing device includes control circuitry. The control circuitry is configured to store, in a first memory, record information formed each time a predetermined event occurs in a device and perform an update process of successively updating an old piece of record information with a new piece of record information in the record information stored in the first memory; transmit the record information stored in the first memory via a first signal line; transfer communication abnormality record information stored in the first memory to a second memory, which is configured not to update record information, and store the communication abnormality record information in the second memory when a communication abnormality signal of the first signal line is supplied via a second signal line; and transmit the communication abnormality record information stored in the second memory via the first signal line when communication of the first signal line is restored.

Inter-process communication fault detection and recovery system

An inter-process communication (IPC) system, includes a first client engine, a first server engine, and a broker engine that is coupled to the first client engine. The broker engine initiates a first timer that is configured to reset when traffic is received from the first server engine while the first server engine is registered with the broker engine and coupled to the broker engine via a communication channel. The traffic that causes the first timer to reset includes at least one of: traffic generated by the first client engine to complete a request, and a first server-to-broker heartbeat message generated by the first server engine. The broker engine determines that the first timer has reached a predefined time amount, and in response, removes the registration of the first server engine and removes the communication channel between the broker engine and the first server engine.

Apparatus And Method For Alarm Management

A Petri net that corresponds to the mathematical model is determined and the Petri net includes the domination relationships and mutual dependency relationships between individual alarms. The Petri net is analyzed to determine a set of construction rules. The construction rules define an approach to reduce the number of alarms presented to an operator or a computer program.

DESIGN SUPPORT SYSTEM AND NON-TRANSITORY COMPUTER READABLE MEDIUM

A design support system includes memory, a receiving unit, and an associating unit. The memory stores information on design element classification that classifies a design element included in a product, and information on design requirement classification that classifies a design requirement required for the product. The receiving unit receives technical information regarding a design trouble. The associating unit refers to technical information regarding a design trouble, received by the receiving unit, and associates a classification item in the design requirement classification to which the design trouble belongs and a classification item in the design element classification to which a design element causing the design trouble belongs with each other, along with information on a phenomenon indicating a failure status of the design element included in the technical information.

Method and apparatus for diagnosis and recovery of system problems

Embodiments of the present invention relate to method and apparatus for system problem diagnosis and recovery. According to embodiments of the present invention, problem symptom information in a system can be automatically monitored and collected BY a monitoring apparatus (or referred as to agent) deployed at the system side. Upon after receiving such information, the diagnosis apparatus, for example, may automatically determine a root cause of the problem by querying a backend knowledge repository, and possibly generate an executable software package for recovering the problem. If the diagnosis apparatus determines that the currently available information is insufficient to determine a creditable enough root cause and/or is insufficient to generate the software package for recovering the problem, the diagnosis apparatus may interactively control the monitoring apparatus to collect desired additional information. In this way, the efficiency and accuracy of problem diagnosis and recovery may be improved.