G06F11/3065

Resource monitor for monitoring long-standing computing resources
11567802 · 2023-01-31 · ·

Disclosed herein are system, apparatus, article of manufacture, method, and/or computer program product embodiments for monitoring long-standing computing resources. An apparatus may operate by receiving a cloud monitoring notification, where the cloud monitoring notification may indicate an occurrence of a monitored condition. The apparatus may then operate by scanning a cluster computing system for resource having a client assigned resource identifier and a computing resource attribute based on a resource identifier scan parameter and a resource attribute scan parameter. The apparatus may further operate by generating a resource notification request based on the scanning of the cluster computing system and transmitting the resource notification request to a communications system to notify a user that the resource has a computing resource attribute that match the resource attribute scan parameter.

Centralized error telemetry using segment routing header tunneling

A network device receives a data packet including a source address and a destination address. The network device drops the data packet before it reaches the destination address and generates an error message indicating that the data packet has been dropped. The network device encapsulates the error message with a segment routing header comprising a list of segments. The first segment of the list of segments in the segment routing header identifies a remote server, and at least one additional segment is an instruction for handling the error message. The network device sends the encapsulated error message to the remote server based on the first segment of the segment routing header.

Session triage and remediation systems and methods
11704177 · 2023-07-18 · ·

A computer system is provided. The computer system includes a memory and at least one processor coupled to the memory. The at least one processor is configured to scan session data representative of operation of a user interface comprising a plurality of user interface elements; detect, at a point in the session data, at least one changed element within the plurality of user interface elements; classify, in response to detecting the at least one changed element, the at least one changed element as either indicating or not indicating an error; store an association between the error and the point in the session data; and provide access to the point in the session data via the association.

Message Cloud

A method for error management is provided. The method comprises receiving a message call request regarding an error event generated by a software application. The message call request comprises a message ID associated with an error type. In response to the call request a message cache is searched for the message ID. If the ID is in the cache, an error message associated with the ID is returned. The error message provides a description of the error and suggested remedial action. If the message ID is not in the cache, the error message is fetched from a message repository that contains error messages corresponding to respective message IDs. The fetched error message is loaded into the cache and returned. Message call request data is stored in a metrics repository. The message call request data comprises frequency metrics that describe how often the message ID is received.

PREDICTIVE BATCH JOB FAILURE DETECTION AND REMEDIATION

Systems, methods, and computer programming products for predicting, preventing and remediating failures of batch jobs being executed and/or queued for processing at future scheduled time. Batch job parameters, messages and system logs are stored in knowledge bases and/or inputted into AI models for analysis. Using predictive analytics and/or machine learning, batch job failures are predicted before the failures occur. Mappings of processes used by each batch job, historical data from previous batch jobs and data identifying the success or failure thereof, builds an archive that can be refined over time through active learning feedback and AI modeling to predictively recommend actions that have historically prevented or remediated failures from occurring. Recommended actions are reported to the system administrator or automatically applied. As job failures occur over time, mappings of the current system log to logs for the unsuccessful batch jobs help the root cause analysis becomes simpler and more automated.

METHOD AND SYSTEM FOR DETERMINING THE STATE OF APPLICATION UPGRADES USING A DEVICE EMULATION SYSTEM OF A CUSTOMER ENVIRONMENT

A method for managing a client environment includes obtaining, by a state processor, a state prediction request associated with an application upgrade on an emulation of a client device; in response to the state prediction request: obtaining live data associated with the application upgrade; performing natural language processing on the live data to obtain processed live data; applying a state prediction model to the processed live data to generate a state prediction; making a first determination that the state prediction indicates that the application upgrade was not successful; and in response to the first determination: making a second determination that the state prediction indicates that the application upgrade is fixable; and in response to the second determination: initiating the remediation of the application upgrade.

Technologies for deploying virtual machines in a virtual network function infrastructure

Technologies for deploying virtual machines (VMs) in a virtual network function (VNF) infrastructure include a compute device configured to collect a plurality of performance metrics based on a set of key performance indicators, determine a key performance indicator value for each of the set of key performance indicators based on the collected plurality of performance metrics, and determine a service quality index for a virtual machine (VM) instance of a plurality of VM instances managed by the compute as a function each key performance indicator value. Additionally, the compute device is configured to determine whether the determined service quality index is acceptable and perform, in response to a determination that the determined service quality index is not acceptable, an optimization action to ensure the VM instance is deployed on an acceptable host of the compute device. Other embodiments are described herein.

Branch target filtering based on memory region access count

A branch predictor of a processor includes one or more prediction structures, including a predicted branch address and predicted branch direction, that identify predicted branches. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.

Optimizing hardware replacement using performance analytics

A solution is disclosed for computer hardware replacement using performance analytics that selects replacement computer hardware based on actual user needs and enterprise priorities. Key performance data is collected and compared with various baselines, thereby identifying hardware that is performing below acceptable levels. Enterprise data and collected data are received from an instrumented operating system on a computing device. The collected data includes boot performance, application performance, and hardware performance. Based at least on the collected data, a usability score is determined by performing a weighted calculation on the collected data. Based at least on the usability score and the enterprise data, it is determined whether a score improvement is required. Based at least on the enterprise data, a score improvement selection is determined. The score improvement selection is reported based at least on determining that a score improvement is required.

Unified event processing and log management over multiple domains
11544124 · 2023-01-03 · ·

A computer-implemented method of providing unified event monitoring and log processing is disclosed. The method comprises receiving streaming event data comprising a plurality of event entries from a plurality of domains including a cloud manager for a cloud platform and an application running within a container on the cloud platform; processing the streaming event data into a normalized, domain-independent format; evaluating a plurality of policy rules on the streaming event data, wherein the plurality of policy rules is defined with a unified syntax; and in response to the evaluating satisfying a condition of a first rule of the plurality of policy rules, transmitting to a remote device data related to an action defined in the first rule, wherein the receiving, processing, evaluating, and transmitting for each event entry for the plurality of event entries are performed in real time.