G06F11/0706

Identifying root causes of software defects

Root cause identification of a software defect includes identifying, in program code of a software feature, hedge code of the software feature based on errors induced from temporarily substituting program code of the software feature with substitute program code and obtaining an error graph for the hedge code, obtaining error logs of an application that incorporates the software feature, the error logs indicating errors with the software feature of the application, automatically generating an application error graph reflective of the errors with the software feature of the application, mapping the application error graph to the error graph for the hedge code, and based on the mapping aligning one of more errors reflected in the application error graph to error(s) reflected in the error graph for the hedge code, identifying the hedge code as inducing a root error identified in the application error graph.

TECHNIQUES TO PROVIDE SELF-HEALING DATA PIPELINES IN A CLOUD COMPUTING ENVIRONMENT
20230214289 · 2023-07-06 · ·

Embodiments may generally be directed to systems and techniques to detect failure events in data pipelines, determine one or more remedial actions to perform, and perform the one or more remedial actions.

METHOD FOR ENCODED DIAGNOSTICS IN A FUNCTIONAL SAFETY SYSTEM
20230006697 · 2023-01-05 ·

A method includes, storing a set of valid codewords including: a first valid functional codeword representing a functional state of a controller subsystem; a first valid fault codeword representing a fault state of the controller subsystem and characterized by a minimum hamming distance from the first valid functional codeword; a second valid functional codeword representing a functional state of a controller; and a second valid fault codeword representing a fault state of the controller; in response to detecting functional operation of the controller subsystem, storing the first valid functional codeword in a first memory; in response to detecting a match between contents of the first memory and the first valid functional codeword, outputting the second valid functional codeword; in response to detecting a mismatch between contents of the first memory and every codeword in the first set of valid codewords, outputting the second valid fault codeword.

Utilizing artificial intelligence to generate and update a root cause analysis classification model

A device trains a classification model with defect classifier training data to generate a trained classification model and processes information indicating priorities and rework efforts for defects, with a Pareto analysis model, to select a set of classes for the defects. The device calculates defect scores for the set of the classes and selects a particular class, from the set of the classes, based on the defect scores. The device processes a historical data set for the particular class to identify a root cause corrective action (RCCA) recommendation and processes information indicating a defect associated with the particular class, with the trained classification model, to generate a predicted RCCA recommendation for the defect. The device processes the predicted RCCA recommendation and the RCCA recommendation, with a linear regression model, to determine an effectiveness score for the predicted RCCA recommendation and retrains the classification model based on the effectiveness score.

IN-APP FAILURE INTELLIGENT DATA COLLECTION AND ANALYSIS

Intelligent collection and analysis of in-app failure data is disclosed herein. Upon an application failure in a client device, the client device may collect failure information uniquely identifying a specific failure and provide the failure information to an analysis system. The analysis system may identify a specific failure that identifies the application and a specific portion of the code in the application, based on the failure information and match an action correlated to the specific failure where the action is uniquely designed to resolve the specific failure in the application. The action may include instructions for the client device used to intelligently lead to a resolution of the specific failure. The analysis system may transmit the action to the client device to perform the action and provide any follow up information to the analysis server. The analysis server may use the information to further analyze the specific failure.

CUSTOM BASEBOARD MANAGEMENT CONTROLLER (BMC) FIRMWARE STACK MONITORING SYSTEM AND METHOD

An Information Handling System (IHS) includes multiple hardware devices, and a baseboard Management Controller (BMC) in communication with multiple hardware devices of the IHS. The BMC includes executable instructions for monitoring a parameter of one or more of the hardware devices when a custom BMC firmware stack is executed on the BMC. The instructions that monitor the parameter are separate and distinct from the instructions of the custom BMC firmware stack. The instructions also control the BMC to perform one or more operations to remediate an excessive parameter when the parameter exceeds a specified threshold.

Method and system for power equipment diagnosis based on windowed feature and Hilbert visualization

A method and a system for power equipment diagnosis based on windowed feature and Hilbert visualization are provided, which belong to the field of power equipment fault diagnosis. The method includes: obtaining an original data set of monitoring data containing power equipment fault features; introducing windowed feature calculation considering logarithmic constraints to process data to obtain a feature sequence; using Hilbert visualization method for further processing to obtain a Hilbert image data set used to train and verify a convolutional neural network; and finally directly inputting newly obtained test sample data after windowed feature calculation and Hilbert visualization processing into the trained network for fault diagnosis and location. The disclosure uses windowed feature calculation and Hilbert visualization to process the monitoring data of a power equipment to fully extract fault features and effectively improve diagnostic accuracy, and uses the convolutional neural network for diagnosis to improve the intelligence of diagnosis.

TECHNIQUES TO PROVIDE SELF-HEALING DATA PIPELINES IN A CLOUD COMPUTING ENVIRONMENT
20220382620 · 2022-12-01 · ·

Embodiments may generally be directed to systems and techniques to detect failure events in data pipelines, determine one or more remedial actions to perform, and perform the one or more remedial actions.

Hot-swap controller fault reporting system

A hot-swap controller fault reporting system includes component(s), a hot-swap controller that is coupled to the component(s), and a hot-swap controller fault reporting subsystem that is coupled to the hot-swap controller. The hot-swap controller fault reporting subsystem identifies a hot-swap controller fault that was generated by the hot-swap controller and that is associated with the component(s), generates an Intelligent Platform Management Interface (IPMI) bit combination that is based on the hot-swap controller fault and that is configured to identify the hot-swap controller and a type of the hot-swap controller fault, and provides a log entry based on the IPMI bit combination in a log database.

IDENTIFYING ROOT CAUSES OF SOFTWARE DEFECTS

Root cause identification of a software defect includes identifying, in program code of a software feature, hedge code of the software feature based on errors induced from temporarily substituting program code of the software feature with substitute program code and obtaining an error graph for the hedge code, obtaining error logs of an application that incorporates the software feature, the error logs indicating errors with the software feature of the application, automatically generating an application error graph reflective of the errors with the software feature of the application, mapping the application error graph to the error graph for the hedge code, and based on the mapping aligning one of more errors reflected in the application error graph to error(s) reflected in the error graph for the hedge code, identifying the hedge code as inducing a root error identified in the application error graph.