G06F11/0793

ONLINE ERROR RECOVERY
20230029315 · 2023-01-26 ·

A technique for correcting errors in a data storage system operates while the data storage system remains online. The technique includes identifying an object for validation, scanning a plurality of pointers, and counting a number of pointers that point to the object. The technique further includes repairing a discrepancy between the count of pointers and a reference count stored in connection with the object.

HARDWARE-BASED SENSOR ANALYSIS
20230229549 · 2023-07-20 ·

A method of monitoring messages from a sensor using an integrated circuit is provided. The messages include data measured by that sensor. The method includes reading a first message from interconnect circuitry of the integrated circuit. The interconnect circuitry connects the sensor to one or more core devices configured to process the messages. A first hash value is calculated for the first message. The first hash value is compared to one or more prior hash values stored in a hash store. Each prior hash value of the one or more prior hash values corresponds to a message that was read from the interconnect circuitry prior to the first message. A corrective action is performed when a difference between the first hash value and at least one of the prior hash values stored in the hash store is below a predetermined threshold.

Simulated Data Center

A system, method, and computer-readable medium are disclosed for performing a data center monitoring and management operation. The data center monitoring and management operation includes: selecting a data center asset for simulation; identifying a set of session input data for use during simulation; and, performing a data center asset simulation session operation for the data center asset based upon the set of session input data.

MULTI-CONTROLLER DECLARATIVE FAULT MANAGEMENT AND COORDINATION FOR MICROSERVICES
20230023744 · 2023-01-26 ·

Methods, systems, and computer program products for multi-controller declarative fault management and coordination for microservices are provided herein. A computer-implemented method includes processing information pertaining to at least one fault impacting multiple resources within a given system, wherein respective portions of the multiple resources are managed by multiple independent controllers; determining, by each of at least a portion of the multiple independent controllers and based at least in part on the processing of the information, one or more desired resource states and one or more remediation actions; generating, based at least in part on one or more of the determined desired resource states and the determined remediation actions, a sequential ordering of the determined remediation actions to be carried out by the at least a portion of the multiple controllers; and automatically initiating execution of the determined remediation actions in accordance with the generated sequential ordering.

UTILIZING AUTOMATIC LABELLING, PRIORITIZING, AND ROOT CAUSE ANALYSIS MACHINE LEARNING MODELS AND DEPENDENCY GRAPHS TO DETERMINE RECOMMENDATIONS FOR SOFTWARE PRODUCTS

A device may receive software data identifying current logs and events associated with software products utilized by an entity and may process the software data, with a machine learning model, to generate error severity scores for the software products. The machine learning model may be trained based on historical software data identifying events and logs associated with software products utilized by the entity and based on a combination of historical health scores, historical sentiment scores, and historical dissimilarity scores for the software products. The device may process the error severity scores, with a prioritization model, to generate prioritized error scores and may process the error severity scores and the prioritized error scores, with a root cause analysis model, to generate root cause data identifying root causes associated with the error severity scores. The device may perform one or more actions based on the root cause data.

CONTROLLER AREA NETWORK AND CONNECTIVITY HEALTH TROUBLESHOOTING SYSTEM

A system and method for diagnosing connection and communication in an industrial machine. The electronic processing system includes a CAN bus, an ethernet network, and a plurality of devices connected to the CAN bus and the ethernet network. The plurality of devices includes at least one controller programmed to run one or more software applications. A connectivity check is performed to obtain CAN connection status data and ethernet connection status data for the plurality of devices. The CAN connection status data and the ethernet connection status data is analyzed to determine a likely cause of a device connection issue. A solution to the device connection issue is output to a user based on the analyzed data.

Systems and Methods for Predicting Power Converter Health

A method for predicting power converter health is provided. The method comprises receiving a plurality of parameter measurements associated with a power converter system comprising a power converter. The plurality of parameter measurements comprises a first set of system measurements and a second set of failure precursor measurements. The method further comprises inputting the first set of system measurements into a first machine learning algorithm to generate expected failure precursor measurement information and inputting the expected failure precursor measurement information and the second set of failure precursor measurements into a second machine learning algorithm to generate component failure prediction information. The method also comprises performing one or more actions based on the generated component failure prediction information.

Control device for vehicle-mounted apparatus

A control device for a vehicle-mounted apparatus, the control device includes: a second CPU state judging section provided to the first CPU, and configured to judge a state of the second CPU based on a state of the inter-CPU communication and a voltage value of the electric power supplied from the first electric power supply section, or the second reset signal; and a first CPU state judging section provided to the second CPU, and configured to judge a state of the first CPU based on the state of the inter-CPU communication and a voltage value of the electric power supplied from the second electric power supply section, or the first reset signal.

QUANTUM COMPUTER SYSTEM SCHEDULING AND PARAMETERIZATION BASED ON ERROR CORRECTION HISTORY
20230229491 · 2023-07-20 ·

In one example described herein a system can receive, by a scheduler of a server, a request to execute a quantum algorithm. The system can determine, by the scheduler, a quantum computer system of a plurality of quantum computer systems to execute the quantum algorithm based on a database that stores associations between each quantum computer system of the plurality of quantum computer systems, at least one parameter associated with the quantum algorithm, and error information. The system can transmit, by the scheduler, the request to the quantum computer system for executing the quantum algorithm.

SYSTEM AND METHODS TO DETECT FAULTY COMPONENTS DURING SESSION LAUNCH

A computer system configured to identify errors in a session launch initiated by a client application is provided. The computer system includes a memory and at least one processor coupled to the memory. The at least one processor is configured to receive one or more events from one or more applications or devices involved in the session launch, wherein an event of the one or more events comprises information from an application or device call (e.g., an application programming interface (API) call) communicated during the session launch, the information comprising destination information; build a primary Directed Acyclic Graph (DAG) based on the information from the API call; determine an error identifier based on the primary DAG; retrieve a troubleshooting recommendation from a library based on the error identifier; and send the troubleshooting recommendation to the client application.