Patent classifications
G06F11/0778
MINIMIZING IMPACT OF FIRST FAILURE DATA CAPTURE ON COMPUTING SYSTEM USING RECOVERY PROCESS BOOST
A computer-implemented method for capturing system memory dumps includes receiving, by a diagnostic data component, an instruction to capture a system memory dump associated with a computer process being executed by a computing system comprising one or more processing units, the system memory dump comprising data from a plurality of memory locations associated with the computer process. In response to determining that the system memory dump satisfies a predetermined criterion, the diagnostic data component sends a request for a computing resource boost from the computing system. Further, in response to the request for the computing resource boost being granted, the diagnostic data component uses additional computing resources from the one or more processing units to store the data from the plurality of memory locations in the system memory dump and executing the backlogged operations that were halted due to the system memory dump capture.
Electronic apparatus equipped with HDD, control method therefor, and storage medium
An electronic apparatus which is capable of preventing loss of data in an HDD resulting from an instantaneous power failure. The electronic apparatus is equipped with the HDD that has a nonvolatile storage area and a volatile storage area in which data is temporarily held. A control unit executes a plurality of processes including a held data writing process in which the data held in the volatile storage area is written into the nonvolatile storage area, in a predetermined order according to an off instruction by a user. In a case where a stop of the power supply to the electronic apparatus is detected, the control unit executes the plurality of processes including the held data writing process in a different order from the predetermined order.
Profiling and debugging for remote neural network execution
Remote access for debugging or profiling a remotely executing neural network graph can be performed by a client using an in-band application programming interface (API). The client can provide indicator flags for debugging or profiling in an inference request sent to a remote server computer executing the neural network graph using the API. The remote server computer can collect metadata for debugging or profiling during the inference operation using the neural network graph and send it back to the client using the same API. Additionally, the metadata can be collected at various granularity levels also specified in the inference request.
SELF-OPTIMIZING ANALYSIS SYSTEM FOR CORE DUMPS
A method for facilitating root cause analysis of a software crash by core dump analysis is disclosed. The method comprises receiving a core dump file relating to a software program, identifying unique source code lines in the core dump file for each running thread at the crash time, and determining unique source code lines as conspicuous source code lines depending on an abstraction level value indicating a number of occurrences of the conspicuous source code line in different threads. Furthermore, the method comprises determining an abstraction ratio as a function of a number of conspicuous source code lines and a number of unique source code lines, evaluating whether the predefined abstraction level value has to be adjusted by determining unique source code line as a conspicuous source code line and determining an abstraction ratio, and outputting the conspicuous source code lines and an assessment value for the abstraction ratio.
Progressive error handling
Systems and methods herein describe receiving identification from a data pipeline, accessing first data offset information for a first data origin and second data offset information for a second data origin, bisecting the first data origin using the first data offset information, processing the data pipeline with the bisected first data offset information and the second data offset information, receiving a notification indicating a data pipeline status, and causing presentation of the notification on a graphical user interface of a computing device.
Agricultural sensor placement and fault detection in wireless sensor networks
Disclosed are various embodiments for optimized sensor deployment and fault detection in the context of agricultural irrigation and similar applications. For instance, a computing device may execute a genetic algorithm (GA) routine to determine an optimal sensor deployment scheme such that a mean-time-to-failure (MTTF) for the system is maximized, thereby improving communication of sensor measurements. Moreover, in various embodiments, a centralized fault detection scheme may be employed and a soil moisture of a field can be determined by statistically inferring soil moistures at locations of faulty nodes using spatial and temporal correlations.
Performing root cause analysis in a multi-role application
A new snapshot of a storage volume is created by instructing computing nodes to suppress write requests. A snapshot of the application may be created and used to rollback or clone the application. Clones snapshots of storage volumes may be gradually populated with data from prior snapshots to reduce loading on a primary snapshot. Components of cloned applications may communicate with one another using addresses of these components in the parent application. Jobs implementing a bundled application may be referenced with a simulated file system that generates reads to hosts only when the job log file is actually read. Job logs and a job hierarchy may be used to perform root cause analysis. Job logs may be for tasks such as creating the bundled application, cloning, rolling back, backing up, scaling out, scaling in, deleting, pruning unused application images, or the like.
Supervisor module for crash detection and mitigation
Disclosed is systems and methods for controlling a video display of a computing device during malfunction. The systems and methods can include receiving a first video stream and determining that the first video stream includes an error message for display on the video display. Once an error message is detected, a second video stream can be transmitted to the video display. The second video stream can include an alternate message for display on the video display.
FRAMEWORK FOR ANOMALY DETECTION AND RESOLUTION PREDICTION
A method comprises collecting operational data for one or more devices and identifying one or more anomalies associated with the one or more devices based at least in part on the collected operational data. At least a portion of the collected operational data corresponding to the identified one or more anomalies is analyzed, and a probability of automatic resolution for respective ones of the identified one or more anomalies is determined based at least in part on the analysis. The identifying, the analyzing and the determining are performed using one or more machine learning models.
Memory anomaly detection method and device
A method includes obtaining a first memory log, where the first memory log includes log information of a plurality of garbage collections, and log information of each garbage collection includes a garbage collection time, and includes at least one of a downtime, memory usage after garbage collection, and memory usage before garbage collection, obtaining, based on log information in a first detection time window, first statistical information corresponding to the first detection time window, and determining, based on the first statistical information corresponding to the first detection time window, an anomaly degree corresponding to the log information in the first detection time window.