Patent classifications
G06F11/00
Remote debug for scaled computing environments
Techniques and apparatus for remotely accessing debugging resources of a target system are described. A target system including physical compute resources, such as, processors and a chipset can be coupled to a controller remotely accessible over a network. The controller can be arranged to facilitate remote access to debug resources of the physical compute resources. The controller can be coupled to debug pin, such as, those of a debug port and arranged to assert control signals on the pins to access debug resources. The controller can also be arranged to exchange information elements with a remote debug host to include indication of debug operations and/or debug results.
System and method for hybrid kernel- and user-space incremental and full checkpointing
A system includes a multi-process application that runs. A multi-process application runs on primary hosts and is checkpointed by a checkpointer comprised of at least one of a kernel-mode checkpointer module and one or more user-space interceptors providing at least one of barrier synchronization, checkpointing thread, resource flushing, and an application virtualization space. Checkpoints may be written to storage and the application restored from said stored checkpoint at a later time. Checkpointing may be incremental using Page Table Entry (PTE) pages and Virtual Memory Areas (VMA) information. Checkpointing is transparent to the application and requires no modification to the application, operating system, networking stack or libraries. In an alternate embodiment the kernel-mode checkpointer is built into the kernel.
Systems, methods, and apparatuses for detecting and creating operation incidents
Techniques for determining insight are described. An exemplary method includes receiving a request to provide insight into potential abnormal behavior; receiving one or more of anomaly information and event information associated with the potential abnormal behavior; evaluating the received one or more of the anomaly information and event information associated with the abnormal behavior to determine there is insight as to what is causing the potential abnormal behavior and to add to an insight at least two of an indication of a metric involved in the abnormal behavior, a severity for the insight indication, an indication of a relevant event involved in the abnormal behavior, and a recommendation on how to cure the potential abnormal behavior; and providing an insight indication for the generated insight.
Fast distributed caching using erasure coded object parts
Systems and methods are described for providing rapid access to data objects stored in a cache. Rather than storing data objects directly, each object can be broken into a number of parts via erasure coding, which enables the object to be generated from less than all parts. When servicing a request for the data object, a device can attempt to retrieve all parts, but begin to generate the data object as soon as a sufficient number of parts is retrieved, even if requests for other parts are outstanding. In this way, the data object can be retrieved without delay due to the slowest requests. For example, where one or more requests timeout, such as due to failure of cache devices, this timeout may have no effect on time required to retrieve the data object from the cache.
Configuring new storage systems based on write endurance
A method performed by a computing device, of configuring a new design of a new data storage system (DSS) having initial configuration parameters is provided. The new design includes an initial plurality of storage drives. The method includes (a) collecting operational information from a plurality of remote DSSs in operation, the operational information including numbers of writes of various write sizes received by respective remote DSSs of the plurality of remote DSSs over time; (b) modeling a number of drive writes per day (DWPD) of the initial plurality of storage drives of the new DSS based on the collected operational information from the plurality of remote DSSs and the initial configuration parameters; (c) comparing the modeled number of DWPD to a threshold value; and (d) in response to the modeled number of DWPD exceeding the threshold value, reconfiguring the new DSS with an updated design.
Configurable integrated circuit (IC) with cyclic redundancy check (CRC) arbitration
An integrated circuit (IC) includes: a storage having a storage interface and addressable bytes, the storage interface coupled to first and second sets of peripheral terminals; control circuitry having control circuitry inputs and control circuitry outputs, the control circuitry inputs coupled to the storage interface and configured to receive configuration bits provided by the storage responsive to a control circuitry update trigger, and the control circuitry outputs coupled to first and second sets of peripheral outputs; and a cyclic-redundancy check (CRC) engine coupled to the storage interface, the CRC engine configured to distinguish between purposeful updates to the data in the storage and bit errors in the data in the storage.
Processor with debug pipeline
A processor includes an execution pipeline that includes a plurality of execution stages, execution pipeline control logic, and a debug system. The execution pipeline control logic is configured to control flow of an instruction through the execution stages. The debug system includes a debug pipeline and debug pipeline control logic. The debug pipeline includes a plurality of debug stages. Each debug pipeline stage corresponds to an execution pipeline stage, and the total number of debug stages corresponds to the total number of execution stages. The debug pipeline control logic is coupled to the execution pipeline control logic. The debug pipeline control logic is configured to control flow through the debug stages of debug information associated with the instruction, and to advance the debug information into a next of the debug stages in correspondence with the execution pipeline control logic advancing the instruction into a corresponding stage of the execution pipeline.
Systems and methods for self-healing and/or failure analysis of information handling system storage
Systems and methods are provided that may be implemented to perform failure analysis and/or self-healing of information handling system storage. In one example, an information handling system may perform self-recovery actions to self-heal system storage issues when there is a OS boot failure due to a failure to detect a system storage drive by determining one or more possible recovery actions based on a current system storage drive status retrieved by an embedded controller (EC) or other programmable integrated circuit of the information handling system. In another example, manufacturing quality control analysis may be performed on boot failure information that is collected at a remote server from multiple failed information handling systems.
Cloud-based providing of one or more corrective measures for a storage system
An illustrative method includes detecting, by a cloud based storage system services provider based on a problem signature, that a storage system has experienced a problem that is associated with the problem signature; and deploying, without user intervention, one or more corrective measures that modify the storage system to resolve the problem.
Servicing data storage devices in a data storage array
Systems and methods for replacing and testing a data storage device are disclosed. In disclosed embodiments, a system including a data storage array (DSA) including a plurality of data storage devices (DSDs) in an enclosure. The system further includes an I/O server coupling the DSA to a client node and configured to provide data access between the client node and the DSA. The system further includes a management server coupled to the DSA, configured to detect a failed DSD in the DSA, detect a replacement DSD in the enclosure that replaces the failed DSD, and add the replacement DSD to a logical path of the DSA. The management server is further configured to display an indication of a state of the DSA based on the comparing.