G06F11/348

SYSTEM-ON-CHIP FOR SPECULATIVE EXECUTION EVENT COUNTER CHECKPOINTING AND RESTORING

An example system for speculative execution event counter checkpointing and restoring may include a plurality of symmetric cores, at least one of the symmetric cores to simultaneously process a plurality of threads and to perform out-of-order instruction processing for the plurality of threads; at least one shared cache circuit to be shared among two or more the of symmetric cores. The system may further include a memory controller to couple the symmetric cores to a system memory and a data communication interface to couple one or more of the cores to input/output devices. The system may further include event counter circuitry comprising: a plurality of event counters including programmable event counters and fixed event counters and one or more configuration registers to store configuration data to specify an event type to be counted by the programmable event counters, wherein at least one of the one or more configuration registers is to store configuration data for a plurality of the programmable event counters. The system may further include transactional memory circuitry to process transactional memory operations including load operations and store operations, the transactional memory circuitry to process a transaction begin instruction to indicate a start of a transactional execution region of a program, a transaction end instruction to indicate an end of the transactional execution region, and a transaction abort instruction to abort processing of the transactional execution region. The system may further include transaction checkpoint circuitry to store a processor state at the start of the transactional execution region of the program, the processor state including values of one or more of the event counters. The system may further include lock elision circuitry to cause critical sections of the program to execute as transactions on multiple threads without acquiring a lock, the lock elision circuitry to cause the critical sections to be re-executed non-speculatively using one or more locks in response to detecting a transaction failure.

System for providing trace data in a data processor having a pipelined architecture

The invention is a method and system for providing trace data in a pipelined data processor. Aspects of the invention include providing a trace pipeline in parallel to the execution pipeline, providing trace information on whether conditional instructions complete or not, providing trace information on the interrupt status of the processor, replacing instructions in the processor with functionally equivalent instructions that also produce trace information and modifying the scheduling of instructions in the processor based on the occupancy of a trace output buffer.

Performance monitoring of shared processing resources
09720744 · 2017-08-01 · ·

A system and method for a performance monitoring hardware unit that may include logic to poll one or more performance monitoring shared resources and determine a status of each performance monitoring shared resource. The performance monitoring hardware unit may also include an interface to provide the status to allow programming of the one or more performance monitoring shared resource. The status may correspond to a usage and/or an errata condition. Thus, the performance monitoring hardware unit may prevent programming conflicts of the one or more performance monitoring shared resources.

Securing crash dump files

In a computer storage system, crash dump files are secured without power fencing in a cluster of a plurality of nodes connected to a storage system. Upon an occurrence of a panic of a crashing node and prior to receiving a panic message of the crashing node by a surviving node loading, in the cluster, a capturing node to become active, prior to a totem token being declared lost by the surviving node, for capturing the crash dump files of the crashing node.

Monitoring performance of a processing device to manage non-precise events

In accordance with embodiments disclosed herein, there is provided systems and methods for monitoring performance of a processing device to manage non-precise events. A processing device includes a performance counter to increment upon occurrence of a non-precise event in the processing device. The processing device also includes a precise event based sampling (PEBS) enable control communicably coupled to the performance counter. The processing device also includes a PEBS handler to generate and store a PEBS record including an architectural metadata defining a state of the processing device at a time of generation of the PEBS record. The processing device further includes a non-precise event based sampling (NPEBS) module communicably coupled to the PEBS control and the PEBS handler. The NPEBS module causes the PEBS handler to generate the PEBS record for the non-precise event upon overflow of the performance counter.

Timing based arbiter systems and circuits for ZQ calibration
09767921 · 2017-09-19 · ·

Systems and apparatuses are provided for an arbiter circuit for timing based ZQ calibration. An example system includes a resistor and a plurality of chips. Each of the plurality of chips further includes a terminal coupled to the resistor, a register storing timing information, and an arbiter circuit configured to determine whether the resistor is available based, at least in part, on the timing information stored in the register. The timing information stored in the register of each respective chip of the plurality of chips is unique to the respective chip among the plurality of chips.

Memory devices and electronic systems having a hybrid cache including static and dynamic caches that may be selectively disabled based on cache workload or availability, and related methods

Memory devices including a hybrid cache, methods of operating a memory device, and associated electronic systems including a memory device having a hybrid cache, are disclosed. The hybrid cache includes a dynamic cache that may include x-level cell (XLC) blocks of non-volatile memory cells, which may include multi-level cells (MLC), triple-level cells (TLC), quad-level cells (QLC), etc., shared between the dynamic cache and a main memory. The hybrid cache includes a static cache including single-level cell (SLC) blocks of non-volatile memory cells. The memory device further includes a memory controller configured to disable at least one of the static cache and the dynamic cache based on a workload of the hybrid cache relative to a Total Bytes Written (TBW) Spec for the memory device. The cache may be disabled based on, for example, program/erase (PE) cycles of one or more portions of the memory device or the workload exceeding a threshold, which may define one or more switch points. A method of operating a memory device may include writing data in the static cache if the static cache is available, and writing the data in the dynamic cache if the static cache is unavailable.

Architecture agnostic replay verfication
11200147 · 2021-12-14 · ·

According to aspects of the disclosure a method is provided, comprising: generating a live execution trace log corresponding to a live execution of a computer program, the live execution being performed by using both hardware emulation and hardware acceleration; generating a first trace entry corresponding to a replay execution of the computer program, the replay execution being performed by using hardware emulation without hardware acceleration, the replay execution being performed based on a set of events that are recorded during the live execution of the computer program; detecting whether the first trace entry is valid based on the live execution trace log; and in response to detecting that the first trace entry is not valid, transitioning into a safe state.

Trace data

A data processing apparatus is provided that includes monitor circuitry to produce local trace data indicating behaviour of the data processing apparatus. Interface circuitry communicates with a second data processing apparatus and encoding circuitry produces an encoded instruction to cause the local trace data to be stored in storage circuitry of the second data processing apparatus or to be output at output circuitry of the second data processing apparatus. The interface circuitry transmits the encoded instruction to the second data processing apparatus.

Technology For Dynamically Tuning Processor Features

A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.