G06F11/348

Processor with debug pipeline

A processor includes an execution pipeline that includes a plurality of execution stages, execution pipeline control logic, and a debug system. The execution pipeline control logic is configured to control flow of an instruction through the execution stages. The debug system includes a debug pipeline and debug pipeline control logic. The debug pipeline includes a plurality of debug stages. Each debug pipeline stage corresponds to an execution pipeline stage, and the total number of debug stages corresponds to the total number of execution stages. The debug pipeline control logic is coupled to the execution pipeline control logic. The debug pipeline control logic is configured to control flow through the debug stages of debug information associated with the instruction, and to advance the debug information into a next of the debug stages in correspondence with the execution pipeline control logic advancing the instruction into a corresponding stage of the execution pipeline.

Monitoring Performance of a Processing Device to Manage Non-Precise Events

Embodiments disclosed herein provide for monitoring performance of a processing device to manage non-precise events. A processing device includes a performance counter to track a non-precise event and to increment upon occurrence of the non-precise event, wherein the non-precise event comprises a first type of performance event that is not linked to an instruction in an instruction trace. The processing device also includes a first handler circuit to generate and store a first record, the first record comprising architectural metadata defining a state of the processing device at a time of generation of the first record, wherein the first handler circuit to generate records corresponding to precise events. The processing device further includes a second handler circuit communicably coupled to the first handler circuit, the second handler circuit to cause the first handler circuit to generate a second record for the non-precise event upon overflow of the performance counter.

DETECTING ANOMALOUS LATENT COMMUNICATIONS IN AN INTEGRATED CIRCUIT CHIP
20230004473 · 2023-01-05 ·

A method of detecting anomalous latencies in communications between components on an integrated circuit (IC) chip. The method includes: (i) monitoring communications between a first component of the IC chip and other components of the IC chip, each communication comprising a command sent from the first component to another component, and a response received by the first component from that other component, the monitoring comprising: measuring the number of communications in each of a series of monitored time windows, and measuring the latency of each communication in the series of monitored time windows; (ii) calculating a maximum tolerable latency for each operational time window of the first component from the number of communications in that operational time window, an available stall time of the first component in that operational time window, and a latency penalty factor for that operational time window; and (iii) determining a measured latency to be anomalous if the measured latency is greater than the maximum tolerable latency.

IDENTIFYING CAUSES OF ANOMALIES OBSERVED IN AN INTEGRATED CIRCUIT CHIP
20230004471 · 2023-01-05 ·

A method of identifying a cause of an anomalous feature measured from system circuitry on an integrated circuit (IC) chip, the IC chip comprising the system circuitry and monitoring circuitry for monitoring the system circuitry by measuring features of the system circuitry in each window of a series of windows, the method comprising: (i) from a set of windows prior to the anomalous window comprising the anomalous feature, identifying a candidate window set in which to search for the cause of the anomalous feature; (ii) for each of the measured features of the system circuitry: (a) calculating a first feature probability distribution of that measured feature for the candidate window set; (b) calculating a second feature probability distribution of that measured feature for window(s) not in the candidate window set; (c) comparing the first and second feature probability distributions; and (d) identifying that measured feature in the timeframe of the candidate window set as a cause of the anomalous feature if the first and second feature probability distributions differ by more than a threshold value; (iii) iterating steps (i) and (ii) for further candidate window sets from the set of windows prior to the anomalous window; and (iv) outputting a signal indicating those measured feature(s) of step (ii)(d) identified as a cause of the anomalous feature.

PROGRAMMABLE STATE MACHINE FOR A HARDWARE PERFORMANCE MONITOR

A processing unit can include a performance monitor for monitoring the performance of the processing unit and associated sub-units. The performance monitor can include a state machine. The state machine can be implemented via state machine data entries stored in a memory associated with the performance monitor. A state machine data entry includes information indicating a state transition condition and output signals. The state transition condition includes a current state and input signals required to meet the condition. The output signals include a next state, one or more counter actions, and one or more triggers. The performance monitor implements logic circuits that determine, based on input signals and the state machine data entries, the next state to transition and associated output signals. The state machine data entries can be written and re-written by a user.

COMMANDED JTAG TEST ACCESS PORT OPERATIONS
20230221368 · 2023-07-13 ·

The disclosure describes a novel method and apparatus for improving the operation of a TAP architecture in a device through the use of Command signal inputs to the TAP architecture. In response to a Command signal input, the TAP architecture can perform streamlined and uninterrupted Update, Capture and Shift operation cycles to a target circuit in the device or streamlined and uninterrupted capture and shift operation cycles to a target circuit in the device. The Command signals can be input to the TAP architecture via the devices dedicated TMS or TDI inputs or via a separate CMD input to the device.

CONSTRAINED CARRIES ON SPECULATIVE COUNTERS

A computer-implemented method for of constrained carries on speculative counters includes providing one or more speculative counters having an upper portion of most significant bits partially embedded in a random-access memory (RAM) array, and a pre-counter portion external to the RAM array having a plurality of least significant bits. The one or more speculative counters are configured to count a plurality of events of interest during a processor core instruction execution. A carry output from the pre-counter portion to the RAM array is suppressed for a duration of a speculative event period.

Systems and methods for intercycle gap refresh and backpressure management

A system may include a synchronization device and an emulation chip including a processor and a memory. The processor may evaluate, during a first cycle, at least one of a set of one or more execution instructions in the memory or evaluation primitives configured to emulate a circuit, and evaluate, during a second cycle, at least one of the set of one or more execution instructions or a set of configured logic primitives. The synchronization device may interpose a gap period interposed between the first cycle and the second cycle such that during the gap period, the processor does not evaluate one or more instructions from the set of one or more execution instructions or re-evaluate primitives. The synchronization device may cause, during the first gap period, the emulation chip to perform refreshes on the memory of the emulation chip.

Debug Trace Fabric for Integrated Circuit
20220374326 · 2022-11-24 ·

A trace network for debugging integrated circuits is disclosed. At least one functional network includes a plurality of components interconnected by a number of network switches, implemented on at least one integrated circuit. A trace network is also implemented on the at least one integrated circuit, and includes a plurality of trace circuits configured to generate trace data based on transactions between ones of the plurality of components. The plurality of trace circuits are coupled to one another by a plurality of trace network switches. The trace circuits are configured to convey the generated trace data to an interface, via the trace network, without using the at least one functional network.

Universal profiling device and method for simulating performance monitoring unit

Disclosed is a universal profiling device operable to simulate a performance monitoring unit for a heterogeneous system. The universal profiling device includes a main circuit and a storage circuit. The main circuit is configured to execute at least one of multiple steps including an active data collection step and a passive data collection step. The active data collection step registers a callback function for an event of a designated object according to predetermined setting or user setting, and actively calls the callback function to obtain information of the event. The passive data collection step registers the event of the designated object according to the predetermined setting or user setting and thereby receives the information of the event without requesting the designated object, wherein the information of the event is stored in the storage circuit.