G06F9/30047

STREAMING ENGINE WITH STREAM METADATA SAVING FOR CONTEXT SWITCHING
20230004391 · 2023-01-05 ·

A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces addresses of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Stream metadata is stored in response to a stream store instruction. Stored stream metadata is restored to the stream engine in response to a stream restore instruction. An interrupt changes an open stream to a frozen state discarding stored stream data. A return from interrupt changes a frozen stream to an active state.

Memory system and data processing system including the same
11544063 · 2023-01-03 · ·

A data processing system includes a compute blade generating a write command to store data and a read command to read the data, and a memory blade. The compute blade has a memory that stores information about performance characteristics of each of a plurality of memories, and determines priority information through which eviction of a cache line is carried out based on the stored information.

RANGE PREFETCH INSTRUCTION

In response to an instruction decoder decoding a range prefetch instruction specifying first and second address-range-specifying parameters and a stride parameter, prefetch circuitry controls, depending on the first and second address-range-specifying parameters and the stride parameter, prefetching of data from a plurality of specified ranges of addresses into the at least one cache. A start address and size of each specified range is dependent on the first and second address-range-specifying parameters. The stride parameter specifies an offset between start addresses of successive specified ranges. Use of the range prefetch instruction helps to improve programmability and improve the balance between prefetch coverage and circuit area of the prefetch circuitry.

CACHE SUPPORT FOR INDIRECT LOADS AND INDIRECT STORES IN GRAPH APPLICATIONS

Techniques for operating on an indirect memory access instruction, where the instruction accesses a memory location via at least one indirect address. A pipeline processes the instruction and a memory operation engine generates a first access to the at least one indirect address and a second access to a target address determined by the at least one indirect address. A cache memory used with the pipeline and the memory operation engine caches pointers. In response to a cache hit when executing the indirect memory access instruction, operations dereference a pointer to obtain the at least one indirect address, not set a cache bit, and return data for the instruction without storing the data in the cache memory; and in response to a cache miss, operations set the cache bit, obtain, and store a cache line for a missed pointer, and return data without storing the data in the cache memory.

Digitally coordinated dynamically adaptable clock and voltage supply apparatus and method

An apparatus and method is described that digitally coordinates dynamically adaptable clock and voltage supply to significantly reduce the energy consumed by a processor without impacting its performance or latency. A signal is generated that indicates a long latency operation. This signal is used to reduce power supply voltage and frequency of the adaptable clock. An early resume indicator is generated a few nanoseconds before normal operations are about to resume. This early resume signal is used to power up the power-downed voltage regulator, and/or can increase frequency and/or supply voltage back to normal level before normal processor operations are about to resume.

APPARATUS, SYSTEM, AND METHOD FOR CONFIGURING A CONFIGURABLE COMBINED PRIVATE AND SHARED CACHE

Aspects disclosed in the detailed description include configuring a configurable combined private and shared cache in a processor. Related processor-based systems and methods are also disclosed. A combined private and shared cache structure is configurable to select a private cache portion and a shared cache portion.

Speculative side-channel hint instruction

An apparatus comprises processing circuitry 14 to perform data processing in response to instructions, the processing circuitry supporting speculative processing of read operations for reading data from a memory system 20, 22; and control circuitry 12, 14, 20 to identify whether a sequence of instructions to be processed by the processing circuitry includes a speculative side-channel hint instruction indicative of whether there is a risk of information leakage if at least one subsequent read operation is processed speculatively, and to determine whether to trigger a speculative side-channel mitigation measure depending on whether the instructions include the speculative side-channel hint instruction. This can help to reduce the performance impact of measures taken to protect against speculative side-channel attacks.

Prefetch mechanism for a cache structure

An apparatus and method is provided, the apparatus comprising a processor pipeline to execute instructions, a cache structure to store information for reference by the processor pipeline when executing said instructions; and prefetch circuitry to issue prefetch requests to the cache structure to cause the cache structure to prefetch information into the cache structure in anticipation of a demand request for that information being issued to the cache structure by the processor pipeline. The processor pipeline is arranged to issue a trigger to the prefetch circuitry on detection of a given event that will result in a reduced level of demand requests being issued by the processor pipeline, and the prefetch circuitry is configured to control issuing of prefetch requests in dependence on reception of the trigger.

METHOD AND DEVICE FOR OPERATING A SELF-DRIVING CAR
20220388531 · 2022-12-08 ·

A method and device for operating a Self-Driving Car (SDC) are disclosed. The device executes in real-time a first processing pipeline including generating static attributes for a plurality of potential positions of the SDC on the road segment, and caching the static attributes in association with the respective ones of the plurality of potential positions. The device executes in real-time a second processing pipeline in parallel with the first processing pipeline including generating a graph-structure for operating the SDC on the road segment. The generating the graph-structure includes generating dynamic attributes for a given edge of the graph-structure, acquiring from the cache memory static attributes for the given edge of the graph-structure, such that the given edge in the graph-structure is associated with the static attributes generated by the first processing pipeline and with the dynamic attributes generated by the second processing pipeline.

SHARING INSTRUCTION CACHE FOOTPRINT BETWEEN MULITPLE THREADS

Aspects are provided for sharing instruction cache footprint between multiple threads. A set/way pointer to an instruction cache line is derived from a system memory address associated with an instruction fetch from a memory page. It is determined that the instruction cache line is shareable between the first thread and the second thread. An alias table entry is created indicating that other instruction cache lines associated with the memory page are also shareable between threads. Another instruction fetch is received from another thread requesting an instruction from another system memory address associated with the memory page. A further set/way pointer to another instruction cache line is derived from the other system memory address. It is determined that the other instruction cache line is sharable based on the alias table entry.