G06F12/0804

Write data for bin resynchronization after power loss

A system includes a memory device and a processing device, operatively coupled to the memory device, the processing device to perform operations comprising: measuring one of a temperature voltage shift or a read bit error rate of fixed data stored in the memory device in response to detecting a power on of the memory device, the fixed data having been programmed in response to detecting a power loss; estimating an amount of time for which the memory device was powered off based on results of the measuring; and in response to the amount of time satisfying a threshold criterion, updating a value for a temporal voltage shift of a block family based on the amount of time.

CONTROL STATE PRESERVATION DURING TRANSACTIONAL EXECUTION

A method includes saving a control state for a processor in response to commencing a transactional processing sequence, wherein saving the control state produces a saved control state. The method also includes permitting updates to the control state for the processor while executing the transactional processing sequence. Examples of updates to the control state include key mask changes, primary region table origin changes, primary segment table origin changes, CPU tracing mode changes, and interrupt mode changes. The method also includes restoring the control state for the processor to the saved control state in response to encountering a transactional error during the transactional processing sequence. In some embodiments, saving the control state comprises saving the current control state to memory corresponding to internal registers for an unused thread or another level of virtualization. A corresponding computer system and computer program product are also disclosed herein.

Method and Apparatus for Shared Virtual Memory to Manage Data Coherency in a Heterogeneous Processing System
20180011792 · 2018-01-11 · ·

One embodiment provides for a heterogeneous computing device comprising a first processor coupled with a second processor, wherein one or more of the first or second processor includes graphics processing logic; wherein each of the first processor and the second processor includes first logic to perform virtual to physical memory address translation; and wherein the first logic includes cache coherency state for a block of memory associated with a virtual memory address.

Method and Apparatus for Shared Virtual Memory to Manage Data Coherency in a Heterogeneous Processing System
20180011792 · 2018-01-11 · ·

One embodiment provides for a heterogeneous computing device comprising a first processor coupled with a second processor, wherein one or more of the first or second processor includes graphics processing logic; wherein each of the first processor and the second processor includes first logic to perform virtual to physical memory address translation; and wherein the first logic includes cache coherency state for a block of memory associated with a virtual memory address.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS

Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.

METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS

Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.

AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
20230004500 · 2023-01-05 ·

A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.

AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
20230004500 · 2023-01-05 ·

A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.