G06F2212/608

Method and Apparatus for Shared Virtual Memory to Manage Data Coherency in a Heterogeneous Processing System
20180011792 · 2018-01-11 · ·

One embodiment provides for a heterogeneous computing device comprising a first processor coupled with a second processor, wherein one or more of the first or second processor includes graphics processing logic; wherein each of the first processor and the second processor includes first logic to perform virtual to physical memory address translation; and wherein the first logic includes cache coherency state for a block of memory associated with a virtual memory address.

Managing datapath validation on per-transaction basis

A technique for managing a datapath of a data storage system includes receiving a request to access target data and creating a transaction that includes multiple datapath elements in a cache, where the datapath elements are used for accessing the target data. In response to detecting that one of the datapath elements is invalid, the technique further includes processing the transaction in a rescue mode. The rescue mode attempts to replace each invalid datapath element of the transaction with a valid version thereof obtained from elsewhere in the data storage system. The technique further includes committing the transaction as processed in the rescue mode.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS

Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.

AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
20230004500 · 2023-01-05 ·

A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.

System and method for optimizing DRAM bus switching using LLC
11567885 · 2023-01-31 · ·

The present disclosure relates to a system and method for optimizing switching of a DRAM bus using LLC. An embodiment of the disclosure includes sending a first type request from a first type queue to the second memory via the memory bus if a direction setting of the memory bus is in a first direction corresponding to the first type request, decrementing a current direction credit count by a first type transaction decrement value, if the decremented current direction credit count is greater than zero, sending another first type request to the second memory via the memory bus and decrementing the current direction credit count again by the first type transaction decrement value, and if the decremented current direction credit count is zero, switching the direction setting of the memory bus to a second direction and resetting the current direction credit count to a second type initial value.

Integrated circuit and address mapping method for cache memory

An integrated circuit (IC) is provided. The IC includes a cache memory divided into a plurality of groups and an address decoder. The groups are assigned in rotation for a plurality of time periods. Each group is assigned in a corresponding single one of the time periods. The address decoder is configured to obtain a set address according to an access address and provide a physical address according to the set address. When the access address corresponds to a first group, the physical address is different from the set address. When the access address corresponds to the groups other than the first group, the physical address is the same as the set address. The sets of the first group that is assigned in a first time period are not overlapping with the sets of other first groups assigned in the time periods other than the first time period.

Cache management for search optimization
11714758 · 2023-08-01 · ·

A method to store a data value onto a cache of a storage hierarchy. A range of a collection of values that resides on a first tier of the hierarchy is initialized. The range is partitioned into disjointed range partitions; a first subset of which is designated as cached; a second subset is designated as uncached. The collection is partitioned into a subset of uncached data and cached data and placed into respective portions. The range partition to which the data value belongs (i.e. the target range partition) is identified as being cached. If the cache is full, the target range partition is divided into two partitions, the partition that excludes the data value is designated as uncached; the values therein are evicted. If the cache has space, the data value is copied onto the cache; otherwise the division/eviction are repeated until the cache has space.

State semantics kexec based firmware update

A kexec-based system update process wherein user-specific data is transferred on reboot of the second kernel. Upon initializing kexec load, buffer memory is assigned to the second kernel and the system loads control pages of fixed size for the second kernel boot, and also loads user-specific data onto extended control pages of variable size. Upon boot of the second kernel, the user-specific data is extracted from the extended control pages and transferred to the corresponding applications.

Three tiered hierarchical memory systems

Systems, apparatuses, and methods related to three tiered hierarchical memory systems are described herein. A three tiered hierarchical memory system can leverage persistent memory to store data that is generally stored in a non-persistent memory, thereby increasing an amount of storage space allocated to a computing system at a lower cost than approaches that rely solely on non-persistent memory. An example apparatus may include a persistent memory, and one or more non-persistent memories configured to map an address associated with an input/output (I/O) device to an address in logic circuitry prior to the apparatus receiving a request from the I/O device to access data stored in the persistent memory, and map the address associated with the I/O device to an address in a non-persistent memory subsequent to the apparatus receiving the request and accessing the data.