G06F2212/27

Selection of variable memory-access size

A method for dynamically selecting a size of a memory access may be provided. The method comprises accessing blocks having a variable number of consecutive cache lines, maintaining a vector with entries of past utilizations for each block size, and adapting said block size before a next access to the blocks.

Processing node, computer system, and transaction conflict detection method

A processing node, a computer system, and a transaction conflict detection method, where the processing node includes a processor and a transactional cache. When obtaining a first operation instruction in a transaction for accessing shared data, the processor accesses the transactional cache for caching shared data of a transaction processed by the processing node. If the transactional cache determines that the first operation instruction fails to hit a cache line in the transactional cache, the transactional cache sends a first destination address in the operation instruction to a transactional cache in another processing node. After receiving status information of a cache line hit by the first destination address from the other processing node, the transactional cache determines, based on the received status information, whether the first operation instruction conflicts with a second operation instruction executed by the other processing node.

Processor with selective data storage (of accelerator) operable as either victim cache data storage or accelerator memory and having victim cache tags in lower level cache wherein evicted cache line is stored in said data storage when said data storage is in a first mode and said cache line is stored in system memory rather then said data store when said data storage is in a second mode

A first data storage holds cache lines, an accelerator has a second data storage that selectively holds accelerator data and cache lines evicted from the first data storage, a tag directory holds tags for cache lines stored in the first and second data storages, and a mode indicator indicates whether the second data storage is operating in a first or second mode in which it respectively holds cache lines evicted from the first data storage or accelerator data. In response to a request to evict a cache line from the first data storage, in the first mode the control logic writes the cache line to the second data storage and updates a tag in the tag directory to indicate the cache line is present in the second data storage, and in the second mode the control logic instead writes the cache line to a system memory.

CONTROL FLOW GUIDED LOCK ADDRESS PREFETCH AND FILTERING
20200151100 · 2020-05-14 ·

A method of prefetching target data includes, in response to detecting a lock-prefixed instruction for execution in a processor, determining a predicted target memory location for the lock-prefixed instruction based on control flow information associating the lock-prefixed instruction with the predicted target memory location. Target data is prefetched from the predicted target memory location to a cache coupled with the processor, and after completion of the prefetching, the lock-prefixed instruction is executed in the processor using the prefetched target data.

Shallow cache for content replication

Embodiments relate to efficiently replicating data from a source storage space to a target storage space. The storage spaces share a common namespace of paths where content units are stored. A shallow cache is maintained for the target storage space. Each entry in the cache includes a hash of a content unit in the target storage space and associated hierarchy paths in the target storage space where the corresponding content unit is stored. When a set of content units in the source storage space is to be replicated at the target storage space, any content unit with a hash in the cache is replicated from one of the associated paths in the cache, thus avoiding having to replicate content from the source storage space.

SHALLOW CACHE FOR CONTENT REPLICATION
20190391917 · 2019-12-26 ·

Embodiments relate to efficiently replicating data from a source storage space to a target storage space. The storage spaces share a common namespace of paths where content units are stored. A shallow cache is maintained for the target storage space. Each entry in the cache includes a hash of a content unit in the target storage space and associated hierarchy paths in the target storage space where the corresponding content unit is stored. When a set of content units in the source storage space is to be replicated at the target storage space, any content unit with a hash in the cache is replicated from one of the associated paths in the cache, thus avoiding having to replicate content from the source storage space.

No-locality hint vector memory access processors, methods, systems, and instructions
11892952 · 2024-02-06 · ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

METHODS AND APPARATUS TO IMPLEMENT MULTIPLE INFERENCE COMPUTE ENGINES

Methods and apparatus to implement multiple inference compute engines are disclosed herein. A disclosed example apparatus includes a first inference compute engine, a second inference compute engine, and an accelerator on coherent fabric to couple the first inference compute engine and the second inference compute engine to a converged coherency fabric of a system-on-chip, the accelerator on coherent fabric to arbitrate requests from the first inference compute engine and the second inference compute engine to utilize a single in-die interconnect port.

MEMORY SYSTEM AND OPERATING METHOD THEREOF
20190227940 · 2019-07-25 ·

A memory system includes a nonvolatile memory device configured to store a plurality of segments each of which is configured by a plurality of map data, a first region configured to cache a target segment including target map data among the plurality of segments, a second region configured to cache a target map data group selected among a plurality of map data groups in the target segment and a controller configured to control data caching of the first region and the second region, wherein each of plurality of map data groups includes a plurality of map data, and the second region caches data of a unit smaller than the first region.

NO-LOCALITY HINT VECTOR MEMORY ACCESS PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20190179762 · 2019-06-13 ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.