Patent classifications
G06F2212/301
Write merging on stores with different privilege levels
A caching system including a first sub-cache, a second sub-cache, coupled in parallel with the first sub-cache, for storing write-memory commands that are not cached in the first sub-cache, the second sub-cache including privilege bits configured to store an indication that a corresponding cache line of the second sub-cache is associated with a level of privilege, and wherein the second sub-cache is further configured to receive a first write memory command for a memory address associated with a first level of privilege, store, in the second sub-cache, first data associated with the first write memory command and the level of privilege associated with the cache line, receive a second write memory command for the cache line, the second write memory command associated with a second level of privilege, merge the first level of privilege with the second level of privilege, and output the merged privilege level with the cache line.
Victim cache with write miss merging
A caching system including a first sub-cache, a second sub-cache, coupled in parallel with the first sub-cache, for storing cache data evicted from the first sub-cache and write-memory commands that are not cached in the first sub-cache, and a cache controller configured to receive two or more cache commands, determine a conflict exists between the received two or more cache commands, determine a conflict resolution between the received two or more cache commands, and sending the two or more cache commands to the first sub-cache and the second sub-cache.
Methods and apparatus for multi-banked victim cache with dual datapath
Methods, apparatus, systems and articles of manufacture are disclosed for multi-banked victim cache with dual datapath. An example cache system includes a storage element that includes banks operable to store data, ports operable to receive memory operations in parallel, wherein each of the memory operations has a respective address, and a plurality of comparators coupled such that each of the comparators is coupled to a respective port of the ports and a respective bank of the banks and is operable to determine whether a respective address of a respective memory operation received by the respective port corresponds to the data stored in the respective bank.
Block device interface using non-volatile pinned memory
A method comprising: receiving, at a block device interface, an instruction to write data, the instruction comprising a memory location of the data; copying the data to pinned memory; performing, by a vector processor, one or more invertible transforms on the data; and writing the data from the pinned memory to one or more storage devices asynchronously; wherein the pinned memory of the data corresponds to a location in pinned memory, the pinned memory being accessible by the vector processor and one or more other processors.
METHODS AND APPARATUS TO FACILITATE ATOMIC COMPARE AND SWAP IN CACHE FOR A COHERENT LEVEL 1 DATA CACHE SYSTEM
Methods, apparatus, systems and articles of manufacture to facilitate atomic compare and swap in cache for a coherent level 1 data cache system are disclosed. An example system includes a cache storage; a cache controller coupled to the cache storage wherein the cache controller is operable to: receive a memory operation that specifies a key, a memory address, and a first set of data; retrieve a second set of data corresponding to the memory address; compare the second set of data to the key; based on the second set of data corresponding to the key, cause the first set of data to be stored at the memory address; and based on the second set of data not corresponding to the key, complete the memory operation without causing the first set of data to be stored at the memory address.
FULLY PIPELINED READ-MODIFY-WRITE SUPPORT
Methods, apparatus, systems and articles of manufacture are disclosed to facilitate fully pipelined read-modify-write support in level 1 data cache using store queue and data forwarding. An example apparatus includes a first storage, a second storage, a store queue coupled to the first storage and the second storage, the store queue operable to receive a first memory operation specifying a first set of data, process the first memory operation for storing the first set of data in at least one of the first storage and the second storage, receive a second memory operation, and prior to storing the first set of data in the at least one of the first storage and the second storage, feedback the first set of data for use in the second memory operation.
Protecting host memory from access by untrusted accelerators
A host processor receives an address translation request from an accelerator, which may be trusted or un-trusted. The address translation request includes a virtual address in a virtual address space that is shared by the host processor and the accelerator. The host processor encrypts a physical address in a host memory indicated by the virtual address in response to the accelerator being permitted to access the physical address. The host processor then provides the encrypted physical address to the accelerator. The accelerator provides memory access requests including the encrypted physical address to the host processor, which decrypts the physical address and selectively accesses a location in the host memory indicated by the decrypted physical address depending upon whether the accelerator is permitted to access the location indicated by the decrypted physical address.
Semiconductor device and memory access setup method
Limitations on memory access decrease the computing capability of related-art semiconductor devices during convolution processing in a convolutional neural network. A semiconductor device according to an aspect of the present invention includes an accelerator section that performs computation on a plurality of intermediate layers included in a convolutional neural network by using a memory having a plurality of banks capable of changing the read/write status on an individual bank basis. The accelerator section includes a network layer control section that controls a memory control section in such a manner as to change the read/write status assigned to the banks storing input data or output data of the intermediate layers in accordance with the transfer amounts and transfer rates of the input data and output data of the intermediate layers included in the convolutional neural network.
WRITE MERGING ON STORES WITH DIFFERENT TAGS
Techniques for caching data are provided that include receiving, by a caching system, a write memory command for a memory address, the write memory command associated with a first color tag, determining, by a first sub-cache of the caching system, that the memory address is not cached in the first sub-cache, determining, by second sub-cache of the caching system, that the memory address is not cached in the second sub-cache, storing first data associated with the first write memory command in a cache line of the second sub-cache, storing the first color tag in the second sub-cache, receiving a second write memory command for the cache line, the write memory command associated with a second color tag, merging the second color tag with the first color tag, storing the merged color tag, and evicting the cache line based on the merged color tag.
METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS
Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.