G06F12/1054

Memory management unit with address translation cache
11507515 · 2022-11-22 · ·

The present disclosure advantageously provides a memory management unit and methods for invalidating cache lines in address translation caches. The memory management unit has one or more address translation caches, and each address translation cache has a plurality of cache lines. The memory management unit receives transactions from a source of transactions. The transactions include, inter alia, memory transactions and a set-aside translation transaction. The memory transactions include at least a first memory transaction and a last memory transaction, and each memory transaction includes the same virtual memory address and the same translation context identifier. The set-aside translation transaction also includes the same virtual memory address and the same translation context identifier. In response to receiving the set-aside translation transaction, the memory management unit deallocates each cache line that stores an address translation for the same virtual memory address and the same translation context identifier.

TECHNOLOGIES FOR DYNAMIC ACCELERATOR SELECTION
20230050698 · 2023-02-16 ·

Technologies for dynamic accelerator selection include a compute sled. The compute sled includes a network interface controller to communicate with a remote accelerator of an accelerator sled over a network, where the network interface controller includes a local accelerator and a compute engine. The compute engine is to obtain network telemetry data indicative of a level of bandwidth saturation of the network. The compute engine is also to determine whether to accelerate a function managed by the compute sled. The compute engine is further to determine, in response to a determination to accelerate the function, whether to offload the function to the remote accelerator of the accelerator sled based on the telemetry data. Also the compute engine is to assign, in response a determination not to offload the function to the remote accelerator, the function to the local accelerator of the network interface controller.

NON-VOLATILE STORAGE CONTROLLER WITH PARTIAL LOGICAL-TO-PHYSICAL (L2P) ADDRESS TRANSLATION TABLE
20220358051 · 2022-11-10 ·

Systems, apparatus and methods are provided for logical-to-physical (L2P) address translation. A method may comprise receiving a request for a first logical data address (LDA), and calculating a first translation data unit (TDU) index for a first TDU. The first TDU may contain a L2P entry for the first LDA. The method may further comprise searching a cache of lookup directory entries of recently accessed TDUs using the first TDU index, determining that there is a cache miss, generating and storing an outstanding request for the lookup directory entry for the first TDU in a miss buffer, retrieving the lookup directory entry for the first TDU from an in-memory lookup directory, determining that the lookup directory entry for the first TDU is not valid, reserve a TDU space for the first TDU in a memory and generating a load request for the first TDU.

Multi-tile memory management mechanism

Graphics processors for implementing multi-tile memory management are disclosed. In one embodiment, a graphics processor includes a first graphics device having a local memory, a second graphics device having a local memory, and a graphics driver to provide a single virtual allocation with a common virtual address range to mirror a resource to each local memory of the first and second graphics devices.

L2P TRANSLATION TECHNIQUES IN LIMITED RAM SYSTEMS TO INCREASE RANDOM WRITE PERFORMANCE USING MULTIPLE L2P CACHES
20230031365 · 2023-02-02 ·

Devices and techniques are disclosed herein for more efficiently performing random write operation for a memory device. In an example, a method of operating a flash memory device can include receiving a write request at a flash memory device from a host, the write request including a first logical block address and write data, saving the write data to a location of the flash memory device having a first physical address, operating the flash memory device in a first mode when an amount of write data associated with the write request is above a threshold, operating the flash memory device in a second mode when an amount of write data is below the threshold, and comparing the amount of write data to the threshold.

OPERATIONAL CODE STORAGE FOR AN ON-DIE MICROPROCESSOR

Methods, systems, and devices for operational code storage for an on-die microprocessor are described. A microprocessor may be formed on-die with a memory array. Operating code for the microprocessor may be stored in the memory array, possibly along with other data (e.g., tracking or statistical data) used or generated by the on-die microprocessor. A wear leveling algorithm may result in some number of rows within the memory array not being used to store user data at any given time, and these rows may be used to store the operating code and possibly other data for the on-die microprocessor. The on-die microprocessor may boot and run based on the operating code stored in memory array.

Implementing mapping data structures to minimize sequentially written data accesses
11615020 · 2023-03-28 · ·

A system includes a memory device, and a processing device, operatively coupled to the memory device, to perform operations including receiving a request to sequentially write data to a block of a memory device, in response to receiving the request, writing the data to the block to obtain sequentially written data, initiating accumulation of logical-to-physical (L2P) mapping data corresponding to the sequentially written data, determining that a criterion for terminating the accumulation of the L2P mapping data is satisfied, in response to determining that the criterion is satisfied, terminating the accumulation of the L2P mapping data to obtain accumulated L2P mapping data, and updating an L2P mapping data structure based on the accumulated L2P mapping data.

L2P translation techniques in limited RAM systems to increase random write performance using multiple L2P caches
11487653 · 2022-11-01 · ·

Devices and techniques are disclosed herein for more efficiently performing random write operation for a memory device. In an example, a method of operating a flash memory device can include receiving a write request at a flash memory device from a host, the write request including a first logical block address and write data, saving the write data to a location of the flash memory device having a first physical address, operating the flash memory device in a first mode when an amount of write data associated with the write request is above a threshold, operating the flash memory device in a second mode when an amount of write data is below the threshold, and comparing the amount of write data to the threshold.

DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING ADDRESS TRANSLATION

There is provided a data processing apparatus and method of data processing. The data processing apparatus comprises storage circuitry to store a hierarchy of page tables comprising an intermediate level page table. Each entry of the intermediate level page table comprises base address information of a next level page table and control information indicating whether an addressing function has been applied to reorder physical storage locations of entries of the next level page table. Address translation circuitry is provided to perform address translations in response to receipt of a virtual address by performing a lookup in a next level page table dependent on the base address information and a page table index from the virtual address. When the control information indicates that the addressing function has been applied, the lookup is performed at a modified storage location generated by applying the addressing function to the page table index.

Hashing with Soft Memory Folding

In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.