G06F12/0607

Systems for performing instructions for fast element unpacking into 2-dimensional registers

Disclosed embodiments relate to instructions for fast element unpacking. In one example, a processor includes fetch circuitry to fetch an instruction whose format includes fields to specify an opcode and locations of an Array-of-Structures (AOS) source matrix and one or more Structure of Arrays (SOA) destination matrices, wherein: the specified opcode calls for unpacking elements of the specified AOS source matrix into the specified Structure of Arrays (SOA) destination matrices, the AOS source matrix is to contain N structures each containing K elements of different types, with same-typed elements in consecutive structures separated by a stride, the SOA destination matrices together contain K segregated groups, each containing N same-typed elements, decode circuitry to decode the fetched instruction, and execution circuitry, responsive to the decoded instruction, to unpack each element of the specified AOS matrix into one of the K element types of the one or more SOA matrices.

Memory sub-system for decoding non-power-of-two addressable unit address boundaries

A system generating, using a first addressable unit address decoder, a first addressable unit address based on an input address, an interleaving factor, and a number of first addressable units. The system then generating, using an internal address decoder, an internal address based on the input address, the interleaving factor, and the number of first addressable units. Generating the internal address includes: determining a lower address value by extracting lower bits of the internal address, determining an upper address value by extracting upper bits of the internal address, and adding the lower address value to the upper address value to generate the internal address. Using an internal power-of-two address boundary decoder and the internal address, the system then generating a second addressable unit address, a third addressable unit address, a fourth addressable unit address, and a fifth addressable unit address.

Data storage apparatus and method for managing valid data based on bitmap table
11593006 · 2023-02-28 · ·

A data storage apparatus includes a storage including a plurality of planes, which include a plurality of pages, and a controller configured to control the storage to read data by grouping the plurality of pages as a page group in an interleaving unit, manage pages in which valid data are stored, among the plurality of pages, as a first bitmap table, and manage a second bitmap table generated by compressing the first bitmap table in a page group unit.

TECHNOLOGIES FOR DYNAMIC ACCELERATOR SELECTION
20230050698 · 2023-02-16 ·

Technologies for dynamic accelerator selection include a compute sled. The compute sled includes a network interface controller to communicate with a remote accelerator of an accelerator sled over a network, where the network interface controller includes a local accelerator and a compute engine. The compute engine is to obtain network telemetry data indicative of a level of bandwidth saturation of the network. The compute engine is also to determine whether to accelerate a function managed by the compute sled. The compute engine is further to determine, in response to a determination to accelerate the function, whether to offload the function to the remote accelerator of the accelerator sled based on the telemetry data. Also the compute engine is to assign, in response a determination not to offload the function to the remote accelerator, the function to the local accelerator of the network interface controller.

MEMORY SUB-SYSTEM FOR DECODING NON-POWER-OF-TWO ADDRESSABLE UNIT ADDRESS BOUNDARIES

A system generating, using a first addressable unit address decoder, a first addressable unit address based on an input address, an interleaving factor, and a number of first addressable units. The system then generating, using an internal address decoder, an internal address based on the input address, the interleaving factor, and the number of first addressable units. Generating the internal address includes: determining a lower address value by extracting lower bits of the internal address, determining an upper address value by extracting upper bits of the internal address, and adding the lower address value to the upper address value to generate the internal address. Using an internal power-of-two address boundary decoder and the internal address, the system then generating a second addressable unit address, a third addressable unit address, a fourth addressable unit address, and a fifth addressable unit address.

Apparatus and method for controlling input/output throughput of a memory system
11500720 · 2022-11-15 · ·

A memory system includes a memory device including a plurality of memory units capable of inputting or outputting data individually, and a controller coupled with the plurality of memory units via a plurality of data paths. The controller is configured to perform a correlation operation on two or more read requests among a plurality of read requests input from an external device, so that the plurality of memory units output plural pieces of data corresponding to the plurality of read requests via the plurality of data paths based on an interleaving manner. The controller is configured to determine whether to load map data associated with the plurality of read requests before a count of the plurality of read requests reaches a threshold, to divide the plurality of read request into two groups based on whether to load the map data, and to perform the correlation operation per group.

Multi-tile memory management mechanism

Graphics processors for implementing multi-tile memory management are disclosed. In one embodiment, a graphics processor includes a first graphics device having a local memory, a second graphics device having a local memory, and a graphics driver to provide a single virtual allocation with a common virtual address range to mirror a resource to each local memory of the first and second graphics devices.

MEMORY ACCESS THRESHOLD BASED MEMORY MANAGEMENT
20230040062 · 2023-02-09 ·

A method includes determining respective memory access counts of a plurality of blocks of non-volatile memory cells that are grouped into a plurality of respective groups, comparing the respective memory access counts to respective memory access thresholds, determining a respective memory access count of a block of non-volatile memory cells exceeds a respective memory access threshold, and performing a media scan operation on the block of non-volatile memory cells.

Memory system and SoC including linear address remapping logic
11573716 · 2023-02-07 · ·

A system-on-chip is connected to a first memory device and a second memory device. The system-on-chip comprises a memory controller configured to control an interleaving access operation on the first and second memory devices. A modem processor is configured to provide an address for accessing the first or second memory devices. A linear address remapping logic is configured to remap an address received from the modem processor and to provide the remapped address to the memory controller. The memory controller dorms a linear access operation on the first or second memory device in response to receiving the remapped address.

Memory devices and methods which may facilitate tensor memory access with memory maps based on memory operations

Examples described herein include systems and methods which include an apparatus comprising a memory array including a plurality of memory cells and a memory controller coupled to the memory array. The memory controller comprises a memory mapper configured to configure a memory map on the basis of a memory command associated with a memory access operation. The memory map comprises a specific sequence of memory access instructions to access at least one memory cell of the memory array. For example, the specific sequence of memory access instructions for a diagonal memory command comprises a sequence of memory access instructions that each access a memory cell along a diagonal of the memory array.