G06F13/1631

Scalable network-on-chip for high-bandwidth memory

Described herein are memory controllers for integrated circuits that implement network-on-chip (NoC) to provide access to memory to couple processing cores of the integrated circuit to a memory device. The NoC may be dedicated to service the memory controller and may include one or more routers to facilitate management of the access to the memory controller.

SIGNAL PROCESSOR AND SIGNAL PROCESSING SYSTEM
20240176752 · 2024-05-30 ·

The present disclosure provides a signal processor (signal processing device). The signal processor includes: a first terminal, into which one of a clock signal and a data signal is input; a second terminal, into which another one of the clock signal and the data signal is input; a storage unit, in which an address map is set to define an address permitting data writing and an address prohibiting data writing; a designation unit, configured to designate the address map set in the storage unit; and a processing unit, configured to write a data based on the data signal to the storage unit according to the address map designated by the designation unit. The designation unit may designate a first address map and a second address map different from the first address map.

Run-time memory access uniformity checking
10346055 · 2019-07-09 · ·

Systems, apparatuses, and methods for performing run-time checking of access uniformity of vector memory access instructions are disclosed. A system includes a vector unit, a scalar unit, and a memory. The system performs a run-time check to determine if two or more threads of a wave have access uniformity to the memory prior to executing a vector memory access instruction for the wave on the vector unit. The system replaces the vector memory access instruction with a group of instructions responsive to determining that two or more threads of the wave have access uniformity to the memory. The group of instructions includes a scalar access instruction to memory followed by a cross-thread data sharing instruction. The scalar access instruction is executed on the scalar unit. Alternatively, the group of instructions can include a vector memory access instruction by only a single thread in each group having access uniformity.

Systems and methods for reducing write latency

A computer system having reduced write latency and methods for use in computer systems for reducing write latency are provided. Processing circuitry of the computer system is configured to execute a volume filter driver (VFD) that can be switched between a fast termination (FT) mode of operations and a normal, or quiescent, mode of operations. When the processing circuitry receives input/output (IO) write requests to write data to memory while the VFD is in the FT mode of operations, the VFD causes metadata associated with received IO write requests to be written to a volume of memory while preventing actual data associated with received IO write requests from being written to the volume, thereby resulting in extremely fast FT mode operation. After the file has been written to the volume, the VFD enters the quiescent mode of operations during which the VFD passes all IO write requests to the volume.

SCHEDULING MEMORY REQUESTS FOR A GANGED MEMORY DEVICE
20190196721 · 2019-06-27 ·

Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. A computing system includes one or more clients for processing applications. A memory controller transfers traffic between the memory controller and two channels, each connected to a memory device. A client sends a 64-byte memory request with an indication specifying that there are two 32-byte requests targeting non-contiguous data within a same page. The memory controller generates two addresses, and sends a single command and the two addresses to two channels to simultaneously access non-contiguous data in a same page.

COMMAND SPLITTING FOR HIGH-COST DATA ACCESS OPERATIONS
20190163651 · 2019-05-30 ·

A method for improving write throughput of a storage device includes receiving a data access command targeting an LBA extent and determining that logical execution of the data access command includes reading or writing data logically across an identified high-performance-cost boundary. Responsive to the determination, the data access command is split into two or more separate data access commands that are separately queued in memory for execution.

TECHNIQUES FOR EFFICIENTLY HANDLING MISALIGNED SEQUENTIAL READS
20240192887 · 2024-06-13 ·

Methods, systems, and devices for techniques for efficiently handling misaligned sequential reads are described. A memory system may include a memory device that includes multiple memory dies. The memory system may receive a first read command and a second read command from a host system. The first read command may be associated with a first set of physical addresses and the second read command may be associated with a second set of physical addresses. The memory system may determine, based on the first set of physical addresses and the second set of physical addresses, that the first read command and the second read command are for a same memory die of the multiple memory dies. The memory system may then transmit to the memory die a read request that indicates the first set of physical addresses and the second set of physical addresses.

On-chip interconnect for memory channel controllers
12007913 · 2024-06-11 · ·

Methods, systems, and apparatus, including computer-readable media, are described for an integrated circuit that accelerates machine-learning computations. The circuit includes processor cores that each include: multiple channel controllers; an interface controller for coupling each channel controller to any memory channel of a system memory; and a fetch unit in each channel controller. Each fetch is configured to: receive channel data that encodes addressing information; obtain, based on the addressing information, data from any memory channel of the system memory using the interface controller; and write the obtained data to a vector memory of the processor core via the corresponding channel controller that includes the respective fetch unit.

Memory and method for operating a memory with interruptible command sequence

A memory device includes command logic allowing for a command protocol allowing interruption of a first command sequence, such as a page write sequence, and then to proceed directly to receive and decode a second command sequence, such as a read sequence, without latency associated, completing the first command sequence. Also, the command logic is configured to be responsive to a third command sequence after the second command sequence and its associated embedded operation have been completed, which completes the interrupted first command sequence and enables execution of an embedded operation identified by the first command sequence. A memory controller supporting such protocols is described.

Scalable Network-on-Chip for High-Bandwidth Memory
20190138493 · 2019-05-09 ·

Described herein are memory controllers for integrated circuits that implement network-on-chip (NoC) to provide access to memory to couple processing cores of the integrated circuit to a memory device. The NoC may be dedicated to service the memory controller and may include one or more routers to facilitate management of the access to the memory controller.