G06F13/1657

PROCESSING ACCELERATOR ARCHITECTURES
20220035760 · 2022-02-03 ·

A processing element/unit can include a plurality of networks, a plurality of cores, crossbar interconnects, a plurality of memory controllers and local memory on an integrated circuit (IC) chip. The plurality of cores can be coupled together by the plurality of networks on chip. The crossbar interconnects can couple the networks of cores to the plurality of memory controllers. The plurality of memory controllers can be configured to access data stored in off-chip memory. The local memory can be configured to cache portions of the accessed data. The local memory can be directly accessible by the network of processing cores, or can be distributed across the plurality of memory controllers. The memory controllers can be narrow channel (NC) memory controllers having widths of 4, 8, 12, 16 or a multiple of 4 bits.

First boot with one memory channel

An embodiment of a semiconductor package apparatus may include technology to identify a partial set of populated memory channels from a full set of populated memory channels of a multi-channel memory system, and complete a first boot of an operating system with only the identified partial set of memory channels of the multi-channel memory system. Other embodiments are disclosed and claimed.

DATA PROCESSING METHODS, APPARATUSES, ELECTRONIC DEVICES AND COMPUTER-READABLE STORAGE MEDIA
20220269622 · 2022-08-25 · ·

Embodiments of the present disclosure provide a data processing method, apparatus, electronic device and computer-readable storage medium. The data processing method includes: receiving, by a processing core, a synchronization signal; determining, by the processing core, according to the synchronization signal, a first storage area used by a self-task of the processing core and a second storage area used by a non-self-task of the processing core; wherein the first storage area differs from the second storage area; accessing the first storage area to execute the self-task and accessing the second storage area to execute the non-self-task by the processing core. Through the above method, the storage areas corresponding to different tasks of the processing core are separated, which solves the technical problems of complex data consistency processing mechanism and low processing efficiency caused by reading from and writing into the same storage area in the existing technology.

MEMORY MAT AS A REGISTER FILE
20220269645 · 2022-08-25 ·

In some embodiments, an integrated circuit may include a substrate and a memory array disposed on the substrate, where the memory array includes a plurality of discrete memory banks. The integrated circuit may also include a processing array disposed on the substrate, where the processing array includes a plurality of processor subunits, each one of the plurality of processor subunits being associated with one or more discrete memory banks among the plurality of discrete memory banks. The integrated circuit may also include a controller configured to implement at least one security measure with respect to an operation of the integrated circuit and take one or more remedial actions if the at least one security measure is triggered.

Scalable, parameterizable, and script-generatable buffer manager architecture
09767051 · 2017-09-19 · ·

A buffer manager is generated by executing a script with respect to a buffer architecture template and a configuration file specifying parameters for the buffer such as, for example, number of memory banks, width of memory banks, depth of memory banks, and client bridge FIFO depth. The script converts the buffer architecture template into a hardware description language (HDL) description of a buffer manager having the parameters. Client bridges accumulate requests for memory banks in FIFO that is provided to a buffer manager upon the client bridge being granted arbitration. Accesses of memory banks may be performed one at a time in consecutive clock cycles in a pipelined manner. Client bridges and the buffer manager may operate in different clock domains. The clock frequency of the buffer manager may be increased or decreased according to requests from client devices.

Data processing device and data processing method
09760507 · 2017-09-12 · ·

A data processing device includes a first sub-arbiter configured to arbitrate an access by first and second masters that access data stored in a memory; a second sub-arbiter configured to arbitrate an access to the memory by a plurality of masters other than the first and the second masters; a main arbiter configured to prioritize the access to the memory by the first sub-arbiter over the access to the memory by the second sub-arbiter; and a limiting unit configured to limit an amount of the access to the memory by the second master within a preset range.

METHOD AND SYSTEM FOR ASYNCHRONOUS MULTI-PLANE INDEPENDENT (AMPI) MEMORY READ OPERATION
20220238145 · 2022-07-28 ·

A flash memory device includes a plurality of memory planes each contains arrays of memory cells; a host interface for accessing the plurality of memory planes by an external host; and a controller connected to the plurality of memory planes via a memory interface and controlling the host interface for accessing the plurality of memory planes. The controller is configured to perform: receiving one or more commands on the host interface from the external host; determining whether to perform asynchronous multi-plane independent (AMPI) read operation corresponding to the commands; and after determining to start the AMPI read operation, accessing the memory planes in parallel according to the commands, and completing the AMPI read operation using an order of the commands determined based on an indicator signal provided to the controller to correspond to a sequence of the commands received on the host interface.

STORAGE DEVICE FOR TRANSMITTING DATA HAVING AN EMBEDDED COMMAND IN BOTH DIRECTIONS OF A SHARED CHANNEL, AND A METHOD OF OPERATING THE STORAGE DEVICE

A method of operating a storage device including first and second memory devices and a memory controller, which are connected to a single channel, the method including: transmitting first data output from the first memory device to the memory controller through a data signal line in the single channel; and transmitting a command to the second memory device through the data signal line while the memory controller receives the first data, wherein a voltage level of the data signal line is based on the command and the first data of the first memory device is loaded on the data signal line, and the first data and the command are transmitted in both directions of the data signal line.

Cross-threaded memory system
11194749 · 2021-12-07 · ·

A multi-chip package includes a logic integrated circuit (IC) die formed with plural memory controller circuits, a first memory IC die and a second memory IC die. The second memory IC die is mounted to the first memory IC die. The first memory IC die and the logic IC die are mounted to one another. The logic IC die includes a serial link interface for coupling to multiple serial links. The first memory die includes a first memory group accessed by a first one of the plural memory controller circuits, and a second memory group accessed by a second one of the plural memory controller circuits.

Accelerator control device, accelerator control method, and recording medium with accelerator control program stored therein
11194618 · 2021-12-07 · ·

This accelerator control device includes: a decision unit which, in processing flow information representing a flow by which a task generated according to a program executed by an accelerator processes data, decides, from among the data, temporary data temporarily generated during execution of a program; a determination unit which, on the basis of an execution status of the task by the accelerator and the processing flow information, determines, for every task using the decided data among the temporary data, whether or not execution of the task has been completed; and a deletion unit which deletes the decided data stored in a memory of the accelerator when execution of every task using the decided data has been completed, whereby, degradation, of processing performance by an accelerator, which occurs when a size of data to be processed by the accelerator is large is avoided.