G06F2213/2806

INPUT AND OUTPUT SPATIAL CROPPING OPERATIONS IN NEURAL PROCESSOR CIRCUITS
20240330217 · 2024-10-03 · ·

An SoC circuit includes a neural processor circuit coupled to a CPU. The neural processor circuit includes neural engines, a data processor DMA circuit, a system memory, and a data processor circuit. The CPU is configured to execute a compiler, which is in turn configured to determine to perform a mode of spatial cropping and the associated crop offset. The neural processor circuit is configured to support arbitrary cropping in the x and y dimensions. The compiler is configured to generate task descriptor(s), the task descriptor(s) distributed to components of the neural processor circuit. The data processor DMA circuit is configured to fetch and format data corresponding to the crop from a source to the buffer. The buffer is configured to realign the data according to the crop origin for broadcast to the neural engines. The neural engines is configured to perform a computation operation which uses the cropped data.

COMPUTER AND CONTROL METHOD FOR COMPUTER

By assigning a physically continuous memory area to a virtual storage apparatus operated on an OS, the performance of the virtual storage apparatus is secured. A processor operates an OS, and the processor executes a plurality of processes on the OS. The plurality of processes includes a first virtual storage apparatus. The first virtual storage apparatus executes an I/O process, and includes a cache for storing data that is subjected to the I/O process. The processor assigns a resource in a computer to the plurality of processes, and the processor creates area information that indicates physical addresses assigned to the processes in a memory. On the basis of the area information, the processor selects a continuous area, which is a physically continuous area from the memory and assigns the continuous area to the cache.

Optimized credit return mechanism for packet sends
09984020 · 2018-05-29 · ·

Method and apparatus for implementing an optimized credit return mechanism for packet sends. A Programmed Input/Output (PIO) send memory is partitioned into a plurality of send contexts, each comprising a memory buffer including a plurality of send blocks configured to store packet data. A storage scheme using FIFO semantics is implemented with each send block associated with a respective FIFO slot. In response to receiving packet data written to the send blocks and detecting the data in those send blocks has egressed from a send context, corresponding freed FIFO slots are detected, and a lowest slot for which credit return indicia has not be returned is determined. The highest slot in a sequence of freed slots from the lowest slot is then determined, and corresponding credit return indicia is returned. In one embodiment an absolute credit return count is implemented for each send context, with an associated absolute credit sent count tracked via software that writes to the PIO send memory, with the two absolute credit counts used for flow control.

STORAGE APPARATUS ACCESSED BY USING MEMORY BUS
20180113826 · 2018-04-26 ·

A storage apparatus accessed by using a memory bus is disclosed. The apparatus includes an interface controller, a storage module, a storage controller, a command register, a status register, and a buffer. In addition, the interface controller can be electrically connected to a memory module interface of a computer system. The interface controller receives an access command for accessing the storage module sent by a CPU. The interface controller writes the access command into the command register, and records a current access status or result by using the status register. The storage controller performs status setting on the status register according to the access command in the command register, and performs a corresponding read/write operation on the storage module.

Methods and systems for direct memory access operations
09952979 · 2018-04-24 · ·

Systems and methods for a direct memory access (DMA) operation are provided. The method includes receiving a host memory address by a device coupled to a computing device; storing the host memory address at a device memory by a DMA engine; receiving a packet at the device for the computing device; instructing the DMA engine by a device processor to retrieve the host memory address from the device memory; retrieving the host memory address by the DMA engine without the device processor reading the host memory address; and transferring the packet to the computing device by a DMA operation.

OPTIMIZED CREDIT RETURN MECHANISM FOR PACKET SENDS
20180039593 · 2018-02-08 · ·

Method and apparatus for implementing an optimized credit return mechanism for packet sends. A Programmed Input/Output (PIO) send memory is partitioned into a plurality of send contexts, each comprising a memory buffer including a plurality of send blocks configured to store packet data. A storage scheme using FIFO semantics is implemented with each send block associated with a respective FIFO slot. In response to receiving packet data written to the send blocks and detecting the data in those send blocks has egressed from a send context, corresponding freed FIFO slots are detected, and a lowest slot for which credit return indicia has not be returned is determined. The highest slot in a sequence of freed slots from the lowest slot is then determined, and corresponding credit return indicia is returned. In one embodiment an absolute credit return count is implemented for each send context, with an associated absolute credit sent count tracked via software that writes to the PIO send memory, with the two absolute credit counts used for flow control.

Memory management for finite automata processing

Matching at least one regular expression pattern in an input stream may be optimized by initializing a search context in a run stack based on (i) partial match results determined from walking segments of a payload of a flow through a first finite automation and (ii) a historical search context associated with the flow. The search context may be modified via push or pop operations to direct at least one processor to walk segments of the payload through the at least one second finite automation. The search context may be maintained in a manner that obviates overflow of the search context and obviating stalling of the push or pop operations to increase match performance.

Peripheral component interconnect express (PCIe) device method for delaying command operations based on generated throughput analysis information

Provided are a Peripheral Component Interconnect Express (PCIe) device and a method of operating the same. The PCIe device may include a performance analyzer, a delay time information generato and a command fetcher. The performance analyzer may measure throughputs of a plurality of functions, and generate throughput analysis information indicating a comparison result between the throughputs of the plurality of functions and throughput limits corresponding to the plurality of functions. The delay time information generator may generate a delay time for delaying a command fetch operation for each of the plurality of functions based on the throughput analysis information. The command fetcher may fetch a target command from a host based on a delay time of a function corresponding to the target command.

Optimized credit return mechanism for packet sends
09792235 · 2017-10-17 · ·

Method and apparatus for implementing an optimized credit return mechanism for packet sends. A Programmed Input/Output (PIO) send memory is partitioned into a plurality of send contexts, each comprising a memory buffer including a plurality of send blocks configured to store packet data. A storage scheme using FIFO semantics is implemented with each send block associated with a respective FIFO slot. In response to receiving packet data written to the send blocks and detecting the data in those send blocks has egressed from a send context, corresponding freed FIFO slots are detected, and a lowest slot for which credit return indicia has not be returned is determined. The highest slot in a sequence of freed slots from the lowest slot is then determined, and corresponding credit return indicia is returned. In one embodiment an absolute credit return count is implemented for each send context, with an associated absolute credit sent count tracked via software that writes to the PIO send memory, with the two absolute credit counts used for flow control.

Engine architecture for processing finite automata

An engine architecture for processing finite automata includes a hyper non-deterministic automata (HNA) processor specialized for non-deterministic finite automata (NFA) processing. The HNA processor includes a plurality of super-clusters and an HNA scheduler. Each super-cluster includes a plurality of clusters. Each cluster of the plurality of clusters includes a plurality of HNA processing units (HPUs). A corresponding plurality of HPUs of a corresponding plurality of clusters of at least one selected super-cluster is available as a resource pool of HPUs to the HNA scheduler for assignment of at least one HNA instruction to enable acceleration of a match of at least one regular expression pattern in an input stream received from a network.