G06F9/30069

METHOD AND APPARATUS FOR DETERMINING BINARY FUNCTION ENTRY

A method for determining a binary function entry includes distinguishing a text section and an exception handling section by parsing a binary code, disassembling the text section to determine an address of an end branch instruction, an address of a direct call target, and an address of a direct jump target, determining an indirect return function call address from the addresses of the end branch instructions, determining an exception handling block address from the addresses of the end branch instructions, excluding the indirect return function call address and the exception handling block address from the addresses of the end branch instructions and determining a tail call corresponding to the binary function entry from the addresses of the direct jump targets.

Scatter and gather streaming data through a circular FIFO
12001365 · 2024-06-04 · ·

Systems, apparatuses, and methods for performing scatter and gather direct memory access (DMA) streaming through a circular buffer are described. A system includes a circular buffer, producer DMA engine, and consumer DMA engine. After the producer DMA engine writes or skips over a given data chunk of a first frame to the buffer, the producer DMA engine sends an updated write pointer to the consumer DMA engine indicating that a data credit has been committed to the buffer and that the data credit is ready to be consumed. After the consumer DMA engine reads or skips over the given data chunk of the first frame from the buffer, the consumer DMA engine sends an updated read pointer to the producer DMA engine indicating that the data credit has been consumed and that space has been freed up in the buffer to be reused by the producer DMA engine.

Scatter and Gather Streaming Data through a Circular FIFO
20240264963 · 2024-08-08 · ·

Systems, apparatuses, and methods for performing scatter and gather direct memory access (DMA) streaming through a circular buffer are described. A system includes a circular buffer, producer DMA engine, and consumer DMA engine. After the producer DMA engine writes or skips over a given data chunk of a first frame to the buffer, the producer DMA engine sends an updated write pointer to the consumer DMA engine indicating that a data credit has been committed to the buffer and that the data credit is ready to be consumed. After the consumer DMA engine reads or skips over the given data chunk of the first frame from the buffer, the consumer DMA engine sends an updated read pointer to the producer DMA engine indicating that the data credit has been consumed and that space has been freed up in the buffer to be reused by the producer DMA engine.

Managing execution of continuous delivery pipelines for a cloud platform based data center

Computing systems, for example, multi-tenant systems deploy software artifacts in data centers created in a cloud platform using a cloud platform infrastructure language that is cloud platform independent. The system generates pipelines for deploying software artifacts in data center entities configured in a cloud platform. The system allows partial execution of pipelines such that the pipeline can be executed again to complete execution of the remaining stages. The system maintains state of the pipeline execution and checks the state to determine whether a stage should be executed during subsequent executions. The system allows a failed stage to be retried multiple times based on a retry strategy. A retry strategy may depend on the data center entity in a hierarchy of data venter entities of a data center.

NPU implemented for fusion-artificial neural network to process heterogeneous data provided by heterogeneous sensors
12077185 · 2024-09-03 · ·

A neural processing unit (NPU) includes a controller including a scheduler, the controller configured to receive from a compiler a machine code of an artificial neural network (ANN) including a fusion ANN, the machine code including data locality information of the fusion ANN, and receive heterogeneous sensor data from a plurality of sensors corresponding to the fusion ANN; at least one processing element configured to perform fusion operations of the fusion ANN including a convolution operation and at least one special function operation; a special function unit (SFU) configured to perform a special function operation of the fusion ANN; and an on-chip memory configured to store operation data of the fusion ANN, wherein the schedular is configured to control the at least one processing element and the on-chip memory such that all operations of the fusion ANN are processed in a predetermined sequence according to the data locality information.

MICRO-OP FUSION FOR NON-ADJACENT INSTRUCTIONS
20180129498 · 2018-05-10 ·

Method(s) for up/down fusion and/or pseudo-fusion of micro-operations are performed in a hardware processor configured to execute program code. A mergeable pair of micro-operations is identified in a sequence of micro-operations of the program code. The pair of micro-operations includes a first micro-operation for performing a first function and a non-consecutive second micro-operation for performing a second function. The first micro-operation precedes the second micro-operation in the sequence of micro-operations being processed. The first micro-operation is merged into the second micro-operation to create a third micro-operation which performs both the first function and the second function. In up/down fusion the third micro-operation is dispatched instead of the first micro-operation or instead of the second micro-operation, based on whether fuse-up or fuse-down is performed. In pseudo-fusion the first micro-operation is retained in the sequence of micro-operations and the second micro-operation is replaced with the third micro-operation.

MICRO-OP FUSION FOR NON-ADJACENT INSTRUCTIONS
20180129501 · 2018-05-10 ·

Method(s) for up/down fusion and/or pseudo-fusion of micro-operations are performed in a hardware processor configured to execute program code. A mergeable pair of micro-operations is identified in a sequence of micro-operations of the program code. The pair of micro-operations includes a first micro-operation for performing a first function and a non-consecutive second micro-operation for performing a second function. The first micro-operation precedes the second micro-operation in the sequence of micro-operations being processed. The first micro-operation is merged into the second micro-operation to create a third micro-operation which performs both the first function and the second function. In up/down fusion the third micro-operation is dispatched instead of the first micro-operation or instead of the second micro-operation, based on whether fuse-up or fuse-down is performed. In pseudo-fusion the first micro-operation is retained in the sequence of micro-operations and the second micro-operation is replaced with the third micro-operation.

METHODS AND SYSTEM FOR IMPROVED PROCESSING OF SEQUENTIAL DATA IN A NEURAL NETWORK

Disclosed is a system that includes a processor configured to process data in a neural network and a memory associated with a primary flow path and at least one secondary flow path within the neural network. The primary flow path comprises one or more primary operators to process the data and the at least one secondary flow path is configured to pass the data to a combining operator by skipping the processing of the data over the primary flow path. The processor is configured to provide the primary flow path and the at least one secondary flow path with a primary sequence of data and a secondary sequence of data respectively such that the secondary sequence of data being time offset from the processed primary sequence of data.

APPARATUS, NPU AND CHIPSET IMPLEMENTED FOR FUSION NEURAL NETWORK
20240367681 · 2024-11-07 ·

A neural processing unit (NPU) includes a controller including a scheduler, the controller configured to receive from a compiler a machine code of an artificial neural network (ANN) including a fusion ANN, the machine code including data locality information of the fusion ANN, and receive heterogeneous sensor data from a plurality of sensors corresponding to the fusion ANN; at least one processing element configured to perform fusion operations of the fusion ANN including a convolution operation and at least one special function operation; a special function unit (SFU) configured to perform a special function operation of the fusion ANN; and an on-chip memory configured to store operation data of the fusion ANN, wherein the scheduler is configured to control the at least one processing element and the on-chip memory such that all operations of the fusion ANN are processed in a predetermined sequence according to the data locality information.

Hashing for deduplication through skipping selected data
12174806 · 2024-12-24 · ·

A system for calculating a fingerprint across a data set by identifying a data set to hash, the data set comprising a set of data blocks, identifying data within the data set to skip, generating, by a hash engine, a hash for each data block in the set of data blocks within the data set except for the data within the data set to skip, and compressing the data.