Patent classifications
G06F8/445
Management of the untranslated to translated code steering logic in a dynamic binary translation based processor
A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.
Tuning of loop orders in blocked dense basic linear algebra subroutines
An example includes a sequence generator to generate a plurality of sequence pairs, a first one of the sequence pairs including: (i) a first input sequence representing first accesses to first tensors in a first loop nest of a first computer program, and (ii) a first output sequence representing a first tuned loop nest corresponding to the first accesses to the first tensors in the first loop nest; a model trainer to train a recurrent neural network based on the sequence pairs as training data, the recurrent neural network to be trained to tune loop ordering of a second computer program based on a second input sequence representing second accesses to a second tensor in a second loop nest of the second computer program; and a memory interface to store, in memory, a trained model corresponding to the recurrent neural network.
SYSTEMS AND METHODS OF AUDITING SERVER PARAMETERS IN A TELEPHONY NETWORK
The current disclosure relates to a system and method for auditing application parameters in a communication network. In particular, the method includes retrieving a set of golden parameter master lists from an auditing database, where each golden parameter master list corresponds to an application of the telecommunications network, and where each golden parameter master list comprises a set of parameter values for the respective application. The method also includes selecting a list of one or more applications for auditing, and for each application in the list: selecting a corresponding golden parameter master list from the set of golden parameter master lists and comparing each of the parameter values of the golden parameter master list to a corresponding set of parameter values of the application to generate a set of discrepancies.
COMPUTER-IMPLEMENTED METHOD AND A COMPUTER-READABLE MEDIUM
A computer-implemented method includes receiving a program code comprising a sequence of array instructions for at least one input array data structure storing multiple elements of a respective common data type, and function meta information, FMI. The FMI allow for determining an output size information of an output of each array instruction of the sequence of array instructions for an input size information of the at least one input array data structure. The method includes receiving hardware information of a processing unit, compiling, based on the first program segment, the runtime size information and the hardware information, a first compute kernel which is executable on the processing unit; and executing the first compute kernel on the processing unit using the runtime instance of the at least one input array data structure as input.
Method of using multidimensional blockification to optimize computer program and device thereof
Disclosed embodiments relate to a method and device for optimizing compilation of source code. The proposed method receives a first intermediate representation code of a source code and analyses each basic block instruction of the plurality of basic block instructions contained in the first intermediate representation code for blockification. In order to blockify the identical instructions, the one or more groups of basic block instructions are assessed for eligibility of blockification. Upon determining as eligible, the group of basic block instructions are blockified using one of one dimensional SIMD vectorization and two-dimensional SIMD vectorization. The method further generates a second intermediate representation of the source code which is translated to executable target code with more efficient processing capacity.
Large lookup tables for an image processor
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for supporting large lookup tables on an image processor. One of the methods includes receiving an input kernel program for an image processor having a two-dimensional array of execution lanes, a shift-register array, and a plurality of memory banks. If the kernel program has an instruction that reads a lookup table value for a lookup table partitioned across the plurality of memory banks, the instruction in the kernel program are replaced with a sequence of instructions that, when executed by an execution lane, causes the execution lane to read a first value from a local memory bank and a second value from the local memory bank on behalf of another execution lane belonging to a different group of execution lanes.
METHOD OF OPTIMIZING REGISTER MEMORY ALLOCATION FOR VECTOR INSTRUCTIONS AND A SYSTEM THEREOF
The present disclosure relates to a system and a method of optimizing register allocation by a processor. The method comprising receiving an intermediate representation (IR) code of a source code and initializing single instruction multiple data (SIMD) width for the IR code. The method comprising analyzing each basic block of the IR code to classify determine one or more instructions of the IR code as vector instructions, wherein each basic block is one of LOAD, STORE and arithmetic logical and multiply (ALM) instructions. The method comprising dynamically setting the SIMD width for each of the vector instructions.
Allocating variables to computer memory
A method of allocating variables to computer memory includes determining at compile time when each of the plurality of variables is live in a memory region and allocating a memory region to each variable wherein at least some variables are allocated at compile time to overlapping memory regions to be stored in those memory regions at runtime at non-overlapping times.
Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure
A method is described that includes translating higher level program code including higher level instructions having an instruction format that identifies pixels to be accessed from a memory with first and second coordinates from an orthogonal coordinate system into lower level instructions that target a hardware architecture having an array of execution lanes and a shift register array structure that is able to shift data along two different axis. The translating includes replacing the higher level instructions having the instruction format with lower level shift instructions that shift data within the shift register array structure.
Allocating Variables to Computer Memory
A method of allocating variables to computer memory includes determining at compile time when each of the plurality of variables is live in a memory region and allocating a memory region to each variable wherein at least some variables are allocated at compile time to overlapping memory regions to be stored in those memory regions at runtime at non-overlapping times.