Patent classifications
G06F8/452
OFFLOADING SERVER AND OFFLOADING PROGRAM
An offloading server includes: a data transfer designation section configured to analyze reference relationships of variables used in loop statements in an application and designate, for data that can be transferred outside a loop, a data transfer using an explicit directive that explicitly specifies a data transfer outside the loop; a parallel processing designation section configured to identify loop statements in the application and specify a directive specifying application of parallel processing by an accelerator and perform compilation for each of the loop statements; and a parallel processing pattern creation section configured to exclude loop statements causing a compilation error from loop statements to be offloaded and create a plurality of parallel processing patterns each of which specifies whether to perform parallel processing for each of the loop statements not causing a compilation error.
Method and system for converting a single-threaded software program into an application-specific supercomputer
The invention comprises (i) a compilation method for automatically converting a single-threaded software program into an application-specific supercomputer, and (ii) the supercomputer system structure generated as a result of applying this method. The compilation method comprises: (a) Converting an arbitrary code fragment from the application into customized hardware whose execution is functionally equivalent to the software execution of the code fragment; and (b) Generating interfaces on the hardware and software parts of the application, which (i) Perform a software-to-hardware program state transfer at the entries of the code fragment; (ii) Perform a hardware-to-software program state transfer at the exits of the code fragment; and (iii) Maintain memory coherence between the software and hardware memories. If the resulting hardware design is large, it is divided into partitions such that each partition can fit into a single chip. Then, a single union chip is created which can realize any of the partitions.
COMPILER PROGRAM, COMPILING METHOD, INFORMATION PROCESSING DEVICE
A compiler program causes a computer to execute optimization processing for an optimization target program. The optimization target program includes a loop including a vector store instruction and a vector load instruction for an array variable. The optimization processing includes (1) unrolling the vector store instruction and the vector load instruction in the loop by an unrolling number of times to generate a plurality of unrolled vector store instructions and a plurality of unrolled vector load instructions, and (2) scheduling to move an unrolled vector load instruction among the plurality of unrolled vector load instructions, which is located after a first unrolled vector store instruction that is located at first among the plurality of unrolled vector load instructions, before the first unrolled vector store instruction.
Anti-Congestion Flow Control for Reconfigurable Processors
A compiler configured to configure memory nodes with a ready-to-read credit counter and a write credit counter. The ready-to-read credit counter of a particular upstream memory node initialized with as many read credits as a buffer depth of a corresponding downstream memory node. The ready-to-read credit counter configured to decrement when a buffer data unit is written by the particular upstream memory node into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a read ready token. The write credit counter of the particular upstream memory node initialized with one or more write credits and configured to decrement when the particular upstream memory node begins writing the buffer data unit into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a write done token.
TRAINING A NEURAL NETWORK USING A NON-HOMOGENOUS SET OF RECONFIGURABLE PROCESSORS
A system for training parameters of a neural network includes a processing node with a processor reconfigurable at a first level of configuration granularity and a controller reconfigurable at a finer level of configuration granularity. The processor is configured to execute a first dataflow segment of the neural network with training data to generate a predicted output value using a set of neural network parameters, calculate a first intermediate result for a parameter based on the predicted output value, a target output value, and a parameter gradient, and provide the first intermediate result to the controller. The controller is configured to receive a second intermediate result over a network, and execute a second dataflow segment, dependent upon the first intermediate result and the second intermediate result, to generate a third intermediate result indicative of an update of the parameter.
Automatic compiler dataflow optimization to enable pipelining of loops with local storage requirements
Systems, apparatuses and methods may provide for technology that detects one or more local variables in source code, wherein the local variable(s) lack dependencies across iterations of a loop in the source code, automatically generate pipeline execution code for the local variable(s), and incorporate the pipeline execution code into an output of a compiler. In one example, the pipeline execution code includes an initialization of a pool of buffer storage for the local variable(s).
Partial data type promotion to exploit efficient vectorization in microprocessors
Aspects of the invention include a compiler detecting an expression in a loop that includes elements of mixed data types. The compiler then promotes elements of a sub-expression of the expression to a same intermediate data type. The compiler then calculates the sub-expression using the elements of the same intermediate data type.
Non-transitory computer-readable recording medium, compilation method, and compiler device
The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.
METHOD FOR COMPILING FROM A HIGH-LEVEL SCRIPTING LANGUAGE TO A BLOCKCHAIN NATIVE SCRIPTING LANGUAGE
The invention provides methods and systems which enable additional functionality to be inserted into blockchain scripts with ease and in an effective and manner. According to one embodiment, the invention provides a blockchain-implemented method comprising the steps of arranging a plurality or selection of scripting language primitives to provide, upon execution, the functionality of a high-level scripting language primitive, wherein the scripting language is associated with a blockchain protocol; inserting the plurality of scripting language primitives at least once into a script; and inserting the script into blockchain transaction (Tx). The high-level scripting language primitive may perform, for example, an arithmetic operation such as multiplication or division. The scripting language primitives may be called op-codes, words or commands, and are native to the scripting language. The scripting language may be Script, and the blockchain protocol may be a version of the Bitcoin protocol.
Method for controlling the flow execution of a generated script of a blockchain transaction
The invention provides a computer-implemented method (and corresponding system) for generating a blockchain transaction (Tx). This may be a transaction for the Bitcoin blockchain or another blockchain protocol. The method comprises the step of using a software resource to receive, generate or otherwise derive at least one data item; and then insert, at least once, at least one portion of code into a script associated the transaction. Upon execution of the script, the portion of code provides the functionality of a control flow mechanism, the behaviour of the control flow mechanism being controlled or influenced by the at least one data item. In one embodiment, the code is copied/inserted into the script more than once. The control flow mechanism can be a loop, such as a while or for loop, or a selection control mechanism such as a switch statement. Thus, the invention allows the generation of a more complex blockchain script and controls how the script will execute when implemented on the blockchain. This, in turns provides control over how or when the output of the blockchain transaction is unlocked.