Patent classifications
G06F8/452
Program, information conversion device, and information conversion method
An information conversion device has one of: a replication necessity analysis unit for specifying where an instruction referred by phi functions is present in one basic block and inserting a transfer instruction therein; an intra-loop constant analysis unit for specifying a closed path in which a phi function reference is circulated and inserting the transfer instruction therein; an inter-instruction dependency analysis unit for specifying where data dependency is present between instructions as a reference destination of the phi functions and inserting the transfer instruction therein; a same instruction reference analysis unit for specifying where the phi functions referring to a result of a same instruction before branching are present and inserting the transfer instruction therein; and a spill out validity analysis unit for storing a value present in a loop processing, loading the value after the loop processing ends, and deleting the transfer instruction.
SYSTEM FOR CO-ORDINATION OF LOGICAL SEQUENCE OF INSTRUCTIONS ACROSS ELECTRONIC DEVICES USING VISUAL PROGRAMMING AND WIRELESS COMMUNICATION
An orchestration engine provides a technical output across multiple programmable objects such as electronic devices, virtual objects and cloud based services in response to user specified logic. The orchestration engine may be deployed on a mobile computer, a tablet computer, a laptop computer, a desktop computer, a wired or wireless electronic device in the system or on a server computer connected via internet. The orchestration engine is capable of supporting extensibility in order to expand support for similar common interaction methods to newer electronic devices via a plug-in framework by specifying the communication protocol of the new element and its capabilities in a descriptive way via a markup language. The orchestration engine is provided along with a library of drag and drop Visual Programming Language steps required for providing executable computer program steps for specifying a user specified logic by computer language illiterate person.
FLOW CONTROL FOR RECONFIGURABLE PROCESSORS
The technology disclosed relates to storing a dataflow graph with a plurality of compute nodes that transmit data along data connections, and controlling data transmission between compute nodes in the plurality of compute nodes along the data connections by using control connections to control writing of data.
Operation Fusion in Nested Meta-pipeline Loops
A method for improving throughput in a reconfigurable computing system includes detecting, in an algebraic representation of a computing task for a reconfigurable dataflow processor, an outer meta-pipeline loop, detecting an inner meta-pipeline loop nested within the outer meta-pipeline loop, and determining that the inner meta-pipeline loop and the outer meta-pipeline loop each conduct a common operation. The method also includes fusing the common operation for the inner meta-pipeline loop and the outer meta-pipeline loop into a single operation within the inner meta-pipeline loop. The instances of the common operation may be fused if the output of a first instance of the common operation is the source for a second instance of the common operation. Examples of the common operation include an accumulator operation, a re-read operation, and a temporal (chip buffer synchronized) operation such as a temporal concatenation operation and a temporal slicing operation.
Method and system for converting a single-threaded software program into an application-specific supercomputer
The invention comprises (i) a compilation method for automatically converting a single-threaded software program into an application-specific supercomputer, and (ii) the supercomputer system structure generated as a result of applying this method. The compilation method comprises: (a) Converting an arbitrary code fragment from the application into customized hardware whose execution is functionally equivalent to the software execution of the code fragment; and (b) Generating interfaces on the hardware and software parts of the application, which (i) Perform a software-to-hardware program state transfer at the entries of the code fragment; (ii) Perform a hardware-to-software program state transfer at the exits of the code fragment; and (iii) Maintain memory coherence between the software and hardware memories. If the resulting hardware design is large, it is divided into partitions such that each partition can fit into a single chip. Then, a single union chip is created which can realize any of the partitions.
Methods and devices for computing a memory size for software optimization
There is provided methods and devices for computing a tile size for software optimization. A method includes receiving, by a computing device, information indicative of one or more of a set of loop bounds and a set of data shapes; processing, by the computing device, the information to determine a computation configuration based on the obtained information, the computation configuration implementable by a compiler, said processing including evaluating at least the computation configuration based on a build cost model, the build cost model representative of a data transfer cost and a data efficiency of the computation configuration; and transmitting, by the computing device, instructions directing the compiler to implement the computation configuration.
Method for compiling from a high-level scripting language to a blockchain native scripting language
The invention provides methods and systems which enable additional functionality to be inserted into blockchain scripts with ease and in an effective and manner. According to one embodiment, the invention provides a blockchain-implemented method comprising the steps of arranging a plurality or selection of scripting language primitives to provide, upon execution, the functionality of a high-level scripting language primitive, wherein the scripting language is associated with a blockchain protocol; inserting the plurality of scripting language primitives at least once into a script; and inserting the script into blockchain transaction (Tx). The high-level scripting language primitive may perform, for example, an arithmetic operation such as multiplication or division. The scripting language primitives may be called op-codes, words or commands, and are native to the scripting language. The scripting language may be Script, and the blockchain protocol may be a version of the Bitcoin protocol.
High performance processor
Implementations relate to a data processor that includes a data processing unit having a plurality of processing elements and a cache hierarchy including a plurality of levels of data caches. The data caches include a first level data cache connected to a second level data cache, and a main memory connected to the highest level cache of the cache hierarchy. At least one of the first level data cache or second level data cache is divided into a plurality of cache segments, and during operation of the data processor, at least some of the plurality of cache segments are excluded from cache operation. Each of the excluded cache segments is dedicated to an associated processing element as tightly coupled local access memory.
INFORMATION PROCESSING DEVICE AND COMPILER METHOD
A non-transitory computer-readable recording medium stores a program for causing a computer to execute a process, the process includes determining, for an n-dimensional array (n≥3) included in an instruction code in an innermost loop of a multiple loop included in a source code, whether array sizes of a first argument and a second argument match numbers of rotations of a first index and a second index in the multiple loop, respectively, when the array sizes match the numbers of rotations and when each initial value and each increment value of the first and second indexes is 1, replacing the first argument and the second argument of the n-dimensional array included in the instruction code with a third argument and changing the n-dimensional array to an (n−1)-dimensional array, and integrating a loop that uses the first index and a loop that uses the second index.
Global modulo allocation in neural network compilation
In one example, a method performed by a compiler comprises: receiving a dataflow graph of a neural network, the neural network comprising a neural network operator; receiving information of computation resources and memory resources of a neural network hardware accelerator intended to execute the neural network operator; determining, based on the dataflow graph, iterations of an operation on elements of a tensor included in the neural network operator; determining, based on the information, a mapping between the elements of the tensor to addresses in the portion of the local memory, and a number of the iterations of the operation to be included in a batch, wherein the number of the iterations in the batch are to be executed in parallel by the neural network hardware accelerator; and generating a schedule of execution of the batches of the iterations of the operations.