G06F8/4435

COMPILER-INITIATED TILE REPLACEMENT TO ENABLE HARDWARE ACCELERATION RESOURCES
20220269492 · 2022-08-25 ·

A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.

A DEEP LEARNING MODEL FOR LEARNING PROGRAM EMBEDDINGS
20220044119 · 2022-02-10 ·

A system and method for using a deep learning model to learn program semantics is disclosed. The method includes receiving a plurality of execution traces of a program, each execution trace comprising a plurality of variable values. The plurality of variable values are encoded by a first recurrent neural network to generate a plurality of program states for each execution trace. A bi-directional recurrent neural network can then determine a reduced set of program states for each execution trace from the plurality of program states. The reduced set of program states are then encoded by a second recurrent neural network to generate a plurality of executions for the program. The method then includes pooling the plurality of executions to generate a program embedding and predicting semantics of the program using the program embedding.

Translating text encodings of machine learning models to executable code
11210073 · 2021-12-28 · ·

Translating text encodings of machine learning models to executable code, the method comprising: receiving a text encoding of a machine learning model; generating, based on the text encoding of the machine learning model, compilable code encoding the machine learning model; and generating, based on the compilable code, executable code encoding the machine learning model.

SYSTEMS AND METHODS TO DECREASE THE SIZE OF A COMPOUND VIRTUAL APPLIANCE FILE

An application is provided as a compound virtual appliance having components to be hosted by virtual machines. Each component includes a set of virtual machine disks. Partial versions of the components are created by removing from each component each virtual machine disk determined to be a duplicate of a virtual machine disk of another component. A compact version of the compound virtual appliance is created by packing together the partial versions of the components and a single copy of each virtual machine disk having been determined to be a duplicate. The compact compound virtual appliance is deployed to a customer site. At the customer site, a complete version of the compound virtual appliance is reconstructed by adding back the single copy of each virtual machine disk having been determined to be a duplicate into each component having had the duplicate virtual machine disk removed.

METHOD AND APPARATUS FOR OPTIMIZING CODE FOR FIELD PROGRAMMABLE GATE ARRAYS

A method for the generation of a hardware accelerator (20) is described. The method comprises inputting (110) a program (105) with a plurality of lines of code describing an algorithm to be implemented on the hardware accelerator (20) and generating (125) a dataflow graph in memory from the inputted program (105). The dataflow graph is optimized and an output program (140) created from the dataflow graph is output. The output program (140) is then provided to a high-level synthesis tool for generating the hardware accelerator (20).

METHOD AND APPARATUS FOR RETAINING OPTIMAL WIDTH VECTOR OPERATIONS IN ARBITRARY/FLEXIBLE VECTOR WIDTH ARCHITECTURE

A method and apparatus to optimize a list of vector instructions using dynamic programming, in particular memoization, by generating a table containing instruction subvectors having individual (parts), contiguous (superparts) and repeated (broadcasts) lanes. Because the instructions in the table are subvectors selected to have individual, contiguous and repeated lanes in the registers, compiler optimizations can be enhanced. Introduction of such dynamic programming allows for speculative lane optimizations, as well as improved analysis-guided optimizations, either of which can be performed alone or in combination with other optimizations, whether or not they make use of dynamic programming.

Rendering Optimisation by Recompiling Shader Instructions
20220172422 · 2022-06-02 ·

A rendering optimisation identifies a draw call within a current render (which may be the first draw call in the render or a subsequent draw call in the render) and analyses a last shader in the series of shaders used by the draw call to determine whether the last shader samples from the one or more buffers at coordinates matching a current fragment location. If this determination is positive, the method further recompiles the last shader to replace an instruction that reads data from one of the one or more buffers at coordinates matching a current fragment location with an instruction that reads from the one or more buffers at coordinates stored in on-chip registers.

Compiler-initiated tile replacement to enable hardware acceleration resources

A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.

Automatic resource management for build systems

A method may include searching compiled code for a variable name of a resource, the variable name containing a predefined string; identifying a variable name in a resource manifest of the library that matches the variable name that contains the predefined string; based on the identifying, importing the resource to a location associated with the compiled code.

Non-transitory computer-readable recording medium, assembly instruction conversion method and information processing apparatus
11327758 · 2022-05-10 · ·

A non-transitory computer-readable recording medium having stored therein a program for causing a computer to execute a process. The process includes storing each of a plurality of generation instructions in a storage area for each of assembly instructions, the generation instructions instructing the generation of instruction sequences of a first instruction set, each instruction sequence of the first instruction set executing a processing equivalent to each assembly instruction of a second instruction set, identifying a first register that is not used by any of the assembly instructions corresponding to the plurality of generation instructions by referring to the storage area, determining a second register of the first instruction set corresponding to the first register as a temporary register in each of the instruction sequences, and generating the instruction sequence that uses the temporary register.