Patent classifications
G06F8/4443
A STREAMING COMPILER FOR AUTOMATIC ADJOINT DIFFERENTIATION
A method for operating on a target function to provide computer code instructions configured to implement automatic adjoint differentiation of the target function. The method comprises: determining, based on the target function, a linearized computational map (100), LCM, of the target function wherein each node of the LCM (100) comprises an elementary operation; for each node of the LCM (100) forming computer code instructions configured to: (i) compute intermediate data associated with a forward function of an automatic adjoint differentiation algorithm; and, (ii) increment, according to the automatic adjoint differentiation algorithm, adjoint variables of the preceding connected nodes of the each node in dependence on intermediate data; wherein forming computer code instructions for both step (i) and step (ii) for each node is performed prior to performing said steps for a subsequent node of the LCM (100).
PROCEDURAL CODE GENERATION FOR CHALLENGE CODE
A method by one or more computing devices for obfuscating challenge code. The method includes obtaining challenge code for interrogating a client, inserting, into the challenge code, code for obfuscating outputs that are to be generated by the client, where the code for obfuscating the outputs includes code for applying a first chain of reversible transformations to the outputs using client-generated random values, interning strings appearing in the challenge code with obfuscated strings, inserting code for deobfuscating the obfuscated strings into the challenge code, inlining function calls in the challenge code, removing function definitions that are unused in the challenge code due to the inlining, reordering the challenge code without changing the functionality of the challenge code, and providing the challenge code for execution by the client.
Compilation framework for dynamic inlining
Disclosed embodiments include generating code from a database query and providing a framework to develop complex data structures and the functions that access those data structures outside of the generated code to access the complex data structures. These data structure functions can be precompiled in order to save compilation time at query runtime, and linked to the generated code in a way that the framework can still inline function calls and apply various optimizations on the linked code.
Loop-oriented neural network compilation
Methods of accelerating the execution of neural networks are disclosed. A description of a neural network may be received. A plurality of operators may be identified based on the description of the neural network. A plurality of symbolic models associated with the plurality of operators may be generated. For each symbolic model, a nested loop associated with an operator may be identified, a loop order may be defined, and a set of data dependencies may be defined. A set of inter-operator dependencies may be extracted based on the description of the neural network. The plurality of symbolic models and the set of inter-operator dependencies may be analyzed to identify a combinable pair of nested loops. The combinable pair of nested loops may be combined to form a combined nested loop.
PERFORMANCE OPTIMIZATION OF CLASS INSTANCE COMPARISONS
An embodiment includes executing a code interpretation engine such that the interpretation engine interprets a first portion of a source code that includes a first comparison between a first pair of operands. The embodiment also includes performing, in memory, a first bitwise comparison between a block A1 and a block B1 of the first portion of the source code. The embodiment also speeds up execution of the first portion of the source code responsive to the first bitwise comparison producing a negative result. The embodiment speeds up the first portion by omitting at least one of (i) a second bitwise comparison between a block A2 and a block B2, and (ii) a field-wise comparison between a block A3 and a block B3.
Application development method, tool, and device, and storage medium
Embodiments of the present application disclose an application development method performed at a computing device. The method includes: obtaining an input file in a predetermined format, the input file including content code of each part used for forming an application; disassembling the content code of each part in the input file into different category code according to corresponding categories; invoking a corresponding compiler according to an attribute of each piece of the category code, to compile the category code, to correspondingly obtain a description file of each piece of the category code; and performing plug-in processing on the description file of the category code of each part, to obtain the application.
FIRST FUTAMURA PROJECTION IN THE CONTEXT OF SQL EXPRESSION EVALUATION
The present invention relates to execution optimization of database queries. Herein are techniques for optimal execution based on query interpretation by translation to a domain specific language (DSL), with optimizations such as partial evaluation, abstract syntax tree (AST) rewriting, just in time (JIT) compilation, dynamic profiling, speculative logic, and Futamura projection. In an embodiment, a database management system (DBMS) that is hosted on a computer generates a query tree that represents a database query that contains an expression that is represented by a subtree of the query tree. The DBMS generates a sequence of DSL instructions that represents the subtree. The sequence of DSL instructions is executed to evaluate the expression during execution of the database query. In an embodiment, an AST is generated from the sequence of DSL instructions. In an embodiment, the DSL AST is optimally rewritten based on a runtime feedback loop that includes dynamic profiling information.
MACHINE LEARNING-BASED TECHNIQUE FOR EXECUTION MODE SELECTION
Described herein are techniques for generating a compiled shader program. The techniques include identifying input features of a shader program, providing the identified input features of the shader program to a trained model for selecting compiler operation values for shader programs, receiving, as output from the trained model, a compiler operation value for the shader program, and generating a compiled shader program based on the compiler operation value for execution on one or more compute units.
Dataflow analysis to reduce the overhead of on stack replacement
An approach is provided in which an information handling system selects an assumption point in a software program corresponding to a compile-time assumption made by a compiler, and selects an assumption violation point in the software program corresponding to a location at which the compile-time assumption can be violated at runtime. The information handling system propagates backwards in the software program from the assumption point and reaches the assumption violation point. The information handling system determines that the assumption point corresponds to a first method and the assumption violation point corresponds to a second method that is different from the first method, and inserts a conditional transition in the software program at the assumption violation point. The information handling system executes a compiled version of the software program that includes the conditional transition.
Transforming loops in program code based on a capacity of a cache
An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.