Patent classifications
G06F8/4441
Hardware acceleration method, compiler, and device
A hardware acceleration method includes: obtaining compilation policy information and a source code, where the compilation policy information indicates that a first code type matches a first processor and a second code type matches a second processor, analyzing a code segment in the source code according to the compilation policy information, determining a first code segment belonging to the first code type or a second code segment belonging to the second code type, compiling the first code segment into a first executable code, sending the first executable code to the first processor, compiling the second code segment into a second executable code, and sending the second executable code to the second processor.
SCHEDULING APPARATUS, TRAINING APPARATUS, SCHEDULER AND GENERATION METHOD
A scheduling apparatus includes at least one memory and at least one processor, and the at least one processor is configured to generate a schedule from a state specified based on received information. The generating includes causing the state to transition such that a process of transferring data from a memory is replaced with a recomputation process that obtains the data.
Methods, apparatus, systems and computer readable media for use in association with partitioning and/or rendering
In one embodiment, a method includes: receiving an application that includes a rendering portion; receiving code implementing a plurality of functions used by the application; defining a plurality of subsets of the plurality of functions, each of the plurality of subsets including at least one of the plurality of functions; monitoring which ones of the plurality of subsets has one or more of the at least one functions included therein invoked during execution of a portion of the application that includes the rendering portion; generating information indicating which ones of the plurality of subsets had one or more of the at least one functions included therein invoked during the execution of the portion of the application; and generating a first set of one or more files that includes: (i) code implementing ones of the plurality of functions that are included in or more of the plurality of subsets that had one or more of the at least one functions included therein invoked during execution of the portion of the application.
Code caching system
Systems and methods for code caching are provided. A first indication of primary source code awaiting execution is received. A resource cache is checked for cached data corresponding to the primary source code. Upon a cache miss in the resource cache, a first executable code compiled from the primary source code is obtained. A secondary source code referenced in the primary source code is selected. A second executable code compiled from the selected secondary source code is obtained. The first executable code and the second executable code are serialized into serialized code. The serialized code is stored as cached data in the resource cache.
Predictive management of on-demand code execution
Systems and methods are described for monitoring code execution within an on-demand code execution environment or other distributed code execution environment. The distributed, asynchronous nature of such environment can make determining the interactions between code executions difficult relative to traditional, non-distributed systems. The present disclosure enables the interrelations between code executions to be monitored by injecting monitoring information into the calls between those code executions. The monitoring information may be propagated through calls, such that a “path” or “trace” of code executions and calls can be determined. Data generated based on the monitoring information can be used to generate a profile for a set of code, so that a developer or other user may easily debug or optimize execution of the code.
SYSTEMS AND METHODS FOR MINIMIZING COMMUNICATIONS
A system for allocation of one or more data structures used in a program across a number of processing units takes into account a memory access pattern of the data structure, and the amount of total memory available for duplication across the several processing units. Using these parameters duplication factors are determined for the one or more data structures such that the cost of remote communication is minimized when the data structures are duplicated according to the respective duplication factors while allowing parallel execution of the program.
Instances of just-in-time (JIT) compilation of code using different compilation settings
In some examples, just-in-time (JIT) control instructions upon execution cause a system to initiate a plurality of instances of JIT compilation of a first code called by a program, where the initiating of the plurality of instances of the JIT compilation of the first code is under control of the JIT control instructions that are outside the program, and the plurality of instances of the JIT compilation of the first code use respective different compilation settings, and are to produce respective JIT compiled instances of the first code.
Software solution for cooperative memory-side and processor-side data prefetching
A solution for cooperative data prefetching that enables software control of a memory-side data prefetch and/or a processor-side data prefetch is provided. In one embodiment, the invention provides a solution for generating an application, in which access to application data for the application is improved (e.g., optimized) in program code for the application. In particular, a push request, for performing a memory-side data prefetch of the application data, and a prefetch request, for performing a processor-side data prefetch, are added to the program code. The memory-side data prefetch results in the application data being copied from a first data store to a second data store that is faster than the first data store while the processor-side data prefetch results in the application data being copied from the second data store to a third data store that is faster than the second data store.
Loop and library fusion
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating efficient compiled code. In an example method, a compilation system obtains an un-optimized computational graph comprising a plurality of nodes representing operations and directed edges representing data dependencies. The un-optimized computational graph is analyzed using pattern matching to determine fusable operations that can be fused together into a single fusion operation. The un-optimized computational graph is transformed into an optimized computational graph by replacing the nodes representing the fusable operations in the un-optimized computational graph with a fusion node representing the single fusion operation. The compilation system produces efficient code by translating the fusion node of the optimized computational graph as a call that performs the fused operations.
Vectorized representation method of software source code
The invention provides a vectorized representation method of a software source code. The vectorized representation method is an AST-based neural network which is a hierarchical vector representation method comprising the following implementation steps: step 1-1, converting an original software source code into an AST at the lowest layer, and then further dividing the AST according to source code statements to acquire a smaller statement tree sequence, wherein statement trees in the statement tree sequence are different in sequence, and the statement tree sequence is consistent with an original statement sequence; step 1-2, encoding the statement trees into statement vectors e.sub.1, e.sub.2, . . . , e.sub.t by a recursive neural encoder; step 1-3, enabling an acquired statement vector sequence to pass through a bidirectional recurrent neural network layer to extract dependency features between statements; and step 1-4, sampling multi-dimensional features of all time steps of the bidirectional recurrent neural network layer through a pooling layer to acquire a final vector representation.