G06F8/4441

Pointer alignment computation in program code according to code pattern analyses

Pointer alignment in a computer programming to obtain information enabling a compiler to optimize program code. Equivalence classes of pointers are collected in a program using a flow-insensitive yet field-sensitive pointer analysis operation iterating through an entire program code of the program. The equivalence classes of pointers, once collected, are mapped to and recorded in an equivalence class mapping table (ECTable). A portion of the collected equivalence classes of pointers are identified, from the ECTable, as pointer candidates for a pointer alignment computation according to a code pattern analysis of each pointer candidate. The code pattern analysis is based on available alignment information, and whether the alignment information would enable a compiler to optimize pointer references of the candidate pointer. The pointer alignment computation is then performed for each identified pointer candidate to obtain the alignment information used to optimize execution of the program.

RUN-TIME PROFILE-GUIDED EXECUTION OF WORKLOADS
20230161576 · 2023-05-25 · ·

Examples described herein relate to technologies to execute a compiler for a process to be executed by one or more graphics processing units (GPUs) to compile the process based on run-time profile guided optimization (PGO). In some examples, the process is compiled based on run-time PGO is based on profile data versioned by application, driver, and GPU version; previously generated profile data; a subset of draws to profile and optimize; or other factors.

SELECTING AN EPILOGUE VECTORIZATION FACTOR FOR USE IN COMPUTER PROCESSING
20230161573 · 2023-05-25 ·

A vectorization factor to be used in vectorization of an epilogue loop in program code is automatically selected. The automatically selecting includes selecting the vectorization factor from a plurality of candidate vectorization factors based on one or more considerations relating to vectorizing the epilogue loop. The vectorization factor that is automatically selected is used in vectorizing the epilogue loop.

FIXED POINT EARLY EXIT OF A LOOP WITHIN COMPUTER CODE
20230161574 · 2023-05-25 ·

Early exit of a loop is performed. A determination is made as to whether a loop within computer code reaches a fixed point of processing, which is predefined. Based on determining that the loop reaches the fixed point of processing, at least one indication is included in the loop to perform an early exit of the loop prior to a last iteration of the loop.

TRANSFORMATION OF A LOOP WITHIN COMPUTER CODE TO MINIMIZE ITERATIONS
20230161575 · 2023-05-25 ·

A loop within computer code is transformed to minimize loop iterations. A determination is made using statistical information relating to the loop whether the loop that has an early exit indication is to be transformed to minimize iterations of the loop. Based on determining that the loop is to be transformed, the loop is transformed.

DYNAMIC COMPUTATION OFFLOADING TO GRAPHICS PROCESSING UNIT
20230061087 · 2023-03-02 ·

A method includes receiving source code of a program to be compiled and compiling the source code of the program. Compiling the source code includes identifying a first function in the source code of the program that is a candidate to be executed by a graphics processing unit (GPU), generating a first intermediate representation and a second intermediate representation for the first function, and inserting a second function in the program in place of the first function, wherein the second function is to select one of the first intermediate representation or the second intermediate representation to be executed. The method further includes providing a compiled program package including the second function, the first intermediate representation and the second intermediate representation.

Compiling models for dedicated hardware

The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.

Packing conditional branch operations

Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.

SYSTEM AND METHOD FOR AUTOMATIC SELECTION FOR DYNAMIC SITE COMPILATION WITHIN A CLOUD-BASED CONTENT HUB ENVIRONMENT
20220335095 · 2022-10-20 ·

Described herein are systems and methods automatic selection for dynamic site compilation in a cloud-based content hub environment. In accordance with an embodiment, embodiments provide an artificial intelligence/machine learning (AI/ML) engine that monitors and collects both content and consumption analytics associated with content items on a webpage. Based upon an analysis of such metrics, content can be automatically tagged such that the content item can be statically compiled with the website (optimized for viewing speed and user experience), or dynamically fetched/loaded on a load or refresh of the website.

Image recognition neural network processing method, device and system

An image recognition neural network processing method includes: a compiler segments an image recognition neural network to obtain tiles of at least one network layer group; classifies the tiles of each network layer group; and for each network layer group, generates an assembly code and tile information of the network layer group according to a tile result and a classification result of the network layer group. The same type of tiles correspond to the same assembly function, each assembly code includes a code segment of the assembly function corresponding to each type of tiles, the tile information includes block information of each tile in the network layer group, the tile information used to instruct a neural network processor to, according to the block information therein, invoke a corresponding code segment to process image data of a corresponding tile when a target image is identified by the image recognition neural network.