G06F8/4441

Redundant Instance Variable Initialization Elision
20170293547 · 2017-10-12 ·

A compiler, IDE or other code analyzer may determine whether an instance variable declaration assignment is redundant. The code analyzer may also take action based on that determination. A code analyzer may be able to determine with certainty that a particular instance variable initialization or assignment is definitely redundant. The code analyzer may cause a compiler to automatically elide the redundant assignment from compiled source code. The code analyzer may be able to determine with certainty that a particular assignment is definitely not redundant. Additionally, a code analyzer may not be able to determine with certainty whether an instance variable assignment is definitely redundant or definitely not redundant. Additionally, the code analyzer may report a warning or other informative message indicating the redundancy property of the assignment, thus alerting the programming to a (possibly) redundant assignment.

Methods and apparatus to eliminate partial-redundant vector loads

Methods, apparatus, systems and articles of manufacture are disclosed to eliminate partial-redundant vector loads. An example apparatus includes a node group to associate a vector operation with a node group based on a load type of the vector operation. The example apparatus also includes a candidate identifier to identify a candidate in the node group, the candidate to include a subset of vector operations of the node group. The example apparatus also includes a code optimizer to determine replacement code based on a characteristic of the candidate, and to compare an estimated cost associated with executing the replacement code to a threshold cost relative to a cost of executing the candidate. The example apparatus also includes a code generator to generate machine code using the replacement code when the estimated cost of executing the replacement code satisfies the threshold cost.

Applying multiple rewriting without collision for semi-automatic program rewriting system

A system and method for applying multiple rewritings without contention in a semi-automatic program rewriting system. The method includes: finding dependent ranges of a variable and a modification affecting range of the variable in a target program; determining at least two solutions for target program modification; detecting whether a collision condition exists amongst the one or more solutions; and modifying the program with said one or more solutions if no collision condition exists, while disabling the other solution if a collision condition is detected. A solution includes a rewriting of a segment of a target program code, and there is performed applying one or both of: multiple rewritings in a single solution and multiple rewritings in multiple regions of the target program. When multiple solutions are applied, the second and later solutions are applied to the already rewritten program. The correct application regions of the second and later solutions are identified.

ACCELERATION TECHNIQUES FOR GRAPH ANALYSIS PROGRAMS

Source code of a graph analysis program expressed in a platform-independent language which supports linear algebra primitives is obtained. An executable version of the program is generated, which includes an invocation of a function of a parallel programming library optimized for a particular hardware platform. A result of executing the program is stored.

Methods and systems for analyzing and improving performance of computer codes
09753731 · 2017-09-05 · ·

Methods and systems for analyzing and improving performance of computer codes. In some embodiments, a method comprises executing, via one or more processors, program code; collecting, via the one or more processors, one or more hardware dependent metrics for the program code; identifying an execution anomaly based on the one or more hardware dependent metrics, wherein the execution anomaly is present when executing the program code; and designing a modification of the program code via the one or more processors, wherein the modification addresses the execution anomaly. In some other embodiments, a method comprises collecting one or more hardware independent metrics for program code; receiving one or more characteristics of a computing device; and estimating, based on the one or more hardware independent metrics and the one or more characteristics, a duration for execution of the program code on the computing device.

COMPILER FOR TRANSLATING BETWEEN A VIRTUAL IMAGE PROCESSOR INSTRUCTION SET ARCHITECTURE (ISA) AND TARGET HARDWARE HAVING A TWO-DIMENSIONAL SHIFT ARRAY STRUCTURE
20170242669 · 2017-08-24 · ·

A method is described that includes translating higher level program code including higher level instructions having an instruction format that identifies pixels to be accessed from a memory with first and second coordinates from an orthogonal coordinate system into lower level instructions that target a hardware architecture having an array of execution lanes and a shift register array structure that is able to shift data along two different axis. The translating includes replacing the higher level instructions having the instruction format with lower level shift instructions that shift data within the shift register array structure.

METHOD OF REORDERING CONDITION CHECKS
20170242776 · 2017-08-24 ·

Described is a computer-implemented method of reordering condition checks. Two or more condition checks in computer code that may be reordered within the code are identified. It is determined that the execution frequency of a later one of the condition checks is satisfied at a greater frequency than a preceding one of the condition checks. It is determined that there is an absence of side effects in the two or more condition checks. The values of the condition checks are propagated and abstract interpretation is performed on the values that are propagated. It is determined that the condition checks are exclusive of each other, and the condition checks are reordered within the computer code.

HETEROGENEOUS COMPUTER SYSTEM OPTIMIZATION
20170242672 · 2017-08-24 ·

Method and system are provided for identifying a processing element for executing a computer program code module. The method includes: calculating a cyclomatic complexity score for the module; selecting one of a first or second processing element based on the calculated complexity score, the first processing element having a first architecture and the second processing element having a second architecture different from the first architecture, the first and second processing elements forming part of a heterogeneous computer system; running the module on the selected processing element to determine a first run time, and subsequently running the module on the non-selected processing element to determine a second run time; comparing the first and second run time to identify a shortest run time; and identifying a processing element producing a shortest run time as the processing element for executing the computer program code module.

System and method of loop vectorization by compressing indexes and data elements from iterations based on a control mask

Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.

High throughput synchronous resource-constrained scheduling for model-based design
09740529 · 2017-08-22 · ·

A system and method for optimizing a system design that includes two or more components, where at least one component is to be implemented using a constrained resource. From an initial schedule, the resource having a longest span time between a start busy time slot and an end busy time slot is identified. The schedule for the other resources is then also extended to the span time. The resulting design can be made synchronous by inserting up-sampler and down-sampler function blocks before and after any strongly connected components.