G06F9/45

Extensible data parallel semantics

A high level programming language provides extensible data parallel semantics. User code specifies hardware and software resources for executing data parallel code using a compute device object and a resource view object. The user code uses the objects and semantic metadata to allow execution by new and/or updated types of compute nodes and new and/or updated types of runtime libraries. The extensible data parallel semantics allow the user code to be executed by the new and/or updated types of compute nodes and runtime libraries.

Method and apparatus for performing register allocation

A method is provided of performing register allocation for at least one program code module. The method includes constructing a restriction graph for program variables within at least one program instruction, and determining whether the constructed restriction graph is colorable. If it is determined that the constructed restriction graph is not colorable, then the method determines whether at least one alternative form of the at least one program instruction is available, and modifies the at least one program instruction to comprise an alternative form if it is determined that at least one alternative form is available.

Information processing apparatus, communication method and information processing system for communication of global data shared by information processing apparatuses
09841919 · 2017-12-12 · ·

An information processing apparatus, among a plurality of information processing apparatuses, to which one of pieces of local data is assigned, the pieces of local data having been obtained by dividing global data shared by the plurality of information processing apparatuses, includes: a storage unit that includes a first storage area sectioned into prescribed units, and stores local data; a processor that executes a process including: detecting a plurality of continuous sections to which the target local data is to be written in a second storage area that is sectioned into the prescribed units in the different information processing apparatus, on the basis of storage area information that identifies data to which the target local data corresponds in the global data; and extracting as many pieces of local data as specified by the number of the continuous sections and transmitting the data to the different information processing apparatus.

Fine-grained demand driven IPO infrastructure

Provided are methods and systems for inter-procedural optimization (IPO). A new IPO architecture (referred to as “ThinLTO”) is designed to address the weaknesses and limitations of existing IPO approaches, such as traditional Link Time Optimization (LTO) and Lightweight Inter-Procedural Optimization (LIPO), and become a new link-time-optimization standard. With ThinLTO, demand-driven and summary-based fine grain importing maximizes the potential of Cross-Module Optimization (CMO), which enables as much useful CMO as possible ThinLTO also provides for global indexing, which enables fast function importing; parallelizes some performance-critical but expensive inter-procedural analyses and transformations; utilizes demand-driven, lazy importing of debug information that minimizes memory consumption for the debug build; and allows easy integration of third-party distributed build systems. In addition, ThinLTO may also be implemented using an IPO server, thereby removing the need for the serial step.

Apparatus and method for handling registers in pipeline processing
09841957 · 2017-12-12 · ·

An apparatus stores a program including a description of loop processing of iterating a plurality of instructions, and rearranges an execution sequence of the plurality of instructions in the program such that the loop processing is pipelined by software pipeline. The apparatus inserts an instruction to use a register for single instruction multiple data (SIMD) extension instruction, into the description of the loop processing in the program.

Method and system for automatic code generation

A method for generating production code from a block diagram in a technical computing environment on a host computer. A first block receives a first input signal that has a plurality of elements. A size of a first required signal of the external function is determined and compared to a size of the first input signal. When the size of the first required signal corresponds to the size of an element in the first input signal a production code is generated enclosing a call of the external function by a loop consecutively addressing each of the plurality of elements in the first input signal. When the size of the first required signal corresponds to the size of the first input signal a production code is generated having a call of the external function without enclosing loop over the elements in the first input signal.

Embedding software updates into content retrieved by applications
09841971 · 2017-12-12 · ·

The disclosed embodiments provide a system that facilitates execution of an application that is updated after undergoing an approval process with a digital application distribution platform on an electronic device. During operation, the system obtains content for display within the application from a server. Next, the system identifies, within the content, an update to the application. The system then modifies execution of the application during runtime of the application by applying the update without reloading the application on the electronic device and without downloading the update from the digital application distribution platform.

FORMAT-SPECIFIC DATA PROCESSING OPERATIONS
20170351494 · 2017-12-07 ·

A method includes analyzing, by a processor, a first version of a computer program, the analyzing including identifying a first process included in the first version of the computer program, the first process configured to perform an operation on data having a first format; and by a processor, generating a second version of at least a portion of the computer program, including omitting the first process and including in the second version of the at least portion of the computer program one or more second processes configured to perform a second operation on data of a second format different from the first format, wherein the second operation is based on the first operation.

PROCESSOR THAT INCLUDES A SPECIAL STORE INSTRUCTION USED IN REGIONS OF A COMPUTER PROGRAM WHERE MEMORY ALIASING MAY OCCUR
20170351496 · 2017-12-07 ·

Processor hardware detects when memory aliasing occurs, and assures proper operation of the code even in the presence of memory aliasing. The processor defines a special store instruction that is different from a regular store instruction. The special store instruction is used in regions of the computer program where memory aliasing may occur. Because the hardware can detect and correct for memory aliasing, this allows a compiler to make optimizations such as register promotion even in regions of the code where memory aliasing may occur.

GENERATING EXECUTABLE FILES THROUGH COMPILER OPTIMIZATION
20170351499 · 2017-12-07 ·

Embodiments of the present invention may track a user's interaction trajectory associated with a problem occurred on a website. According to an embodiment of the present invention, a first symbol of a first definition associated with a first object file is obtained. Then, in response to the first symbol matching a second symbol of a second definition associated with a second object file, the first object file is optimized based on a first segment associated with the first definition in the first object file and an optimization to the second object file is skipped. Next, an executable file is generated based on the optimized first object file and the second object file.