G06F8/456

Information processing method and computer-readable recording medium having stored therein optimization program
11169814 · 2021-11-09 · ·

An information processing method executed by a computer, the method includes executing a target program to acquire number of executions for each of a plurality of program codes; selecting a combination of program codes related to a plurality of assignment statements from among program codes related to assignment statements having a higher number of executions based on the acquired number of executions; when the target program is changed, executing the changed target program to calculate an execution accuracy and an operation time so that parallel processing using an SIMD operation function is executed for each of the program codes related to the plurality of assignment statements included in the selected combination; and searching for the combination so that the calculated execution accuracy and operation time satisfy a predetermined condition.

SPARSITY UNIFORMITY ENFORCEMENT FOR MULTICORE PROCESSOR

Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

COMPUTER SYSTEM AND METHOD FOR VALIDATION OF PARALLELIZED COMPUTER PROGRAMS
20230153112 · 2023-05-18 ·

Validation of correct derivation of a parallel program from a sequential program for deployment of the parallel program to a plurality of processing units is described. The system receives the program code of the sequential program and the program code of the parallel program. A static analysis component computes a first control flow graph, and determines dependencies within the sequential program code. It further computes a further control flow graph for each thread or process of the parallel program and determines dependencies within the further control flow graphs. A checking component checks if the sequential program and the derived parallel program are semantically equivalent by comparing the respective first and further control flow graphs and respective dependencies. A release component declares a correct derivation state for the parallel program to qualify the parallel program for deployment if the derived parallel program and the sequential program are semantically equivalent.

Offload computing protocol

Systems and methods for are provided for offloading computing tasks from constrained devices. An example apparatus includes an offload computing protocol (OCP) enabled device. The OCP enabled device includes OCP extensions to the operating system to enable the offloading of computing tasks. A proximity locator may use a radio transceiver to locate an OCP device that can accept a computing task. The OCP enabled device may include an OCP bundle comprising code and data, wherein the OCP bundle is to be sent to the OCP device.

Sparsity Uniformity Enforcement for Multicore Processor

Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

EDIT AUTOMATION USING AN ANCHOR TARGET LIST

Edit automation functionality generalizes edits performed by a user in a document, locates similar text, and recommends or applies transforms while staying within a current workflow. Source code edits such as refactoring are automated. The functionality uses or provides anchor target lists, temporal edit patterns, edit graphs, automatable edit sequence libraries, and other data structures and computational techniques for identifying locations appropriate for particular edits, for getting transforms, for selecting optimal transforms, for leveraging transforms in an editing session or later, and for displaying transform recommendations and results. The edit automation functionality enhances automation subtool generation, discoverability, and flexibility, for refactoring, snippet insertion, quick actions in an integrated development environment, and other automatable edit sequences.

Instruction Set

The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.

Automated expression parallelization

A system is capable of automatically adjusting or reconstructing a baseline expression to generate a parallelized expression. Evaluation of the parallelized expression provide a substantially similar output as the evaluation of the baseline query in more efficient manner. In some implementations, data indicating an expression to be evaluated on a primary thread of the one or more processors is obtained. Elements of the expression are identified. The elements are grouped into a parse tree representation. Elements of the expression are classified as belonging to either a first category that includes elements that are eligible for parallel processing or a second category that includes elements that are not eligible for parallel processing. A particular element that is classified as belonging to the first category is identified and evaluated on a non-primary thread of the one or more processors. The non-primary thread is evaluated in parallel with the primary thread.

System and method of optimizing instructions for quantum computers

A quantum computing system includes a quantum processor having a plurality of qubits, a classical memory, and a classical processor. The classical processor is configured to compile a quantum program into logical assembly instructions in an intermediate language, aggregate the logical assembly instructions together into a plurality of logical blocks of instructions, generate a logical schedule for the quantum program based on commutativity between the plurality of logical blocks, generate a tentative physical schedule based on the logical schedule, the tentative physical schedule includes a mapping of the logical assembly instructions in the logical schedule onto the plurality of qubits of the quantum processor, aggregate instructions together within the tentative physical schedule that do not reduce parallelism, thereby generating an updated physical schedule; generate optimized control pulses for the aggregated instructions, and execute the quantum program on the quantum processor with the optimized control pulses and the updated physical schedule.

BACKGROUND PROCESSING DURING REMOTE MEMORY ACCESS
20220214827 · 2022-07-07 · ·

An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.