G06F8/453

Method and apparatus for compiling code based on a dependency tree
09823911 · 2017-11-21 · ·

A compiling apparatus generates a dependency tree representing dependency relations among a plurality of instructions included in first code. The compiling apparatus detects, from the dependency tree, a partial tree including a first instruction, a second instruction, and a third instruction that depends on the operation results of the first and second instructions, and rewrites the instructions corresponding to the partial tree to a set of instructions including a plurality of complex instructions each of which causes a processor to perform a complex operation including a plurality of operations. The compiling apparatus generates second code on the basis of the dependency tree and the set of instructions.

Hardware acceleration method, compiler, and device

A hardware acceleration method includes: obtaining compilation policy information and a source code, where the compilation policy information indicates that a first code type matches a first processor and a second code type matches a second processor, analyzing a code segment in the source code according to the compilation policy information, determining a first code segment belonging to the first code type or a second code segment belonging to the second code type, compiling the first code segment into a first executable code, sending the first executable code to the first processor, compiling the second code segment into a second executable code, and sending the second executable code to the second processor.

SYSTEMS AND METHODS FOR MINIMIZING COMMUNICATIONS

A system for allocation of one or more data structures used in a program across a number of processing units takes into account a memory access pattern of the data structure, and the amount of total memory available for duplication across the several processing units. Using these parameters duplication factors are determined for the one or more data structures such that the cost of remote communication is minimized when the data structures are duplicated according to the respective duplication factors while allowing parallel execution of the program.

PARALLEL DECOMPOSITION AND RESTORATION OF DATA CHUNKS
20220043806 · 2022-02-10 ·

A system for parallel decomposition and restoration of data chunks is provided, wherein a decomposable transformer service module analyzes and decomposes data into data chunks and transformations that restore the original data from the chunks, enabling efficient storage, modification, and restoration of program code across a number of target devices using a central repository.

Accelerating application modernization

Various embodiments of the present technology generally relate to the characterization and improvement of software applications. More specifically, some embodiments relate to systems and methods for modeling code behavior and generating new versions of the code based on the code behavior models. In some embodiments, a method of improving a codebase includes recording a run of the existing code, characterizing the code behavior via one or more models, prototyping new code according to a target language and target environment, deploying the new code to the target environment, and comparing the behavior of the new code to the behavior of the existing code. In some implementations, generating new code based on the behavior models includes using one or more machine learning techniques for code generation based on the target language and environment.

Cognitive automation-based engine to propagate data across systems

Aspects of the disclosure relate to cognitive automation-based engine processing to propagate data across multiple systems via a private network to overcome technical system, resource consumption, and architecture limitations. Data to be propagated can be manually input or extracted from a digital file. The data can be parsed by analyzing for correct syntax, normalized into first through sixth normal forms, segmented into packets for efficient data transmission, validated to ensure that the data satisfies defined formats and input criteria, and distributed into a plurality of data stores coupled to the private network, thereby propagating data without repetitive manual entry. The data may also be enriched by, for example, correcting for any errors or linking with other potentially related data. Based on data enrichment, recommendations of additional target(s) for propagation of data can be identified. Reports may also be generated. The cognitive automation may be performed in real-time to expedite processing.

Methods and apparatus for automatic communication optimizations in a compiler based on a polyhedral representation

Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least one local memory unit that allows for data reuse opportunities. The first custom computing apparatus optimizes the code for reduced communication execution on the second computing apparatus.

CLOCK GATING AND CLOCK SCALING BASED ON RUNTIME APPLICATION TASK GRAPH INFORMATION

An apparatus to facilitate clock gating and clock scaling based on runtime application task graph information is disclosed. The apparatus includes a processor to: receive, from a compiler, a bitstream generated from code of an application, the bitstream related to a workload of the application; generate a task graph of the application using at least part of the bitstream, the task graph to represent one of a relationship and dependency of the code; program the bitstream to an accelerator device, wherein the bitstream to configure the accelerator device to support the workload of the application; execute one or more kernels of the code using the accelerator device; identify one or more optimizations for the accelerator device based on the task graph of the application; and transmit a command to cause the one or more optimizations to be implemented in the at least one region of the accelerator device.

REMOTE APPLICATION MODERNIZATION

Various embodiments of the present technology generally relate to the characterization and improvement of software applications. More specifically, some embodiments relate to systems and methods for modeling code behavior and generating new versions of the code based on the code behavior models. In some embodiments, a method of improving a codebase includes recording a run of the existing code, characterizing the code behavior via one or more models, prototyping new code according to a target language and target environment, deploying the new code to the target environment, and comparing the behavior of the new code to the behavior of the existing code. In some implementations, generating new code based on the behavior models includes using one or more machine learning techniques for code generation based on the target language and environment.

Head Of Line Blocking Mitigation In A Reconfigurable Data Processor

A coarse-grained reconfigurable (CGR) processor comprises a first network and a second network; a plurality of agents coupled to the first network; an array of CGR units coupled together by the second network; and a tile agent coupled between the first network and the second network. The tile agent comprises a plurality of links, a plurality of credit counters associated with respective agents of the plurality of agents, a plurality of credit-hog counters associated with respective links of the plurality of links, and an arbiter to manage access to the first network from the plurality of links based their associated credit-hog counters. Furthermore, a credit-hog counter of the plurality of credit-hog counters changes in response to processing a request for a transaction from its associated link.