G06F8/45

SYSTEM AND METHOD FOR OPTIMIZING ASSESSMENT AND IMPLEMENTATION OF MICROSERVICES CODE FOR CLOUD PLATFORMS

A system and a method for application transformation to cloud by conversion of an application source code to a cloud native code is provided. A first and a second transformation recommendation path is received and a set of remediation templates are applied based on the first and the second transformation recommendation paths where the set of remediation steps comprises pre-defined parameterized actions. The system comprises a microservices unit configured to optimize assessment and implementation of microservices code for multiple target cloud platforms by determining a count of microservices anti-patterns in a microservices code, wherein the anti-patterns represent a pattern of the microservices code and ascertaining a current state of the microservices code by determining a maturity score. A set of repeatable steps associated with microservices code development are provided in a bundled form for accelerated implementation of changes in the microservices code for deployment on the multiple target cloud platforms.

APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The CU is arranged operably to: fetch execution codes; when each execution code is suitable to be executed by the CU, execute the execution code; and when each execution code is not suitable to be executed by the CU, generate a corresponding entry, and send a request with the corresponding entry to the engine for instructing the engine to allow a component inside or outside of the GPU to complete an operation in accordance with the corresponding entry.

Large lookup tables for an image processor
11321802 · 2022-05-03 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for supporting large lookup tables on an image processor. One of the methods includes receiving an input kernel program for an image processor having a two-dimensional array of execution lanes, a shift-register array, and a plurality of memory banks. If the kernel program has an instruction that reads a lookup table value for a lookup table partitioned across the plurality of memory banks, the instruction in the kernel program are replaced with a sequence of instructions that, when executed by an execution lane, causes the execution lane to read a first value from a local memory bank and a second value from the local memory bank on behalf of another execution lane belonging to a different group of execution lanes.

APPARATUS AND METHOD FOR SECONDARY OFFLOADS IN GRAPHICS PROCESSING UNIT

The invention relates to an apparatus for second offloads in a graphics processing unit (GPU). The apparatus includes an engine; and a compute unit (CU). The engine is arranged operably to store an operation table including entries. The CU is arranged operably to fetch computation codes including execution codes, and synchronization requests; execute each execution code; and send requests to the engine in accordance with the synchronization requests for instructing the engine to allow components inside or outside of the GPU to complete operations in accordance with the entries of the operation table.

Merging Skip-Buffers

A method in a reconfigurable computing system includes connecting a plurality of tensor consumers to their corresponding tensor producers via skip-buffers, which generates a plurality of skip-buffers. The method includes determining that at least one skip-buffer of the plurality of skip-buffers corresponding to a first set of tensor consumers and at least one skip-buffer of the plurality of skip-buffers corresponding to a second set of tensor consumers, are compatible to wholly or partially merge. The method also includes merging, wholly or partially, the compatible skip-buffers to produce a merged skip-buffer having a minimal buffer depth. The described method may reduce memory unit consumption and latency.

Compiling method, compiling device, execution method, computer-readable storage medium and computer device

A Flutter-based compiling method, a compiling device, an executing method, a computer-readable storage medium, and a computer device are provided. The Flutter-based compiling method includes: receiving configuration content; in response to the configuration content, compiling and generating an executable file, where the executable file includes at least two of a Native component, a Flutter Native component and a Flutter dynamic component, and is configured to generate a routing table during operation, to enable the Native component, the Flutter Native component and the Flutter dynamic component to communicate with each other through the routing table.

Allocating variables to computer memory
11762641 · 2023-09-19 · ·

A method of allocating variables to computer memory includes determining at compile time when each of the plurality of variables is live in a memory region and allocating a memory region to each variable wherein at least some variables are allocated at compile time to overlapping memory regions to be stored in those memory regions at runtime at non-overlapping times.

MEMORY-BASED DISTRIBUTED PROCESSOR ARCHITECTURE
20210365334 · 2021-11-25 · ·

Distributed processors and methods for compiling code for execution by distributed processors are disclosed. In one implementation, a distributed processor may include a substrate; a memory array disposed on the substrate; and a processing array disposed on the substrate. The memory array may include a plurality of discrete memory banks, and the processing array may include a plurality of processor subunits, each one of the processor subunits being associated with a corresponding, dedicated one of the plurality of discrete memory banks. The distributed processor may further include a first plurality of buses, each connecting one of the plurality of processor subunits to its corresponding, dedicated memory bank, and a second plurality of buses, each connecting one of the plurality of processor subunits to another of the plurality of processor subunits.

Bandwidth-Aware Computational Graph Mapping
20230297349 · 2023-09-21 · ·

A computer-implemented method of transforming a high-level program for mapping onto a coarse-grained reconfigurable (CGR) processor with an array of CGR units, including sectioning a dataflow graph into a plurality of sections; extracting performance information for each of the plurality of sections; on a CGR unit: assigning to a section at least two computations dependent on a first data element; scheduling an additional load of the first data element in response to available memory bandwidth for that section; eliminating a buffer between the additional load of the first data element and one of the two computations, for that section; generating configuration data for the and communication channels, wherein the configuration data, when loaded onto an instance of the array of CGR units, causes the array of CGR units to implement the dataflow graph; and storing the configuration data in a non-transitory computer-readable storage medium.

HETEROGENEITY-AGNOSTIC AND TOPOLOGY-AGNOSTIC DATA PLANE PROGRAMMING
20210365253 · 2021-11-25 ·

The present disclosure provides a compiler operative to convert computer-executable instructions for a network data plane written in a heterogeneity-agnostic and topology-agnostic programming language into an intermediate representation, then compile the intermediate representation into multiple executable representations according to topological constraints of the network. Users may develop software-defined network functionality for a data center network composed of heterogeneous network devices by writing code in a programming language implementing heterogeneity-agnostic and topology-agnostic abstractions, while the compiler synthesizes heterogeneity-dependent and topology-dependent computer-executable object code implementing the software-defined network functionality across network devices of the data center network by analyzing logical dependencies and network topology to determine dependency constraints and resource constraints.