G06F8/451

Compiler-initiated tile replacement to enable hardware acceleration resources

A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.

METHOD AND SYSTEM FOR PROTOCOL PROCESSING
20230273778 · 2023-08-31 ·

A method of protocol processing including a main program code that has one or more code segments and instructions for processing different protocol elements of a data packet stream of a transport protocol is disclosed herein. The method includes assigning a latency requirement and/or bandwidth requirement to one or more of the code segments of the main program code; and compiling each of the code segments according to the assigned latency and/or bandwidth requirement into a respective target code for executing each of the target codes by different processors.

AUTOMATED DESIGN OF FIELD PROGRAMMABLE GATE ARRAY OR OTHER LOGIC DEVICE BASED ON ARTIFICIAL INTELLIGENCE AND VECTORIZATION OF BEHAVIORAL SOURCE CODE
20220164510 · 2022-05-26 ·

A method includes obtaining behavioral source code defining logic to be performed using at least one logic device, hardware information associated with the at least one logic device, and constraints identifying user requirements associated with the at least one logic device. The method also includes generating a design for the at least one logic device using the behavioral source code, the hardware information, and the constraints. The design enables the at least one logic device to execute the logic while satisfying the user requirements. The design is generated using a machine learning/artificial intelligence (ML/AI) algorithm that iteratively modifies potential designs to meet the user requirements.

Assignment of tasks between a plurality of devices

A method that includes defining a functionality of a system, wherein the system includes at least a first device in a first device category and a second device in a second device category, defining a configuration of the system, determining a plurality of tasks comprised in an application, wherein the application is to be executed by the system, simulating, using a simulation environment, execution of the application in the system when the tasks comprised in the application are assigned to be executed by the first device or the second device, based on the simulation, modifying the assignment of the tasks, distributing the application to the system, and receiving data regarding the execution of the application.

SYSTEM FOR SIMPLIFYING EXECUTABLE INSTRUCTIONS FOR OPTIMISED VERIFIABLE COMPUTATION

The invention relates to distributed ledger technologies such as consensus-based blockchains. Computer-implemented methods for reducing arithmetic circuits derived from smart contracts are described. The invention is implemented using a blockchain network, which may be, for example, a Bitcoin blockchain. A set of conditions encoded in a first programming language is obtained. The set of conditions is converted into a programmatic set of conditions encoded in a second programming language. The programmatic set of conditions is precompiled into precompiled program code. The precompiled program code is transformed into an arithmetic circuit. The arithmetic circuit is reduced to form a reduced arithmetic circuit, and the reduced arithmetic circuit is stored.

METHOD FOR REALIZING NGRAPH FRAMEWORK SUPPORTING FPGA REAR-END DEVICE
20230267024 · 2023-08-24 ·

Disclosed are a method for realizing an nGraph framework supporting an FPGA backend device, and a related apparatus. The method includes: integrating an OpenCL standard API library into an nGraph framework; creating, in the nGraph framework, an FPGA backend device creation module for registering an FPGA rear-end device, initializing an OpenCL environment and acquiring the FPGA backend device; creating, in the nGraph framework, an FPGA buffer space processing module for opening up an FPGA buffer space and for reading and writing an FPGA cache; creating, in the nGraph framework, an OP kernel implementation module for creating an OP kernel and compiling the OP kernel; and creating, in the nGraph framework, an FPGA compiling execution module for registering, scheduling and executing the OP kernel.

Decentralized data processing architecture

A system and method for decentralized data processing includes receiving, by a first data processing unit of a data processing unit array, a user request and sending, by the first data processing unit, the user request to at least one of other data processing units of the data processing unit array. Each of the first data processing unit and the other data processing units include a dedicated non-volatile memory. The system and method also include receiving, by the first data processing unit, a code of execution results from each of the other data processing units that execute the user request, combining, by the first data processing unit, the code of execution results from the each of the other data processing units that execute the user request, and responding, by the first data processing unit, to the user request by transmitting the combined code of execution results.

DISTRIBUTABLE RUNTIME SNAPSHOTS
20230259341 · 2023-08-17 ·

A cloud service computing system can provide a runtime snapshot for an application in response to a request from a client computing system for the application. A cloud service computing system executes the application to an execution state and creates a snapshot that includes information indicating the application objects created by the application during execution of the application and the state of the application objects at the execution state. The snapshot further includes bytecode for the application and may also include configuration settings for the runtime under which the application was executed by the cloud service computing system to generate the snapshot. The client computing system can place the application in a ready to service state by initializing a managed heap with the bytecode and the heap objects based on information contained in the snapshot and placing the heap objects into a state indicated by information contained in the snapshot.

SOFTWARE UPDATE MANAGEMENT DEVICE AND SOFTWARE UPDATE MANAGEMENT METHOD

To ensure efficiency of software update operations while maintaining reliability of an entire network during the software update operations. A software update management apparatus 1 includes a storage unit 12 adapted to divide a network into one or more blocks and store block management information 100 indicating whether each of network devices belonging to each of the resulting blocks is an active device or a standby device; an update instruction receiving unit 101 adapted to receive software update instructions; a software update information generating unit 102 adapted to generate software update information 200; a software updating unit 103 adapted to perform software update processes after transferring traffic to standby devices in same blocks as respective active devices when it is determined that the network devices are active devices according to the software update information 200 and thereby perform the software update processes for active devices or standby devices in different blocks in parallel.

Processor for performing dynamic programming according to an instruction, and a method for configuring a processor for dynamic programming via an instruction
11726757 · 2023-08-15 · ·

The disclosure provides processors that are configured to perform dynamic programming according to an instruction, a method for configuring a processor for dynamic programming according to an instruction and a method of computing a modified Smith Waterman algorithm employing an instruction for configuring a parallel processing unit. In one example, the method for configuring includes: (1) receiving, by execution cores of the processor, an instruction that directs the execution cores to compute a set of recurrence equations employing a matrix, (2) configuring the execution cores, according to the set of recurrence equations, to compute states for elements of the matrix, and (3) storing the computed states for current elements of the matrix in registers of the execution cores, wherein the computed states are determined based on the set of recurrence equations and input data.