Patent classifications
G06F9/3005
Dynamic allocation of executable code for multiarchitecture heterogeneous computing
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
Multi-level workflow scheduling using metaheuristic and heuristic algorithms
Techniques described herein relate to a method for deploying workflows. The method may include receiving, by a global orchestrator of a device ecosystem, a request to execute a workflow; decomposing, by the global orchestrator, the workflow into a plurality of workflow portions; executing, by the global orchestrator, a metaheuristic algorithm to generate a result comprising a plurality of domains of the device ecosystem in which to execute the plurality of workflow portions; and providing, by the global orchestrator, the plurality of workflow portions to respective local orchestrators of the plurality of domains based on the result of executing the metaheuristic algorithm.
Automated management of data transformation flows based on semantics
Various embodiments are provided for intelligent management of data flows in a computing environment by a processor. One or more data transformation in time-series data applications templates may be created and managed according to concepts, one or more instances of the concepts, relationships between the concepts, and a mapping of the concepts to one or more data sources.
Function evaluation using multiple values loaded into registers by a single instruction
A technique for efficient calling of functions on a processor generates an executable program having a function call by analysing an interface for the function that defines an argument expression and an internal value used solely within the function, and an argument declaration defining an argument value to be provided to the function when the program is run. A data structure is generated including the internal value and a resolved argument value derived from the argument expression and the argument value. A single instruction is encoded in the program to utilise the data structure. When the program is executed on a processor, the single instruction causes the processor to load the argument value and internal value from the data structure into registers in the processor, prior to evaluating the function. The function can then be executed without further register loads being performed.
Method and system for instruction block to execution unit grouping
A method for emulating a guest centralized flag architecture by using a native distributed flag architecture. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks, wherein each of the instruction blocks comprise two half blocks; scheduling the instructions of the instruction block to execute in accordance with a scheduler; and using a distributed flag architecture to emulate a centralized flag architecture for the emulation of guest instruction execution.
Overlay layer for network of processor cores
Methods and systems related to the efficient execution of complex computations by a multicore processor and the movement of data among the various processing cores in the multicore processor are disclosed. A multicore processor stack for the multicore processor can include a computation layer, for conducting computations using the processing cores in the multicore processor, with executable instructions for processing pipelines in the processing cores. The multicore processor stack can also include a network-on-chip layer, for connecting the processing cores in the multicore processor, with executable instructions for routers and network interface units in the multicore processor. The computation layer and the network-on-chip layer can be logically isolated by a network-on-chip overlay layer.
Framework integration for instance-attachable accelerator
- Sudipta Sengupta ,
- Poorna Chand Srinivas Perumalla ,
- Jalaja Kurubarahalli ,
- Samuel Oshin ,
- Cory Pruce ,
- Jun Wu ,
- Eftiquar Shaikh ,
- Pragya Agarwal ,
- David Thomas ,
- Karan Kothari ,
- Daniel Evans ,
- Umang Wadhwa ,
- Mark Klunder ,
- Rahul Sharma ,
- Zdravko Pantic ,
- Dominic Rajeev Divakaruni ,
- Andrea Olgiati ,
- Leo Dirac ,
- Nafea Bshara ,
- Bratin Saha ,
- Matthew Wood ,
- Swaminathan Sivasubramanian ,
- Rajankumar Singh
Techniques for partitioning data flow operations between execution on a compute instance and an attached accelerator instance are described. A set of operations supported by the accelerator is obtained. A set of operations associated with the data flow is obtained. An operation in the set of operations associated with the data flow is identified based on the set of operations supported by the accelerator. The accelerator executes the first operation.
APPARATUS FOR ANALYZING NON-INFORMATIVE FIRMWARE AND METHOD USING THE SAME
Disclosed herein are an apparatus for analyzing non-informative firmware and a method using the apparatus. The method includes detecting a target instruction for firmware analysis in a memory map in non-informative firmware, generating an analysis list based on memory map information corresponding to the target instruction, and generating a visualized analysis result corresponding to the firmware by grouping the entries of the analysis list by preset reference bytes.
DYNAMIC ACTION-DRIVEN VISUAL TASK FLOW
Architectures and techniques are presented that provide a dynamic, action-driven visual task flow customization to a software-as-a-service (SaaS) platform. A task element, expressing a task workflow can be modeled as a record of an order data model of the development environment. The task workflow can be executing according to a flow state management procedure, e.g., to manage lifecycle. In response to the flow state management procedure, the record can be updated according to a current state of the task element. These task elements can be correlated to an action or other operation of a flow designer module of the SaaS platform, which can leverage native automation.
Dynamic multi-mode CNN accelerator and operating methods
A convolutional neural network (CNN) operation accelerator comprising a first sub-accelerator and a second sub-accelerator is provided. The first sub-accelerator comprises I units of CNN processor cores, J units of element-wise & quantize processors, and K units of pool and nonlinear function processor. The second sub-accelerator comprises X units of CNN processor cores, Y units of first element-wise & quantize processors, and Z units of pool and nonlinear function processor. The above variables I˜K, X˜Z are all greater than 0, and at least one of the three relations, namely, “I is different from X”, “J is different from Y”, and “K is different from Z”, is satisfied. A to-be-performed CNN operation comprises a first partial CNN operation and a second partial CNN operation. The first sub-accelerator and the second sub-accelerator perform the first partial CNN operation and the second partial CNN operation, respectively.