Patent classifications
G06F9/30123
System-on-a-Chip (SoC) Architecture for Low Power State Communication
An apparatus to facilitate a system-on-a-chip (SoC) architecture for low power state communication is disclosed. The apparatus includes a low power state fabric to provide a low power state path that avoids compute processing resources of the apparatus, and a low power state agent circuitry communicably coupled to the low power state fabric to update, in response to initiation of a low power state in the apparatus, a configuration of routers of the low power state fabric to utilize the low power state path provided by the low power state fabric, and to route memory transactions to the low power state path while the apparatus is in the low power state.
STOCHASTIC COMPILATION OF MULTIPLEXED QUANTUM ROTATIONS
A quantum-computation method comprises (a) sampling a random bit string from a predetermined distribution of bit strings, where each bit of the random bit string enables or disables a corresponding fixed-angle rotation of a state vector and where the product of the enabled fixed-angle rotations approximates an arbitrary rotation of the state vector through an angle of a multiplexed-rotation gate; and (b) enacting on the state vector each of the fixed-angle rotations enabled by a corresponding bit of the bit string.
INDUSTRIAL INTERNET OF THINGS AIOPS WORKFLOWS
Data flows and data processing modules are provided to fulfill the implementation of: contextualized data collection, scalable analytics, and machine learning operations quality assurance. These modules may be implemented as standalone components, and/or bundled as a group of coordinated microservices. The conversion of raw domain expertise and knowledge into data processors, analytics, and visualization modules for industry oriented Industrial Internet of Things (IIoT) solutions is streamlined.
Handling exceptions in a multi-tile processing arrangement
A multitile processing system has an execution unit on each tile, and an interconnect which conducts communications between the tiles according to a bulk synchronous parallel scheme. Each tile performs an on-tile compute phase followed by an intertile exchange phase, where the exchange phase is held back until all tiles in a particular group have completed the compute phase. On completion of the compute phase, each tile generates a synchronisation request and pauses an issue of instructions until it receives a synchronisation acknowledgement. If a tile attains an excepted state, it raises an exception signal and pauses instruction issue until the excepted state has been resolved. However, tiles which are not in the excepted state can continue to perform their on-tile computer phase, and will issue their own synchronisation request in their own normal time frame. Synchronisation acknowledgements will not be received from all of the tiles in the group until the excepted state has been resolved on the tile with the excepted state.
SYSTEM AND METHOD FOR USING VIRTUAL VECTOR REGISTER FILES
Described is a system and method for using virtual vector register files. In particular, a graphics processor includes a logic unit, a virtual vector register file coupled to the logic unit, a vector register backing store coupled to the virtual vector register file, and a virtual vector register file controller coupled to the virtual vector register file. The virtual vector register file includes a N deep vector register file and a M deep vector register file, where N is less than M. The virtual vector register file controller performing eviction and allocation between the N deep vector register file, the M deep vector register file and the vector register backing store dependent on at least access requests for certain vector registers.
Elevated Isolation of Reconfigurable Data Flow Resources in Cloud Computing
A data processing system includes a runtime processor and a pool of reconfigurable data flow resources with memory units, busses, and arrays of physical configurable units. The runtime processor is operatively coupled to the pool of reconfigurable data flow resources and configured to load first and second configuration files for executing first and second user applications on first and second subsets of the arrays of physical configurable units and to assign first and second subsets of the memory units to the first and second user applications. The runtime processor starts execution of the first and second user applications on the first and second subsets of the arrays of physical configurable units, prevents the first user application from accessing the resources allocated to the second user application, and prevents the second user application from accessing resources allocated to the first user application.
Virtual network pre-arbitration for deadlock avoidance and enhanced performance
A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to, in a first clock cycle, a pre-arbitration winner between a first memory access request and a second memory access request based on a first number of credits allocated to a first destination device and a second number of credits allocated to a second destination device. The arbiter circuit is further configured to, in a second clock cycle select a final arbitration winner from among the pre-arbitration winner and a subsequent memory access request based on a comparison of a priority of the pre-arbitration winner and a priority of the subsequent memory access request.
METHOD AND APPARATUS FOR EFFICIENT PROGRAMMABLE INSTRUCTIONS IN COMPUTER SYSTEMS
Systems, apparatuses, and methods for implementing as part of a processor pipeline a reprogrammable execution unit capable of executing specialized instructions are disclosed. A processor includes one or more reprogrammable execution units which can be programmed to execute different types of customized instructions. When the processor loads a program for execution, the processor loads a bitfile associated with the program. The processor programs a reprogrammable execution unit with the bitfile so that the reprogrammable execution unit is capable of executing specialized instructions associated with the program. During execution, a dispatch unit dispatches the specialized instructions to the reprogrammable execution unit for execution. The results of other instructions, such as integer and floating point instructions, are available immediately to instructions executing on the reprogrammable execution unit since the reprogrammable execution unit shares the processor registers with the integer and floating point execution units.
Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines
A system for executing instructions using a plurality of register file segments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality register file segments are coupled to the partitionable engines for providing data storage.
Scheduling resuming of ready to run virtual processors in a distributed system
Dynamic scheduling is disclosed. A plurality of physical nodes is included in a computer system. Each node includes a plurality of processors. Each processor includes a plurality of hyperthreads. In response to receiving an indication of an event occurring, a search is performed for a queue in a set of queues on which to place a virtual processor that had been waiting on the event. Queues in the set of queues correspond to hyperthreads in a physical node in the plurality of physical nodes. The queues in the set of queues are visited according to a predetermined traversal order.