Patent classifications
G06F15/8061
TECHNIQUES TO CONTROL SYSTEM UPDATES AND CONFIGURATION CHANGES VIA THE CLOUD
Embodiments are generally directed apparatuses, methods, techniques and so forth determine an access level of operation based on an indication received via one or more network links from a pod management controller, and enable or disable a firmware update capability for a firmware device based on the access level of operation, the firmware update capability to change firmware for the firmware device. Embodiments may also include determining one or more configuration settings of a plurality of configuration settings to enable for configuration based on the access level of operation, and enable configuration of the one or more configuration settings.
Parallel merge sorter circuit
A merge sort circuit can include a parallel merge sort core that performs a partial merge on two input tuples, each containing a number P of data elements sorted according to a sort key, to produce a sorted output tuple of P data elements. Input data blocks to be merged can be stored in first and second block buffers. The block buffers can receive data from a vector memory read interface that reads groups of at least P data elements at a time. Loading of data elements into the block buffers can be based on respective fill levels of the block buffers.
PARALLEL MERGE SORTER CIRCUIT
A merge sort circuit can include a parallel merge sort core that performs a partial merge on two input tuples, each containing a number P of data elements sorted according to a sort key, to produce a sorted output tuple of P data elements. Input data blocks to be merged can be stored in first and second block buffers. The block buffers can receive data from a vector memory read interface that reads groups of at least P data elements at a time. Loading of data elements into the block buffers can be based on respective fill levels of the block buffers.
Methods and apparatus for a vector memory subsystem for use with a programmable mixed-radix DFT/IDFT processor
A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output memory addresses that are used to unload vector data from the memory banks. The input memory addresses are used to shuffle the input data in the memory bank based on a radix factorization associated with an N-point DFT, and the output memory addresses are used to unload the vector data from the memory bank to compute radix factors of the radix factorization.
Tensor partitioning and partition access order
A method of processing partitions of a tensor in a target order includes receiving, by a reorder unit and from two or more producer units, a plurality of partitions of a tensor in a first order that is different from the target order, storing the plurality of partitions in the reorder unit, and providing, from the reorder unit, the plurality of partitions in the target order to one or more consumer units. In an example, the one or more consumer units process the plurality of partitions in the target order.
TECHNOLOGIES FOR DYNAMICALLY MANAGING RESOURCES IN DISAGGREGATED ACCELERATORS
Technologies for dynamically managing resources in disaggregated accelerators include an accelerator. The accelerator includes acceleration circuitry with multiple logic portions, each capable of executing a different workload. Additionally, the accelerator includes communication circuitry to receive a workload to be executed by a logic portion of the accelerator and a dynamic resource allocation logic unit to identify a resource utilization threshold associated with one or more shared resources of the accelerator to be used by a logic portion in the execution of the workload, limit, as a function of the resource utilization threshold, the utilization of the one or more shared resources by the logic portion as the logic portion executes the workload, and subsequently adjust the resource utilization threshold as the workload is executed. Other embodiments are also described and claimed.
DATA PROCESSING ENGINE TILE ARCHITECTURE FOR AN INTEGRATED CIRCUIT
An example data processing engine (DPE) for a DPE array in an integrated circuit (IC) includes: a core; a memory including a data memory and a program memory, the program memory coupled to the core, the data memory coupled to the core and including at least one connection to a respective at least one additional core external to the DPE; support circuitry including hardware synchronization circuitry and direct memory access (DMA) circuitry each coupled to the data memory; streaming interconnect coupled to the DMA circuitry and the core; and memory-mapped interconnect coupled to the core, the memory, and the support circuitry.
Robotically serviceable computing rack and sleds
Examples may include racks for a data center and sleds for the racks, the sleds arranged to house physical resources for the data center. The sleds and racks can be arranged to be autonomously manipulated, such as, by a robot. The sleds and racks can include features to facilitate automated installation, removal, maintenance, and manipulation by a robot.
TECHNIQUES TO CONFIGURE PHYSICAL COMPUTE RESOURCES FOR WORKLOADS VIA CIRCUIT SWITCHING
Embodiments are generally directed apparatuses, methods, techniques and so forth to select two or more processing units of the plurality of processing units to process a workload, and configure a circuit switch to link the two or more processing units to process the workload, the two or more processing units each linked to each other via paths of communication and the circuit switch.
Reconfigurable Parallel Processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.