Patent classifications
H03K19/1731
CLOUD-BASED SCALE-UP SYSTEM COMPOSITION
Technologies for composing a managed node with multiple processors on multiple compute sleds to cooperatively execute a workload include a memory, one or more processors connected to the memory, and an accelerator. The accelerator further includes a coherence logic unit that is configured to receive a node configuration request to execute a workload. The node configuration request identifies the compute sled and a second compute sled to be included in a managed node. The coherence logic unit is further configured to modify a portion of local working data associated with the workload on the compute sled in the memory with the one or more processors of the compute sled, determine coherence data indicative of the modification made by the one or more processors of the compute sled to the local working data in the memory, and send the coherence data to the second compute sled of the managed node.
Overlay architecture for programming FPGAs
An overlay architecture and an associated method that uses datapath merging to provide minimal-overhead support for multiple source netlists, and optionally provides an adjustable amount of flexibility through a secondary interconnect network is disclosed.
Technologies for dividing work across accelerator devices
Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.
TECHNOLOGY MAPPING METHOD OF AN FPGA
A technology mapping method for a FPGA includes converting a gate level netlist into an AND-Inverter Graph (AIG) netlist, selecting a node among nodes included in the AIG netlist, generating a cut set including one or more cuts corresponding to the selected node, selecting a best cut by sorting the cuts included in the cut set according to predetermined criteria and outputting a LUT netlist including the best cut, wherein the predetermined criteria include a maximum difference of levels of sub-cuts connected in each cut as a first criterion.
TECHNOLGIES FOR MILLIMETER WAVE RACK INTERCONNECTS
Racks and rack pods to support a plurality of sleds are disclosed herein. Switches for use in the rack pods are also disclosed herein. A rack comprises a plurality of sleds and a plurality of electromagnetic waveguides. The plurality of sleds are vertically spaced from one another. The plurality of electromagnetic waveguides communicate data signals between the plurality of sleds.
TECHNOLOGIES FOR DIVIDING WORK ACROSS ACCELERATOR DEVICES
Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.
Look up table including magnetic element, FPGA including the look up table, and technology mapping method of the FPGA
A look up table (LUT) includes a decoder configured to decode input signals and to output decoded signals, a storage unit including a plurality of magnetic elements an being configured to select one or more of the plurality of magnetic elements in response to the decoded signals and a signal input/output (TO) unit configured to output an output signal corresponding to the selected one or more magnetic elements and to program the selected one or more magnetic elements by receiving a write signal.
Technologies for providing accelerated functions as a service in a disaggregated architecture
Technologies for providing accelerated functions as a service in a disaggregated architecture include a compute device that is to receive a request for an accelerated task. The task is associated with a kernel usable by an accelerator sled communicatively coupled to the compute device to execute the task. The compute device is further to determine, in response to the request and with a database indicative of kernels and associated accelerator sleds, an accelerator sled that includes an accelerator device configured with the kernel associated with the request. Additionally, the compute device is to assign the task to the determined accelerator sled for execution. Other embodiments are also described and claimed.
Cloud-based scale-up system composition
Technologies for composing a managed node with multiple processors on multiple compute sleds to cooperatively execute a workload include a memory, one or more processors connected to the memory, and an accelerator. The accelerator further includes a coherence logic unit that is configured to receive a node configuration request to execute a workload. The node configuration request identifies the compute sled and a second compute sled to be included in a managed node. The coherence logic unit is further configured to modify a portion of local working data associated with the workload on the compute sled in the memory with the one or more processors of the compute sled, determine coherence data indicative of the modification made by the one or more processors of the compute sled to the local working data in the memory, and send the coherence data to the second compute sled of the managed node.
Technologies for deterministic constant-time data compression
A compute device to generate deterministic compressed streams receives a current string to be matched to one or more prior instances of the current string, the current string being located within an input buffer and the one or more prior instances located within a history buffer. The compute device identifies a limited subset of index memory designated for storing pointers to the prior instances, identifying a reserved slop region in the index memory, and compares the current string to a prior instance, locating the at least one prior instance using at least one pointer to the at least one prior instance. The at least one pointer is stored within the limited subset of the index memory, and the compute device also prohibits use of any pointers stored in the reserved slop region of the index memory. Other embodiments are described and claimed.