G06F15/7889

Flexible physical function and virtual function mapping

Techniques and mechanisms provide a flexible mapping for physical functions and virtual functions in an environment including virtual machines.

Method, apparatus and system for data stream processing with a programmable accelerator
09830154 · 2017-11-28 · ·

Techniques and mechanisms for programming an accelerator device to enable performance of a data processing algorithm. In an embodiment, an accelerator of a computer platform is programmed based on programming information received from a host processor of the computer platform. In another embodiment, programming of the accelerator is to enable data driven execution of an instruction by a data stream processing engine of the accelerator.

MULTIPLE DIES HARDWARE PROCESSORS AND METHODS

Methods and apparatuses relating to hardware processors with multiple interconnected dies are described. In one embodiment, a hardware processor includes a plurality of physically separate dies, and an interconnect to electrically couple the plurality of physically separate dies together. In another embodiment, a method to create a hardware processor includes providing a plurality of physically separate dies, and electrically coupling the plurality of physically separate dies together with an interconnect.

System Having a Hybrid Threading Processor, a Hybrid Threading Fabric Having Configurable Computing Elements, and a Hybrid Interconnection Network
20230168897 · 2023-06-01 ·

Representative apparatus, method, and system embodiments are disclosed for configurable computing. In a representative embodiment, a system includes an interconnection network, a processor, a host interface, and a configurable circuit cluster. The configurable circuit cluster may include a plurality of configurable circuits arranged in an array; an asynchronous packet network and a synchronous network coupled to each configurable circuit of the array; and a memory interface circuit and a dispatch interface circuit coupled to the asynchronous packet network and to the interconnection network. Each configurable circuit includes instruction or configuration memories for selection of a current data path configuration, a master synchronous network input, and a data path configuration for a next configurable circuit.

RISC-V-BASED 3D INTERCONNECTED MULTI-CORE PROCESSOR ARCHITECTURE AND WORKING METHOD THEREOF

An RISC-V-based 3D interconnected multi-core processor architecture and a working method thereof. The RISC-V-based 3D interconnected multi-core processor architecture includes a main control layer, a micro core array layer and an accelerator layer, wherein the main control layer includes a plurality of main cores which are RISC-V instruction set CPU cores, the micro core array layer includes a plurality of micro unit groups including a micro core, a data storage unit, an instruction storage unit and a linking controller, wherein the micro core is an RISC-V instruction set CPU core that executes partial functions of the main core; the accelerator layer is configured to optimize a running speed of space utilization for accelerators meeting specific requirements, wherein some main cores in the main control layer perform data interaction with the accelerator layer, the other main cores interact with the micro core array layer.

FPGA coprocessor with sparsity and density modules for execution of low and high parallelism portions of graph traversals

An FPGA-based graph data processing method is provided for executing graph traversals on a graph having characteristics of a small-world network by using a first processor being a CPU and a second processor that is a FPGA and is in communicative connection with the first processor, wherein the first processor sends graph data to be traversed to the second processor, and obtains result data of the graph traversals from the second processor for result output after the second processor has completed the graph traversals of the graph data by executing level traversals, and the second processor comprises a sparsity processing module and a density processing module, the sparsity processing module operates in a beginning stage and/or an ending stage of the graph traversals, and the density processing module with a higher degree of parallelism than the sparsity processing module operates in the intermediate stage of the graph traversals.

COMPUTING ACCELERATION FRAMEWORK
20220057997 · 2022-02-24 · ·

A processing acceleration system including at least one gate array that performs finite field arithmetic and at least one controller that sends information to the gate array(s) upon a determination that sending the information, performing the finite field arithmetic by the gate array(s), and sending results of the finite field arithmetic to at least one destination is more efficient than general-purpose computing processor(s) performing the finite field arithmetic and sending the results to the at least one destination. The gate array(s) may include field programmable gate array(s), and the destination(s) may include the general-purpose computing processor(s) or storage devices. The finite field arithmetic may include galois field arithmetic such as modular arithmetic, for example as may be used with respect to erasure coding for storage device(s).

Hardware accelerator configuration by a translation of configuration data
09785444 · 2017-10-10 · ·

A microprocessor circuit may include a software programmable microprocessor core and a data memory accessible via a data memory bus. The data memory may include sets of configuration data structured according to respective predetermined data structure specifications for configurable math hardware accelerators, and sets of input data for configurable math hardware accelerators, each configured to apply a predetermined signal processing function to the set of input data according to received configuration data. A configuration controller is coupled to the data memory via the data memory bus and to the configurable math hardware accelerators. The configuration controller may fetch the configuration data for each math hardware accelerator from the data memory and translate the configuration data. The configuration controller may transmit each set of configuration data to the corresponding configurable math hardware accelerator and write the configuration data to configuration registers of the math hardware accelerator.

MULTIPLE DIES HARDWARE PROCESSORS AND METHODS

Methods and apparatuses relating to hardware processors with multiple interconnected dies are described. In one embodiment, a hardware processor includes a plurality of physically separate dies, and an interconnect to electrically couple the plurality of physically separate dies together. In another embodiment, a method to create a hardware processor includes providing a plurality of physically separate dies, and electrically coupling the plurality of physically separate dies together with an interconnect.

Time-Multiplexed use of Reconfigurable Hardware

A method for executing applications in a system comprising general hardware and reconfigurable hardware includes accessing a first execution file comprising metadata storing a first priority indicator associated with a first application, and a second execution file comprising metadata storing a second priority indicator associated with a second application. In an example, use of the reconfigurable hardware is interleaved between the first application and the second application, and the interleaving is scheduled to take into account (i) workload of the reconfigurable hardware and (ii) the first priority indicator and the second priority indicator associated with the first application and the second application, respectively. In an example, when the reconfigurable hardware is used by one of the first and second applications, the general hardware is used by another of the first and second applications.