G06F15/17337

MANAGING RESOURCE SHARING IN A MULTI-CORE DATA PROCESSING FABRIC
20210055965 · 2021-02-25 ·

Systems and methods provide an extensible, multi-stage, realtime application program processing load adaptive, manycore data processing architecture shared dynamically among instances of parallelized and pipelined application software programs, according to processing load variations of said programs and their tasks and instances, as well as contractual policies. The invented techniques provide, at the same time, both application software development productivity, through presenting for software a simple, virtual static view of the actually dynamically allocated and assigned processing hardware resources, together with high program runtime performance, through scalable pipelined and parallelized program execution with minimized overhead, as well as high resource efficiency, through adaptively optimized processing resource allocation.

Managing resource sharing in a multi-core data processing fabric
10963306 · 2021-03-30 · ·

Systems and methods provide an extensible, multi-stage, realtime application program processing load adaptive, manycore data processing architecture shared dynamically among instances of parallelized and pipelined application software programs, according to processing load variations of said programs and their tasks and instances, as well as contractual policies. The invented techniques provide, at the same time, both application software development productivity, through presenting for software a simple, virtual static view of the actually dynamically allocated and assigned processing hardware resources, together with high program runtime performance, through scalable pipelined and parallelized program execution with minimized overhead, as well as high resource efficiency, through adaptively optimized processing resource allocation.

INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD
20210089343 · 2021-03-25 · ·

An information processing apparatus includes a memory configured to include a reception buffer in which data destined for a virtual machine that operates in the information processing apparatus is written, and a processor coupled to the memory and configured to continuously allocate a first storage area of the reception buffer to a first coprocessor which is an offload destination of a relay process of a virtual switch, and allocate a second storage area of the reception buffer to a second coprocessor which is an offload destination of an extension process of the virtual switch when an allocation request of the reception buffer is received from the second coprocessor.

PROCESSING CIRCUIT, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING METHOD
20210011716 · 2021-01-14 · ·

Information processing circuit includes an accelerator function unit (AFU), an FPGA interface unit (FIU), a tag check unit, and an output control unit. The AFU sequentially obtains write control instructions for a plurality of kinds of data including an output waiting instruction that stops output of a subsequent instruction. The FIU sequentially outputs the write control instructions via a first path or a second path. The tag check unit receives responses to the write control instructions output from the FIU. The output control unit selects one of the first path and the second path based on the storage address of the write control instruction, determines the necessity for mixing write control instructions, mixes write control instructions and causes the FIU to output the result.

DATA PROCESSING UNIT FOR STREAM PROCESSING

A new processing architecture is described that utilizes a data processing unit (DPU). Unlike conventional compute models that are centered around a central processing unit (CPU), the DPU that is designed for a data-centric computing model in which the data processing tasks are centered around the DPU. The DPU may be viewed as a highly programmable, high-performance I/O and data-processing hub designed to aggregate and process network and storage I/O to and from other devices. The DPU comprises a network interface to connect to a network, one or more host interfaces to connect to one or more application processors or storage devices, and a multi-core processor with two or more processing cores executing a run-to-completion data plane operating system and one or more processing cores executing a multi-tasking control plane operating system. The data plane operating system is configured to support software functions for performing the data processing tasks.

Maximizing high link bandwidth utilization through efficient component communication in disaggregated datacenters

Embodiments are provided herein for facilitating high link bandwidth utilization in a disaggregated computing system. A plurality of general purpose links are used to connect respective pluralities of computing elements. A traffic pattern between respective ones of a first plurality of computing elements of a first type and respective ones of a second plurality of computing elements of a second type is detected. The first and second pluralities of computing elements are dynamically connected through the respective ones of the plurality of general purpose links according to the detected traffic pattern.

STREAMING FABRIC INTERFACE
20200327088 · 2020-10-15 ·

An interface for coupling an agent to a fabric supports a load/store interconnect protocol and includes a header channel implemented on a first subset of a plurality of physical lanes, the first subset of lanes including first lanes to carry a header of a packet based on the interconnect protocol and second lanes to carry metadata for the header. The interface additionally includes a data channel implemented on a separate second subset of the plurality of physical lanes, the second subset of lanes including third lanes to carry a payload of the packet and fourth lanes to carry metadata for the payload.

Dynamic memory-based communication in disaggregated datacenters

Embodiments are provided herein for dynamic memory-based communication in a disaggregated computing system. A pool of similar computing elements is configured as a large address space, the large address space segmented by an identifier. Data travel distances are optimized depending on a historical or expected use of a data object by using a grouping and amortization algorithm to relocate the data object within the pool of similar computing elements at a particular address within the large address space according to the historical or expected use.

Task switching and inter-task communications for coordination of applications executing on a multi-user parallel processing architecture
10789099 · 2020-09-29 · ·

System and methods for managing execution of software programs on an array of processing units may involve monitoring an amount of processing input at one or more input buffers buffering processing input for each program, assigning task instances of each program to the array for concurrent processing of the processing input of the programs, adjusting a relative portion of an amount of processing input to be processed by each instance of the one or more assigned task instances of a given program based upon whether, on a prior assignment cycle, more or fewer task instances of the given program had been assigned to the array, and causing connection, in accordance with the assigning, of the processing input from each input buffer to a different unit of the processing units to deliver the processing input to the appropriate program.

TASK SWITCHING AND INTER-TASK COMMUNICATIONS FOR COORDINATION OF APPLICATIONS EXECUTING ON A MULTI-USER PARALLEL PROCESSING ARCHITECTURE
20200285517 · 2020-09-10 ·

Systems and methods provide an extensible, multi-stage, realtime application program processing load adaptive, manycore data processing architecture shared dynamically among instances of parallelized and pipelined application software programs, according to processing load variations of said programs and their tasks and instances, as well as contractual policies. The invented techniques provide, at the same time, both application software development productivity, through presenting for software a simple, virtual static view of the actually dynamically allocated and assigned processing hardware resources, together with high program runtime performance, through scalable pipelined and parallelized program execution with minimized overhead, as well as high resource efficiency, through adaptively optimized processing resource allocation.