Patent classifications
G06F9/544
MEMORY ALLOCATION FOR DISTRIBUTED PROCESSING DEVICES
Examples described herein relate to an offload processor to receive data for transmission using a network interface or received in a packet by a network interface. In some examples, the offload processor can include a packet storage controller to determine whether to store data in a buffer of the offload processing device or a system memory after processing by the offload processing device. In some examples, determine whether to store data in a buffer of the offload processor or a system memory is based on one or more of: available buffer space, latency limit associated with the data, priority associated with the data, or available bandwidth through an interface between the buffer and the system memory. In some examples, the offload processor is to receive a descriptor and specify a storage location of data in the descriptor, wherein the storage location is within the buffer or the system memory.
EVENT-DRIVEN PROGRAMMING MODEL BASED ON ASYNCHRONOUS, MASSIVELY PARALLEL DATAFLOW PROCESSES FOR HIGHLY-SCALABLE DISTRIBUTED APPLICATIONS
An example method comprises receiving one or more published events by an event hook application program interface (API) from one or more client applications, passing a model to a web server configured to generate web containers in concurrent threads, receiving, by any number of worker nodes, each web container, each of the worker nodes including a system agent program for dynamically assigned functions, the web containers being provided to the any number of worker nodes for logical isolation of system agent execution in memory, and performing the dynamically assigned functions by the system agent program in a blackboard memory, the blackboard memory being a shared memory with non-blocking reads and writes and performing functionality, the dynamically assigned functions being executed in parallel and at least two of the dynamically assigned functions sharing context between inter-dependent processes.
EXTENSIBLE MULTI-PRECISION DATA PIPELINE FOR COMPUTING NON-LINEAR AND ARITHMETIC FUNCTIONS IN ARTIFICIAL NEURAL NETWORKS
An extensible multi-precision data pipeline system, comprising, a local buffer that stores an input local data set in a local storage format, an input tensor shaper coupled to the local buffer that reads the input local data set and converts the input local data set into an input tensor data set having a tensor format of vector width N by tensor length L, a cascaded pipeline coupled to the input tensor shaper that routes the input tensor data set through at least one function stage resulting in an output tensor data set, an output tensor shaper coupled to the cascaded pipeline that converts the output tensor data set into an output local data set having the local storage format and wherein the output tensor shaper writes the output local data set to the local buffer.
ACCELERATION CIRCUITRY FOR POSIT OPERATIONS
Systems, apparatuses, and methods related to acceleration circuitry for posit operations are described. Signaling indicative of performance of an operation to write a first bit string to a first buffer resident on acceleration circuitry and a second bit string resident on the acceleration circuitry can be received at an DMA controller couplable to the acceleration circuitry. The acceleration circuitry can be configured to perform arithmetic operations, logical operations, or both on bit strings formatted in a unum or posit format. Signaling indicative of an arithmetic operation, a logical operation, or both, to be performed using the first and second bit strings can be transmitted to the acceleration circuitry. The arithmetic operation, the logical operation, or both can be performed via the acceleration circuitry and according to the signaling. Signaling indicative of a result of the arithmetic operation, the logical operation, or both can be transmitting to the DMA controller.
Multi-channel Data Path Circuitry
Techniques are disclosed relating to sharing datapath circuitry among multiple SIMD groups. In some embodiments, pipeline circuitry is configured to perform operations specified by instructions of first and second assigned SIMD groups. The pipeline circuitry may include first and second front-end circuitry configured to decode instructions of the respective SIMD groups. The pipeline circuitry may include shared execution circuitry configured to perform operations specified by the first and second assigned SIMD groups and arbitration circuitry configured to select an instruction from among at least the first and second front-end circuitry for assignment to the shared execution circuitry in a current cycle. The arbitration circuitry may select an instruction based on one or more of: stall counts, whether available instructions are being speculatively executed, whether ones of available instructions target a particular portion of the shared execution circuitry, numbers of execution cycles, and SIMD group ages.
AUTOMATICALLY INTRODUCING REGISTER DEPENDENCIES TO TESTS
Method, apparatus and product for automatically introducing register dependency into tests. A test template represents an abstract test scenario to be utilized for testing a target processor. The abstract test scenario requires that a value be assigned to a register. A test that implements the abstract test scenario is generated. The test is a set of instructions that are executable by the target processor. The generation of the test comprises: determining a memory address to retain the value in a memory that is accessible to the target processor; and adding to the test an instruction to load to the register the value from the memory address, whereby adding a register dependency to the test that is not required by the abstract test scenario. The test can be executed on the target processor or simulation thereof.
METHOD AND SYSTEM FOR FACILITATING DATA PLACEMENT AND CONTROL OF PHYSICAL ADDRESSES WITH MULTI-QUEUE I/O BLOCKS
A system is provided to receive a request to write a sector of data to a non-volatile storage device, wherein the request is associated with a physical address in the non-volatile storage device at which the sector of data is to be written. The system identifies, based on the physical address, a channel buffer to which the sector of data is to be transmitted, and stores the sector of data in the channel buffer. Responsive to determining that the channel buffer stores other sectors, the system writes the sector of data and the other sectors of data to the non-volatile storage device based on the physical address.
COMPUTING SYSTEM FOR MACRO GENERATION, MODIFICATION, VERIFICATION, AND EXECUTION
An automation application is described herein. The automation application executes on a computing device and accesses a macro for a target application. The macro has been generated based upon a sequence of inputs from a user received by the target application that causes the target application to perform an action, screen states of the target application as the target application receives the sequence of inputs from the user, operating system processes that are performed by an operating system as the target application receive the sequence of inputs from the user, and evidence events representing information obtained from the operating system processes. The automation application executes the macro, wherein executing the macro causes the automation application to mimic the sequence of inputs to the target application, thereby causing the target application to perform the action.
SYSTEM AND METHOD FOR EFFICIENT MULTI-GPU RENDERING OF GEOMETRY BY SUBDIVIDING GEOMETRY
A method for graphics processing. The method including rendering graphics for an application using graphics processing units (GPUs). The method including using the plurality of GPUs in collaboration to render an image frame including a plurality of pieces of geometry. The method including during the rendering of the image frame, subdividing one or more of the plurality of pieces of geometry into smaller pieces, and dividing the responsibility for rendering these smaller portions of geometry among the plurality of GPUs, wherein each of the smaller portions of geometry is processed by a corresponding GPU. The method including for those pieces of geometry that are not subdivided, dividing the responsibility for rendering the pieces of geometry among the plurality of GPUs, wherein each of these pieces of geometry is processed by a corresponding GPU.
MEMORY SYSTEM AND OPERATING METHOD THEREOF
A memory system is provided to include a first virtual function controller in communication with a first virtual machine of a host and configured to receive, from the first virtual machine, a command for accessing a namespace and provide, to the first virtual machine, a response to the command; a second virtual function controller in communication with a second virtual machine of the host and configured to be coupled to the namespace and receive the command from the first virtual function controller based on status information of the first virtual function controller and the second virtual function controller; a buffer memory configured to provide an area for data corresponding to the command; and a memory controller configured to access the namespace based on the command and provide the buffer memory with the data.