Patent classifications
G06F9/544
Method and apparatus for a step-enabled workflow
Aspects of the disclosure provide methods and an apparatus including processing circuitry configured to receive workflow information of a workflow. The processing circuitry generates, based on the workflow information, the workflow including a first buffering task and a plurality of processing tasks that includes a first processing task and a second processing task. The first processing task is caused to enter a running state in which a subset of input data is processed and output to the first buffering task as first processed subset data. The first processing task is caused to transition from the running state to a non-running state based on an amount of the first processed subset data in the first buffering task being equal to a first threshold. Subsequently, the second processing task is caused to enter a running state in which the first processed subset data in the first buffering task is processed.
GRAPHICS PROCESSING UNIT INVOKING METHOD, CENTRAL PROCESSING UNIT AND APPARATUS
The present application provides a method for invoking a graphics processing unit, a central processing unit and an apparatus. The method is applied to the central processing unit, the central processing unit having a first process and a second process running therein, the method comprising: in response to an invoking instruction for invoking a programming interface corresponding to an execution task of the first process, invoking by the first process a hijacking code corresponding to the programming interface,; running by the first process the hijacking code to send a running request to a second process, wherein the running request is used for instructing the second process to invoke the programming interface; and invoking a graphics processing unit by the second process by invoking the programming interface in response to the running request, and then processing an execution task by the graphics processing unit.
Host apparatus, heterogeneous system architecture device, and heterogeneous system based on unified virtual memory
Disclosed herein is a heterogeneous system based on unified virtual memory. The heterogeneous system based on unified virtual memory may include a host for compiling a kernel program, which is source code of a user application, in a binary form and delivering the compiled kernel program to a heterogenous system architecture device, the heterogenous system architecture device for processing operation of the kernel program delivered from the host in parallel using two or more different types of processing elements, and unified virtual memory shared between the host and the heterogenous system architecture device.
Methods and apparatus for reordering signals
Various embodiments of the present technology may provide methods and apparatus for reordering signals that are generated by a sensor. The apparatus may receive the generated signals in the form of a plurality of X-bit input signals and generate a plurality of output signals according to an exemplary reordering scheme. The apparatus may perform the exemplary reordering scheme based on one or more states of a state machine.
Scheduling off-chip memory access for programs with predictable execution
A machine learning network is implemented by executing a computer program of instructions on a machine learning accelerator (MLA) comprising a plurality of interconnected storage elements (SEs) and processing elements (PEs). The instructions are partitioned into blocks, which are retrieved from off-chip memory. The block includes a set of deterministic instructions (MLA instructions) to be executed by on-chip storage elements and/or processing elements according to a static schedule from a compiler. The MLA instructions may require data retrieved from off-chip memory by memory access instructions contained in prior blocks. The compiler also schedules the memory access instructions in a manner that avoids contention for access to the off-chip memory. By avoiding contention, the execution time of off-chip memory accesses becomes predictable enough and short enough that the memory access instructions may be scheduled so that they are known to complete before the retrieved data is required.
HETEROGENEOUS COMPUTE DOMAINS WITH AN EMBEDDED OPERATING SYSTEM IN AN INFORMATION HANDLING SYSTEM
An information handling system includes a memory device, a memory, a chipset, and a basic input/output system (BIOS). The chipset includes a main processor and a hybrid processor. During a first pre-boot phase, the BIOS memory maps the hybrid processor to a first portion of the memory device, and stores an embedded operating system in the memory. During a second pre-boot phase, the BIOS memory maps the main processor to a second portion of the memory device, stores a host operating system in the memory, and loads the embedded operating system on the hybrid processor. The second portion is a larger portion of the memory device than the first portion.
VERSIONED PROGRESSIVE CHUNKED QUEUE FOR A SCALABLE MULTI-PRODUCER AND MULTI-CONSUMER QUEUE
A method includes receiving, by a producer thread of a plurality of producer threads, an offer request associated with an item. The producer thread increases a sequence and determines (i) a chunk identifier of a memory chunk from a pool of memory chunks and (ii) a first slot position in the memory chunk to offer the item. The producer thread also writes the item into the memory chunk at the first slot position. Then, a first consumer thread of a plurality of consumer threads determines the first slot position of the item and consumes the item at the first slot position. A second consumer thread consumes another item at a second slot position in the memory chunk and recycles the memory chunk.
FLUSHING DIRTY PAGES FROM PAGE BUFFERS INDICATED BY NON-SEQUENTIAL PAGE DESCRIPTORS
Dirty pages of cached user data are persistently stored to page buffers that are allocated from a page buffer pool in a persistent data storage resource of a data storage system, and are indicated by page descriptors that are stored at a head of a temporally ordered page descriptor ring as the dirty pages are stored to the page buffers. The disclosed technology performs a flush operation by selecting a work-set of non-sequential page descriptors within the page descriptor ring, flushing dirty pages from page buffers indicated by the page descriptors in the work-set to non-volatile data storage drives of the data storage system, and storing, for each one of the page buffers indicated by the page descriptors in the work-set, an indication that the page buffer is available for re-use.
Graphics layer processing in a multiple operating systems framework
Graphics layer processing in a multiple operating systems framework is disclosed, including: presenting, at a display, a first composition including a sub-graphics layer object associated with a graphical interface corresponding to an application, wherein the application is executed in a guest subsystem of a system; receiving a content-related compositing request corresponding to a guest server graphics layer object in the guest subsystem; using the guest server graphics layer object to obtain a host server graphics layer object that corresponds to the guest server graphics layer object, wherein the host server graphics layer object is in a host subsystem of the system; obtaining a buffer corresponding to the guest server graphics layer object; and generating a second composition including the sub-graphics layer object, wherein the second composition is to be presented at the display.
Handling an input/output store instruction
An input/output store instruction is handled. A data processing system includes a system nest communicatively coupled to at least one input/output bus by an input/output bus controller. The data processing system further includes at least a data processing unit including a core, system firmware and an asynchronous core-nest interface. The data processing unit is communicatively coupled to the system nest via an aggregation buffer. The system nest is configured to asynchronously load from and/or store data to an external device which is communicatively coupled to the input/output bus. The data processing unit is configured to complete the input/output store instruction before an execution of the input/output store instruction in the system nest is completed.