G06F9/544

Network switch with endpoint and direct memory access controllers for in-vehicle data transfers

A network switch includes a data bus, a register, an endpoint controller and a direct memory access controller. The endpoint controller is configured to receive a descriptor generated by a device driver of a host system, store the descriptor in the register, and transfer data between a root complex controller of the host system and the data bus. The descriptor identifies an address of a buffer in a memory of the host system. The direct memory access controller is configured to receive the address of the buffer from the endpoint controller or the register and, based on the address and an indication generated by the device driver, independently control transfer of the data between the memory of the host system and a network device connected to the network switch. The direct memory access controller is a receive direct memory access controller or a transmit direct memory access controller.

ASYNCHRONOUS DATA MOVEMENT PIPELINE
20220244986 · 2022-08-04 ·

Apparatuses, systems, and techniques to parallelize operations in one or more programs with data copies from global memory to shared memory in each of the one or more programs. In at least one embodiment, a program performs operations on shared data and then asynchronously copies shared data to shared memory, and continues performing additional operations in parallel while the shared data is copied to shared memory until an indicator provided by an application programming interface to facilitate parallel computing, such as CUDA, informs said program that shared data has been copied to shared memory.

Multiple independent synchonization named barrier within a thread group

An apparatus to facilitate thread barrier synchronization is disclosed. The apparatus includes a plurality of processing resources to execute a plurality of execution threads included in a thread workgroup and barrier synchronization hardware to assign a first named barrier to a first set of the plurality of execution threads in the thread workgroup, assign a second named barrier to a second set of the plurality of execution threads in the thread workgroup, synchronize execution of the first set of execution threads via the first named barrier and synchronize execution of the second set of execution threads via the second named barrier.

COMMAND-AWARE HARDWARE ARCHITECTURE

In an embodiment, responsive to determining: (a) a first command is not of a particular command type associated with one or more hardware modules associated with a particular routing node, or (b) at least one argument used for executing the first command is not available: transmitting the first command to another routing node in the hardware routing mesh. Upon receiving a second command of the command bundle and determining: (a) the second command is of the particular command type associated with the hardware module(s), and (b) arguments used by the second command are available: transmitting the second command to the hardware module(s) associated with the particular routing node for execution by the hardware module(s). Thereafter, the command bundle is modified based on execution of the second command by at least refraining from transmitting the second command of the command bundle to any other routing nodes in the hardware routing mesh.

Migrating logical volumes from a thick provisioned layout to a thin provisioned layout

The present disclosure is directed to migrating logical volumes from a thick provisioned layout to a thin provisioned layout, and includes one or more processors and one or more computer-readable non-transitory storage media comprising instructions that, when executed by the one or more processors, cause one or more components of the system to perform operations comprising creating an abstraction layer on top of a logical volume in a storage device, the abstraction layer for accessing the logical volume, the logical volume one of a plurality of logical volumes in a volume group of the storage device; allocating a thin pool from remaining storage space in the volume group of the storage device; creating a snapshot of the logical volume; adding a thin virtual volume corresponding to the logical volume to the thin pool; and copying data from the snapshot to the thin virtual volume.

De-prioritization supporting frame buffer caching
11403223 · 2022-08-02 · ·

Systems, methods, and computer readable media to manage memory cache for graphics processing are described. A processor creates a resource group for a plurality of graphics application program interface (API) resources. The processor subsequently encodes a set command that references the resource group within a command buffer and assigns a data set identifier (DSID) to the resource group. The processor also encodes a write command within the command buffer that causes the graphics processor to write data within a cache line and mark the written cache line with the DSID, a read command that causes the graphics processor to read data written into the resource group, and a de-prioritize command that causes the graphics processor to notify the memory cache to later flush content from the cache line associated with the DSID and to later invalidate the cache line when higher priority content is received.

Storing a result of a first instruction of an execute packet in a holding register prior to completion of a second instruction of the execute packet

A method includes receiving an execute packet that includes a first instruction and a second instruction and executing the first instruction and the second instruction using a pipeline. Executing the first and second instructions includes storing a result of the first instruction in a holding register; determining whether an event that interrupts execution of the execute packet occurs prior to completion of the executing of the second instruction; and based on the event not occurring, committing the result of the first instruction after completion of the executing of the second instruction.

Data processing system and operating method thereof
11403236 · 2022-08-02 · ·

A data processing system may be configured to include a memory device, a controller configured to access the memory device when a host requests offload processing of an application, and process the application, and a sharing memory management component within the controller and configured to: set controller owning rights of access to a target region of the memory device in response to the host stores, in the target region, data used for the requested offload processing of the application; and set the controller owning rights of access or the host owning rights of access to the target region based on a processing state of the application.

Firmware update patch

A computing system is provided, including a processor and memory storing instructions that, when executed, cause the processor to store a firmware update patch in a runtime buffer included in the memory. The runtime buffer may be accessible by firmware and an operating system of the computing system. The processor may perform a first verification check on the firmware update patch. When the firmware update patch passes the first verification check, the processor may copy the firmware update patch to a system management random access memory (SMRAM) buffer included in the memory. The SMRAM buffer may be accessible by the firmware and inaccessible by the operating system. The processor may perform a second verification check on the copy of the firmware update patch. When the copy of the firmware update patch passes the second verification check, the processor may execute the copy of the firmware update patch.

Processor circuit and data processing method

A processor circuit is provided. The processor circuit includes an instruction decode unit, an instruction detector, an address generator and a data buffer. The instruction decode unit is configured to decode a load instruction to generate a decoding result. The instruction detector, coupled to the instruction decode unit, is configured to detect if the load instruction is in a load-use scenario. The address generator, coupled to the instruction decode unit, is configured to generate a first address requested by the load instruction according to the decoding result. The data buffer is coupled to the instruction detector and the address generator. When the instruction detector detects that the load instruction is in the load-use scenario, the data buffer is configured to store the first address generated from the address generator, and store data requested by the load instruction according to the first address.