G06F9/544

Systems and methods for inter-process communication within a robot

A method includes creating a publisher configured to send messages over a channel having a shared memory. The method includes creating at least one subscriber configured to receive the messages over the channel by sequentially referencing memory slots of the plurality of memory slots. The method includes determining that the next sequential memory slot is currently referenced by a subscriber. The method includes delaying sending the message by the publisher based on determining that the next sequential memory slot is currently referenced by the subscriber. The method includes receiving an event trigger indicative of message reading by the subscriber. The method includes, responsive to receiving the event trigger, determining that the next sequential memory slot is not currently referenced. The method includes sending the message to the next sequential memory slot based on determining that the next sequential memory slot is not currently referenced.

Non-favored volume cache starvation prevention

A method to prevent starvation of non-favored volumes in cache is disclosed. In one embodiment, such a method includes storing, in a cache of a storage system, non-favored storage elements and favored storage elements. A cache demotion algorithm is used to retain the favored storage elements in the cache longer than the non-favored storage elements. The method designates a maximum amount of storage space that the favored storage elements are permitted to consume in the cache. In preparation to free storage space in the cache, the method determines whether an amount of storage space consumed by the favored storage elements in the cache has reached the maximum amount. If so, the method frees storage space in the cache by demoting favored storage elements. If not, the method frees storage space in the cache in accordance with the cache demotion algorithm. A corresponding system and computer program product are also disclosed.

DYNAMIC INSTRUMENTATION TO CAPTURE CLEARTEXT FROM TRANSFORMED COMMUNICATIONS
20230136524 · 2023-05-04 ·

Techniques for dynamically instrumenting code to capture cleartext from transformed communications are provided. In one technique, an operating system (OS) mechanism receives an OS call. The OS mechanism determines whether the OS call is of a particular type. In response to determining that the OS call is of the particular type, a certain location within executable code of a unction is identified. A user-level collection mechanism is inserted at the certain location. After inserting the user-level collection mechanism, code at the certain location is executed that causes the user-level collection mechanism to be executed.

Managing services across containers

Services can be managed across containers. A management service can obtain or compile configuration information for containerized applications and containerized services that are hosted on a computing device. The configuration information can define how a containerized application is dependent on a containerized service. Using the configuration information, the management service can establish data paths between containers to enable container services running in the containers to perform cross-container communications by which a containerized application in one container can access a containerized service in another container. The management service may also enable a container service to perform communications by which a containerized application can access services provided by the host operating system.

Tensor accelerator capable of increasing efficiency of data sharing

A tensor accelerator includes two tile execution units and a bidirectional queue. Each of the tile execution units includes a buffer, a plurality of arithmetic logic units, a network, and a selector. The buffer includes a plurality of memory cells. The network is coupled to the plurality of memory cells. The selector is coupled to the network and the plurality of arithmetic logic units. The bidirectional queue is coupled between the selectors of the tile execution units.

Efficient Processing For Artificial Reality

In one embodiment, a method includes accessing a first image corresponding to a first frame of a video stream, rendering a first area of a second image corresponding to a second frame of the video stream, generating a second area of the second image corresponding to the second frame of the video stream by re-projecting the second area of the first image according to one or more warping parameters, and constructing the second image corresponding to the second frame by compositing the rendered first area and the generated second area of the second image.

In another embodiment, a method includes an operating system receiving a set of data associated with an object from a first application, storing the set of data on the operating system, receiving a command to share the object with a second application, and allowing the second application to access the portion of the data associated with the object that it needs.

DEVICE SELECTION FOR WORKLOAD EXECUTION

Examples described herein relate to a network interface device. In some examples, the network interface device includes circuitry to provide access to an accelerator device on a second platform to perform a workload in response to communication with a device driver executed by a first platform. In some examples, the first platform and second platform are connected by a network and wherein the accelerator device satisfies a selection criteria and wherein the selection criteria comprises a device type. In some examples, the accelerator device on the second platform is accessible to an application via the device driver.

THREAD SPECIALIZATION FOR COLLABORATIVE DATA TRANSFER AND COMPUTATION
20230140934 · 2023-05-11 ·

Apparatuses, systems, and techniques to perform a matrix multiplication using parallel processing. In at least one embodiment, a matrix multiplication is divided into a set of tiles, with each tile processed with a prolog task, a calculation task, and an epilog task. The prolog tasks are performed by a dedicated set of threads, with the remaining tasks performed in an interleaved manner using two or more thread groups.

Quiescent state-based reclaiming strategy for progressive chunked queue
11797344 · 2023-10-24 · ·

A system includes a memory for storing a plurality of memory chunks and a processor for executing a plurality of producer threads. A producer thread increases a producer sequence and determines (i) a first chunk identifier associated with the producer sequence of an identified memory chunk and (ii) a position from the producer sequence to offer an item. The producer thread determines a second chunk identifier of a last created/appended memory chunk and determines whether the second chunk identifier is valid (e.g., matches the first chunk identifier). The producer thread reads a current memory chunk and determines whether a third chunk identifier associated with the current memory chunk is valid (e.g., matches the first chunk identifier). The producer thread writes the item into the identified memory chunk at the position.

One-sided reliable remote direct memory operations

Techniques are provided to allow more sophisticated operations to be performed remotely by machines that are not fully functional. Operations that can be performed reliably by a machine that has experienced a hardware and/or software error are referred to herein as Remote Direct Memory Operations or “RDMOs”. Unlike RDMAs, which typically involve trivially simple operations such as the retrieval of a single value from the memory of a remote machine, RDMOs may be arbitrarily complex. The techniques described herein can help applications run without interruption when there are software faults or glitches on a remote system with which they interact.