G06F9/544

Systems and methods for task switching in neural network processor
11740932 · 2023-08-29 · ·

Embodiments relate to managing tasks that when executed by a neural processor circuit instantiates a neural network. A neural task manager circuit within the neural processor circuit can switch between tasks in different task queues. Each task queue is configured to store a reference to a task list of tasks for instantiating a neural network. Each task queue can also be assigned a priority parameter. While the neural processor circuit is executing tasks of a first task list and prior to completion of each task, the neural task manager circuit can switch between task queues according to the priority parameters for execution of tasks of a second task list by the neural processor circuit. The neural processor circuit includes one or more neural engine circuits that are configured to perform neural operations by executing the tasks assigned by the task manager.

Encaching and sharing transformed libraries

Embodiments disclosed herein are directed at applying transformations to computer code residing in original libraries for protection against cyberattacks. For example, the transformations applied on original libraries cause random reorganization of the computer code resulting in a transformed version of an original library. Although a malicious attacker can utilize a known exploit of the original library and launch a cyberattack, such knowledge is of no use on the transformed version of the original library. In some embodiments, the transformed version of the original library is stored in cache memory and shared by multiple executable programs to facilitate efficient memory utilization. By making updates to information within the memory occupied by the executable program, the connection between the transformed version of the original library and the executable program is established, when the executable program attempts to access the functional blocks of the original library, which can be released from memory.

Command-aware hardware architecture

In an embodiment, responsive to determining: (a) a first command is not of a particular command type associated with one or more hardware modules associated with a particular routing node, or (b) at least one argument used for executing the first command is not available: transmitting the first command to another routing node in the hardware routing mesh. Upon receiving a second command of the command bundle and determining: (a) the second command is of the particular command type associated with the hardware module(s), and (b) arguments used by the second command are available: transmitting the second command to the hardware module(s) associated with the particular routing node for execution by the hardware module(s). Thereafter, the command bundle is modified based on execution of the second command by at least refraining from transmitting the second command of the command bundle to any other routing nodes in the hardware routing mesh.

High availability events in a layered architecture

Techniques are provided for high availability events in a layered architecture. In an example two computing nodes coordinate to provide a computing service, where each node has a base operating system configured to fence the other base operating system, and an application configured to fence the other application. In some examples, fencing requests by an application are routed through its base operating system, which coordinates application-level fencing requests and operating system-level fencing requests.

Global coherence operations

A method includes receiving, by a L2 controller, a request to perform a global operation on a L2 cache and preventing new blocking transactions from entering a pipeline coupled to the L2 cache while permitting new non-blocking transactions to enter the pipeline. Blocking transactions include read transactions and non-victim write transactions. Non-blocking transactions include response transactions, snoop transactions, and victim transactions. The method further includes, in response to an indication that the pipeline does not contain any pending blocking transactions, preventing new snoop transactions from entering the pipeline while permitting new response transactions and victim transactions to enter the pipeline; in response to an indication that the pipeline does not contain any pending snoop transactions, preventing, all new transactions from entering the pipeline; and, in response to an indication that the pipeline does not contain any pending transactions, performing the global operation on the L2 cache.

EFFICIENT AND RELIABLE HOST DISTRIBUTION OF TOTALLY ORDERED GLOBAL STATE
20220159061 · 2022-05-19 ·

An asynchronous distributed computing system with a plurality of computing nodes is provided. One of the computing nodes includes a sequencer service that receives updates from the plurality of computing nodes. The sequencer service maintains or annotates messages added to the global state of the system. Updates to the global state are published to the plurality of computing nodes. Monitoring services on the other computing nodes write the updates into a locally maintained copy of the global state that exists in shared memory on each one of the nodes. Client computer processes on the nodes may then subscribe to have updates “delivered” to the respective client computer processes.

Systems and methods for determining a dependency of instructions
11740907 · 2023-08-29 · ·

In a particular implementation, a method includes: receiving, at a central processing unit (CPU), first and second instructions of a plurality of instructions obtained from a memory, where the first instruction corresponds to a preceding instruction of a second instruction, and where the second instruction corresponds to a succeeding instruction of the first instruction; determining a dependency of the first and second instructions; sending the first and second instructions to an issue queue of the CPU; executing, at the CPU, the first and second instructions; and completing, at the CPU, the first and second instructions.

CACHING STREAMS OF MEMORY REQUESTS
20220156198 · 2022-05-19 ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for allocating cache resources according to page-level attribute values. In one implementation, the system includes one or more integrated client devices and a cache. Each client device is configured to generate at least a memory request. Each memory request has a respective physical address and a respective page descriptor of a page to which the physical address belongs. The cache is configured to cache memory requests for each of the one or more integrated client devices. The cache comprises a cache memory having multiple ways. The cache is configured to distinguish different memory requests using page-level attributes of respective page descriptors of the memory requests, and to allocate different portions of the cache memory to different respective memory requests.

Cloudified user-level tracing
11743131 · 2023-08-29 · ·

Some embodiments provide a method for performing radio access network (RAN) functions in a cloud at a user-level tracing application that executes on a machine deployed on a host computer in the cloud. The method receives data, via a RAN intelligent controller (RIC), from a RAN component. The method uses the received data to generate information related to traffic performance for at least one user. The method provides the generated information to the RIC.

COMPUTING DEVICE, COMPUTING EQUIPMENT AND PROGRAMMABLE SCHEDULING METHOD

The embodiments of the disclosure relate to a computing device, a computing equipment, and a programmable scheduling method for data loading and execution, and relate to the field of computer. The computing device is coupled to a first computing core and a first memory. The computing device includes a scratchpad memory, a second computing core, a first hardware queue, a second hardware queue and a synchronization unit. The second computing core is configured for acceleration in a specific field. The first hardware queue receives a load request from the first computing core. The second hardware queue receives an execution request from the first computing core. The synchronization unit configured to make the triggering of the load request and the execution request to cooperate with each other. In this manner, flexibility, throughput, and overall performance can be enhanced.