G06F2209/521

Organizing tasks by a hierarchical task scheduler for execution in a multi-threaded processing system
11249807 · 2022-02-15 · ·

A method for scheduling tasks from a program executed by a multi-processor core system is disclosed. The method includes a scheduler that groups a plurality of tasks, each having an assigned priority, by priority in a task group. The task group is assembled with other task groups having identical priorities in a task group queue. A hierarchy of task group queues is established based on priority levels of the assigned tasks. Task groups are assigned to one of a plurality of worker threads based on the hierarchy of task group queues. Each of the worker threads is associated with a processor in the multi-processor system. The tasks of the task groups are executed via the worker threads according to the order in the hierarchy.

Methods for single-owner multi-consumer work queues for repeatable tasks

There are provided methods for single-owner multi-consumer work queues for repeatable tasks. A method includes permitting a single owner thread of a single owner, multi-consumer, work queue to access the work queue using atomic instructions limited to only a single access and using non-atomic operations. The method further includes restricting the single owner thread from accessing the work queue using atomic instructions involving more than one access. The method also includes synchronizing amongst other threads with respect to their respective accesses to the work queue.

Private memory regions and coherency optimization by controlling snoop traffic volume in multi-level cache hierarchy

A system for optimizing cache coherence message traffic volume is disclosed. The system includes a plurality of caches in a multi-level memory hierarchy and a plurality of agents. Each agent is associated with a cache. The system includes one or more monitoring engines. Each agent in the plurality of agents is associated with a monitoring engine. The agents can execute a processor level software instruction causing a memory region to be private to the agent. Each of the agents is configured to execute a memory access for data on an associated cache and to send a request for data up the hierarchy on a cache miss. The monitoring engine is configured to intercept request for data from an agent and to prevent snooping for the cache line in peer caches when the cache line associated with a memory region represented as private to the agent.

System and method for managing multi-core accesses to shared ports

A port is provided that utilized various techniques to manage contention for the same by controlling data that is written to and read from the port in multi-core assembly within a usable computing system. When the port is a sampling port, the assembly may include at least two cores, a plurality of buffers in operative communication with the at least one sampling ports, a non-blocking contention management unit comprising a plurality of pointers that collectively operate to manage contention of shared ports in a multi-core computing system. When the port is queuing port, the assembly may include buffers in communication with the queuing port and the buffers are configured to hold multiple messages in the queuing port. The assembly may manage contention of shared queuing ports in a multi-core computing system.

Novel RTOS/OS Architecture for Context Switching Without Disabling Interrupts
20210397571 · 2021-12-23 ·

The present invention is a novel RTOS/OS architecture that changes the fundamental way that context switching is performed. In all prior operating system implementations, context switching required disabling of interrupts. This opens the possibility that data can be lost. This novel approach consists of a context switching method in which interrupts are never disabled. Two implementations are presented. In the first implementation, the cost is a negligible amount of memory. In the second, the cost is only a minimal impact on the context switching time. This RTOS/OS architecture requires specialized hardware. Concretely, an advanced interrupt controller that supports nesting and tail chaining of prioritized interrupts is needed (e.g. the Nested Vectored Interrupt Controller (NVIC) found on many ARM processors). The novel RTOS/OS architecture redefines how task synchronization primitives such as semaphores and mutexes are released. Whereas previous architectures directly accessed internal structures, this architecture does so indirectly by saving information in shared buffers or setting flags, and then activating a low priority software interrupt that subsequently interprets this data and performs all context switching logic. The software interrupt must be set as the single lowest priority interrupt in the system.

Method and Apparatus for Secure and Verifiable Composite Service Execution and Fault Management on Blockchain

A method is implemented by one or more network devices to identify an originating point of failure in a composite service executed in a cloud computing environment. The execution of the composite service includes execution of a plurality of atomic services in an ordered sequence, where for each atomic service that is executed, an execution trace for that atomic service is stored in a blockchain to form an ordered sequence of execution traces, where the execution trace for a given atomic service is signed using the private key associated with that atomic service and analyzing one or more of the ordered sequence of execution traces to determine which of the plurality of atomic services originated the failure, where each execution trace that is analyzed is authenticated using the public key that corresponds to the private key associated with the atomic service that generated that execution trace.

RTOS/OS architecture for context switching that solves the diminishing bandwidth problem and the RTOS response time problem using unsorted ready lists
11734051 · 2023-08-22 ·

The present invention is a novel RTOS/OS architecture that changes the fundamental way that data is organized and context switching is performed. This novel approach consists of a context switching method in which interrupts are never disabled. This RTOS/OS architecture requires specialized hardware. Concretely, an advanced interrupt controller that supports nesting and tail chaining of prioritized interrupts is needed (e.g. the Nested Vectored Interrupt Controller (NVIC) found on many ARM processors) is required. The novel RTOS/OS architecture does not keep the list of tasks ready to run in sorted order, allowing for O(1) insertion time and utilizes a barrier variable to allow for safe O(n) insertion of tasks into the priority sorted list of blocked tasks without disabling interrupts. The advanced interrupt controller allows for any new interrupts to preempt the software exception handler thereby ensuring no data loss. This novel RTOS/OS architecture eliminates the diabolical deficiency existent in current architectures which creates a superficial dependency between the number of tasks in the system and the maximum bandwidth that can be sustained at some peripheral. That is, this architecture ensures that the maximum bandwidth never decreases as more tasks are added to the system.

QUIESCENT STATE-BASED RECLAIMING STRATEGY FOR PROGRESSIVE CHUNKED QUEUE
20220138010 · 2022-05-05 ·

A system includes a memory for storing a plurality of memory chunks and a processor for executing a plurality of producer threads. A producer thread increases a producer sequence and determines (i) a first chunk identifier associated with the producer sequence of an identified memory chunk and (ii) a position from the producer sequence to offer an item. The producer thread determines a second chunk identifier of a last created/appended memory chunk and determines whether the second chunk identifier is valid (e.g., matches the first chunk identifier). The producer thread reads a current memory chunk and determines whether a third chunk identifier associated with the current memory chunk is valid (e.g., matches the first chunk identifier). The producer thread writes the item into the identified memory chunk at the position.

Executing an atomic primitive in a multi-core processor system

The present disclosure relates to a method for a computer system comprising a plurality of processor cores, including a first processor core and a second processor core, wherein a cached data item is assigned to a first processor core, of the plurality of processor cores, for exclusively executing an atomic primitive. The method includes receiving, from a second processor core at a cache controller, a request for accessing the data item, and in response to determining that the execution of the atomic primitive is not completed by the first processor core, returning a rejection message to the second processor core.

Processing of plural-register-load instruction
11314509 · 2022-04-26 · ·

An apparatus comprises processing circuitry to issue load operations to load data from memory. In response to a plural-register-load instruction specifying at least two destination registers to be loaded with data from respective target addresses, the processing circuitry permits issuing of separate load operations corresponding to the plural-register-load instruction. Load tracking circuitry maintains tracking information for one or more issued load operations. When the plural-register-load instruction is subject to an atomicity requirement and the plurality of load operations are issued separately, the load tracking circuitry detects, based on the tracking information, whether a loss-of-atomicity condition has occurred for the load operations corresponding to the plural-register-load instruction, and requests re-processing of the plural-register-load instruction when the loss-of-atomicity condition is detected.