G06F15/7814

System and method for continuous low-overhead monitoring of distributed applications running on a cluster of data processing nodes

Embodiments of the present invention provide an improvement over known approaches for monitoring of and taking action on observations associated with distributed applications. Application event reporting and application resource monitoring is unified in a manner that significantly reduces storage and aggregation overhead. For example, embodiments of the present invention can employ hardware and/or software support that reduces storage and aggregation overhead. In addition to providing for fine-grained, continuous, decentralized monitoring of application activity and resource consumption, embodiments of the present invention can also provide for decentralized filtering, statistical analysis, and derived data streaming. Furthermore, embodiments of the present invention are securely implemented (e.g., for use solely under the control of an operator) and can use a separate security domain for network traffic.

Real-time GPU rendering with performance guaranteed power management

Systems, apparatuses, and methods for performing real-time video rendering with performance guaranteed power management are disclosed. A system includes at least a software driver, a power management unit, and a plurality of processing elements for performing rendering tasks. The system receives inputs which correspond to rendering tasks which need to be performed. The software driver monitors the inputs that are received and the number of rendering tasks to which they correspond. The software driver also monitors the amount of time remaining until the next video synchronization signal. The software driver determines which performance setting will minimize power consumption while still allowing enough time to finish the rendering tasks for the current frame before the next video synchronization signal. Then, the software driver causes the power management unit to provide this performance setting to the plurality of processing elements as they perform the rendering tasks for the current frame.

Data transmission between memory and on chip memory of inference engine for machine learning via a single data gathering instruction
10891136 · 2021-01-12 · ·

A system to support data gathering for a machine learning (ML) operation comprises a memory unit configured to maintain data for the ML operation in a plurality of memory blocks each accessible via a memory address. The system further comprises an inference engine comprising a plurality of processing tiles each comprising one or more of an on-chip memory (OCM) configured to load and maintain data for local access by components in the processing tile. The system also comprises a core configured to program components of the processing tiles of the inference engine according to an instruction set architecture (ISA) and a data streaming engine configured to stream data between the memory unit and the OCMs of the processing tiles of the inference engine wherein data streaming engine is configured to perform a data gathering operation via a single data gathering instruction of the ISA at the same time.

REAL-TIME GPU RENDERING WITH PERFORMANCE GUARANTEED POWER MANAGEMENT

Systems, apparatuses, and methods for performing real-time video rendering with performance guaranteed power management are disclosed. A system includes at least a software driver, a power management unit, and a plurality of processing elements for performing rendering tasks. The system receives inputs which correspond to rendering tasks which need to be performed. The software driver monitors the inputs that are received and the number of rendering tasks to which they correspond. The software driver also monitors the amount of time remaining until the next video synchronization signal. The software driver determines which performance setting will minimize power consumption while still allowing enough time to finish the rendering tasks for the current frame before the next video synchronization signal. Then, the software driver causes the power management unit to provide this performance setting to the plurality of processing elements as they perform the rendering tasks for the current frame.

Nonlinear, decentralized processing unit and related systems or methodologies
11868305 · 2024-01-09 ·

Disclosed is a processor chip that includes on-chip and off-chip software. The chip is optimized for hyperdimensional, fixed-point vector algebra to efficiently store, process, and retrieve information. A specialized on-chip data-embedding algorithm uses algebraic logic gates to convert off-chip normal data, such as images and spreadsheets, into discrete, abstract vector space where information is processed with off-chip software and on-chip accelerated computation via a desaturation method. Information is retrieved using an on-chip optimized decoding algorithm. Additional software provides an interface between a CPU and the processor chip to manage information processing instructions for efficient data transfer on- and off-chip in addition to providing intelligent processing that associates input information to allow for suggestive outputs.

HARDWARE FOR SUPPORTING TIME TRIGGERED LOAD ANTICIPATION IN THE CONTEXT OF A REAL TIME OS

An integrated circuit is disclosed that includes a central processing unit (CPU), a random access memory (RAM) configured for storing data and CPU executable instructions, a first peripheral circuit for accessing memory that is external to the integrated circuit, a second peripheral circuit, and a communication bus coupled to the CPU, the RAM, the first peripheral circuit and the second peripheral circuit. The second peripheral circuit includes a first preload register configured to receive and store a first preload value, a first register configured to store first information that directly or indirectly identifies a first location where first instructions of a first task can be found in memory that is external to the integrated circuit, and a counter circuit that includes a counter value. The counter circuit can increment or decrement the counter value with time when the counter circuit is started. A first compare circuit is also included and can compare the counter value to the first preload value. The first compare circuit is configured to assert a first match signal in response to detecting a match between the counter value and the first preload value. The second peripheral circuit is configured to send a first preload request to the first peripheral circuit in response to an assertion of the first match signal. The first preload request identifies the location where the first instructions of the first task can be found in the external memory.

SYSTEMS AND METHODS FOR IMPLEMENTING AN INTELLIGENCE PROCESSING COMPUTING ARCHITECTURE

A system and method for automated data propagation and automated data processing within an integrated circuit includes an intelligence processing integrated circuit comprising at least one intelligence processing pipeline, wherein the at least one intelligence processing pipeline includes: a main data buffer that stores input data; a plurality of distinct intelligence processing tiles, wherein each distinct intelligence processing tile includes a computing circuit and a local data buffer; a token-based governance module, the token-based governance module implementing: a first token-based control data structure; a second token-based control data structure, wherein the first token-based control data structure and the second-token based control data operate in cooperation to control an automated flow of the input data and/or an automated processing of the input data through the at least one intelligence processing pipeline.

Systems and methods for implementing an intelligence processing computing architecture

A system and method for automated data propagation and automated data processing within an integrated circuit includes an intelligence processing integrated circuit comprising at least one intelligence processing pipeline, wherein the at least one intelligence processing pipeline includes: a main data buffer that stores input data; a plurality of distinct intelligence processing tiles, wherein each distinct intelligence processing tile includes a computing circuit and a local data buffer; a token-based governance module, the token-based governance module implementing: a first token-based control data structure; a second token-based control data structure, wherein the first token-based control data structure and the second-token based control data operate in cooperation to control an automated flow of the input data and/or an automated processing of the input data through the at least one intelligence processing pipeline.

RENDERING SYSTEM AND METHOD BASED ON SYSTEM-ON-CHIP (SOC) PLATFORM
20240029335 · 2024-01-25 ·

A rendering system based on a system-on-chip (SOC) platform includes a plurality of processors, a plurality of graphic processors, an inter-core communication circuit, and the shared memory. The plurality of processors are divided into more than two processor groups. The plurality of operating systems are run by a plurality of different processor groups. A plurality of graphics processors receive a rendering tasks sent by the plurality of operating systems, respectively. An inter-core communication circuit is configured to cause the plurality of processors of the plurality of processor groups to communicate with each other. The plurality of graphics processors read image data in the shared memory, and rendered image data is transmitted to the shared memory.

Apparatus and method for configuring a microcontroller system
10572300 · 2020-02-25 · ·

A microcontroller system and a method of configuring a microcontroller system is provided. The microcontroller system comprises a central processing unit, memory associated with the microcontroller system, event receiving means operable to receive an event, and configuration control means. The configuration control means is operable to collect one or a plurality of sets of configuration data from said memory, each of said sets of configuration data defining a configuration related to at least one operational unit of the microcontroller system, wherein said sets of configuration data are associated with a first operation mode associated with the microcontroller system. Moreover, the configuration control means is operable to determine a set of configuration data of the sets of configuration data and to configure the microcontroller system corresponding to said determined set of configuration data.