Patent classifications
G06F9/30079
PROCESSING PIPELINE WITH ZERO LOOP OVERHEAD
Techniques are disclosed for reducing or eliminating loop overhead caused by function calls in processors that form part of a pipeline architecture. The processors in the pipeline process data blocks in an iterative fashion, with each processor in the pipeline completing one of several iterations associated with a processing loop for a commonly-executed function. The described techniques leverage the use of message passing for pipelined processors to enable an upstream processor to signal to a downstream processor when processing has been completed, and thus a data block is ready for further processing in accordance with the next loop processing iteration. The described techniques facilitate a zero loop overhead architecture, enable continuous data block processing, and allow the processing pipeline to function indefinitely within the main body of the processing loop associated with the commonly-executed function where efficiency is greatest.
DATA PIPELINE CIRCUIT SUPPORTING INCREASED DATA TRANSFER INTERFACE FREQUENCY WITH REDUCED POWER CONSUMPTION, AND RELATED METHODS
A data pipeline circuit includes an upstream interface circuit that receives sequential data and a downstream interface circuit that transfers the sequential data to a downstream circuit. A ready signal indicates the downstream circuit is ready to receive the sequential data. The data pipeline circuit includes a first data latch, a second data latch and a first status latch. The first data latch receives the sequential data. The first status latch generates an available signal that is asserted to indicate the second data latch is available to receive the sequential data. The second data latch receives the sequential data in response on the available signal being asserted and the ready signal indicating the downstream circuit is not ready to receive the sequential data on the data output. Limiting conditions in which the sequential data is stored in the second data latch significantly reduces power consumption of the data pipeline circuit.
METHODS FOR CONFIGURING SPAN OF CONTROL UNDER VARYING TEMPERATURE
A method may include, in response to a change in an operating parameter of a processing unit, modifying a signal pathway to a processing circuit component of the processing unit, and communicating with the processing circuit component via the signal pathway.
ADAPTIVE PIPELINE SELECTION FOR ACCELERATING MEMORY COPY OPERATIONS
Examples include a computing system having a direct memory access (DMA) engine pipeline, a plurality of processing cores, each processing core including a core pipeline, and a memory coupled to the DMA engine pipeline and the plurality of processing cores. The computing system includes a pipeline selector coupled to the plurality of processing cores and the DMA engine pipeline, the pipeline selector to, during initialization, determine at least one threshold for pipeline selection for the computing system, and during runtime, select one of the core pipelines or the DMA engine pipeline to execute a memory copy operation in the memory based at least in part on the at least one threshold.
Processing-in-memory implementations of parsing strings against context-free grammars
An example system implementing a processing-in-memory pipeline includes: a memory array to store a plurality of look-up tables (LUTs) and data comprising an input string; a logic array coupled to the memory array, the logic array to perform a set of logic operations on the data and the LUTs, the set of logic operations implementing a set of production rules of a context-free grammar to translate the input string into one or more symbols; and a control block coupled to the memory array and the logic array, the control block to control a computational pipeline by activating one or more LUTs of the plurality of LUTs, the computational pipeline implementing a parser evaluating the input string against the context-free grammar.
PERFORMING CYCLIC REDUNDANCY CHECKS USING PARALLEL COMPUTING ARCHITECTURES
Apparatuses, systems, and techniques to compute cyclic redundancy checks use a graphics processing unit (GPU) to compute cyclic redundancy checks. For example, in at least one embodiment, an input data sequence is distributed among GPU threads for parallel calculation of an overall CRC value for the input data sequence according to various novel techniques described herein.
Methods and apparatus for data pipelines between cloud computing platforms
Methods, apparatus, systems and articles of manufacture are disclosed to establish a data pipeline between cloud computing platforms. An apparatus includes a producer registration controller to register a data producer with a data pipeline service in a public cloud network, the data producer associated with a private cloud network, a consumer registration controller to register a data consumer with the data pipeline service, and a communication controller to, in response to the registration of the data consumer, transmit data generated by the public cloud network from the data consumer to the data buffer via a first data plane gateway, and, in response to a validation of the data consumer, transmit the data from the data buffer to the data consumer via a second data plane gateway, the first data plane gateway different from the second data plane gateway.
SYSTEM AND METHOD FOR PROVIDING GRANULAR PROCESSOR PERFORMANCE CONTROL
A basic input/output system provides an interface for a core aggregation layout that identifies a grouping of processor cores into core aggregations, wherein each of the core aggregations is associated with a maximum allowable C-state. A processor may monitor an information handling system during operation of an application to gather data associated with latency sensitivity of the application, update the core aggregation layout based on the data gathered during the operation of the application, and pin a thread for execution to one of the processor cores based on the latency sensitivity of the application and the maximum allowable C-state.
Circuit for Fast Interrupt Handling
A circuit for fast interrupt handling is disclosed. An apparatus includes a processor circuit having an execution pipeline and a table configured to store a plurality of pointers that correspond to interrupt routines stored in a memory circuit. The apparatus further includes an interrupt redirect circuit configured to receive a plurality of interrupt requests. The interrupt redirect circuit may select a first interrupt request among a plurality of interrupt requests of a first type. The interrupt redirect circuit retrieves a pointer from the table using information associated with the request. Using the pointer, the execution pipeline retrieves first program instruction from the memory circuit to execute a particular interrupt routine.
Global coherence operations
A method includes receiving, by a L2 controller, a request to perform a global operation on a L2 cache and preventing new blocking transactions from entering a pipeline coupled to the L2 cache while permitting new non-blocking transactions to enter the pipeline. Blocking transactions include read transactions and non-victim write transactions. Non-blocking transactions include response transactions, snoop transactions, and victim transactions. The method further includes, in response to an indication that the pipeline does not contain any pending blocking transactions, preventing new snoop transactions from entering the pipeline while permitting new response transactions and victim transactions to enter the pipeline; in response to an indication that the pipeline does not contain any pending snoop transactions, preventing, all new transactions from entering the pipeline; and, in response to an indication that the pipeline does not contain any pending transactions, performing the global operation on the L2 cache.