Patent classifications
G06F15/17325
Multiple master, multi-slave serial peripheral interface
Systems, methods, and apparatus provide a multi-master serial peripheral interface. An apparatus is coupled to master and slave devices through an interconnect circuit using individual point-to-point SPI links. The interconnect circuit may be configured to couple pairs of devices selected from the plurality of devices through their individual point-to-point SPI links, enable a first transaction to be completed between a first pair of devices after a first master device in the first pair of devices initiates the first transaction, enable a second transaction to be completed between a second pair of devices after a second master device in the second pair of devices initiates the second transaction, and prevent a collision between the first master device and the second master device while the first pair of devices are engaged in the first transaction. The pairs of devices may be selected when they are participants in one or more transactions.
System and method for managing multi-core accesses to shared ports
A port is provided that utilized various techniques to manage contention for the same by controlling data that is written to and read from the port in multi-core assembly within a usable computing system. When the port is a sampling port, the assembly may include at least two cores, a plurality of buffers in operative communication with the at least one sampling ports, a non-blocking contention management unit comprising a plurality of pointers that collectively operate to manage contention of shared ports in a multi-core computing system. When the port is queuing port, the assembly may include buffers in communication with the queuing port and the buffers are configured to hold multiple messages in the queuing port. The assembly may manage contention of shared queuing ports in a multi-core computing system.
MEMORY SYSTEM ARCHITECTURE FOR MULTI-THREADED PROCESSORS
Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.
Receive-side timestamp accuracy
In one embodiment, a network device, includes a network interface port configured to receive data symbols from a network node over a packet data network, at least some of the symbols being included in data packets, and controller circuitry including physical layer (PHY) circuitry, which includes receive PHY pipeline circuitry configured to process the received data symbols, and a counter configured to maintain a counter value indicative of a number of the data symbols in the receive PHY pipeline circuitry.
Broadcast synchronization for dynamically adaptable arrays
An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.
Data processing engine tile architecture for an integrated circuit
An example data processing engine (DPE) for a DPE array in an integrated circuit (IC) includes: a core; a memory including a data memory and a program memory, the program memory coupled to the core, the data memory coupled to the core and including at least one connection to a respective at least one additional core external to the DPE; support circuitry including hardware synchronization circuitry and direct memory access (DMA) circuitry each coupled to the data memory; streaming interconnect coupled to the DMA circuitry and the core; and memory-mapped interconnect coupled to the core, the memory, and the support circuitry.
Embedding rings on a toroid computer network
A computer comprising a plurality of interconnected processing nodes arranged in a configuration with multiple layers, arranged along an axis, comprising first and second endmost layers and at least one intermediate layer between the first and second endmost layers is provided. Each layer comprises a plurality of processing nodes connected in a ring by an intralayer respective set of links between each pair of neighbouring processing nodes, the links adapted to operate simultaneously. Nodes in each layer are connected to respective corresponding nodes in each adjacent layer by an interlayer link. Each processing node in the first endmost layer is connected to a corresponding node in the second endmost layer. Data is transmitted around a plurality of embedded one-dimensional logical rings with an asymmetric bandwidth utilisation, each logical ring using all processing nodes of the computer in such a manner that the plurality of embedded one-dimensional logical rings operate simultaneously.
SYSTEMS AND METHODS FOR MULTI-ARCHITECTURE COMPUTING
Disclosed herein are systems and methods for multi-architecture computing. For example, in some embodiments, a computing system may include: a processor system including at least one first processor core having a first instruction set architecture (ISA); a memory device coupled to the processor system, wherein the memory device has stored thereon a first binary representation of a program for the first ISA; and control logic to suspend execution of the program by the at least one first processor core and cause at least one second processor core to resume execution of the program, wherein the at least one second processor core has a second ISA different from the first ISA; wherein the program is to generate data having an in-memory representation compatible with both the first ISA and the second ISA.
DATA EXCHANGE PATHWAYS BETWEEN PAIRS OF PROCESSING UNITS IN COLUMNS IN A COMPUTER
A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
Loop Execution Control for a Multi-Threaded, Self-Scheduling Reconfigurable Computing Fabric Using a Reenter Queue
Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.