Patent classifications
G06F15/781
DATA LINK STABILITY DETECTION USING COMPUTER VISION-BASED DATA EYE ANALYSIS
The reliability of a data communication link may be analyzed and otherwise maintained by collecting a two-dimensional array representing a functional data eye, and using a convolutional neural network to determine a score of the functional data eye. The determined score may be compared with a threshold, and an action may be initiated based on the result of the comparison.
Method and apparatus for dual issue multiply instructions
A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.
Address interleaving for machine learning
A system includes a memory, an interface engine, and a master. The memory is configured to store data. The inference engine is configured to receive the data and to perform one or more computation tasks of a machine learning (ML) operation associated with the data. The master is configured to interleave an address associated with memory access transaction for accessing the memory. The master is further configured to provide a content associated with the accessing to the inference engine.
DSB operation with excluded region
Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
Hardware for supporting time triggered load anticipation in the context of a real time OS
An integrated circuit is disclosed that includes a central processing unit (CPU), a random access memory (RAM) configured for storing data and CPU executable instructions, a first peripheral circuit for accessing memory that is external to the integrated circuit, a second peripheral circuit, and a communication bus coupled to the CPU, the RAM, the first peripheral circuit and the second peripheral circuit. The second peripheral circuit includes a first preload register configured to receive and store a first preload value, a first register configured to store first information that directly or indirectly identifies a first location where first instructions of a first task can be found in memory that is external to the integrated circuit, and a counter circuit that includes a counter value. The counter circuit can increment or decrement the counter value with time when the counter circuit is started. A first compare circuit is also included and can compare the counter value to the first preload value. The first compare circuit is configured to assert a first match signal in response to detecting a match between the counter value and the first preload value. The second peripheral circuit is configured to send a first preload request to the first peripheral circuit in response to an assertion of the first match signal. The first preload request identifies the location where the first instructions of the first task can be found in the external memory.
Network On Layer Enabled Architectures
The technology relates to a system on chip (SoC). The SoC may include a network on layer including one or more routers and an application specific integrated circuit (ASIC) layer bonded to the network layer, the ASIC layer including one or more components. In some instances, the network layer and the ASIC layer each include an active surface and a second surface opposite the active surface. The active surface of the ASIC layer and the second surface of the network may each include one or more contacts, and the network layer may be bonded to the ASIC layer via bonds formed between the one or more contacts on the second surface of the network layer and the one or more contacts on the active surface of the ASIC layer.
HARDWARE ACCELERATION OF REINFORCEMENT LEARNING WITHIN NETWORK DEVICES
A network interface device includes a memory to store configuration values associated with a reinforcement learning (RL) routine and a set of RL-related parameters associated with the RL routine, packet processing circuitry to receive network packets, and accelerator circuitry coupled to the memory and the packet processing circuitry. The accelerator circuitry is to: detect a network packet that includes a particular criterion; and execute the RL routine, using the configuration values and in response to detecting the network packet, to employ observation information derived from or associated with the network packet to perform an RL-related action.
METHOD AND APPARATUS FOR VECTOR PERMUTATION
A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.
Memory system and SoC including linear address remapping logic
A system-on-chip is connected to a first memory device and a second memory device. The system-on-chip comprises a memory controller configured to control an interleaving access operation on the first and second memory devices. A modem processor is configured to provide an address for accessing the first or second memory devices. A linear address remapping logic is configured to remap an address received from the modem processor and to provide the remapped address to the memory controller. The memory controller performs a linear access operation on the first or second memory device in response to receiving the remapped address.
Configurable heterogeneous AI processor with distributed task queues allowing parallel task execution
Embodiments described herein provide a configurable heterogeneous Artificial Intelligence (AI) processor comprising at least two different architectural types of computation units, a storage unit and a controller. Each of the computation units has a respective task queue. The controller is configured to partition a computation graph of a neural network into a plurality of computation subtasks and distribute the computation subtasks to the task queues of the computation units. The controller is also configured to set a dependency among the computation subtasks, synchronize the computation subtasks according to the set dependency, and control access to data involved in the computation subtasks. Different application tasks are processed by uniformly managing and scheduling the various architectural types of computation units in an on-chip heterogeneous manner, so that the AI processor can flexibly adapt to different application scenarios.