G06F9/38

BUILT-IN SELF-TEST FOR A PROGRAMMABLE VISION ACCELERATOR OF A SYSTEM ON A CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

Vector SIMD VLIW data path architecture

A Very Long Instruction Word (VLIW) digital signal processor particularly adapted for single instruction multiple data (SIMD) operation on various operand widths and data sizes. A vector compare instruction compares first and second operands and stores compare bits. A companion vector conditional instruction performs conditional operations based upon the state of a corresponding predicate data register bit. A predicate unit performs data processing operations on data in at least one predicate data register including unary operations and binary operations. The predicate unit may also transfer data between a general data register file and the predicate data register file.

Security enhancement in hierarchical protection domains

Methods and systems for allowing software components that operate at a specific exception level (e.g., EL-3 to EL-1, etc.) to repeatedly or continuously observe or evaluate the integrity of software components operating at a lower exception level (e.g., EL-2 to EL-0) to ensure that the software components have not been corrupted or compromised (e.g., subjected to malware, cyberattacks, etc.) include a computing device that identifies, by a component operating at a higher exception level (“HEL component”), at least one of a current vector base address (VBA), an exception raising instruction (ERI) address, or a control and system register value associated with a component operating at a lower exception level (“LEL component”). The computing device may perform a responsive action in response to determining that the current VBA, the ERT address, or control and system register value do not match the corresponding reference data.

Scalable runtime validation for on-device design rule checks

An apparatus to facilitate scalable runtime validation for on-device design rule checks is disclosed. The apparatus includes a memory to store a contention set, one or more multiplexors, and a validator communicably coupled to the memory. In one implementation, the validator is to: receive design rule information for the one or more multiplexers, the design rule information referencing the contention set; analyze, using the design rule information, a user bitstream against the contention set at a programming time of the apparatus, the user bitstream for programming the one or more multiplexors; and provide an error indication responsive to identifying a match between the user bitstream and the contention set.

Traversing a large connected component on a distributed file-based data structure

A distributed system including multiple processing nodes. The distributed system can perform certain acts. The acts can include receiving a set of input nodes and a set of criteria. The acts can include obtaining an adjacency list representing a large connected component. The large connected component can include nodes, edges, and edge metadata. A quantity of the nodes of the large connected component can exceed 1 billion. The adjacency list can be distributed across the multiple processing nodes. The nodes of the large connected component can include the input nodes. The acts also can include performing one or more iterations of traversing the large connected component until a stopping condition is satisfied. Each iteration can include processing a set of input nodes at the multiple processing nodes using the set of criteria to generate first data at the multiple processing nodes, determining a set of output nodes such that each output node of the set of output nodes is one hop from a respective input node of the set of input nodes, consolidating the first data from the multiple processing nodes to a first processing node of the multiple processing nodes, processing the first data at the first processing node; and assigning the set of input nodes for a subsequent iteration of the one or more iterations based on the set of output nodes when the stopping condition is not satisfied. The acts further can include outputting second data based on the first data received and processed at the first processing node during the one or more iterations. Other embodiments are disclosed.

Data path for scalable matrix node engine with mixed data formats
11556615 · 2023-01-17 · ·

A microprocessor system comprises a matrix computational unit and a control unit. The matrix computational unit includes a plurality of processing elements. The control unit is configured to provide a matrix processor instruction to the matrix computational unit. The matrix processor instruction specifies a floating-point operand formatted using a first floating-point representation format. The matrix computational unit accumulates an intermediate result value calculated using the floating-point operand. The intermediate result value is in a second floating-point representation format.

Detecting and recovering lost adjunct processor messages

A method, computer program product, and computer system are provided. An operating system (OS) receives a status at completion of a cryptographic adjunct process (AP) instruction directed to an AP message queue on a cryptographic AP. The status includes a return code, a reason code, a queue full indicator, a queue empty indicator, and the count of enqueued request messages on the AP message queue. The OS determines a number of lost request messages on the AP message queue, based on a count of enqueued request messages on the AP message queue received in the status. The OS re-enqueues the number of lost request messages to the AP message queue. The OS recovers the number of lost request messages on the AP message queue.

Automated local scaling of compute instances

At a first compute instance run on a virtualization host, a local instance scaling manager is launched. The scaling manager determines, based on metrics collected at the host, that a triggering condition for redistributing one or more types of resources of the first compute instance has been met. The scaling manager causes virtualization management components to allocate a subset of the first compute instance's resources to a second compute instance at the host.

Method for selectively deploying sensors within an agricultural facility

One variation of a method for deploying sensors within an agricultural facility includes: accessing scan data of a set of modules deployed within the agricultural facility; extracting characteristics of plants occupying the set of modules from the scan data; selecting a first subset of target modules from the set of modules, each target module in the set of target modules containing a group of plants exhibiting characteristics representative of plants occupying modules neighboring the target module; for each target module, scheduling a robotic manipulator within the agricultural facility to remove a particular plant from a particular plant slot in the target module and load the particular plant slot with a sensor pod from a population of sensor pods deployed in the agricultural facility; and monitoring environmental conditions at target modules in the first subset of target modules based on sensor data recorded by the first population of sensor pods.

INTERMODAL CALLING BRANCH INSTRUCTION
20230010863 · 2023-01-12 ·

Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.