Patent classifications
G06F9/3897
Path simplification for computer graphics applications
Systems and methods provide for efficiently and accurately determining a simplified path that conforms to the geometry of an original path by simultaneously minimizing the deviation from the original path and reducing the number of anchor points in the simplified path. A simplified path may be iteratively generated by updating parametric values and anchor points for candidate simplified paths at epochs. A deviation in distance between points on the original path and corresponding points on candidate paths may be iteratively decreased to ensure that the resulting simplified path follows the geometry of the original path to a predetermined threshold. Continuity constrains can also be applied to ensure smoothness of the simplified path.
Apparatus and method to switch configurable logic units
Examples described herein include systems and methods which include an apparatus comprising a plurality of configurable logic units and a plurality of switches, with each switch being coupled to at least one configurable logic unit of the plurality of configurable logic units. The apparatus further includes an instruction register configured to provide respective switch instructions of a plurality of switch instructions to each switch based on a computation to be implemented among the plurality of configurable logic units. For example, the switch instructions may include allocating the plurality of configurable logic units to perform the computation and activating an input of the switch and an output of the switch to couple at least a first configurable logic unit and a second configurable logic unit. In various embodiments, configurable logic units can include arithmetic logic units (ALUs), bit manipulation units (BMUs), and multiplier-accumulator units (MACs).
Systems and methods for virtually partitioning a machine perception and dense algorithm integrated circuit
Systems and methods for virtually partitioning an integrated circuit may include identifying dimensional attributes of a target input dataset and selecting a data partitioning scheme from a plurality of distinct data partitioning schemes for the target input dataset based on the dimensional attributes of the target dataset and architectural attributes of an integrated circuit. The methods described herein may also include disintegrating the target dataset into a plurality of distinct subsets of data based on the selected data partitioning scheme and identifying a virtual processing core partitioning scheme from a plurality of distinct processing core partitioning schemes for an architecture of the integrated circuit based on the disintegration of the target input dataset. Additionally, the architecture of the integrated circuit may be virtually partitioned into a plurality of distinct partitions of processing cores and each of the plurality of distinct subsets of data may be mapped to one of the plurality of distinct partitions of processing cores.
Synchronization amongst processor tiles
A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
Multiple output fusion for operations performed in a multi-dimensional array of processing units
Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors and similarly structured data that are generated in parallel, for example, on nodes organized in a mesh or torus topology defined by connections in at least two dimension between the nodes. The methods provide parallel computation and communication between nodes in the topology.
SYSTEMS AND METHODS FOR VIRTUALLY PARTITIONING A MACHINE PERCEPTION AND DENSE ALGORITHM INTEGRATED CIRCUIT
Systems and methods for virtually partitioning an integrated circuit may include identifying dimensional attributes of a target input dataset and selecting a data partitioning scheme from a plurality of distinct data partitioning schemes for the target input dataset based on the dimensional attributes of the target dataset and architectural attributes of an integrated circuit. The methods described herein may also include disintegrating the target dataset into a plurality of distinct subsets of data based on the selected data partitioning scheme and identifying a virtual processing core partitioning scheme from a plurality of distinct processing core partitioning schemes for an architecture of the integrated circuit based on the disintegration of the target input dataset. Additionally, the architecture of the integrated circuit may be virtually partitioned into a plurality of distinct partitions of processing cores and each of the plurality of distinct subsets of data may be mapped to one of the plurality of distinct partitions of processing cores.
METHODS, SYSTEMS AND APPARATUS TO IMPROVE CONVOLUTION EFFICIENCY
Methods, apparatus, systems, and articles of manufacture are disclosed to improve convolution efficiency of a convolution neural network (CNN) accelerator. An example hardware accelerator includes a hardware data path element (DPE) in a DPE array, the hardware DPE including an accumulator, and a multiplier coupled to the accumulator, the multiplier to multiply first inputs including an activation value and a filter coefficient value to generate a first convolution output when the hardware DPE is in a convolution mode, and a controller coupled to the DPE array, the controller to adjust the hardware DPE from the convolution mode to a pooling mode by causing at least one of the multiplier or the accumulator to generate a second convolution output based on second inputs, the second inputs including an output location value of a pool area, at least one of the first inputs different from at least one of the second inputs.
Processor Repair
A processor comprises at least one delay stage for each processing circuit and switching circuitry for selectively switching the delay stage into or out of a communication path involved in message exchange. For processing circuits up to a defective processing circuit in the column, the delay stage is switched into the communication path, and for processing circuits above the defective processing circuit in the column, including a repairing processing circuit which repairs the defective processing circuit the delay stage is switched out of the communication path whereby the fixed transmission time of processing circuits is preserved in the event of a repair of the column.
System and method for system for acquiring data
A method of acquiring data, a computer program product for implementing the method, a system for acquiring data, and a vehicle including the system. The method includes determining one or more data types and virtual channels required for one or more applications. The method also includes allocating a plurality of circular buffers in memory according to the determined data type(s) and virtual channel(s). One or more of the circular buffers are allocated to safety data lines. The remaining circular buffers are allocated to functional data lines. The method further includes storing at least one functional data line in a circular buffer allocated to functional data lines according to a data type and virtual channel of the functional data line. The method also includes storing at least one safety data line in a circular buffer allocated to safety data lines.
PROCESS EXECUTION ORDER DETERMINING PROGRAM AND PROCESS EXECUTION ORDER DETERMINING METHOD
The present disclosure provides a process execution order determining program and a process execution order determining method for supporting work in determining the execution order of a plurality of processes when designing the plurality of processes to be executed in controlled devices. The process execution order determining program causes a computer to execute a decision process for determining an execution order of a plurality of processes to be executed in controlled devices; and a first determination process for monitoring elements that appear in a connection path formed in the decision process, and determining whether same elements appear in the connection path.