Patent classifications
G06F15/7885
IN-FLIGHT INTEGRATED MODULAR AVIONICS (IMA) RECONFIGURATION
A real-time computer network application in the aeronautic industry for in-flight reconfiguring of the integrated modular avionics modules onboard aircraft, via spatial and temporal partitioning of modules conforming a data processing system including a computer-implemented method for in-flight Integrated Modular Avionics reconfiguration, a data processing system for the mentioned reconfiguration, a computer program product and a computer-readable medium, both to perform such reconfiguration.
Analyzing data using a hierarchical structure
Apparatus, systems, and methods for analyzing data are described. The data can be analyzed using a hierarchical structure. One such hierarchical structure can comprise a plurality of layers, where each layer performs an analysis on input data and provides an output based on the analysis. The output from lower layers in the hierarchical structure can be provided as inputs to higher layers. In this manner, lower layers can perform a lower level of analysis (e.g., more basic/fundamental analysis), while a higher layer can perform a higher level of analysis (e.g., more complex analysis) using the outputs from one or more lower layers. In an example, the hierarchical structure performs pattern recognition.
Lossless tiling in convolution networks—padding before tiling, location-based tiling, and zeroing-out
Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.
Analyzing data using a hierarchical structure
Apparatus, systems, and methods for analyzing data are described. The data can be analyzed using a hierarchical structure. One such hierarchical structure can comprise a plurality of layers, where each layer performs an analysis on input data and provides an output based on the analysis. The output from lower layers in the hierarchical structure can be provided as inputs to higher layers. In this manner, lower layers can perform a lower level of analysis (e.g., more basic/fundamental analysis), while a higher layer can perform a higher level of analysis (e.g., more complex analysis) using the outputs from one or more lower layers. In an example, the hierarchical structure performs pattern recognition.
Lossless Tiling in Convolution Networks - Materialization of Tensors
Disclosed is a data processing system that includes a plurality of reconfigurable processors and processor memory. Runtime logic, operatively coupled to the plurality of reconfigurable processors and the processor memory, is configured to configure at least one reconfigurable processor in the plurality of reconfigurable processors with a first subgraph in a sequence of subgraphs of a graph; load an input onto the processor memory; on a tile-by-tile basis, process a first set of input tiles from the input through the first subgraph and generate a first set of intermediate tiles, load the first set of intermediate tiles onto the processor memory, and process the first set of intermediate tiles through the first subgraph and generate a first set of output tiles; and compose output tiles in the first set of output tiles into a first composed input, and load the first composed input onto the processor memory.
Reconfigurable Parallel Processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.
Multi-headed multi-buffer for buffering data for processing
An integrated circuit includes a plurality of configurable units, each configurable unit having two or more corresponding sections. The plurality of configurable units is arranged in a serial arrangement to form a chain of sections of the configurable units. A data bus is connected to the plurality of configurable units which communicates data at a clock rate. The chain of sections is to receive and write a series of tensors at the clock rate at a first end section of the chain of sections, and sequentially propagate the series of tensors through individual sections within the chain of sections at the clock rate. The chain of sections is to output the series of tensors at a second end section of the chain of sections. The chain of sections is to also output the series of tensors at an intermediate section of the chain of sections.
Compiler for a command-aware hardware architecture
In an embodiment, a compiler for generating command bundles is configured to receive an execution definition that includes operations for execution. The compiler determines an ordered set of hardware functions corresponding to a hardware architecture to execute at least one operation. The hardware architecture may be selected from typical processor types or a command-aware hardware processor. The compiler generates a command bundle that includes a set of logically independent commands based on hardware functions and functionality of the hardware architecture to optimize execution of the operations. A command-aware hardware processor includes a hardware routing mesh that includes sets of routing nodes that form one or more hardware pipelines. Many hardware pipelines may be included in the hardware routing mesh. A command bundle is transmitted through a selected hardware pipeline via a control path, and is modified by the routing nodes based on execution of commands to achieve a desired outcome.
Fast Argument Load in a Reconfigurable Data Processor
A reconfigurable processor includes an array of configurable units connected by a bus system. Each configurable unit has a configuration data store, organized as a shift register, to store configuration data. The configuration data store also includes individually addressable argument registers respectively made up of word-sized portions of the shift register to provide arguments to the configurable unit. The configurable unit also includes program load logic shift data into the configuration data store, and argument load logic to directly load data into the argument registers without shifting the received argument data through the shift register. A program load controller is associated with the array to respond to a program load command by executing a program load process, and a fast argument load (FAL) controller is associated with the array to respond to an FAL command by executing an FAL process.
Reconfigurable parallel processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.