Patent classifications
G06F9/3881
ELECTRONIC APPARATUS AND METHOD FOR REDUCING NUMBER OF COMMANDS
An electronic apparatus and a method for reducing the number of commands are provided. The electronic apparatus includes a central processor and a co-processor. The central processor generates a plurality of original register setting commands to set at least one bit of at least one register of the co-processor. The original register setting commands include a plurality of first original register setting commands, and a plurality of setting targets of the first original register setting commands have address continuity. The central processor merges the first original register setting commands to generate at least one merged register setting command. The central processor transmits the at least one merged register setting command to the co-processor.
Data processing method and apparatus, and related product
The present disclosure provides a data processing method and an apparatus and a related product. The products include a control module including an instruction caching unit, an instruction processing unit, and a storage queue unit. The instruction caching unit is configured to store computation instructions associated with an artificial neural network operation; the instruction processing unit is configured to parse the computation instructions to obtain a plurality of operation instructions; and the storage queue unit is configured to store an instruction queue, where the instruction queue includes a plurality of operation instructions or computation instructions to be executed in the sequence of the queue. By utilizing the above-mentioned method, the present disclosure can improve the operation efficiency of related products when performing operations of a neural network model.
Processor, accelerator, and direct memory access controller within a processor core that each reads/writes a local synchronization flag area for parallel execution
It is provided a processor system comprising at least one processor core including a processor, a memory and an accelerator. The memory includes an instruction area, a synchronization flag area and a data area. The accelerator starts, even if the processor is executing another processing, acceleration processing and executes read instruction in a case where the read instruction is a flag checking instruction and a flag indicating the completion of predetermined processing has been written; and stores the data subjected to the acceleration processing after completion of the acceleration processing, and further writes a flag indicating the completion of the acceleration processing. The processor starts, even if the accelerator is executing another processing, read instruction corresponding to a flag in a case where the read instruction is the flag checking instruction and it is confirmed that the flag indicating the completion of the acceleration processing has been written.
ACCELERATION CIRCUITRY FOR POSIT OPERATIONS
Systems, apparatuses, and methods related to acceleration circuitry for posit operations are described. Signaling indicative of performance of an operation to write a first bit string to a first buffer resident on acceleration circuitry and a second bit string resident on the acceleration circuitry can be received at an DMA controller couplable to the acceleration circuitry. The acceleration circuitry can be configured to perform arithmetic operations, logical operations, or both on bit strings formatted in a unum or posit format. Signaling indicative of an arithmetic operation, a logical operation, or both, to be performed using the first and second bit strings can be transmitted to the acceleration circuitry. The arithmetic operation, the logical operation, or both can be performed via the acceleration circuitry and according to the signaling. Signaling indicative of a result of the arithmetic operation, the logical operation, or both can be transmitting to the DMA controller.
Method, apparatus and system for data stream processing with a programmable accelerator
Techniques and mechanisms for programming an accelerator device to enable performance of a data processing algorithm. In an embodiment, an accelerator of a computer platform is programmed based on programming information received from a host processor of the computer platform. In another embodiment, programming of the accelerator is to enable data driven execution of an instruction by a data stream processing engine of the accelerator.
Vector processor storage
A method comprising: receiving, at a vector processor, a request to store data; performing, by the vector processor, one or more transforms on the data; and directly instructing, by the vector processor, one or more storage device to store the data; wherein performing one or more transforms on the data comprises: erasure encoding the data to generate n data fragments configured such that any k of the data fragments are usable to regenerate the data, where k is less than n; and wherein directly instructing one or more storage device to store the data comprises: directly instructing the one or more storage devices to store the plurality of data fragments.
COMPUTING MACHINE, METHOD AND NON-TRANSITORY COMPUTER-READABLE MEDIUM
A computing machine according to the present disclosure includes: control managing means for controlling communication between a plurality of computing machines; and management register means for managing communication setting information which sets communication between the computing machines and communication state information which indicates a state of the communication. The computing machine includes edge control means for, upon receiving a signal from one of the computing machines, sorting the signal into a control signal and data based on the communication setting information set in the management register means and in accordance with the number of clocks for processing the signal.
Coprocessor context priority
A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.
ROBUST, EFFICIENT MULTIPROCESSOR-COPROCESSOR INTERFACE
Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
STREAMING ENGINE WITH ERROR DETECTION, CORRECTION AND RESTART
Disclosed embodiments relate to a streaming engine employed in, for example, a digital signal processor. A fixed data stream sequence including plural nested loops is specified by a control register. The streaming engine includes an address generator producing addresses of data elements and a steam head register storing data elements next to be supplied as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer. Parity bits are formed upon storage of data in the stream buffer which are stored with the corresponding data. Upon transfer to the stream head register a second parity is calculated and compared with the stored parity. The streaming engine signals a parity fault if the parities do not match. The streaming engine preferably restarts fetching the data stream at the data element generating a parity fault.