G06F13/4013

Layered vector architecture compatibility for cross-system portability

An application that includes intrinsics defined in one architecture is to execute without change on a different architecture. Program code that depends on vector element ordering is obtained, and that program code is part of an application including one or more intrinsics. The one or more intrinsics are mapped from a first system architecture for which the application was written to a second system architecture. One or more operations of the program code are then converted from a first data layout to a second data layout. The application, including the mapped intrinsics and the converted data layout, is to be executed on a processor of the different architecture.

MULTI-PACKET PROCESSING WITH ORDERING RULE ENFORCEMENT
20180089124 · 2018-03-29 ·

A system includes an input/output adapter operable to receive a plurality of packets in a single clock cycle. The system includes a controller operatively connected to the input/output adapter. The controller is operable to receive a first packet at a data link layer and determine a state of a first output indicator to maintain packet ordering. Based on determining that a first receiver formatting interface is selected by the first output indicator, the controller performs an alignment adjustment and output of the first packet by the first receiver formatting interface. Based on determining that a second receiver formatting interface is selected by the first output indicator, the controller performs the alignment adjustment and output of the first packet by the second receiver formatting interface.

DATA ITEM ORDER RESTORATION
20180081624 · 2018-03-22 ·

An apparatus and a corresponding method for processing a sequence of received data items are disclosed. The processing is performed by multiple processing elements. A reorder buffer comprising multiple slots is used to maintain the order of the received data items, wherein a processing element reserves a next available slot in the reorder buffer before beginning processing the next data item of the sequence of received data items. On completion of the processing a buffer change indicator value is read by the processing element when seeking to insert the processed data item into the reserved slot. If the buffer change indicator changes during the course of the insertion process, this serves as an indication to the processing element that another processing element is modifying the content of the reorder buffer in parallel. A check may be repeated for at least one subsequent already-processed data item, since this latter data item may have become ready to be retired from the reorder buffer.

SHUFFLER CIRCUIT FOR LANE SHUFFLE IN SIMD ARCHITECTURE

Techniques are described to perform a shuffle operation. Rather than using an all-lane to all-lane cross bar, a shuffler circuit having a smaller cross bar is described. The shuffler circuit performs the shuffle operation piecewise by reordering data received from processing lanes and outputting the reordered data.

Power consumption control based on random bus inversion
20240411716 · 2024-12-12 ·

An electronic device includes circuitry and a plurality of ports. The plurality of ports includes an input port and an output port, configured to communicate data units with one or more other devices across a fabric of a System on a Chip (SoC), the data units include N data bits, N being an integer larger than 1. The circuitry is configured to receive an input data unit via the input port, to make a random decision of whether to invert the N data bits in the input data unit, to produce an output data unit by retaining or inverting the N data bits of the input data unit based on the random decision, and to send the output data unit via the output port.

Methods, apparatus, and articles of manufacture to reorder N-dimensional sparse data into groups of data elements that can be collocated in a memory

Exemplary embodiments maintain spatial locality of the data being processed by a sparse CNN. The spatial locality is maintained by reordering the data to preserve spatial locality. The reordering may be performed on data elements and on data for groups of co-located data elements referred to herein as chunks. Thus, the data may be reordered into chunks, where each chunk contains data for spatially co-located data elements, and in addition, chunks may be organized so that spatially located chunks are together. The use of chunks helps to reduce the need to re-fetch data during processing. Chunk sizes may be chosen based on the memory constraints of the processing logic (e.g., cache sizes).

SEMICONDUCTOR DEVICE AND DATA PROCESSING SYSTEM SELECTIVELY OPERATING AS ONE OF A BIG ENDIAN OR LITTLE ENDIAN SYSTEM
20170345401 · 2017-11-30 ·

The present invention is to provide a semiconductor device that can correctly switch endians on the outside even if the endian of a parallel interface is not recognized on the outside. The semiconductor device includes a switching circuit and a first register. The switching circuit switches between whether a parallel interface with the outside is to be used as a big endian or a little endian. A first register holds control data of the switching circuit. The switching circuit regards the parallel interface as the little endian when first predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register, and regards the parallel interface as the big endian when second predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register. Whatever the endian setting status, the control information can be correctly inputted without being influenced by the endian setting status.

Semiconductor device and data processing system selectively operating as one of a big endian or little endian system

The present invention is to provide a semiconductor device that can correctly switch endians on the outside even if the endian of a parallel interface is not recognized on the outside. The semiconductor device includes a switching circuit and a first register. The switching circuit switches between whether a parallel interface with the outside is to be used as a big endian or a little endian. A first register holds control data of the switching circuit. The switching circuit regards the parallel interface as the little endian when first predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register, and regards the parallel interface as the big endian when second predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register. Whatever the endian setting status, the control information can be correctly inputted without being influenced by the endian setting status.

Endian configuration memory and ECC protecting processor endianess mode circuit

An electronic circuit includes a microcontroller processor (410), a peripheral (420) coupled with the processor, an endian circuit (470) coupled with the processor and the peripheral to selectively provide different endianess modes of operation, and a detection circuit (140) to detect a failure to select a given endianess, whereby inadvertent switch of endianess due to faults is avoided. Other circuits, devices, systems, methods of operation and processes of manufacture are also disclosed.

Methods for generating a synthetic backup and for consolidating a chain of backups independent of endianness

Systems and methods for generating synthetic backups and for consolidating a chain of related backups. The chain of related backups is merged on the fly to create a backup stream. A block allocation table (BAT) may be identified for each backup to be consolidated into a synthetic backup, and BAT entries from each backup may be merged or combined to create a new BAT table associated with the synthetic backup. The data included in the related backups may be reformatted on the fly from big endian to little endian or vice versa. The backup stream is stored on a target device or volume.