Patent classifications
G06F15/781
Method and apparatus to sort a vector for a bitonic sorting algorithm
A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.
Method and apparatus for dual issue multiply instructions
A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.
Method and apparatus for vector permutation
A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.
Configurable peripherals
A system-on-chip (SOC), includes a memory, a partition access module coupled to the memory, a partition requesting unit coupled to the partition access module, and a first input-output (IO) device coupled to the partition access module. The partition access module creates a first partition of the SOC. The first partition includes a first portion of a first processor, the first IO device, and a first portion of the memory. Based upon a partition request, the partition access module repartitions the SOC to create a dynamic partition. The dynamic partition includes the first portion of the first processor, the first input-output (IO) device, the first portion of the memory, and a second IO device not included in the first partition.
DSB Operation with Excluded Region
Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
Symbiotic network on layers
The technology relates to a system on chip (SoC). The SoC may include a plurality of network layers which may assist electrical communications either horizontally or vertically among components from different device layers. In one embodiment, a system on chip (SoC) includes a plurality of network layers, each network layer including one or more routers, and more than one device layers, each of the plurality of network layers respectively bonded to one of the device layers. In another embodiment, a method for forming a system on chip (SoC) includes forming a plurality of network layers in an interconnect, wherein each network layer is bonded to an active surface of a respective device layer in a plurality of device layer.
Common platform for one-level memory architecture and two-level memory architecture
A processor includes a first memory interface to be coupled to a plurality of memory module sockets located off-package, a second memory interface to be coupled to a non-volatile memory (NVM) socket located off-package, and a multi-level memory controller (MLMC). The MLMC is to: control the memory modules disposed in the plurality of memory module sockets as main memory in a one-level memory (1LM) configuration; detect a switch from a 1LM mode of operation to a two-level memory (2LM) mode of operation in response to a basic input/output system (BIOS) detection of a low-power memory module disposed in one of the memory module sockets and a NVM device disposed in the NVM socket in a 2LM configuration; and control the low-power memory module as cache in the 2LM configuration in response to detection of the switch from the 1LM mode of operation to the 2LM mode of operation.
HIGHLY INTEGRATED SCALABLE, FLEXIBLE DSP MEGAMODULE ARCHITECTURE
Disclosed embodiments include an electronic device having a processor core, a memory, a register, and a data load unit to receive a plurality of data elements stored in the memory in response to an instruction. All of the data elements hare the same data size, which is specified by one or more coding bits. The data load unit includes an address generator to generate addresses corresponding to locations in the memory at which the data elements are located, and a formatting unit to format the data elements. The register is configured to store the formatted data elements, and the processor core is configured to receive the formatted data elements from the register.
Multiple transaction data flow control unit for high-speed interconnect
Methods, apparatus, and systems, for transporting data units comprising multiple pieces of transaction data over high-speed interconnects. A flow control unit, called a KTI (Keizer Technology Interface) Flit, is implemented in a coherent multi-layer protocol supporting coherent memory transactions. The KTI Flit has a basic format that supports use of configurable fields to implement KTI Flits with specific formats that may be used for corresponding transactions. In one aspect, the KTI Flit may be formatted as multiple slots used to support transfer of multiple respective pieces of transaction data in a single Flit. The KTI Flit can also be configured to support various types of transactions and multiple KTI Flits may be combined into packets to support transfer of data such as cache line transfers.
Supporting access to accelerators on a programmable integrated circuit by multiple host processes
Supporting multiple clients on a single programmable integrated circuit (IC) can include implementing a first image within the programmable IC in response to a first request for processing to be performed by the programmable IC, wherein the request is from a first process executing in a host data processing system coupled to the programmable IC, receiving, using a processor of the host data processing system, a second request for processing to be performed on the programmable IC from a second and different process executing in the host data processing system while the programmable IC still implements the first image, comparing, using the processor, a second image specified by the second request to the first image, and, in response to determining that the second image matches the first image based on the comparing, granting, using the processor, the second request for processing to be performed by the programmable IC.