G06F9/35

Apparatus and method for performing operations on capability metadata
11481384 · 2022-10-25 · ·

An apparatus is provided comprising storage elements to store data blocks, where each data block has capability metadata associated therewith identifying whether the data block specifies a capability, at least one capability type being a bounded pointer. Processing circuitry is then arranged to be responsive to a bulk capability metadata operation identifying a plurality of the storage elements, to perform an operation on the capability metadata associated with each data block stored in the plurality of storage elements. Via a single specified operation, this hence enables query and/or modification operations to be performed on multiple items of capability metadata, hence providing more efficient access to such capability metadata.

Hardware-implemented universal floating-point instruction set architecture for computing directly with human-readable decimal character sequence floating-point representation operands
11635957 · 2023-04-25 ·

A universal floating-point Instruction Set Architecture (ISA) compute engine implemented entirely in hardware. The ISA compute engine computes directly with human-readable decimal character sequence floating-point representation operands without first having to explicitly perform a conversion-to-binary-format process in software. A fully pipelined convertToBinaryFromDecimalCharacter hardware operator logic circuit converts one or more human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point representations every clock cycle. Following computations by at least one hardware floating-point operator, a convertToDecimalCharacterFromBinary hardware conversion circuit converts the result back to a human-readable decimal character sequence floating-point representation.

SAVING AND RESTORING REGISTERS
20230069266 · 2023-03-02 ·

There is provided a data processing apparatus comprising a plurality of registers, each of the registers having data bits to store data and metadata bits to store metadata. Each of the registers is adapted to operate in a metadata mode in which the metadata bits and the data bits are valid, and a data mode in which the data bits are valid and the metadata bits are invalid. Mode bit storage circuitry indicates whether each of the registers is in the data mode or the metadata mode. Execution circuitry is responsive to a memory operation that is a store operation on one or more given registers.

APPARATUS AND METHOD FOR CAPABILITY-BASED PROCESSING

Apparatus comprises a processor to execute program instructions stored at respective memory addresses, processing of the program instructions being constrained by a prevailing capability defining at least access permissions to a set of one or more memory addresses; the processor comprising: control flow change handling circuitry to perform a control flow change operation, the control flow change operation defining a control flow change target address indicating the address of a program instruction for execution after the control flow change operation; and capability generating circuitry to determine, in dependence on the control flow change target address, an address at which capability access permissions data is stored; the capability generating circuitry being configured to retrieve the capability access permissions data and to generate a capability for use as a next prevailing capability in dependence upon at least the capability access permissions data.

APPARATUS AND METHOD FOR CAPABILITY-BASED PROCESSING

Apparatus comprises a processor to execute program instructions stored at respective memory addresses, processing of the program instructions being constrained by a prevailing capability defining at least access permissions to a set of one or more memory addresses; the processor comprising: control flow change handling circuitry to perform a control flow change operation, the control flow change operation defining a control flow change target address indicating the address of a program instruction for execution after the control flow change operation; and capability generating circuitry to determine, in dependence on the control flow change target address, an address at which capability access permissions data is stored; the capability generating circuitry being configured to retrieve the capability access permissions data and to generate a capability for use as a next prevailing capability in dependence upon at least the capability access permissions data.

Memory system architecture for multi-threaded processors

Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.

Memory system architecture for multi-threaded processors

Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.

MULTI-TABLE INSTRUCTION PREFETCH UNIT FOR MICROPROCESSOR
20230205543 · 2023-06-29 ·

A method, programming product, and/or system for prefetching instructions includes an instruction prefetch table that has a plurality of entries, each entry for storing a first portion of an indirect branch instruction address and a target address, wherein the indirect branch instruction has multiple target addresses and the instruction prefetch table is accessed by an index obtained by hashing a second portion of bits of the indirect branch instruction address with an information vector of the indirect branch instruction. A further embodiment includes a first prefetch table for uni-target branch instructions and a second prefetch table for multi-target branch instructions. In operation it is determined whether a branch instruction hits in one of the multiple prefetch tables; a target address for the branch instruction is read from the respective prefetch table in which the branch instruction hit; and the branch instruction is prefetched to an instruction cache.

MULTI-TABLE INSTRUCTION PREFETCH UNIT FOR MICROPROCESSOR
20230205543 · 2023-06-29 ·

A method, programming product, and/or system for prefetching instructions includes an instruction prefetch table that has a plurality of entries, each entry for storing a first portion of an indirect branch instruction address and a target address, wherein the indirect branch instruction has multiple target addresses and the instruction prefetch table is accessed by an index obtained by hashing a second portion of bits of the indirect branch instruction address with an information vector of the indirect branch instruction. A further embodiment includes a first prefetch table for uni-target branch instructions and a second prefetch table for multi-target branch instructions. In operation it is determined whether a branch instruction hits in one of the multiple prefetch tables; a target address for the branch instruction is read from the respective prefetch table in which the branch instruction hit; and the branch instruction is prefetched to an instruction cache.

CIRCUITRY AND METHODS FOR IMPLEMENTING CAPABILITIES USING NARROW REGISTERS
20230195461 · 2023-06-22 ·

Systems, methods, and apparatuses for implementing capabilities using narrow registers are described. In certain examples, a hardware processor core comprises a capability management circuit to check a capability for a memory access request, the capability comprising an address field, a validity field, and a bounds field that is to indicate a lower bound and an upper bound of an address space to which the capability authorizes access; a decoder circuit to decode a single instruction into a decoded single instruction, the single instruction comprising fields to indicate a memory address that stores the capability and a single destination register, and an opcode to indicate that an execution circuit is to load a first proper subset of the capability from the memory address into the single destination register and load a second proper subset of the capability from the memory address into an implicit second destination register; and the execution circuit to execute the decoded single instruction according to the opcode.