Patent classifications
G06F9/3004
CONTENT-ADDRESSABLE PROCESSING ENGINE
A content-addressable processing engine, also referred to herein as CAPE, is provided. Processing-in-memory (PIM) architectures attempt to overcome the von Neumann bottleneck by combining computation and storage logic into a single component. CAPE provides a general-purpose PIM microarchitecture that provides acceleration of vector operations while being programmable with standard reduced instruction set computing (RISC) instructions, such as RISC-V instructions with standard vector extensions. CAPE can be implemented as a standalone core that specializes in associative computing, and that can be integrated in a tiled multicore chip alongside other types of compute engines. Certain embodiments of CAPE achieve average speedups of 14× (up to 254×) over an area-equivalent out-of-order processor core tile with three levels of caches across a diverse set of representative applications.
Accelerator circuit for mathematical operations with immediate values table
Embodiments of the present disclosure relate to an accelerator circuit with a dynamic immediate values table (IVT). The accelerator circuit includes an instruction memory, a data memory, and a vector circuit with the IVT storing multiple immediate values at multiple entries. The vector circuit reads a subset of instructions from the instruction memory, each instruction including at least one corresponding pointer to at least one corresponding entry in the IVT. The vector circuit further receives a subset of input data from the data memory corresponding to the subset of instructions. The vector circuit performs a respective operation in accordance with each instruction from the subset of instructions using a corresponding data vector of the received subset of input data identified in each instruction and at least one corresponding immediate value from the IVT pointed by the at least one corresponding pointer to generate corresponding output data.
Matrix barcode having a plurality of colors and an ultraviolet layer for conveying spatial information
A matrix bar code on a surface may comprise a plurality of colors and an ultraviolet layer. The matrix barcode may be a fiducial marker for conveying spatial information. The The conveyed spatial information may stem at least in part from the ultraviolet layer.
Extended asynchronous data mover functions compatibility indication
A method is provided that is executable by a processor of a computer. Note that the processor is communicatively coupled to a memory of the computer, and the memory stores a response block of a call command. In implementing the method, the processor defines a sub-functions field in the response block of the call command. Further the processor indicates that a set of functions of a set of instructions are installed and available at an interface based on a corresponding sub-functions flag within the sub-functions field being set. Note that the interface is also being executed on the computer and that the set of functions being represented by the corresponding sub-functions flag. The processor further indicates that the set of functions of the set of instructions are not installed based on the corresponding sub-functions flag not being set.
Computer-readable recording medium recording appearance frequency calculation program, information processing apparatus, and appearance frequency calculation method
A recording medium recording an appearance frequency calculation program for causing an information processing apparatus to execute processing includes: construction processing of constructing thread groups each including threads; acquisition processing in which the thread group acquires a data group including a same number of pieces of data as a number of threads constituting the thread group, each thread being responsible for one piece of data of the data group; and addition processing in which the thread adds one to a first storage area that stores an appearance frequency of a first numerical value, and a duplication number indicating a number of duplication is added to the first storage area when the own thread is a representative thread that is present alone in the thread group that is responsible for the data of the first numerical value when the first numerical value is duplicated in the data group.
Methods and systems for invalidating memory ranges in fabric-based architectures
Embodiments of the invention include a machine-readable medium having stored thereon at least one instruction, which if performed by a machine causes the machine to perform a method that includes decoding, with a node, an invalidate instruction; and executing, with the node, the invalidate instruction for invalidating a memory range specified across a fabric interconnect.
Matrix data broadcast architecture
Systems, apparatuses, and methods for efficient parallel execution of multiple work units in a processor by reducing a number of memory accesses are disclosed. A computing system includes a processor core with a parallel data architecture. The processor core executes a software application with matrix operations. The processor core supports the broadcast of shared data to multiple compute units of the processor core. A compiler or other code assigns thread groups to compute units based on detecting shared data among the compute units. Rather than send multiple read accesses to a memory subsystem for the shared data, the processor core generates a single access request. The single access request includes information to identify the multiple compute units for receiving the shared data when broadcasted by the processor core.
METHOD AND APPARATUS FOR IMPLIED BIT HANDLING IN FLOATING POINT MULTIPLICATION
A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.
System of Multiple Stacks in a Processor Devoid of an Effective Address Generator
In one implementation devoid of an effective address generator a method of call operation comprises pushing one or more parameters onto a first stack, pushing the contents of one or more registers onto a second stack, popping off the first stack one or more of the parameters into one or more of the registers whose contents were pushed onto the second stack, performing register to register operations on the one or more registers whose contents were pushed onto the second stack with a result of the register to register operations being stored in a result register, the result register being one of the one or more registers whose contents were pushed onto the second stack, popping off the second stack the contents of all the one or more registers into their respective registers from which they came, and returning control to an instruction following the call.
DETECTION OF IMAGES IN RELATION TO TARGETS BASED ON COLORSPACE TRANSFORMATION TECHNIQUES AND UTILIZING ULTRAVIOLET LIGHT
Techniques to improve detection and security of images, including formation and detection of matrix-based images. Some techniques include logic to process image data, generate one or more colorspaces associated with that data, and perform colorspace conversions based on the generated colorspace. The logic may be further configured to generate an image based on the colorspace conversions, including but not limited to a matrix bar code. The logic may be further configured to apply one or both of an ultraviolet layer and an infrared layer to the image, e.g. matrix barcode, generated from the colorspace conversion(s). Other embodiments are described and claimed.