Patent classifications
G06F12/0891
SELECTIVE MEMORY ENCRYPTION
In one example in accordance with the present disclosure, a method may include receiving, by a processor on a system on a chip (SoC), a request to encrypt a subset of data accessed by a process. The method may also include receiving, at a page encryption hardware unit of the SoC, a system call from an operating system on behalf of the process, to generate an encrypted memory page corresponding to the subset of data. The method may also include generating, by the page encryption hardware unit, an encryption/decryption key for the first physical memory address. The encryption/decryption key may not be accessible by the operating system. The method may also include encrypting, by the page encryption hardware unit, the subset of data to the physical memory address using the encryption/decryption key and storing, by the page encryption hardware unit, the encryption/decryption key in a key store.
Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.
Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.
METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS
Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.
METHODS AND APPARATUS TO FACILITATE READ-MODIFY-WRITE SUPPORT IN A COHERENT VICTIM CACHE WITH PARALLEL DATA PATHS
Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
Provided are an information processing device and an information processing method that execute system call processing with improved processing efficiency without compromising security level. A kernel as a data processor that carries out system call execution control determines reliability of an application that executes system call invocation and reliability of processing data, and selects and executes either a safety-oriented system call A or a throughput-oriented system call B in according to a result of the determination. With the safety-oriented system call A, confirmation of permission to execute a system call and cache flush are executed, but with the throughput-oriented system call B, the confirmation of permission to execute a system call and the cache flush are skipped.
AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.
AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.