Patent classifications
G06F12/0808
Mergeable counter system and method
A system includes a first counter configured to increment or decrement in response to a triggering event. The first counter is sized to overflow. The system also includes a second counter configured to increment or decrement in response to a triggering event. The first counter and the second counter are merged to form a third counter in response to detecting an overflow triggering event for the first counter. A merge bit indicative of whether the first counter and the second counter are merged changes value in response to merging the first counter and the second counter.
Mergeable counter system and method
A system includes a first counter configured to increment or decrement in response to a triggering event. The first counter is sized to overflow. The system also includes a second counter configured to increment or decrement in response to a triggering event. The first counter and the second counter are merged to form a third counter in response to detecting an overflow triggering event for the first counter. A merge bit indicative of whether the first counter and the second counter are merged changes value in response to merging the first counter and the second counter.
Cache system with a primary cache and an overflow cache that use different indexing schemes
A cache memory system including a primary cache and an overflow cache that are searched together using a search address. The overflow cache operates as an eviction array for the primary cache. The primary cache is addressed using bits of the search address, and the overflow cache is addressed by a hash index generated by a hash function applied to bits of the search address. The hash function operates to distribute victims evicted from the primary cache to different sets of the overflow cache to improve overall cache utilization. A hash generator may be included to perform the hash function. A hash table may be included to store hash indexes of valid entries in the primary cache. The cache memory system may be used to implement a translation lookaside buffer for a microprocessor.
Cache system with a primary cache and an overflow cache that use different indexing schemes
A cache memory system including a primary cache and an overflow cache that are searched together using a search address. The overflow cache operates as an eviction array for the primary cache. The primary cache is addressed using bits of the search address, and the overflow cache is addressed by a hash index generated by a hash function applied to bits of the search address. The hash function operates to distribute victims evicted from the primary cache to different sets of the overflow cache to improve overall cache utilization. A hash generator may be included to perform the hash function. A hash table may be included to store hash indexes of valid entries in the primary cache. The cache memory system may be used to implement a translation lookaside buffer for a microprocessor.
TARGETED PER-LINE OPERATIONS FOR REMOTE SCOPE PROMOTION
A processing system includes one or more first caches and one or more first lock tables associated with the one or more first caches. The processing system also includes one or more processing units that each include a plurality of compute units for concurrently executing work-groups of work items, a plurality of second caches associated with the plurality of compute units and configured in a hierarchy with the one or more first caches, and a plurality of second lock tables associated with the plurality of second caches. The first and second lock tables indicate locking states of addresses of cache lines in the corresponding first and second caches on a per-line basis.
TARGETED PER-LINE OPERATIONS FOR REMOTE SCOPE PROMOTION
A processing system includes one or more first caches and one or more first lock tables associated with the one or more first caches. The processing system also includes one or more processing units that each include a plurality of compute units for concurrently executing work-groups of work items, a plurality of second caches associated with the plurality of compute units and configured in a hierarchy with the one or more first caches, and a plurality of second lock tables associated with the plurality of second caches. The first and second lock tables indicate locking states of addresses of cache lines in the corresponding first and second caches on a per-line basis.
TECHNIQUES FOR MAINTAINING CONSISTENCY BETWEEN ADDRESS TRANSLATIONS IN A DATA PROCESSING SYSTEM
A technique for operating a memory management unit (MMU) of a processor includes the MMU detecting that one or more address translation invalidation requests are indicated for an accelerator unit (AU). In response to detecting that the invalidation requests are indicated, the MMU issues a raise barrier request for the AU. In response to detecting a raise barrier response from the AU to the raise barrier request the MMU issues the invalidation requests to the AU. In response to detecting an address translation invalidation response from the AU to each of the invalidation requests, the MMU issues a lower barrier request to the AU. In response to detecting a lower barrier response from the AU to the lower barrier request, the MMU resumes handling address translation check-in and check-out requests received from the AU.
TECHNIQUES FOR MAINTAINING CONSISTENCY BETWEEN ADDRESS TRANSLATIONS IN A DATA PROCESSING SYSTEM
A technique for operating a memory management unit (MMU) of a processor includes the MMU detecting that one or more address translation invalidation requests are indicated for an accelerator unit (AU). In response to detecting that the invalidation requests are indicated, the MMU issues a raise barrier request for the AU. In response to detecting a raise barrier response from the AU to the raise barrier request the MMU issues the invalidation requests to the AU. In response to detecting an address translation invalidation response from the AU to each of the invalidation requests, the MMU issues a lower barrier request to the AU. In response to detecting a lower barrier response from the AU to the lower barrier request, the MMU resumes handling address translation check-in and check-out requests received from the AU.
DEAD SURFACE INVALIDATION
Systems, apparatuses, and methods for performing dead surface invalidation are disclosed. An application sends draw call commands to a graphics processing unit (GPU) via a driver, with the draw call commands rendering to surfaces. After it is determined that a given surface will no longer be accessed by subsequent draw calls, the application sends a surface invalidation command for the given surface to a command processor of the GPU. After the command processor receives the surface invalidation command, the command processor waits for a shader engine to send a draw call completion message for a last draw call to access the given surface. Once the command processor receives the draw call completion message, the command processor sends a surface invalidation command to a cache to invalidate cache lines for the given surface to free up space in the cache for other data.
Application of a default shared state cache coherency protocol
Example implementations relate to cache coherency protocols as applied to a memory block range. Exclusive ownership of a range of blocks of memory in a default shared state may be tracked by a directory. The directory may be associated with a first processor of a set of processors. When a request is received from a second processor of the set of processors to read one or more blocks of memory absent from the directory, one or more blocks may be transmitted in the default shared state to the second processor. The blocks absent from the directory may not be tracked in the directory.