Patent classifications
G06F12/0837
INTERLEAVED CACHE CONTROLLERS WITH SHARED METADATA AND RELATED DEVICES AND SYSTEMS
Interleaved cache controllers with shared metadata are disclosed and described. A memory system may comprise a plurality of cache controllers and a metadata store interconnected by a metadata store fabric. The metadata store receives information from at least one of the plurality of cache controllers, a portion of which is stored as shared distributed metadata. The metadata store provides shared access of the shared distributed metadata hosted to the plurality of cache controllers
FLEXIBLE FRAMEWORK TO SUPPORT MEMORY SYNCHRONIZATION OPERATIONS
A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.
Intelligent content migration with borrowed memory
Systems, methods and apparatuses to intelligently migrate content involving borrowed memory are described. For example, after the prediction of a time period during which a network connection between computing devices having borrowed memory degrades, the computing devices can make a migration decision for content of a virtual memory address region, based at least in part on a predicted usage of content, a scheduled operation, a predicted operation, a battery level, etc. The migration decision can be made based on a memory usage history, a battery usage history, a location history, etc. using an artificial neural network; and the content migration can be performed by remapping virtual memory regions in the memory maps of the computing devices.
Intelligent content migration with borrowed memory
Systems, methods and apparatuses to intelligently migrate content involving borrowed memory are described. For example, after the prediction of a time period during which a network connection between computing devices having borrowed memory degrades, the computing devices can make a migration decision for content of a virtual memory address region, based at least in part on a predicted usage of content, a scheduled operation, a predicted operation, a battery level, etc. The migration decision can be made based on a memory usage history, a battery usage history, a location history, etc. using an artificial neural network; and the content migration can be performed by remapping virtual memory regions in the memory maps of the computing devices.
Translation entry invalidation in a multithreaded data processing system
In a multithreaded data processing system including a plurality of processor cores, storage-modifying requests, including a translation invalidation request of an initiating hardware thread, are received in a shared queue. The translation invalidation request is broadcast so that it is received and processed by the plurality of processor cores. In response to confirmation of the broadcast, the address translated by the translation entry is stored in a queue. Once the address is stored, the initiating processor core resumes dispatch of instructions within the initiating hardware thread. In response to a request from one of the plurality of processor cores, an effective address translated by a translation entry being invalidated is accessed in the queue. A synchronization request for the address is broadcast to ensure completion of processing of any translation invalidation request for the address.
Translation entry invalidation in a multithreaded data processing system
In a multithreaded data processing system including a plurality of processor cores, storage-modifying requests, including a translation invalidation request of an initiating hardware thread, are received in a shared queue. The translation invalidation request is broadcast so that it is received and processed by the plurality of processor cores. In response to confirmation of the broadcast, the address translated by the translation entry is stored in a queue. Once the address is stored, the initiating processor core resumes dispatch of instructions within the initiating hardware thread. In response to a request from one of the plurality of processor cores, an effective address translated by a translation entry being invalidated is accessed in the queue. A synchronization request for the address is broadcast to ensure completion of processing of any translation invalidation request for the address.
HIGH PERFORMANCE CONSTANT CACHE AND CONSTANT ACCESS MECHANISMS
A graphics processing apparatus includes a graphics processor and a constant cache. The graphics processor has a number of execution instances that will generate requests for constant data from the constant cache. The constant cache stores constants of multiple constant types. The constant cache has a single level of hierarchy to store the constant data. The constant cache has a banking structure based on the number of execution instances, where the execution instances generate requests for the constant data with unified messaging that is the same for the different types of constant data.
HIGH PERFORMANCE CONSTANT CACHE AND CONSTANT ACCESS MECHANISMS
A graphics processing apparatus includes a graphics processor and a constant cache. The graphics processor has a number of execution instances that will generate requests for constant data from the constant cache. The constant cache stores constants of multiple constant types. The constant cache has a single level of hierarchy to store the constant data. The constant cache has a banking structure based on the number of execution instances, where the execution instances generate requests for the constant data with unified messaging that is the same for the different types of constant data.
Assymetric coherent caching for heterogeneous computing
A method of caching data in the memory of electronic processor units including compiling, in a first processor configured to perform data-parallel computation, a set of asymmetric coherent caching rules. The set of rules configure the first processor to be: inoperable to cache, in a second level memory cache of the first electronic processor unit, data whose home location is in a final memory store of a second electronic processor unit; operable to cache, in the second level memory cache of the first electronic processor unit, the data whose home location is in a final memory store of the first electronic processor unit; and operable to cache, in a first level memory cache of the first electronic processor unit, the data, regardless of a home location of the data.
Assymetric coherent caching for heterogeneous computing
A method of caching data in the memory of electronic processor units including compiling, in a first processor configured to perform data-parallel computation, a set of asymmetric coherent caching rules. The set of rules configure the first processor to be: inoperable to cache, in a second level memory cache of the first electronic processor unit, data whose home location is in a final memory store of a second electronic processor unit; operable to cache, in the second level memory cache of the first electronic processor unit, the data whose home location is in a final memory store of the first electronic processor unit; and operable to cache, in a first level memory cache of the first electronic processor unit, the data, regardless of a home location of the data.