G06F12/0844

VEHICULAR DEVICE AND CONTROL METHOD FOR VEHICULAR DEVICE
20220027276 · 2022-01-27 ·

A vehicular device includes multiple CPU modules, multiple cache memories allocated to the CPU modules, respectively, and a memory synchronization unit configured to synchronize multiple surfaces drawn in the multiple cache memories. The memory synchronization unit divides the surfaces to be synchronized into multiple tiles, and sequentially synchronize the divided tiles from tiles for which drawing has been completed.

VEHICULAR DEVICE AND CONTROL METHOD FOR VEHICULAR DEVICE
20220027276 · 2022-01-27 ·

A vehicular device includes multiple CPU modules, multiple cache memories allocated to the CPU modules, respectively, and a memory synchronization unit configured to synchronize multiple surfaces drawn in the multiple cache memories. The memory synchronization unit divides the surfaces to be synchronized into multiple tiles, and sequentially synchronize the divided tiles from tiles for which drawing has been completed.

Methods and systems for distributing memory requests

A memory request, including an address, is accessed. The memory request also specifies a type of an operation (e.g., a read or write) associated with an instance (e.g., a block) of data. A group of caches is selected using a bit or bits in the address. A first hash of the address is performed to select a cache in the group. A second hash of the address is performed to select a set of cache lines in the cache. Unless the operation results in a cache miss, the memory request is processed at the selected cache. When there is a cache miss, a third hash of the address is performed to select a memory controller, and a fourth hash of the address is performed to select a bank group and a bank in memory.

Methods and systems for distributing memory requests

A memory request, including an address, is accessed. The memory request also specifies a type of an operation (e.g., a read or write) associated with an instance (e.g., a block) of data. A group of caches is selected using a bit or bits in the address. A first hash of the address is performed to select a cache in the group. A second hash of the address is performed to select a set of cache lines in the cache. Unless the operation results in a cache miss, the memory request is processed at the selected cache. When there is a cache miss, a third hash of the address is performed to select a memory controller, and a fourth hash of the address is performed to select a bank group and a bank in memory.

MICROPROCESSOR ARCHITECTURE HAVING ALTERNATIVE MEMORY ACCESS PATHS
20210365381 · 2021-11-25 ·

The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.

MICROPROCESSOR ARCHITECTURE HAVING ALTERNATIVE MEMORY ACCESS PATHS
20210365381 · 2021-11-25 ·

The present invention is directed to a system and method which employ two memory access paths: 1) a cache-access path in which block data is fetched from main memory for loading to a cache, and 2) a direct-access path in which individually-addressed data is fetched from main memory. The system may comprise one or more processor cores that utilize the cache-access path for accessing data. The system may further comprise at least one heterogeneous functional unit that is operable to utilize the direct-access path for accessing data. In certain embodiments, the one or more processor cores, cache, and the at least one heterogeneous functional unit may be included on a common semiconductor die (e.g., as part of an integrated circuit). Embodiments of the present invention enable improved system performance by selectively employing the cache-access path for certain instructions while selectively employing the direct-access path for other instructions.

Techniques for handling cache coherency traffic for contended semaphores

The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.

Techniques for handling cache coherency traffic for contended semaphores

The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.

Vehicular device and control method for vehicular device
11640358 · 2023-05-02 · ·

A vehicular device includes multiple CPU modules, multiple cache memories allocated to the CPU modules, respectively, and a memory synchronization unit configured to synchronize multiple surfaces drawn in the multiple cache memories. The memory synchronization unit divides the surfaces to be synchronized into multiple tiles, and sequentially synchronize the divided tiles from tiles for which drawing has been completed.

Vehicular device and control method for vehicular device
11640358 · 2023-05-02 · ·

A vehicular device includes multiple CPU modules, multiple cache memories allocated to the CPU modules, respectively, and a memory synchronization unit configured to synchronize multiple surfaces drawn in the multiple cache memories. The memory synchronization unit divides the surfaces to be synchronized into multiple tiles, and sequentially synchronize the divided tiles from tiles for which drawing has been completed.