G06F12/084

Method and apparatus for using a storage system as main memory
11556469 · 2023-01-17 · ·

A data access system including a processor, multiple cache modules for the main memory, and a storage drive. The cache modules include a FLC controller and a main memory cache. The multiple cache modules function as main memory. The processor sends read/write requests (with physical address) to the cache module. The cache module includes two or more stages with each stage including a FLC controller and DRAM (with associated controller). If the first stage FLC module does not include the physical address, the request is forwarded to a second stage FLC module. If the second stage FLC module does not include the physical address, the request is forwarded to the storage drive, a partition reserved for main memory. The first stage FLC module has high speed, lower power operation while the second stage FLC is a low-cost implementation. Multiple FLC modules may connect to the processor in parallel.

COMPUTATIONAL ACCELERATION FOR DISTRIBUTED CACHE
20230221867 · 2023-07-13 ·

A client device includes at least one memory configured to be used at least in part as a shared cache in a distributed cache. A network interface of the client device is configured to communicate with one or more other devices on a network each configured to provide a respective shared cache for the distributed cache. A Non-Volatile Memory express (NVMe) controller of the client device receives a command from a processor to access data in the shared cache and executes a program to use data read from the shared cache or data to be written to the shared cache to perform at least one computational operation. In another aspect, data is accessed in the shared cache using a kernel and data read from the shared cache or data to be written to the shared cache is used to perform at least one computational operation by the kernel.

COMPUTATIONAL ACCELERATION FOR DISTRIBUTED CACHE
20230221867 · 2023-07-13 ·

A client device includes at least one memory configured to be used at least in part as a shared cache in a distributed cache. A network interface of the client device is configured to communicate with one or more other devices on a network each configured to provide a respective shared cache for the distributed cache. A Non-Volatile Memory express (NVMe) controller of the client device receives a command from a processor to access data in the shared cache and executes a program to use data read from the shared cache or data to be written to the shared cache to perform at least one computational operation. In another aspect, data is accessed in the shared cache using a kernel and data read from the shared cache or data to be written to the shared cache is used to perform at least one computational operation by the kernel.

Saturating local cache in memory-compute systems

Latency in a node-based compute-near-memory system can be problematic. A solution to the problem can include or use a dedicated software-based cache at each node. The cache can be configured to store information received from each of the other nodes in the system. In an example, the cache can be populated during a breadth first search algorithm to store frontier information from each of the other nodes.

Saturating local cache in memory-compute systems

Latency in a node-based compute-near-memory system can be problematic. A solution to the problem can include or use a dedicated software-based cache at each node. The cache can be configured to store information received from each of the other nodes in the system. In an example, the cache can be populated during a breadth first search algorithm to store frontier information from each of the other nodes.

ARITHMETIC PROCESSOR AND METHOD FOR OPERATING ARITHMETIC PROCESSOR
20230010353 · 2023-01-12 · ·

An arithmetic processor including a plurality of core groups each including a plurality of cores and a cache unit, a plurality of home agents each including a tag directory and a store command queue and a store command queue. The store command queue enters the received store request to the entry queue in order of reception, the cache unit stores the data of the store request in a data RAM. The store command queue sets a data ownership acquisition flag of the store request to valid when obtaining a data ownership of the store request and issues a top-of-queue notification to the cache control unit when the flag of the top-of-queue entry is valid. In response to the top-of-queue notification, the cache unit update a cache tag to modified state and issue a store request completion notification.

ARITHMETIC PROCESSOR AND METHOD FOR OPERATING ARITHMETIC PROCESSOR
20230010353 · 2023-01-12 · ·

An arithmetic processor including a plurality of core groups each including a plurality of cores and a cache unit, a plurality of home agents each including a tag directory and a store command queue and a store command queue. The store command queue enters the received store request to the entry queue in order of reception, the cache unit stores the data of the store request in a data RAM. The store command queue sets a data ownership acquisition flag of the store request to valid when obtaining a data ownership of the store request and issues a top-of-queue notification to the cache control unit when the flag of the top-of-queue entry is valid. In response to the top-of-queue notification, the cache unit update a cache tag to modified state and issue a store request completion notification.

High-Throughput Algorithm For Multiversion Concurrency Control With Globally Synchronized Time
20230216921 · 2023-07-06 ·

Throughput is preserved in a distributed system while maintaining concurrency by pushing a commit wait period to client commit paths and to future readers. As opposed to servers performing commit waits, the servers assign timestamps, which are used to ensure that causality is preserved. When a server executes a transaction that writes data to a distributed database, the server acquires a user-level lock, and assigns the transaction a timestamp equal to a current time plus an interval corresponding to bounds of uncertainty of clocks in the distributed system. After assigning the timestamp, the server releases the user-level lock. Any client devices, before performing a read of the written data, must wait until the assigned timestamp is in the past.

High-Throughput Algorithm For Multiversion Concurrency Control With Globally Synchronized Time
20230216921 · 2023-07-06 ·

Throughput is preserved in a distributed system while maintaining concurrency by pushing a commit wait period to client commit paths and to future readers. As opposed to servers performing commit waits, the servers assign timestamps, which are used to ensure that causality is preserved. When a server executes a transaction that writes data to a distributed database, the server acquires a user-level lock, and assigns the transaction a timestamp equal to a current time plus an interval corresponding to bounds of uncertainty of clocks in the distributed system. After assigning the timestamp, the server releases the user-level lock. Any client devices, before performing a read of the written data, must wait until the assigned timestamp is in the past.

TECHNOLOGIES FOR CROSS-DEVICE SHARED WEB RESOURCE CACHE
20230214438 · 2023-07-06 ·

Technologies for cross-device shared web resource caching include a client device and a shared cache device. The client device scans for a shared cache device in local proximity to the client device and, in response to the scan, registers with the shared cache device. After registering, the client device requests a cached web resource from the shared cache device. The shared cache device determines whether a cached web resource that matches the request is installed in a shared cache. The shared cache device may determine whether an origin of the request matches the origin of the cached web resource. If installed, the shared cache device sends a found response and the cached web resource to the client device. If not installed, the shared cache device sends a not-found response and the client device may request the web resource from a remote web server. Other embodiments are described and claimed.