G06F2212/152

Technologies for assigning workloads to balance multiple resource allocation objectives

Technologies for allocating resources of managed nodes to workloads to balance multiple resource allocation objectives include an orchestrator server to receive resource allocation objective data indicative of multiple resource allocation objectives to be satisfied. The orchestrator server is additionally to determine an initial assignment of a set of workloads among the managed nodes and receive telemetry data from the managed nodes. The orchestrator server is further to determine, as a function of the telemetry data and the resource allocation objective data, an adjustment to the assignment of the workloads to increase an achievement of at least one of the resource allocation objectives without decreasing an achievement of another of the resource allocation objectives, and apply the adjustments to the assignments of the workloads among the managed nodes as the workloads are performed. Other embodiments are also described and claimed.

Real time input/output address translation for virtualized systems

In an example, a device includes a memory and a processor core coupled to the memory via a memory management unit (MMU). The device also includes a system MMU (SMMU) cross-referencing virtual addresses (VAs) with intermediate physical addresses (IPAs) and IPAs with physical addresses (PAs). The device further includes a physical address table (PAT) cross-referencing IPAs with each other and cross-referencing PAs with each other. The device also includes a peripheral virtualization unit (PVU) cross-referencing IPAs with PAs, and a routing circuit coupled to the memory, the SMMU, the PAT, and the PVU. The routing circuit is configured to receive a request comprising an address and an attribute and to route the request through at least one of the SMMU, the PAT, or the PVU based on the address and the attribute.

SMART PREFETCHING FOR REMOTE MEMORY

Memory pages of a local application program are prefetched from a memory of a remote host. A method of prefetching the memory pages from the remote memory includes detecting that a cache-line access made by a processor executing the local application program is an access to a cache line containing page table data of the local application program, identifying data pages that are referenced by the page table data, and fetching the identified data pages from the remote memory and storing the fetched data pages in a local memory.

VIRTUALIZATION ENGINE FOR VIRTUALIZATION OPERATIONS IN A VIRTUALIZATION SYSTEM
20220413888 · 2022-12-29 ·

Methods, systems, and computer storage media for providing virtualization operations—including an activate operation, suspend operation, and resume operation for virtualization in a virtualization system. In operation, a unique identifier and file metadata associated with a first file stored in a cache engine. The cache engine manages the first file of an application running on the virtual machine to circumvent writing file data of the first file to an OS disk during a suspend operation of the virtual machine and circumvents reading file data of the first file from the OS disk during a resume operation of the virtual machine. Based on a resume operation associated with the virtual machine and the file metadata, file data of the first file that is stored in the cache engine is accessed. The file data is communicated to the virtual machine, the virtual machine is associated with the suspend and the resume operation.

METHOD AND SYSTEM FOR TRACKING STATE OF CACHE LINES

The state of cache lines transferred into an out of caches of processing hardware is tracked by monitoring hardware. The method of tracking includes monitoring the processing hardware for cache coherence events on a coherence interconnect between the processing hardware and monitoring hardware, determining that the state of a cache line has changed, and updating a hierarchical data structure to indicate the change in the state of said cache line. The hierarchical data structure includes a first level data structure including first bits, and a second level data structure including second bits, each of the first bits associated with a group of second bits. The step of updating includes setting one of the first bits and one of the second bits in the group corresponding to the first bit that is being set, according to an address of said cache line.

APPARATUS, SYSTEM, AND METHOD OF BYTE ADDRESSABLE AND BLOCK ADDRESSABLE STORAGE AND RETRIEVAL OF DATA TO AND FROM NON-VOLATILE STORAGE MEMORY
20220404975 · 2022-12-22 ·

A hybrid memory system provides rapid, persistent byte-addressable and block-addressable memory access to a host computer system by providing direct access to a both a volatile byte-addressable memory and a volatile block-addressable memory via the same parallel memory interface. The hybrid memory system also has at least a non-volatile block-addressable memory that allows the system to persist data even through a power-loss state. The hybrid memory system can copy and move data between any of the memories using local memory controllers to free up host system resources for other tasks.

Trusted local memory management in a virtualized GPU

Embodiments are directed to trusted local memory management in a virtualized GPU. An embodiment of an apparatus includes one or more processors including a trusted execution environment (TEE); a GPU including a trusted agent; and a memory, the memory including GPU local memory, the trusted agent to ensure proper allocation/deallocation of the local memory and verify translations between graphics physical addresses (PAs) and PAs for the apparatus, wherein the local memory is partitioned into protection regions including a protected region and an unprotected region, and wherein the protected region to store a memory permission table maintained by the trusted agent, the memory permission table to include any virtual function assigned to a trusted domain, a per process graphics translation table to translate between graphics virtual address (VA) to graphics guest PA (GPA), and a local memory translation table to translate between graphics GPAs and PAs for the local memory.

Saving virtual memory space in a clone environment

Virtual memory space may be saved in a clone environment by leveraging the similarity of the data signatures in swap files when a chain of virtual machines (VMs) includes clones spawned from a common parent and executing common applications. Deduplication is performed across the chain, rather than merely within each VM. Examples include generating a common deduplication identifier (ID) for the chain; generating a logical addressing table linked to the deduplication ID, for each of the VMs in the chain; and generating a hash table for the chain. Examples further include, based at least on a swap out request, generating a hash value for a block of memory to be written to a storage medium; and based at least on finding the hash value within the hash table, updating the logical addressing table to indicate a location of a prior-existing duplicate of the block on the storage medium.

APPARATUSES, SYSTEMS, AND METHODS FOR CONFIGURING COMBINED PRIVATE AND SHARED CACHE LEVELS IN A PROCESSOR-BASED SYSTEM

Apparatuses, systems, and methods for configuring combined private and shared cache levels in a processor-based system. The processor-based system includes a processor that includes a plurality of processing cores each including execution circuits which are coupled to respective cache(s) and a configurable combined private and shared cache, and which may receive instructions and data on which to perform operations from the cache(s) and the combined private and shared cache. A shared cache portion of each configurable combined private and shared cache can be treated as an independently-assignable portion of the overall shared cache, which is effectively the shared cache portions of all of the processing cores. Each independently-assignable portion of the overall shared cache can be associated with a particular client running on the processor as an example. This approach can provide greater granularity of cache partitioning of a shared cache between particular clients running on a processor.

USER-SPACE REMOTE MEMORY PAGING
20220398199 · 2022-12-15 ·

Techniques for implementing user-space remote memory paging are provided. In one set of embodiments, these techniques include a user-space remote memory paging (RMP) runtime that can: (1) pre-allocate one or more regions of remote memory for use by an application; (2) at a time of receiving/intercepting a memory allocation function call invoked by the application, map the virtual memory address range of the allocated local memory to a portion of the pre-allocated remote memory; (3) at a time of detecting a page fault directed to a page that is mapped to remote memory, retrieve the page via Remote Direct Memory Access (RDMA) from its remote memory location and store the retrieved page in a local main memory cache; and (4) on a periodic basis, identify pages in the local main memory cache that are candidates for eviction and write out the identified pages via RDMA to their mapped remote memory locations if they have been modified.