Patent classifications
G06F12/1072
SYSTEM AND METHOD FOR IMPLEMENTING A NETWORK-INTERFACE-BASED ALLREDUCE OPERATION
An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.
Data migration based on performance characteristics of memory blocks
A performance manager (400, 500) and a method (200) performed thereby are provided, for managing the performance of a logical server of a data center. The data center comprises at least one memory pool in which a memory block has been allocated to the logical server. The method (200) comprises determining (230) performance characteristics associated with a first portion of the memory block, comprised in a first memory unit of the at least one memory pool; and identifying (240) a second portion of the memory block, comprised in a second memory unit of the at least one memory pool, to which data of the first portion of the memory block may be migrated to apply performance characteristics associated with the second portion. The method (200) further comprises initiating migration (250) of the data to the second portion of the memory block.
Hashing with Soft Memory Folding
In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
HARDWARE OFFLOADING FOR AN EMULATED IOMMU DEVICE
Disclosed is a method of managing memory of a virtual machine (VM), including receiving, at a physical input-output memory management unit (IOMMU) of a processing device operating the VM, a request from a VM IOMMU for VM memory address translation for a VM peripheral component interconnect (PCI) device created on the VM; determining, by the physical IOMMU, a corresponding VM memory address translation result based on the request as received and a memory translation table; and transmitting, by the physical IOMMU to the VM IOMMU, the corresponding VM memory address translation result for servicing the request for VM memory address translation of the VM PCI device.
HARDWARE OFFLOADING FOR AN EMULATED IOMMU DEVICE
Disclosed is a method of managing memory of a virtual machine (VM), including receiving, at a physical input-output memory management unit (IOMMU) of a processing device operating the VM, a request from a VM IOMMU for VM memory address translation for a VM peripheral component interconnect (PCI) device created on the VM; determining, by the physical IOMMU, a corresponding VM memory address translation result based on the request as received and a memory translation table; and transmitting, by the physical IOMMU to the VM IOMMU, the corresponding VM memory address translation result for servicing the request for VM memory address translation of the VM PCI device.
SOFTWARE-DRIVEN REMAPPING HARDWARE CACHE QUALITY-OF-SERVICE POLICY BASED ON VIRTUAL MACHINE PRIORITY
Systems, methods, and devices for software-driven resource reservation of an input/output memory management unit (IOMMU) are provided. A system may include a peripheral device and a processing device. The peripheral device may be accessible to a virtual machine running on the processing device via direct memory access (DMA) that is translated by an IOMMU). The processing device may run the virtual machine and a virtual machine manager. The processing device also includes the IOMMU, which is configurable to reserve a subset of resources of the IOMMU to the virtual machine based on a descriptor provided by the virtual machine manager.
PREVENTING UNAUTHORIZED TRANSLATED ACCESS USING ADDRESS SIGNING
A host may use address translation to convert virtual addresses to physical addresses for endpoints, which may then submit memory access requests for physical addresses. The host may incorporate the physical address and a signature of the physical address generated using a private key into a translated address field of a response to a translation request. An endpoint may treat the combination as a translated address by storing it in an entry of a translation cache, and accessing the entry for inclusion in a memory access request. The host may generate a signature of the translated address from the request using the private key, with the result being compared to the signature from the request. The memory access request may be verified when the compared values match, and the memory access may be performed using the translated address.
Apparatuses, methods, and systems for verification of input-output memory management unit to device attachment
Systems, methods, and apparatuses relating to performing an attachment of an input-output memory management unit (IOMMU) to a device, and a verification of the attachment. In one embodiment, a protocol and IOMMU extensions are used by a secure arbitration mode (SEAM) module and/or circuitry to determine if the IOMMU that is attached to the device requested to be mapped to a trusted domain.
Accelerated migration of compute instances using offload cards
As part of a compute instance migration, a compute instance which was executing at a first server begins execution at a second server before at least some state information of the compute instance has reached the second server. In response to a determination that a particular page of state information is not present at the second server, a migration manager running at one or more offload cards of the second server causes the particular page to be transferred to the second server via a network channel set up between the offload cards of both servers, and stores the page into main memory of the second server.
Hardware offloading for an emulated IOMMU device
Disclosed is a method of managing memory of a virtual machine (VM), including receiving, at a physical input-output memory management unit (IOMMU) of a processing device operating the VM, a request from a VM IOMMU for VM memory address translation for a VM peripheral component interconnect (PCI) device created on the VM; determining, by the physical IOMMU, a corresponding VM memory address translation result based on the request as received and a memory translation table; and transmitting, by the physical IOMMU to the VM IOMMU, the corresponding VM memory address translation result for servicing the request for VM memory address translation of the VM PCI device.