G06F2212/656

Universal pointers for data exchange in a computer system having independent processors
11544069 · 2023-01-03 · ·

A system, method and apparatus to facilitate data exchange via pointers. For example, in a computing system having a first processor and a second processor that is separate and independent from the first processor, the first processor can run a program configured to use a pointer identifying a virtual memory address having an ID of an object and an offset within the object. The first processor can use the virtual memory address to store data at a memory location in the computing system and/or identify a routine at the memory location for execution by the second processor. After the pointer is communicated from the first processor to the second processor, the second processor can access the same memory location identified by the virtual memory address. The second processor may operate on the data stored at the memory location or load the routine from the memory location for execution.

Coherence-based cache-line Copy-on-Write

A method of performing a copy-on-write on a shared memory page is carried out by a device communicating with a processor via a coherence interconnect. The method includes: adding a page table entry so that a request to read a first cache line of the shared memory page includes a cache-line address of the shared memory page and a request to write to a second cache line of the shared memory page includes a cache-line address of a new memory page; in response to the request to write to the second cache line, storing new data of the second cache line in a second memory and associating the second cache-line address with the new data stored in the second memory; and in response to a request to read the second cache line, reading the new data of the second cache line from the second memory.

Memory sharing via a unified memory architecture
11531623 · 2022-12-20 · ·

A method and system for sharing memory between a central processing unit (CPU) and a graphics processing unit (GPU) of a computing device are disclosed herein. The method includes allocating a surface within a physical memory and mapping the surface to a plurality of virtual memory addresses within a CPU page table. The method also includes mapping the surface to a plurality of graphics virtual memory addresses within an I/O device page table.

INDEPENDENTLY CONTROLLED DMA AND CPU ACCESS TO A SHARED MEMORY REGION

An embodiment of an integrated circuit comprises circuitry to share page tables associated with a page between a processor memory management unit (MMU) and an input/output memory management unit (IOMMU), store a page table entry in the memory associated with the page, and separately control access to the page from a processor and from a direct memory access (DMA) request based on one or more fields of the stored page table entry. Other embodiments are disclosed and claimed.

RESET DYNAMIC ADDRESS TRANSLATION PROTECTION INSTRUCTION

An instruction is provided to perform a reset address translation protection operation when executed. Executing the instruction includes determining, by a processor, that an address translation protection bit in a specified translation table entry associated with a storage block is to be reset. Based on determining that the address translation protection bit is to be reset, executing the instruction includes resetting the address translation protection bit to deactivate write protection for the storage block. The resetting is absent waiting for an action by one or more other processors of the computing environment.

INCREASING PAGE SHARING ON NON-UNIFORM MEMORY ACCESS (NUMA)-ENABLED HOST SYSTEMS

In one set of embodiments, a hypervisor of a host system can determine that a delta between local and remote memory access latencies for each of a subset of NUMA nodes of the host system is less than a threshold. In response, the hypervisor can enable page sharing across the subset of NUMA nodes, where enabling page sharing comprises associating the subset of NUMA nodes with a single page sharing table, and where the single page sharing table holds entries identifying host physical memory pages of the host system that are shared by virtual machines (VMs) placed on the subset of NUMA nodes.

VMID AS A GPU TASK CONTAINER FOR VIRTUALIZATION

Systems, apparatuses, and methods for abstracting tasks in virtual memory identifier (VMID) containers are disclosed. A processor coupled to a memory executes a plurality of concurrent tasks including a first task. Responsive to detecting one or more instructions of the first task which correspond to a first operation, the processor retrieves a first identifier (ID) which is used to uniquely identify the first task, wherein the first ID is transparent to the first task. Then, the processor maps the first ID to a second ID and/or a third ID. The processor completes the first operation by using the second ID and/or the third ID to identify the first task to at least a first data structure. In one implementation, the first operation is a memory access operation and the first data structure is a set of page tables. Also, in one implementation, the second ID identifies a first application of the first task and the third ID identifies a first operating system (OS) of the first task.

Methods and apparatuses for addressing memory caches
11500781 · 2022-11-15 · ·

A cache memory includes cache lines to store information. The stored information is associated with physical addresses that include first, second, and third distinct portions. The cache lines are indexed by the second portions of respective physical addresses associated with the stored information. The cache memory also includes one or more tables, each of which includes respective table entries that are indexed by the first portions of the respective physical addresses. The respective table entries in each of the one or more tables are to store indications of the second portions of respective physical addresses associated with the stored information.

Scatter gather engine

In an example, an apparatus comprises a plurality of execution units, and logic, at least partially including hardware logic, to create a scatter gather list in memory and collect a plurality of operating statistics for the plurality of execution units using the scatter gather list. Other embodiments are also disclosed and claimed.

Method and apparatus for buffer sharing

Embodiments are generally directed to methods and apparatuses for buffer sharing. An embodiment of a method comprises: receiving a plurality of graphics data comprising a first graphics data, each of the plurality of graphics data mapped to a corresponding buffer in a Graphics Processing Unit (GPU) memory, wherein the first graphics data is mapped to a first buffer in the GPU memory; receiving a second graphics data mapped to a second buffer in the GPU memory; comparing the first buffer mapped by the first graphics data with the second buffer mapped by the second graphics data; and remapping the second graphics data to the first buffer if the first buffer is identical with the second buffer.