SYSTEM AND METHOD FOR MEMORY SYNCHRONIZATION OF A MULTI-CORE SYSTEM

20170337256 · 2017-11-23

Assignee

Inventors

Cpc classification

International classification

Abstract

A system for memory synchronization of a multi-core system is provided, the system comprising: an assigning module which is configured to assign at least one memory partition to at least one core of the multi-core system; a mapping module which is configured to provide information for translation lookaside buffer shootdown for the multi-core system leveraged by sending an interrupt to the at least one core of the multi-core system, if a page table entry associated with the memory partition assigned to the at least one core is modified; and an interface module which is configured to provide an interface to the assigning module from user-space.

Claims

1. A system for memory synchronization of a multi-core system, the system comprising: a processor configured to execute instructions stored in memory to: operate as an assigning module to assign at least one memory partition to at least one core of the multi-core system; operate as a mapping module to provide information for a translation lookaside buffer (TLB) shootdown for the multi-core system by sending an interrupt to the at least one core of the multi-core system if a page table entry associated with the memory partition assigned to the at least one core is modified; and operate as an interface module to provide an interface for the assigning module operations from user-space.

2. The system according to claim 1, wherein when the processor is operating as the mapping module, the processor provides the information for translating the TLB shootdown during a copy-on-write page sharing of the at least one memory partition.

3. The system according to one of claim 1, wherein when the processor is operating as the interface module, the processor provides the interface to the assigning module for the user-space by a set of system calls for controlling a binding of the at least one memory partition to the at least one core.

4. The system according to claim 3, wherein when the processor is operating as the interface module, the processor uses the set of system calls to receive and adapt the TLB shootdown information.

5. A database, comprising a multi-core system; a memory system with at least one memory partition; and a system for memory synchronization of the multi-core system, the system comprising a processor configured to execute instructions stored in memory to: operate as an assigning module configured to assign at least one memory partition to at least one core of the multi-core system; operate as a mapping module configured to provide information for translating a TLB shootdown for the multi-core system by sending an interrupt to the at least one core of the multi-core system, if a page table entry associated with the memory partition assigned to the at least one core is modified; and operate as an interface module which is configured to provide an interface to the assigning module from user-space.

6. The database according to claim 5, wherein the database is a hybrid online transactional processing and online analytical processing database.

7. The database according to claim 6, wherein the database is configured to perform online transaction processing by ensuring that at least one online transactional processing thread operates on one or more of the at least one memory partition.

8. The database according to one of claim 5, wherein the database is enabled to provide a controlled dispatching of the TLB shootdown by using a data structure indicating which of the at least one core is bound to which of the at least one memory partition.

9. The database according to claim 5, wherein the interface module is configured to communicate a binding of the at least one memory partition to the at least one core.

10. The database according to claim 9, wherein the interface module is configured to adapt the TLB shootdown by information as received by using the set of system calls.

11. A method for memory synchronization of a multi-core system performed by a processor executing computer instructions stored in memory, the method comprising: operating as an assigning module and assigning (S1) at least one memory partition to at least one core of the multi-core system; providing (S2) a translation lookaside buffer (TLB) shootdown for the multi-core system using a mapping module by sending (S3) an interrupt to the at least one core of the multi-core system, if a page table entry associated with the memory partition assigned to the at least one core is modified; and providing (S4) an interface to the assigning module to a user-space.

12. The method according to claim 11, wherein the step of providing (S2) the TLB shootdown is performed during a copy-on-write page sharing of the at least one memory partition.

13. The method according to claim 11, wherein the step of providing (S2) TLB shootdown is performed by a set of system calls for controlling a binding of the at least one memory partition to the at least one core.

14. The method according to claim 13, wherein the TLB shootdown is adapted by information as received by using the set of system calls.

15. The method of claim 12 wherein program instructions that define the operations to be performed by the processor are stored in a non-transitory computer readable media.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0047] Further embodiments of the present disclosure will be described with respect to the following figures, in which:

[0048] FIG. 1 shows a schematic diagram of a system for memory synchronization of a multi-core system according to one embodiment of the present disclosure;

[0049] FIG. 2 shows a database system for memory synchronization of a multi-core system according to an embodiment of the disclosure;

[0050] FIG. 3 shows is a flowchart diagram of a method for memory synchronization of a multi-core system according to one embodiment of the present disclosure;

[0051] FIG. 4 is a diagram of an operating system kernel according to one embodiment of the present disclosure; and

[0052] FIG. 5 shows a schematic diagram of a operating system kernel according to one embodiment of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE DISCLOSURE

[0053] In the associated figures, identical reference signs denote identical or at least equivalent elements, parts, units or steps. In addition, it should be noted that all of the accompanying drawings are not to scale.

[0054] The technical solutions in the embodiments of the present disclosure are described in the following text with detailed reference to the accompanying drawings in the embodiments of the present disclosure.

[0055] The described embodiments are only some embodiments of the present disclosure, rather than all embodiments. Based on the described embodiments of the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without making any creative effort shall fall within the protection scope of the present disclosure.

[0056] FIG. 1 shows a schematic diagram of a system for memory synchronization of a multi-core system according to one embodiment of the present disclosure.

[0057] FIG. 1 shows an embodiment of a system 100 for memory synchronization of a multi-core system 1000, the system 100 comprising: an assigning module 10, a mapping module 20, and an interface module 30.

[0058] The assigning module 10 may be configured to assign at least one memory partition 200-1, . . . , 200-n to at least one core 1010-1, . . . , 1010-n of the multi-core system 1000.

[0059] The mapping module 20 may be configured to provide information for translation lookaside buffer shootdown for the multi-core system 1000 leveraged by sending an interrupt to the at least one core 1010-1, . . . , 1010-n of the multi-core system 1000, if a page table entry associated with the memory partition 200-1, . . . , 200-n assigned to the at least one core 1010-1, . . . , 1010-n is modified.

[0060] The mapping module 20 may provide the mapping information that will be used during TLB shootdown by the operating system kernel 3000, which performs the TLB shootdown.

[0061] The interface module 30 may be configured to provide an interface to the assigning module 10 from user-space.

[0062] FIG. 2 shows a schematic diagram of a database comprising a system for memory synchronization of a multi-core system according to an embodiment of the disclosure.

[0063] FIG. 2 shows a schematic diagram of a database 2000 comprising a multi-core system 1000 with at least one core 1010-1, . . . , 1010-n; a memory system 200 with at least one memory partition 200-1, . . . , 200-n; and a system 100 for memory synchronization of the multi-core system 1000.

[0064] The system 100 may comprise an assigning module 10, a mapping module 20, and an interface module 30.

[0065] FIG. 3 is a flowchart diagram of a method for memory synchronization of a multi-core system according to one embodiment of the present disclosure.

[0066] As a first step of the method, assigning S1 at least one memory partition 200-1, . . . , 200-n to at least one core 1010-1, . . . , 1010-n of the multi-core system 1000 using an assigning module 10 is conducted.

[0067] As a second step of the method, providing S2 a translation lookaside buffer shootdown for the multi-core system 1000 using a mapping module 20 is conducted.

[0068] As a third step of the method, sending S3 an interrupt to the at least one core 1010-1, . . . , 1010-n of the multi-core system 1000 is conducted, if a page table entry associated with the memory partition 200-1, . . . , 200-n assigned to the at least one core 1010-1, . . . , 1010-n is modified.

[0069] As a fourth step of the method, providing S4 an interface to the assigning module 10 to a user-space using an interface module 30 is conducted.

[0070] FIG. 4 is a diagram of an operating system kernel according to one embodiment of the present disclosure.

[0071] FIG. 4 shows the operating system kernel 3000 issues TLB shootdown IPIs towards all hardware processing cores that share the parent address space, when an OLTP operation attempts to update memory. This happens during CoW page-fault handling by the operating system, induced by an exception raised from the protection unit of the hardware MMU.

[0072] According to an embodiment of the present disclosure, the implementation may include modifications or enhancements to the operating system kernel 3000, in order to: expose the required interface to the user-space/database applications that control the binding of memory partitions to particular CPU cores, for the purposes of TLB shootdown (e.g. by introducing a set of new system calls to the operating system kernel).

[0073] A further advantage is provided by accordingly enhancing the related operating system kernel 3000 TLB shootdown functionality that is involved during page-fault time for copy-on-write pages, in order to consult the information that has been passed from the database 2000 to the operating system kernel 3000 via the new set of system calls.

[0074] In addition, the leverage the mechanism by the database 2000 is improved, by ensuring that every OLTP thread operates on one or more memory partitions, and by conveying this information to the kernel 3000 by means of the new system calls.

[0075] According to an embodiment of the present disclosure, on a scenario where a hybrid OLTP/OLAP in-memory database 2000 uses the MMU-based CoW snapshotting mechanism, the operating system kernel 3000 is necessarily broadcasting TLB shootdown requests via IPIs towards all of the hardware processing cores 1010-1, . . . , 1010-n that share the parent address space by means of threads, when an OLTP transaction attempts to update any part of the memory that is currently being snapshotted (during CoW page-fault handling) as shown in FIG. 4.

[0076] According to an embodiment of the present disclosure, a controlled dispatching of TLB shootdown IPIs may be installed, which is influenced by user space. The database/application is responsible for reducing the set of cores 1010-1, . . . , 1010-n that receive an IPI by periodically (or on-demand) providing a restricted CPU mask to the kernel by means of a set of newly introduced system calls.

[0077] According to an embodiment of the present disclosure, the CPU mask, is a data structure that indicates which CPU cores 1010-1, . . . , 1010-n of the multi-core system 1000 should be included, and which should be excluded by the binding. The kernel invalidation related functions are modified to leverage the provided CPU mask for selectively dispatching IPIs during CoW page-faults.

[0078] According to an embodiment of the present disclosure, the database 2000 or any application is further responsible for ensuring that its user space threads (doing OLTP transactions) respect the bindings that have been passed to the kernel 3000, by restricting their operation on a corresponding memory region and operating from the specific cores 1010-1, . . . , 1010-n that are bound to that region.

[0079] The system calls that control the TLB-invalidating IPIs operate by specifying a memory block identified by its starting virtual memory address, and by providing a CPU mask, identifying which cores 1010-1, . . . , 1010-n of the multi-core system 1000 should receive TLB shootdown IPIs when CoW page-fault occurs in the particular block of memory.

[0080] According to an embodiment of the present disclosure, the memory block is automatically inferred by the operating system (there is no need to provide an address-length pair from the perspective of the user space), since the operating system is internally keeping track of allocated regions, and can easily locate the containing memory block that the supplied virtual memory address belongs to.

[0081] For example, the following system calls can be introduced to an operating system kernel: [0082] mem_region_set_cpumask(void*addr, cpumask_t*cpumask); [0083] mem_region_get_cpumask(void*addr, cpumask_t*cpumask);

[0084] The internal implementation locates the containing memory region for the supplied virtual memory address (addr), by consulting some per-process kernel internal memory allocation book-keeping structure.

[0085] According to an embodiment of the present disclosure, the kernel internal structure that corresponds to an allocated memory block is also enhanced in order to store a CPU mask structure, as supplied from the system calls.

[0086] According to an embodiment of the present disclosure, the system calls, after locating the particular structure for the containing memory region, copy the CPU mask from user-space memory to kernel-space memory (or from kernel-space to user-space) for mem_region_set_cpumask (and mem_region_get cpumask) respectively. In the case of mem_region_set_cpumask, the CPU mask is copied in the kernel space and stored in the structure that identifies the allocated memory block.

[0087] FIG. 5 shows a schematic diagram of an operating system kernel, during CoW page-fault handling according to one embodiment of the present disclosure.

[0088] FIG. 5 shows the operating system kernel 3000, during CoW page-fault handling, can leverage the binding information provided earlier by the database 2000 (via a set of new system calls), in order to issue TLB-shootdown IPIs selectively, instead of broadcasting them to every available core 1010-1, . . . , 1010-n that shares the address space.

[0089] According to an embodiment of the present disclosure, during handling a copy-on-write page fault, the kernel 3000 has enough information (the hardware exception handled is providing the faulting address) in order to locate the containing memory region, and obtain the CPU mask that is associated with that region.

[0090] According to an embodiment of the present disclosure, the kernel 3000 can subsequently use the CPU mask in order to selectively issue TLB shootdown IPIs, directed only towards the cores 1010-1, . . . , 1010-n that are indicated within the corresponding CPU mask, as shown in FIG. 5.

[0091] By default, if no CPU mask has been supplied by the database, the operating system kernel handles the TLB shootdown by broadcasting IPIs to all the cores 1010-1, . . . , 1010-n that share the address space, assuming that there is no information supplied by the user space/database that could be leveraged to narrow the set of cores 1010-1, . . . , 1010-n to receive TLB shootdowns during the CoW page fault.

[0092] According to an embodiment of the present disclosure, in case the application/database needs to revert back to the “default” operating system behavior after having supplied a CPU mask via the related system call, it can provide a “null” CPU mask in order to restore the default behavior.

[0093] This reference implementation is only serving for illustrative purposes. The same scheme could be implemented via a different kind of interfaces and semantics from the database/application to the operating system kernel 3000.

[0094] According to an embodiment of the present disclosure, the memory management related operating system kernel calls (e.g., mmap or a new system call), could be enhanced in order to accommodate a parameter that indicates the CPU mask (i.e. the binding of a memory region to a set of processing cores 1010-1, . . . , 1010-n).

[0095] Another example would be a modification and extension of existing operating system kernel calls (e.g., mmap), introducing a new flag (e.g. MAP_BIND) that would indicate implicitly that the memory region to be allocated by the call, should be bound to the CPU that the system call was invoked on (e.g. the CPU of the thread allocating the memory region), for the purposes of TLB shootdown IPIs.

[0096] The reference implementation though, offers more flexibility, since it allows for explicit binding of memory regions to processing cores 1010-1, . . . , 1010-n, and for modification of the bindings throughout the runtime of the database system 2000.

[0097] The present disclosure also supports a computer program product including computer executable code or computer executable instructions that, when executed, causes at least one computer to execute the performing and computing steps described herein.

[0098] Many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the above teachings. Of course, those skilled in the art readily recognize that there are numerous applications of the disclosure beyond those described herein.

[0099] While the present disclosure has been described with reference to one or more particular embodiments, those skilled in the art recognize that many changes may be made thereto without departing from the scope of the present disclosure. It is therefore to be understood that within the scope of the appended claims and their equivalents, the disclosures may be practiced otherwise than as specifically described herein.

[0100] In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims.

[0101] The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measured cannot be used to advantage. A computer program may be stored or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.