DISTRIBUTED METADATA MANAGEMENT CONSISTENCY ASSURANCE METHOD, DEVICE, SYSTEM AND APPLICATION

20220050809 ยท 2022-02-17

Assignee

Inventors

Cpc classification

International classification

Abstract

A distributed metadata management consistency assurance method, device, system and application are provided. A consistent node is deployed in a metadata cluster, the client sends a metadata update request to the consistent node, and the consistent node returns a metadata update success message to the client, sequentially records the metadata update request, marks old metadata as invalidated, and deletes the invalidation mark after asynchronous data synchronization with the metadata server. The client sends a metadata read operation to the metadata server. If an object of the metadata read operation is marked as invalidated, read data that has not yet completed asynchronous data synchronization is returned via the consistent node; otherwise, the read data is directly returned via the metadata server where the metadata is located. The disclosure can ensure consistency of distributed metadata management, and improve metadata access performance as far as possible while ensuring the consistency of metadata update.

Claims

1. A distributed metadata management consistency assurance method, characterized in that, comprises following implementation steps: 1) intercepting a metadata operation request from a client, turning to step 2) if the metadata operation request is a metadata update operation; or turning to step 3) if the metadata operation request is a metadata read operation for marked invalidated metadata; 2) returning a metadata update success message to the client, sequentially recording the metadata update request, marking old metadata stored in a metadata server where the metadata is located as invalidated, asynchronously synchronizing the sequentially recorded metadata update request to the metadata server where the metadata is located, and deleting an invalidation mark of the synchronized metadata; and quitting; and 3) returning the metadata that has not completed asynchronous synchronization to the client and quitting.

2. The distributed metadata management consistency assurance method according to claim 1, wherein characterized in that, detailed steps of the step 2) comprise: 2.1) returning the metadata update success message to the client; 2.2) encapsulating the metadata update operation into a log, and persisting the log to a storage device with an atomic write operation so that the metadata update request has been persisted to a metadata cluster under the condition of ensuring consistency; 2.3) sending an invalidation message to the metadata server where the metadata is located, and marking the old metadata stored in the metadata server where the metadata is located as invalidated; and 2.4) asynchronously synchronizing the sequentially recorded metadata update request to the metadata server where the metadata is located periodically, and deleting the invalidation mark of the synchronized metadata; and quitting.

3. A distributed metadata management consistency assurance device, characterized in that, comprising: an operation request judgment program unit configured to intercept a metadata operation request from a client, turn to execute an update operation processing program unit if the metadata operation request is a metadata update operation, or turn to execute a read operation processing program unit if the metadata operation request is a metadata read operation; the update operation processing program unit configured to return a metadata update success message to the client, sequentially record the metadata update request, and mark old metadata stored in a metadata server where the metadata is located as invalidated; and asynchronously synchronize the sequentially recorded metadata update request to the metadata server where the metadata is located, and delete an invalidation mark of the synchronized metadata; and the read operation processing program unit configured to return the metadata that has not completed asynchronous synchronization to the client.

4. A distributed metadata management consistency assurance device, comprising a consistent node composed of at least one computer installation, characterized in that, the consistent node is programmed to execute following implementation steps: 1) intercepting a metadata operation request from a client, turning to step 2) if the metadata operation request is a metadata update operation; or turning to step 3) if the metadata operation request is a metadata read operation for marked invalidated metadata; 2) returning a metadata update success message to the client, sequentially recording a metadata update request, marking old metadata stored in a metadata server where the metadata is located as invalidated, asynchronously synchronizing the sequentially recorded metadata update request to the metadata server where the metadata is located, and deleting an invalidation mark of the synchronized metadata; and quitting; and 3) returning the metadata that has not completed asynchronous synchronization to the client and quitting.

5. A distributed metadata management consistency assurance system, comprising a client and at least one metadata server, wherein the client and the metadata server are connected with the consistent node according to claim 4.

6. An application method of the distributed metadata management consistency assurance system, wherein the distributed metadata management consistency assurance system comprises a client and at least one metadata server, and the application method comprises following implementation steps: S1) the client judges type of a metadata operation request to be initiated, turns to execute step S2) if the metadata operation request is a metadata update operation, or turns to execute step S3) if the metadata operation request is a metadata read operation; S2) the client selects one consistent node and sends the metadata update operation to the selected consistent node, and ends and quits after receiving a metadata update success message from the consistent node; S3) the client sends the metadata read operation to a target metadata server of the metadata read operation; S4) the target metadata server judges whether there is an invalidation mark on target metadata of the metadata read operation, returns client target metadata to the client if there is no invalidation mark on the target metadata, and the client ends and quits after receiving the returned target metadata; the target metadata server returns a target metadata invalidation message to the client if there is the invalidation mark on the target metadata; and the client turns to execute step S5) after receiving the returned target metadata invalidation message; and S5) the client selects another one consistent node, sends the metadata read operation for the marked invalidated metadata to the selected another one consistent node, and ends and quits after receiving the metadata returned by the another one consistent node to the client that has not completed asynchronous synchronization.

7. The application method of the distributed metadata management consistency assurance system according to claim 6, wherein when the client selects the one consistent node in the step S2) and selects the another one consistent node in the step S5), the client selects a corresponding consistent node according to a filename of the metadata update operation or the metadata read operation.

8. The application method of the distributed metadata management consistency assurance system according to claim 7,wherein the step of selecting the corresponding consistent node according to the filename of the metadata update operation or the metadata read operation specifically refers to using a Hash function h(x) to select the consistent node numbered as h(filename)%N for the filename of the metadata update operation or the metadata read operation, where % is a remainder symbol and N is the number of consistent nodes in a metadata cluster.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0035] FIG. 1 is a structural diagram of a metadata cluster in an embodiment of the present invention.

[0036] FIG. 2 is a schematic diagram showing basic process of a method in an embodiment of the present invention.

[0037] FIG. 3 is a schematic diagram showing metadata update process in an embodiment of the present invention.

[0038] FIG. 4 is a schematic diagram showing metadata read process in an embodiment of the invention.

DESCRIPTION OF THE EMBODIMENTS

[0039] Traditional large-scale file systems consist of three nodes namely clients, metadata servers and data servers. The client is the initiator of all read-write requests. The metadata server is responsible for storing file metadata and responding to metadata requests from the client. The data server is responsible for storing the file data and responding to the data requests from the client. In the present embodiment, a new node named consistent node is introduced on the basis of the above infrastructure of the traditional large-scale file system to execute metadata management consistency assurance, and the specific architecture is shown in FIG. 1. FIG. 1 shows architecture of a file system consisting of 4 clients, 3 metadata servers and 2 consistent nodes, in which dirty represents dirty data, and the data server is ignored as it does not involve the related art of the present invention. In the architecture, functions of the client and the metadata server are the same as those of the traditional file systems, the consistent nodes and the metadata servers jointly form a metadata cluster, and the consistent nodes are mainly configured to ensure metadata consistency at low overhead in metadata update.

[0040] As shown in FIG. 2, a distributed metadata management consistency assurance method in the present embodiment comprises the following implementation steps:

[0041] 1) intercepting a metadata operation request from a client, turning to step 2) if the metadata operation request is a metadata update operation; or turning to step 3) if the metadata operation request is a metadata read operation for marked invalidated metadata;

[0042] 2) returning a metadata update success message to the client, sequentially recording the metadata update request, marking old metadata stored in a metadata server where the metadata is located as invalidated, asynchronously synchronizing the sequentially recorded metadata update request to the metadata server where the metadata is located, and deleting an invalidation mark of the synchronized metadata; and quitting; and

[0043] 3) returning the metadata that has not completed asynchronous synchronization to the client and quitting. It should be noted that in the present embodiment, the metadata read operation of the client is preferably sent to the metadata server. Only when the metadata server confirms that the metadata is marked as invalidated, the client will send the metadata read operation to the consistent node.

[0044] The distributed metadata management consistency assurance method of the present embodiment firstly proposes the above large-scale file system architecture with an introduction of consistent nodes, based on which a new metadata read-write process is designed with consistency and high performance taken into account.

[0045] As shown in FIG. 3, detailed steps of the step 2) comprise:

[0046] 2.1) returning the metadata update success message to the client;

[0047] 2.2) encapsulating the metadata update operation into a log, and persisting the log to a storage device with an atomic write operation so that the metadata update request has been persisted to a metadata cluster under the condition of ensuring consistency;

[0048] 2.3) sending an invalidation message to the metadata server where the metadata is located, and marking the old metadata stored in the metadata server where the metadata is located as invalidated (a read-write storage device is not required in this step); and

[0049] 2.4) asynchronously synchronizing the sequentially recorded metadata update request to the metadata server where the metadata is located periodically, and deleting the invalidation mark of the synchronized metadata; and quitting.

[0050] Correspondingly, the present embodiment also provides a distributed metadata management consistency assurance device, comprising:

[0051] an operation request judgment program unit configured to intercept a metadata operation request from a client, turn to execute an update operation processing program unit if the metadata operation request is a metadata update operation, or turn to execute a read operation processing program unit if the metadata operation request is a metadata read operation;

[0052] the update operation processing program unit configured to return a metadata update success message to the client, sequentially record the metadata update request, and mark old metadata stored in a metadata server where the metadata is located as invalidated; and asynchronously synchronize the sequentially recorded metadata update request to the metadata server where the metadata is located, and delete an invalidation mark of the synchronized metadata; and

[0053] the read operation processing program unit configured to return the metadata that has not completed asynchronous synchronization to the client.

[0054] As shown in FIG. 2, the present embodiment also provides a distributed metadata management consistency assurance device, comprising a consistency assurance node composed of at least one computer installation, and the consistency assurance node is programmed to execute the steps of the distributed metadata management consistency assurance method of the present embodiment.

[0055] As shown in FIG. 3 and FIG. 4, the present embodiment also provides an application method of the distributed metadata management consistency assurance system, comprising the following implementation steps:

[0056] S1) the client judges type of a metadata operation request to be initiated, turns to execute step S2) if the metadata operation request is a metadata update operation, or turns to execute step S3) if the metadata operation request is a metadata read operation;

[0057] S2) the client selects one consistency assurance node and sends the metadata update operation to the selected consistency assurance node, and ends and quits after receiving a metadata update success message from the consistency assurance node;

[0058] S3) the client sends the metadata read operation to a target metadata server of the metadata read operation;

[0059] S4) the target metadata server judges whether there is an invalidation mark on target metadata of the metadata read operation, returns client target metadata to the client if there is no invalidation mark on the target metadata, and the client ends and quits after receiving the returned target metadata; the target metadata server returns a target metadata invalidation message to the client if there is the invalidation mark on the target metadata; and the client turns to execute step S5) after receiving the returned target metadata invalidation message; and

[0060] S5) the client selects one consistency assurance node, sends the metadata read operation for the marked invalidated metadata to the selected consistency assurance node, and ends and quits after receiving the metadata returned by the consistency assurance node to the client that has not completed asynchronous synchronization.

[0061] In the present embodiment, when the client selects one consistency assurance node in the step S2) and the step S5), specifically the client selects a corresponding consistent node according to a filename of the metadata update operation or the metadata read operation, and by which load balance of multiple consistent nodes can be implemented.

[0062] In the present embodiment, the step of selecting the corresponding consistent node according to the filename of the metadata update operation or the metadata read operation specifically refers to using a Hash function h(x) to select the consistent node numbered as h(filename)%N for the filename of the metadata update operation or the metadata read operation, where % is a remainder symbol and N is the number of consistent nodes in a metadata cluster. The advantages resulting from use of such method is as follows: when the client reads the metadata cluster and finds that the requested metadata is located on the consistent node but has not been synchronized to the metadata server, the latest metadata can be obtained by determining the consistent node where the latest updated data is located by means of hashing of the filename. This method of locating the latest metadata only by the filename does not result in additional storage and IO overhead.

[0063] In the present embodiment, an application program of the client sends out the metadata update request through a system call which is embedded into the client of the distributed file system as designed by the present invention through a virtual file system.

[0064] In the distributed file system with the consistent node as designed in the present embodiment, when the client sends out the metadata update request, an update log is first sent to the consistent node (in contrast, the traditional distributed file system sends the update request directly from the client to the metadata server). After receiving the update log, the consistent node quickly persists the update log to a local storage device. Once the persistence operation succeeds, a confirmation message of metadata update operation success is returned to the client. Two features exist in the metadata update process to ensure that the metadata update can achieve higher performance: first, no matter how many metadata servers are involved in the metadata update operation, the client only needs to interact with one consistent node, and only needs a single network interaction, and network delay is significantly reduced; and secondly, the persistence operation on the consistent node is sequential write of the log. The sequential write operation can achieve good performance on all storage devices, thus further reducing the delay of the metadata update persistence. So far, the metadata update sent by the client is only reflected in the consistent node, and have not been committed to the metadata server. However, from the perspective of the whole metadata cluster, update status has been recorded under the condition of ensuring consistency and persistence, and the remaining work is to implement the data synchronization between the consistent node and the metadata server in the metadata cluster. As the consistent node has informed the client of the persistence of the updated metadata to the metadata cluster, the client does not have to wait for the updated metadata to be synchronized from the consistent node to the metadata server, and can directly turn to execute other tasks. Therefore, the data synchronization from the consistent node to the metadata server does not occur on the critical IO path and can be executed asynchronously. In the present embodiment, the update log on the consistent node is committed to the metadata server when the metadata server has relatively light load. As the data synchronization from the consistent node to the metadata server is asynchronous, the update status of the metadata server lags slightly behind that of the consistent node. The client cannot get the latest metadata when the client reads the metadata server for a period when the consistent node and the metadata server are asynchronous. In order to reduce negative impact of asynchronous data update, in the present embodiment, after the metadata update log is persisted to the consistent node, the consistent node immediately sends a notification to the metadata server, informing that the metadata update has been persisted to the consistent node, but the metadata update will be synchronized to the metadata server in a lagging manner. The above notification can be completed through a single network interaction, and the metadata server only needs to record this information in a memory after receiving the notification without the need of a read-write storage device, so it will not bring too much overhead. In the present embodiment, as the metadata update request for sequential record is asynchronously synchronized with the metadata server where the metadata is located, the metadata update request can be synchronized to the metadata server when the metadata server is relatively idle.

[0065] The metadata cluster designed according to the present embodiment comprises a consistent node and a metadata server, and data between such two nodes may be asynchronous, but data asynchronization can only occur on the recently updated metadata, and most of the latest version of metadata are still saved on the metadata server. Therefore, the client still interacts with the metadata server first when the client initiates a metadata read request. In the memory of the metadata server, there are marks to indicate which metadata has been persisted to the consistent node but has not yet been synchronized to the metadata server. If the metadata requested by the client falls into such class, the metadata server will take the initiative to obtain the latest metadata from the consistent node and return latest metadata to the client, and update the metadata saved by itself to the latest state. If the metadata saved by the metadata server itself is the latest version (that is, there is no mark in the memory indicating that the latest version of the metadata is on the consistent node), the latest version can be returned directly to the client.

[0066] The above are only preferred embodiments of the present invention, and the protection scope of the present invention is not limited to the embodiments mentioned above. The technical solutions under the ideas of the present invention fall into the protection scope of the present invention. It should be pointed out that, for an ordinary person skilled in the art, some improvements and modifications without departing from the principle of the present invention shall be deemed as the protection scope of the present invention.