One-dimensional partial Fourier parallel magnetic resonance imaging method based on deep convolutional network

Abstract

The present disclosure relates to a 1D partial Fourier parallel magnetic resonance imaging method with a deep convolutional network and belongs to the technical field of magnetic resonance imaging. The method includes steps of: creating a sample set and a sample label set for training; constructing an initial deep convolutional network model; inputting a training sample of the sample set to the initial deep convolutional network model for forward process, comparing an output result of the forward process with an expected result in the sample label set, and performing training with a gradient descent method until a parameter of each layer which enables consistency between the output result and the expected result to be maximum is obtained; creating an optimal deep convolutional network model by using the obtained parameter of the each layer; and inputting a multi-coil undersampled image sampled online to the optimal deep convolutional network model, performing the forward process on the optimal deep convolutional network model, and outputting a reconstructed single-channel full-sampled image. The present disclosure can well remove the noise of the reconstructed image, reconstruct a magnetic resonance image with a better visual effect, and has high practical value.

Claims

1. A one-dimensional partial Fourier parallel magnetic resonance imaging method based on a deep convolutional network, comprising: creating, based on an existing undersampled multi-channel magnetic resonance image, a sample set and a sample label set for training; constructing an initial deep convolutional network model comprising an input layer, L convolutional layers and an output layer which are sequentially connected; inputting a training sample (x, y) of the sample set to the initial deep convolutional network model for forward process, comparing an output result of the forward process with an expected result in the sample label set, and training with a gradient descent method until a parameter of each layer which enables consistency between the output result and the expected result to be maximum is obtained; creating an optimal deep convolutional network model by using the obtained parameter of the each layer; and inputting a multi-coil undersampled image sampled online to the optimal deep convolutional network model, performing the forward process on the optimal deep convolutional network model, and outputting a reconstructed full-sampled image; wherein the gradient descent method comprises: for the training sample (x, y), calculating a gradient of a last convolutional layer C.sub.L according to the following formula: $δ^{L} = \frac{\partial J}{\partial b_{L}} = \frac{\partial J}{\partial D_{L}} \frac{\partial D_{L}}{\partial b_{L}} = C_{L} - y$ $wherein \frac{\partial D_{l}}{\partial b_{l}} = 1 and C_{l} = σ (D_{l});$ updating a gradient of an lth-layer nonlinear mapping layer of δ.sup.l by the following formula: $δ^{'} = \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D_{l + 1}} \frac{\partial D_{l + 1}}{\partial C_{l}} \frac{\partial C_{l}}{\partial D_{l}} = (δ^{l + 1} * W^{l + 1}) • \frac{\partial (D^{l})}{\partial D^{l}}$ wherein * denotes a cross-correlation operation, and ° denotes that array elements are sequentially multiplied; obtaining a gradient of each of the L convolutional layers as: ${\begin{matrix} \frac{\partial J}{\partial W_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial W_{l}} = δ^{l} * D^{l - 1} \\ \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} = δ^{l} \end{matrix}$ updating a parameter of the each of the L convolutional layers based on the calculated gradient of the each of the L convolutional layers.

2. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 1, wherein the training sample in the sample set is a coincident undersampled image extraction patch extracted from an existing offline multi-coil undersampled image, and a sample label in the label set is a square root of a sum of squares of a full-sampled multi-channel image extraction patch corresponding to the undersampled image extraction patch.

3. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 2, wherein the undersampled image extraction patch used as the training sample is obtained according to the following formula: $\underset{Θ}{\arg \min} {\frac{1}{2 TN} {.Math.}_{t = 1}^{T} {.Math.}_{n = 1}^{N} {.Math. C (x_{t, n}; Θ) - y_{t, n} .Math.}_{2}^{2}}$ wherein x is the undersampled image extraction patch, y is the corresponding full-sampled image extraction patch, C is an end-to-end mapping relationship estimated with a hidden layer parameter Θ={(W.sub.1,b.sub.1), . . . (W.sub.l,b.sub.l), . . . (W.sub.L,b.sub.L)}, T is a number of samples extracted from an image, and N is a total number of images.

4. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 1, wherein the existing offline multi-coil undersampled image is obtained by under sampling a K-space multi-coil full-sampled image.

5. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 1, wherein the L convolutional layers of the initial deep convolutional network model are created in the following manner: ${\begin{matrix} C_{0} = x \\ C_{1} = σ (W_{1} * x + b_{1}) \\ C_{l} = σ (W_{l} * C_{l - 1} + b_{l}), l \in 2, .Math., L - 1 \\ C_{L} = σ (W_{L} * C_{L - 1} + b_{L}) \end{matrix}$ wherein C denotes a convolutional layer and x denotes an input sample; in a formula, W.sub.1 is a convolution operator of a first convolutional layer C.sub.1 and is equal to c×M.sub.1×M.sub.1×n.sub.1, b.sub.1 is an element-related n.sub.1-dimensional offset, c is a number of image channels, M.sub.1 is a filter size and n.sub.1 is a number of filters; W.sub.l is a convolution operator of a l-th convolutional layer C.sub.1 and is equal to n.sub.l-1×M.sub.l×M.sub.l×n.sub.l, b.sub.l is an element-related n.sub.l-dimensional offset, M.sub.l is a filter size and n.sub.l is a number of filters; W.sub.L is a convolution operator of a last convolutional layer C.sub.L and is equal to n.sub.L-1×M.sub.L×M.sub.L×c, wherein b.sub.L is an element-related n.sub.L-dimensional offset, c is a number of image channels, M.sub.L is a filter size, and n.sub.L is a number of filters.

6. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 5, wherein the initial deep convolutional network model further comprises activation layers connected to one or more of the L convolutional layers.

7. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 6, wherein the activation layers use a ReLu activation function.

8. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 5, wherein the initial deep convolutional network model comprises the input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer and the output layer, wherein the first convolutional layer is connected to a first activation layer and the second convolutional layer is connected to a second activation layer.

9. The one-dimensional partial Fourier parallel magnetic resonance imaging method based on the deep convolutional network according to claim 8, wherein the output layer uses an EuclideanLoss function.

10. A one-dimensional partial Fourier parallel magnetic resonance imaging apparatus based on a deep convolutional network, comprising: a module configured to create, based on an existing undersampled multi-channel magnetic resonance image, a sample set and a sample label set for training; a module configured to construct an initial deep convolutional network model comprising an input layer, L convolutional layers and an output layer which are sequentially connected; a module configured to input a training sample (x, y) of the sample set to the initial deep convolutional network model for forward process, compare an output result of the forward process with an expected result in the sample label set, and perform training with a gradient descent method until a parameter of each layer which enables consistency between the output result and the expected result to be maximum is obtained; a module configured to create an optimal deep convolutional network model by using the obtained parameter of the each layer; and a module configured to input a multi-coil undersampled image sampled online to the optimal deep convolutional network model, perform the forward process on the optimal deep convolutional network model, and output a reconstructed full-sampled image; wherein the gradient descent method comprises: for the training sample (x, y), calculating a gradient of a last convolutional layer C.sub.L according to the following formula: $δ^{L} = \frac{\partial J}{\partial b_{L}} = \frac{\partial J}{\partial D_{L}} \frac{\partial D_{L}}{\partial b_{L}} = C_{L} - y$ $wherein \frac{\partial D_{l}}{\partial b_{l}} = 1 and C_{l} = σ (D_{l});$ updating a gradient of an lth-layer nonlinear mapping layer of δ.sup.l by the following formula: $δ^{'} = \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D_{l + 1}} \frac{\partial D_{l + 1}}{\partial C_{l}} \frac{\partial C_{l}}{\partial D_{l}} = (δ^{l + 1} * W^{l + 1}) • \frac{\partial (D^{l})}{\partial D^{l}}$ wherein * denotes a cross-correlation operation, and denotes that array elements are sequentially multiplied; obtaining a gradient of each of the L convolutional layers as: ${\begin{matrix} \frac{\partial J}{\partial W_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial W_{l}} = δ^{l} * D^{l - 1} \\ \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} = δ^{l} \end{matrix}$ updating a parameter of the each of the L convolutional layers based on the calculated gradient of the each of the L convolutional layers.

11. A non-transitory computer readable medium, which is configured to store programs, wherein the programs are computer-executable and cause a computer to perform processing comprising: creating, based on an existing undersampled multi-channel magnetic resonance image, a sample set and a sample label set for training; constructing an initial deep convolutional network model comprising an input layer, L convolutional layers and an output layer which are sequentially connected; inputting a training sample (x, y) of the sample set to the initial deep convolutional network model for forward process, comparing an output result of the forward process with an expected result in the sample label set, and performing training with a gradient descent method until a parameter of each layer which enables consistency between the output result and the expected result to be maximum is obtained; creating an optimal deep convolutional network model by using the obtained parameter of each layer; and inputting a multi-coil undersampled image sampled online to the optimal deep convolutional network model, performing the forward process on the optimal deep convolutional network model, and outputting a reconstructed full-sampled image; wherein the gradient descent method comprises: for the training sample (x, y), calculating a gradient of a last convolutional layer C.sub.L according to the following formula: $δ^{L} = \frac{\partial J}{\partial b_{L}} = \frac{\partial J}{\partial D_{L}} \frac{\partial D_{L}}{\partial b_{L}} = C_{L} - y$ $wherein \frac{\partial D_{l}}{\partial b_{l}} = 1 and C_{l} = σ (D_{l});$ updating a gradient of an lth-layer nonlinear mapping layer of δ.sup.l by the following formula: $δ^{l} = \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D_{l + 1}} \frac{\partial D_{l + 1}}{\partial C_{l}} \frac{\partial C_{l}}{\partial D_{l}} = (δ^{l + 1} * W^{l + 1}) • \frac{\partial (D^{l})}{\partial D^{l}}$ wherein * denotes a cross-correlation operation, and ° denotes that array elements are sequentially multiplied; obtaining a gradient of each of the L convolutional layers as: ${\begin{matrix} \frac{\partial J}{\partial W_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial W_{l}} = δ^{l} * D^{l - 1} \\ \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} = δ^{l} \end{matrix}$ updating a parameter of the each of the L convolutional layers based on the calculated gradient of the each of the L convolutional layers.

12. The computer readable medium according to claim 11, wherein a training sample in the sample set is a coincident undersampled image extraction patch extracted from an existing offline multi-coil undersampled image, and a sample label in the label set is a square root of a sum of squares of a full-sampled multi-channel image extraction patch corresponding to the undersampled image extraction patch.

13. The computer readable medium according to claim 12, wherein the undersampled image extraction patch as the training sample is obtained according to the following formula: $\underset{θ}{\arg \min} {\frac{1}{2 TN} {.Math.}_{i = 1}^{T} {.Math.}_{n = 1}^{N} {.Math. C (x_{t, n}; Θ) - y_{t, n} .Math.}_{2}^{2}}$ wherein x is the undersampled image extraction patch, y is the corresponding full-sampled image extraction patch, C is an end-to-end mapping relationship estimated with a hidden layer parameter Θ={(W.sub.1,b.sub.1), . . . (W.sub.l,b.sub.l), . . . (W.sub.L,b.sub.L)}, T is a number of samples extracted from an image, and N is a total number of images.

14. The computer readable medium according to claim 11, wherein the existing offline multi-coil undersampled image is obtained by under sampling a K-space multi-coil full-sampled image.

15. The computer readable medium according to claim 11, wherein the L convolutional layers of the initial deep convolutional network model are created in a following manner: ${\begin{matrix} C_{0} = x \\ C_{1} = σ (W_{1} * x + b_{1}) \\ C_{l} = σ (W_{l} * C_{l - 1} + b_{l}), l \in 2, .Math., L - 1 \\ C_{L} = σ (W_{L} * C_{L - 1} + b_{L}) \end{matrix}$ wherein C denotes a convolutional layer and x denotes an input sample; in a formula, W.sub.1 is a convolution operator of a first convolutional layer C.sub.1 and is equal to c×M.sub.1×M.sub.1×n.sub.1, b.sub.1 is an element-related n.sub.1-dimensional offset, c is a number of image channels, M.sub.1 is a filter size and n.sub.1 is a number of filters; W.sub.l is a convolution operator of a l-th convolutional layer C.sub.l and is equal to n.sub.l-1×M.sub.l×M.sub.l×n.sub.l, b.sub.l is an element-related n.sub.l-dimensional offset, M.sub.l is a filter size and n.sub.l is a number of filters; W.sub.L is a convolution operator of a last convolutional layer C.sub.L and is equal to n.sub.L-1×M.sub.L×M.sub.L×c, wherein b.sub.L is an element-related n.sub.L-dimensional offset, c is a number of image channels, M.sub.L is a filter size, and n.sub.L is a number of filters.

16. The computer readable medium according to claim 15, wherein the initial deep convolutional network model further comprises activation layers connected to one or more of the L convolutional layers.

17. The computer readable medium according to claim 16, wherein the activation layers use a ReLu activation function, and the output layer uses an EuclideanLoss function.

18. The computer readable medium according to claim 15, wherein the initial deep convolutional network model comprises the input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer and the output layer, wherein the first convolutional layer is connected to a first activation layer and the second convolutional layer is connected to a second activation layer.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a general concept of a one-dimensional partial Fourier parallel magnetic resonance imaging method based on a deep convolutional network according to the present disclosure;

(2) FIG. 2 is a flowchart of a method of the present disclosure;

(3) FIG. 3A is a forward conduction process of a sample over a deep convolutional network offline, and FIG. 3B is an example of a training framework of a deep convolutional network; and

(4) FIG. 4A is a full-sampled image, FIG. 4B is a one-dimensional uniform undersampling pattern used by GRAPPA and SPIRiT, FIG. 4C shows a Hamming filtered 1D low frequency undersampling pattern at an acceleration factor of 3 (left shifted 23 columns from the k-space center), FIG. 4D is a reconstruction visual effect obtained by using SPIRiT, FIG. 4E is a reconstruction visual effect obtained by using GRAPPA, and FIG. 4F is a reconstruction visual effect obtained by using the method of the present disclosure.

DETAILED DESCRIPTION

(5) Specific embodiments of the present disclosure will be described below in conjunction with the drawings. In the specific embodiments described hereinafter of the present disclosure, some specific features are described for a better understanding of the present disclosure and all the specific features are not essential features for implementing the present disclosure, which is apparent to those skilled in the art. The specific embodiments described hereinafter of the present disclosure are merely exemplary specific embodiments of the present disclosure and not intended to limit the present disclosure.

(6) FIG. 1 is a general concept of a one-dimensional partial Fourier parallel magnetic resonance imaging method based on a deep convolutional network according to the present disclosure, which mainly includes two parts: offline training of a deep convolutional network model and online reconstruction of a magnetic resonance image.

(7) Firstly, the samples in the training sample set and the labels corresponding to the samples in the sample label set are input into the created deep convolutional network model for training, the deep convolutional network is trained to learn a nonlinear mapping relationship between undersampled images and fully sampled images, that is, an optimal deep convolutional network model is established, and then the optimal deep convolutional network model is used as a predictor to reconstruct magnetic resonance images online.

(8) Training the deep convolutional network offline includes constructing a deep convolutional network model and training samples. Construction of the deep convolutional network model and the offline training process are described hereinafter in detail in conjunction with the drawings. The construction and training of the deep convolutional network model of the present disclosure basically includes the steps described below.

(9) (1) A training sample set and a corresponding sample label set are established.

(10) The sample set and the sample label set for training are created based on a large number of existing undersampled multi-channel magnetic resonance images.

(11) In a preferred embodiment, a training sample may be a coincident undersampled image extraction patch extracted from an existing offline multi-coil undersampled image, and a sample label may be a square root of a sum of squares of a full-sampled multi-channel image extraction patch corresponding to the undersampled image extraction patch.

(12) In a specific embodiment, a size of the image extraction patch extracted as a sample may be 33×33×12 and a size of the label may be 17×17, but the present disclosure is not limited thereto, and image extraction patches in other sizes and labels may also be used as samples.

(13) The selection of a sample set is crucial to the construction of the optimal deep convolutional network. Therefore, in a preferred embodiment of the present disclosure, the field of view of the K space is provided with an asymmetric undersampled mask along a predetermined dimension, and Hamming filtering is performed on the undersampled mask to obtain a corresponding undersampled trajectory.

(14) In another preferred embodiment of the present disclosure, the above multi-coil undersampled image is obtained by undersampling a multi-coil full-sampled image in K space by using a Hamming filtered 1D low frequency undersampling pattern at an acceleration factor of 3 (left shifted 23 columns from the k-space center). FIG. 4B is a schematic diagram illustrating a one-dimensional uniform undersampling pattern used by GRAPPA and SPIRiT, and FIG. 4C is a schematic diagram illustrating a Hamming filtered 1D low frequency undersampling pattern at an acceleration factor of 3 (left shifted 23 columns from the k-space center).

(15) Compared with the conventional one-dimensional uniform undersampling pattern used by GRAPPA and SPIRiT, the Hamming filtered 1D low frequency undersampling pattern at an acceleration factor of 3 (left shifted 23 columns from the k-space center) has the advantages that an undersampled image sample set with a higher quality can be obtained. The higher the quality of the training sample set, the more favorable it is for training the deep convolutional network model.

(16) In addition, the greater the number of samples, the better the accuracy of the trained deep convolutional network. Thus, in a specific embodiment of the present disclosure, a training set including a large number of samples is used, includes approximately 650,000 labeled samples, up to 34.8 G in capacity.

(17) Data in a sample needs to be processed before the sample is input into the network model.

(18) Firstly, the undersampling K space is defined as:
f=PFu (2)

(19) In the formula, P denotes a diagonal matrix of an undersampling pattern, F is a full-sampling Fourier encoding matrix normalized by a formula F.sup.HF=I, u denotes a vector matrix of an original image or an offline image, and Fu denotes full-sampling K-space data.

(20) H denotes a Hermitian transform whose zero-padded magnetic resonance image z can be obtained by direct inverse transformation of the observed data, and the expression is as follows:
z=F.sup.HPFu (3)

(21) According to the related theory of linear algebra, the cyclic convolution of a signal u plus an abrupt change signal p can be expressed as F.sup.HPFu, and P in the formula is the diagonal term of Fourier transform p and is a non-zero term.

(22) Further, in order to achieve the objective of the present disclosure, it is necessary to learn a global convolutional neural network from the undersampled Fourier data as much as possible to reconstruct the magnetic resonance image, but considering that the data of the magnetic resonance image obtained in advance is true or corrupted offline, it is necessary to minimize the error by the following objective function.

(23) $\begin{matrix} \underset{Θ}{\arg \min} {\frac{1}{2 T} {.Math.}_{t = 1}^{T} {.Math. C (z_{t}; Θ) - u_{t} .Math.}_{2}^{2}} & (4) \end{matrix}$

(24) C is an end-to-end mapping relationship estimated with a hidden layer parameter Θ={(W.sub.1,b.sub.1), . . . (W.sub.l,b.sub.l), . . . (W.sub.L,b.sub.L)}, T is a number of samples extracted from an image, and N is a total number of images.

(25) To increase the robustness of the network, in an embodiment, more training samples may be obtained according to the following formula:

(26) $\begin{matrix} \underset{Θ}{\arg \min} {\frac{1}{2 TN} {.Math.}_{t = 1}^{T} {.Math.}_{n = 1}^{N} {.Math. C (x_{t, n}; Θ) - y_{t, n} .Math.}_{2}^{2}} . & (5) \end{matrix}$

(27) C is an end-to-end mapping relationship estimated with a hidden layer parameter Θ={(W.sub.1,b.sub.1), . . . (W.sub.l,b.sub.l), . . . (W.sub.L,b.sub.L)}, T is a number of samples extracted from an image, and N is a total number of images.

(28) In the following description, merely one pair (x, y) is used as a training sample for convenience of expression.

(29) (2) A deep convolutional network model is constructed.

(30) In an example of the deep convolutional network model of the present disclosure, a convolutional neural network model having an input layer, L convolutional layers, and an output layer is created as follows.

(31) The first convolutional layer of the convolutional neural network model is defined as:
C.sub.1=σ(W.sub.1*x+b.sub.1) (6)

(32) W.sub.1 is a convolution operator and is equal to c×M.sub.1×M.sub.1×n.sub.l, b.sub.1 is an element-related n.sub.1-dimensional offset, c is the number of image channels, M.sub.1 is a filter size and n.sub.l is the number of filters.

(33) For a nonlinear response, a rectified linear unit such as a ReLU function or the like is used for more efficient calculation.

(34) Next a nonlinear mapping is further performed, i.e., mapping from n.sub.l-1 dimension to n.sub.l is performed, and image features and structures are defined by the following formula to represent the entire data reconstructed image:
C.sub.l=σ(W.sub.l*C.sub.l-1+b.sub.l) (7)

(35) W.sub.l is a convolution operator and is equal to n.sub.l-1×M.sub.l×M.sub.l×n.sub.l, b.sub.l is an element-related n-dimensional offset, M.sub.l is a filter size and n.sub.l is the number of filters.

(36) Considering the convolution problem of the last layer, in order to reconstruct the final predicted image from the convolutional neural network, another layer of convolution needs to be constructed, and the final image is predicted by the last-layer activation function.
C.sub.L=σ(W.sub.L*C.sub.L-1+b.sub.L) (8).

(37) W.sub.L is a convolution operator and is equal to n.sub.L-1×M.sub.L×M.sub.L×c, b.sub.L is an element-related n.sub.L-dimensional offset, c is the number of image channels, M.sub.L is a filter size and n.sub.L is the number of filters.

(38) Finally, a convolutional neural network with L convolutional layers is designed to learn the mapping relationship:

(39) $\begin{matrix} {\begin{matrix} C_{0} = x \\ C_{1} = σ (W_{1} * x + b_{1}) \\ C_{l} = σ (W_{l} * C_{l - 1} + b_{l}), l \in 2, .Math., L - 1 \\ C_{L} = σ (W_{L} * C_{L - 1} + b_{L}) \end{matrix} . & (9) \end{matrix}$

(40) Next, a detailed description will be given with reference to the drawings. FIG. 3A and FIG. 3B show an example of a deep convolutional network constructed by the present disclosure, and show the forward conduction process of a sample over the deep convolutional network and the training framework built in a deep learning architecture.

(41) The deep convolutional network model created in FIGS. 3A and 3B includes an input layer data, a first convolutional layer conv1, a second convolutional layer conv2, a third convolutional layer conv3 and an output layer including a loss function. The input layer, the first convolutional layer, the second convolutional layer, the third convolutional layer, and the output layer are sequentially connected.

(42) The structure of the deep convolutional network of the present disclosure is not limited to the examples in FIGS. 3A and 3B. For example, the deep convolutional network model of the present disclosure may include more than three convolutional layers, or may include merely two convolutional layers.

(43) Furthermore, the model further includes a rectified linear unit, for example, a first activation layer relu1 connected to the first convolutional layer and a second activation layer relu2 connected to the second convolutional layer to linearly correct the output of each convolutional layer.

(44) In the embodiment of FIG. 3B, both the first and second activation layers use the ReLU activation function, while the loss layer uses the EuclideanLoss function. ReLU is one type of activation function, and sigmoid, ReLU and other nonlinear activation functions are commonly used in the convolutional neural network.

(45) In this embodiment of the present disclosure, the reason why ReLU is used as the activation function is that saturation of the sigmoid function (i.e., gradient descent is slow in a gentle region where the sigmoid function approaches 1) can be avoided by using the ReLU function, training speed is increased to accelerate image reconstruction, gradient diffusion can be avoided, and accuracy is higher.

(46) (3) A deep convolutional network model is trained.

(47) Next, the training samples are input into the created deep convolutional network model to start the training process.

(48) Firstly, a training sample in the training sample set is input to the network model for forward process, and the output result of forward process is compared with data in a sample label.

(49) The forward process and training process of the sample in the deep convolutional network model shown in FIGS. 3A and 3B are further explained below, taking the sample size of 33×33×12 and the label size of 17×17 as examples.

(50) In FIG. 3A, D denotes the number of channels of the multi-channel coil, the extracted image extraction patch of W0×H0×D is input as a sample to the input layer, and the corresponding sample label is input to the loss layer as shown in FIG. 3B.

(51) In the first convolutional layer, convolution extraction is performed on the input image samples through K1 convolution kernels, and a size of each of the K1 convolution kernels is a. As shown in FIG. 3A, after the input image sample passes through the first convolutional layer, an image feature of W1×H1×k1 is obtained by means of convolutional extracted on the input sample image.

(52) In the embodiment of FIG. 3B, the first convolutional layer conv1 uses a convolution kernel with a weight size of 9×9×12×64 and an offset size of 64×1 and selects a stride of 1 to perform processing in the manner of a non-extended edge (an extended edge value of 0). Here, the obtained image feature can also be linearly corrected through the first activation layer relu1, and the corrected image feature is sent to the next processing layer.

(53) Next, the obtained W1×H1×k1 image feature is subjected to a second convolution extraction at the second convolutional layer.

(54) As shown in FIG. 3A, an image feature of W2×H2×k2 is obtained after the second convolutional layer is passed through. In the embodiment of FIG. 3B, the second convolutional layer conv2 uses a convolution kernel with a weight size of 5×5×64×32, and an offset size of 32×1 and selects a stride of 1 for the second convolution extraction in the manner of a non-extended edge (i.e., an extended edge value of 0). Here, the obtained image feature can be linearly corrected through the second activation layer relu2 and the corrected image feature is sent to the next processing layer.

(55) Next, the obtained image feature of size W2×H2×k2 is sent into the third convolutional layer for similar convolution extraction.

(56) As shown in FIG. 3A, after the third convolutional layer is passed through, a single channel output image of W3×H3 is obtained. In the embodiment of FIG. 3B, the third convolutional layer conv3 uses a convolution kernel with a weight size of 5×5×32×1 and an offset size of 1, and selects a stride of 1 for the third convolution extraction in the manner of a non-extended edge (i.e., an extended edge value of 0), thereby obtaining the output result of forward process.

(57) Next, after the end of the forward process, the output result will be sent to the output layer for comparison with the expected value.

(58) As shown in FIG. 3B, the output image obtained from the third convolutional layer is sent to the loss function (also referred to as an error function) of the output layer so that the output value is compared with the data in the label.

(59) In the deep convolutional network model, the loss layer (loss function) is used for estimating the degree of inconsistency (deviation or error) between a prediction result of an output sample and an ideal result (input label information) to which the sample corresponds. Generally, the smaller the loss function value, the more consistent the predicted result with the ideal result, and the better the robustness of the model. In fact, the entire process of training samples is the process of finding parameters of each layer in the model that minimizes the loss function, and the parameters include the weight and offset parameters of each layer of the network.

(60) In an embodiment of the present disclosure, error backward propagation is performed by using a gradient descent method based on the comparison result (deviation or error). Generally, in the calculation process of the gradient descent method, the gradients of error function (loss function) to all weights and offset values are calculated in a manner of error backward propagation. Specifically, the method starts from any point, moves a distance in the opposite direction of the gradient of that point, and then runs a distance in the opposite direction of the gradient of the new position, and iteration is performed in this way to move to the global minimum point of the function.

(61) For example, for pairs of training samples (x, y), output values of forward process are calculated by equations (5) to (8). In order to update the parameters of each layer of the network, the corresponding gradient is calculated by error backward propagation.

(62) A single pair of targets (x, y) is considered firstly. Equation (4) can be expressed by the following formula:

(63) $\begin{matrix} J (Θ) = \underset{Θ}{\arg \min} {\frac{1}{2} {.Math. C (x; Θ) - y .Math.}_{2}^{2}} . & (10) \end{matrix}$

(64) In the formula, C.sub.l=W.sub.l*C.sub.l-1+b.sub.l.

(65) δ.sup.l is the gradient of the error term b in the backward propagation and the calculation formula is as follows:

(66) $\begin{matrix} δ^{L} = \frac{\partial J}{\partial b_{L}} = \frac{\partial J}{\partial D_{L}} \frac{\partial D_{L}}{\partial b_{L}} = C_{L} - y . & (11) \end{matrix}$

(67) Since

(68) $\frac{\partial D_{l}}{\partial b_{l}} = 1 and C_{l} = σ (D_{l}),$
a gradient of the lth-layer nonlinear mapping layer of δ.sup.l can be updated by the following formula:

(69) $\begin{matrix} δ^{'} = \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D_{l + 1}} \frac{\partial D_{l + 1}}{\partial C_{l}} \frac{\partial C_{l}}{\partial D_{l}} = (δ^{l + 1} * W^{l + 1}) • \frac{\partial (D^{l})}{\partial D^{l}} . & (12) \end{matrix}$

(70) In the formula, * denotes a cross-correlation operation, and ° denotes that array elements are sequentially multiplied.

(71) The gradient of each layer is then obtained as:

(72) 0 $\begin{matrix} {\begin{matrix} \frac{\partial J}{\partial W_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial W_{l}} = δ^{l} * D^{l - 1} \\ \frac{\partial J}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} \frac{\partial D^{l}}{\partial b_{l}} = \frac{\partial J}{\partial D^{l}} = δ^{l} \end{matrix} & (13) \end{matrix}$

(73) $\frac{\partial J (Θ)}{\partial Θ} .$

(74) During training, it is customary to calculate the random gradient.

(75) (4) An optimal deep convolutional network model is created.

(76) Based on the calculated gradient of each layer, the weight and offset parameters of each layer of the network are determined, i.e., the calculated gradient is used for updating parameters W.sub.l and b.sub.l by using the gradient descent method, thus acquiring a nonlinear mapping relationship from the undersampled image to the fully-sampled image. That is, the optimal deep convolutional network is created by using the weight and offset parameters obtained from the training in step (3), and can serve as a predictor.

(77) For example, in the embodiment of FIG. 3B, comparison is made with the input labels (i.e., vectors related to the corresponding sample fully-sampled image), a gradient descent method is used for minimizing the loss function, thereby determining the nonlinear mapping relationship between the input undersampled image sample and the corresponding fully-sampled image, i.e., thereby determining the weight and offset parameters of each layer of the network that minimize the loss function, and using the obtained weight and offset parameters to create an optimal deep convolutional network model.

(78) (5) A magnetic resonance image is reconstructed online by using the optimal deep convolutional network model.

(79) A magnetic resonance image can be reconstructed online by using the optimal deep convolutional network model created in step (4), an undersampled multi-coil image sampled online is input into the optimal deep convolutional network for forward conduction, and a full-sampled image is output.

(80) As in the embodiment shown in FIG. 3B, in forward process reconstruction of the magnetic resonance image, the image input into the deep convolutional network model is no longer a segmented image extraction patch, but an entire multi-coil undersampled image.

(81) FIGS. 4D to 4F show a comparison of image reconstruction results respectively obtained by using SPIRiT, GRAPPA and the method of the present disclosure.

(82) The results show that, compared with the current popular methods GRAPPA and SPIRiT by which a magnetic resonance image is reconstructed in K space and noise is brought to the reconstructed image, the one-dimensional partial Fourier parallel magnetic resonance imaging method based on a deep convolutional network provided in the present disclosure performs undersampling in the K space, and uses the trained weight and offset in the spatial domain to reconstruct the magnetic resonance image through forward conduction, which is actually to filter the whole multi-channel undersampled image by using a filter (the filter is the trained weight). Therefore, compared with GRAPPA and SPIRiT, the present disclosure can well remove the noise of the reconstructed image and reconstruct the magnetic resonance image with a better visual effect.

(83) In addition, when an image is reconstructed, the weight and offset parameters are obtained for forward conduction by training the deep network to learn, and the forward conduction speed using parallel computation itself is very fast, so high-speed reconstruction of a magnetic resonance image is another advantage of the present disclosure.

(84) Although the present disclosure is described through the preferred embodiments, modifications, permutations and various equivalent substitutions are possible within the scope of the present disclosure. It is to be noted that there are many alternative ways of implementing the method and system of the present disclosure. Therefore, it is intended that the appended claims shall be construed as including all the modifications, permutations and various equivalent substitutions within the spirit and scope of the present disclosure.

One-dimensional partial Fourier parallel magnetic resonance imaging method based on deep convolutional network

Assignee

Inventors

Cpc classification

Classification Explorer

G01R33/5611

PHYSICS

Classification Explorer

G01R33/5608

PHYSICS

Classification Explorer

G06N3/084

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

G01R33/0029

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G06N3/048

PHYSICS

Classification Explorer

G01R33/5615

PHYSICS

International classification

Classification Explorer

G01R33/561

PHYSICS

Classification Explorer

G01R33/00

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

Abstract

Claims

Description