Three-dimensional automatic location system for epileptogenic focus based on deep learning
11645748 · 2023-05-09
Assignee
Inventors
- Cheng Zhuo (Hangzhou, CN)
- Mei Tian (Hangzhou, CN)
- Hong Zhang (Hangzhou, CN)
- Qinming Zhang (Hangzhou, CN)
- Teng ZHANG (Hangzhou, CN)
- Yi Liao (Hangzhou, CN)
- Xiawan Wang (Hangzhou, CN)
- Jianhua Feng (Hangzhou, CN)
Cpc classification
G06V10/255
PHYSICS
G06V10/454
PHYSICS
A61B6/501
HUMAN NECESSITIES
A61B6/5217
HUMAN NECESSITIES
A61B6/5211
HUMAN NECESSITIES
G06N3/082
PHYSICS
International classification
Abstract
The present disclosure discloses a three-dimensional automatic location system for an epileptogenic focus based on deep learning. The system includes: a PET image acquisition and labelling module; a registration module mapping PET image to standard symmetrical brain template; a PET image preprocessing module generating mirror image pairs of left and right brain image blocks; a network SiameseNet training module containing two deep residual convolutional neural networks which share weight parameters, an output layer connecting a multilayer perceptron and a softmax layer, and using a training set of an epileptogenic focus image and an normal image to train the network to obtain a network model; a classification module and epileptogenic focus location module, using the trained network model to generate a probabilistic heatmap for the newly input PET image, a classifier determining whether the image is normal or epileptogenic focus sample, and then predicting a position for the epileptogenic focus region.
Claims
1. A three-dimensional automatic location system for an epileptogenic focus based on deep learning, wherein the system comprises following modules: (1) a PET image acquisition and labelling module, for image acquisition and epileptogenic focus region labelling: 1.1) acquiring an image: a subject using a 3D brain image acquisition on a PET scanner, to acquire the PET brain image at the same posture state process; 1.2) labelling samples: dividing the PET images into a normal sample set and an epileptogenic focus sample set, and manually labelling an epileptogenic focus region for the epileptogenic focus sample set, where the epileptogenic focus region is labelled as 1, and remaining regions are labelled as 0; (2) a PET image registration module: using cross-correlation as similarity measure between an original image and a registration image, using a symmetric differential homeomorphic (SyN) algorithm to register all of the PET images and the labelled images thereof into the same symmetric standard space, to achieve the registration from the acquired PET images and the labelled images to standard symmetric brain templates; (3) adopting a deep learning system based on symmetry, comprising following modules: 3.1) a data preprocessing module: 3.1.1) data enhancement: performing radial distortion and image intensity enhancement on the registered image and the label to obtain a newly generated image and label; 3.1.2) image block division: performing image block division on enhanced image data, using a three-dimensional sliding window to divide left and right hemispheres L and R of the PET image into mirror image pairs of the image block, and dividing data of the mirror image pairs of the image block into a training set and a test set according to proportions; the training set and the test set all contain two types of PET image block data—epileptogenic focus and normal; 3.2) a network building module: building a deep twin network SiameseNet, and this network contains two identical convolutional neural networks, a fully connected layer and an output layer; the SiameseNet inputs the mirror image pairs of a pair of image blocks to the two convolutional neural networks which share a weight parameter θ in each layer, to obtain a feature L_feature and a feature R_feature of two high-dimensional image, an absolute difference of the two high-dimensional image features is calculated: d=|L_feature−R_feature|, and it is transmitted to a multi-layer perceptron of the fully connected layer for probability regression, and the output layer uses a classification probability of a softmax regression function, that is, a probability that the image block carries the epileptogenic focus or is normal; 3.3) a test image detection module: image classification: using the trained network to calculate a probability heatmap of the PET image of the test set, and using a logistic regression algorithm to classify the probability heatmap corresponding to each PET image, to obtain a classification result, that is, the normal PET image or the epileptogenic focus PET image; locating of the epileptogenic focus: performing bilinear interpolation on the probabilistic heatmap identified as the epileptogenic focus PET image, changing the probability heatmap to a size of the original image, and predicting a region larger than a probability threshold as the epileptogenic focus region.
2. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 1.1) in the process of acquiring the image, format conversion is performed on the acquired PET brain image, that is, the originally acquired image in a DICOM format is converted into an image in a NIFTI format.
3. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein (2) in the image registration module, a Gaussian smoothing algorithm is used to reduce registration errors, the Gaussian smoothing process selects FWHM of full width at half maximum of the Gaussian function to be 5 to 15 mm, and Z-score normalization is performed on the smoothed image.
4. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 3.1.1) in the process of the data enhancement, radial distortion is specifically as follows: the radial distortion is that an image pixel point takes a distortion center as a center point, deviation is generated along a radial position, and a calculation process of the radial distortion is:
P.sub.u=P.sub.d+(P.sub.d−P.sub.c)(k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6+. . . ), where P.sub.u is a pixel point of the original image, P.sub.d is a pixel point of the distorted image, P.sub.c is a distortion center, k.sub.i(i=1,2,3 . . . ) is a distortion coefficient of the radial distortion, and r is a distance between P.sub.d and P.sub.c in a vector space.
5. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 3.1.1) in the process of the data enhancement is specifically as follows: the image intensity enhancement comprises filter processing, image noise-adding processing, and multiplicative and additive transformation of image gray values in the space, and a formula for the image intensity enhancement is:
P.sub.a=g_mult×P.sub.u+g_add, where P.sub.a is an image pixel point after the image intensity enhancement, g_mult is an image pixel point of a multiplicative Gaussian bias field, and g_add is an image pixel point of an additive Gaussian bias field.
6. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 3.1.2) in the division of the image block, resolution of each PET image data in the image data set is X×Y×Z pixels, a size of the sliding scanning window block is set to m×m×m, and a sliding step length is set to t, then the size of each image block is m×m×m, and for the left and right hemispheres of a PET image, it can be divided into
7. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 3.2) in the network building module, each of the convolutional neural networks of the SiameseNet has a structure of ten layers, in which the first layer comprises one convolution layer, one batch normalization operation unit, one Relu function, and one pool layer that are connected in sequence; each of the second to the ninth layers is a ResBlocks, and each of the ResBlocks contains two convolution layers, two normalization operations and one Relu function that are connected in sequence; the tenth layer is one convolution layer, and the tenth layers of the two convolutional neural networks output and are connected to one fully connected layer for nonlinear transformation, dimensions of the fully connected layer vectors are 2048, 1024, 512 and 2 in sequence, and finally, one output layer is connected.
8. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 7, wherein 3.2) in the network building module, in the model training, a cross entropy function is used as a loss function of the network, and a calculation formula of the cross entropy Loss(a, b) is:
Loss(a, b)=−Σ.sub.i=1.sup.na.sub.i ln b.sub.i, where n represents the number of samples, a is correct probability distribution, and b is probability distribution predicted by the network model; standard stochastic gradient descent is used to update the weight parameter θ, and a formula thereof is:
9. The three-dimensional automatic location system for the epileptogenic focus based on the deep learning according to claim 1, wherein 3.2) in the network building module, in the SiameseNet network model, a calculation process of the convolution layer operation is: is normalized data, output.sub.norm is batch data output by the batch normalization operation, μ and σ are respectively mean and variance of each batch data, γ and β are respectively scaling and translation variables, and ϵ is a relatively small constant data added to increase training stability.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
DESCRIPTION OF EMBODIMENTS
(6) The present disclosure will be further described in detail below with reference to the drawings and specific embodiments.
(7) As shown in
(8) (1) a PET image acquisition and labelling module, including image acquisition and epileptogenic focus region labelling:
(9) 1.1) acquiring an image: using a 3D PET/CT scanner to acquire a PET image of a brain, a subject maintaining the same posture during an acquisition process, acquiring the PET image. After the image is acquired, image format conversion is performed, that is, an originally acquired image sequence in a DICOM format is converted into an easy-to-process image in a NIFTI format.
(10) 1.2) labelling samples: dividing the PET images into a normal sample set and an epileptogenic focus sample set, and manually labelling the epileptogenic focus region for the epileptogenic focus sample set, where the epileptogenic focus region is labelled as 1, and the remaining regions are labelled as 0.
(11) (2) a PET image registration module: using cross-correlation as the similarity measure between images, using a symmetric differential homeomorphic (SyN) algorithm to deform all PET images and the labelled images thereof into the same symmetric standard space, in order to achieve the registration of the acquired PET images, the labelled images and standard symmetric brain templates. For deforming an original image I to an image J, a following objective function is minimized:
(12)
(13) The first term is a smoothing term, in which L is a smoothing operator and v is a velocity field. λ in the second term controls accuracy of matching. C(I,J) is a similarity measure, where C(I,J) can be expressed as:
(14)
(15) After registration, a Gaussian smoothing algorithm is used to reduce registration errors caused by individual differences. The Gaussian smoothing process selects the FWHM of the full width at half maximum of the Gaussian function to be 5 to 15 mm, to eliminate the registration errors caused by individual differences. Z-score normalization is performed on the smoothed image:
(16)
(17) where μ is mean of the registered image J, and σ is variance of an image.
(18) (3) adopting a deep learning system based on symmetry, including following modules:
(19) 3.1) a data preprocessing module:
(20) 3.1.1) data enhancement: performing radial distortion and image intensity enhancement on the registered image and the label to obtain a newly generated image and label. The radial distortion is that an image pixel point takes a distortion center as a center point, deviation is generated along a radial position, and a calculation process of the radial distortion is:
P.sub.u=P.sub.d+(P.sub.d−P.sub.c)(k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6+. . . )
(21) where P.sub.u is a pixel point of the original image, P.sub.d is a pixel point of the distorted image, P.sub.c is a distortion center, k.sub.i(i=1,2,3 . . . ) is a distortion coefficient of the radial distortion, and r is a distance between P.sub.d and P.sub.c in a vector space.
(22) The image intensity enhancement includes filter processing, image noise-adding processing, and multiplicative and additive transformation of image gray values in the space, and a formula for the image intensity enhancement is:
P.sub.a=g_mult×P.sub.u+g_add
(23) where P.sub.a is an image pixel point after the image intensity enhancement, g_mult is an image pixel point of a multiplicative Gaussian bias field, and g_add is an image pixel point of an additive Gaussian bias field.
(24) 3.1.2) image block division: performing image block division on enhanced image data, using a three-dimensional sliding window to divide left and right hemispheres L and R of the PET image into mirror image pairs of the image block, and dividing data of the mirror image pairs of the image block into a training set, a verification set and a test set according to proportions; the training set, the verification set and the test set all contain two types of PET image block data—epileptogenic focus and normal. In the image data set, resolution of each PET image data is X×Y×Z pixels, a size of the sliding scanning window block is set to m×m×m, and a sliding step length is set to t. Then, the size of each image block is m×m×m. For the left and right hemispheres of a PET image, it can be divided into
(25)
pairs of image blocks.
(26) 3.2) a network building module: building a deep twin network SiameseNet. This network contains two identical convolutional neural networks, a fully connected layer and an output layer. Each of the convolutional neural networks has a structure of ten layers, in which the first layer includes one convolution layer (cony), one batch normalization operation unit (batch normalization), one Relu function, and one pool layer (pool) that are connected in sequence; each of the second to the ninth layers is a ResBlock, and each of the ResBlocks contains two convolution layers, two normalization operations and one Relu function that are connected in sequence; the tenth layer is one convolution layer, and the tenth layers of the two convolutional neural networks output and are connected to one fully connected layer (fc) for nonlinear transformation. Finally, one output layer is connected. Parameter setting for one random dropout can be 0.5.
(27) In the SiameseNet network model, a calculation process of the convolution layer operation is:
(28)
(29) where output.sub.conv is the three-dimensional size of output image data of each of the convolution layer (length, width and depth of the image), input.sub.conv is a three-dimensional size of an input image, pad means to fill pixels around the image, kernal is a three-dimensional size of a convolution kernel, and stride is a step length of the convolution kernel.
(30) For each of the convolution layers, the batch normalization operation is used, to accelerate a convergence speed and stability of the network, and a formula for the batch normalization operation is:
(31)
(32) where input.sub.norm is each batch data that is input, is normalized data, output.sub.norm is batch data output by the batch normalization operation, μ and σ are respectively mean and variance of each batch data, γ and β are respectively scaling and translation variables, and ϵ is a relatively small constant data added to increase training stability;
(33) an activation function connected to each of the convolution layers uses the Relu function, which can shorten a training period, and a calculation method of the Relu function is:
output.sub.relu=max(input.sub.relu, 0)
(34) where input.sub.relu is input data of the Relu function, and output.sub.relu is output data of the Relu function.
(35) The two convolutional neural networks of the SiameseNet share the same weight parameter θ in each layer and the inputs of the network are mirror image pairs of a pair of image blocks. As shown in
(36)
(37) where d.sub.j represents output of different categories, g represents the number of the categories, j=1,2, . . . g.
(38) In the model training, a cross entropy function is used as a loss function of the network. A calculation method of the cross entropy Loss(a, b) is:
(39)
(40) where n represents the number of samples, a is correct probability distribution, and b is probability distribution predicted by the network model. Standard stochastic gradient descent (SGD) is used to update the weight parameter θ, and a formula thereof is:
(41)
(42) where η is a learning rate and θ.sup.k is a k-th weight parameter.
(43) In an example of the present disclosure, flowcharts of the training phase and the test phase are as shown in
(44) 3.3) a test image detection module:
(45) image classification: using the trained model to calculate a probability heatmap of the PET image of the test set. As shown in
(46)
Afterwards, a logistic regression algorithm is used to classify the probability heatmap corresponding to each PET image, to obtain a classification result, that is, the normal PET image or the epileptogenic focus PET image.
(47) Locating of the epileptogenic focus: performing bilinear interpolation on the probabilistic heatmap identified as the epileptogenic focus PET image, changing the probability heatmap to a heatmap having the same size as that of the original image, and predicting a region larger than a probability threshold as the epileptogenic focus region. A calculation formula of the bilinear interpolation is:
f(m+u, n+v)=(1−u)(1−v)f(m, n)+u(1−v)f(m+1, n)+(1−u)vf(m, n+1)+uvf(m+1, n+1)
(48) where f(m+u, n+v) is a newly calculated pixel value, f(m, n), f(m+1, n), f(m, n+1) and f(m+1, n+1) are respectively four original pixel values around the new pixel value, and u and v are distances between the original pixel point and the new pixel point. By setting the threshold k (heatmap≥heatmap_max×k), in which the heatmap_max is the maximum value of the heatmap, the predicted epileptogenic focus region is finally obtained.
(49) In a specific case where the system of this embodiment is applied, as shown in
(50) This patent is not limited to the preferred embodiment above. Under inspiration of this patent, anyone can obtain various other forms of epileptogenic focus location system based on deep learning, and all changes and modifications made in accordance with the scope of the patent application of the present disclosure shall fall within the scope of this patent.