Tomographic image machine learning device and method
11494586 · 2022-11-08
Assignee
Inventors
Cpc classification
G06F18/214
PHYSICS
G06T2211/441
PHYSICS
G06V10/454
PHYSICS
G16H50/20
PHYSICS
A61B6/5217
HUMAN NECESSITIES
G06T11/008
PHYSICS
G16H50/70
PHYSICS
International classification
Abstract
There are provided machine learning device and method which can prepare divided data suitable for machine learning from volume data for learning. A machine learning unit (15) calculates detection accuracy of each organ O(j,i) in a predicted mask Pj using a loss function Loss. However, the detection accuracy of the organ O(k,i) with a volume ratio A(k,i)<Th is not calculated. That is, in the predicted mask Pk, the detection accuracy of the organ O(k,i) with a volume ratio that is small to some extent is ignored. The machine learning unit (15) changes each connection load of a neural network (16) from an output layer side to an input layer side according to the loss function Loss.
Claims
1. A machine learning device comprising: a neural network; a machine learning unit; a processor configured to function as: a learning data input unit that receives an input of learning data, the learning data including volume data of a tomographic image and labeling of a first region in the volume data; a division unit that divides the learning data received by the learning data input unit to create divided learning data, wherein the divided learning data includes a second region, and a labeling of the second region corresponds to the labeling of the first region; a learning exclusion target region discrimination unit that discriminates a learning exclusion target region which is a region to be excluded from a target of learning, from the divided learning data created by the division unit and the learning data; wherein the machine learning unit trains the neural network to perform labeling of a region other than the learning exclusion target region discriminated by the learning exclusion target region discrimination unit by machine learning, on the basis of the divided learning data created by the division unit, wherein the learning exclusion target region discrimination unit compares a volume of the second region labeled in the divided learning data to a volume of the first region labeled in the learning data, and sets the second region as the learning exclusion target region to be excluded from the target of learning when a volume ratio of the second region to the first region is equal to or less than a predetermined threshold value.
2. The machine learning device according to claim 1, the processor is further configured to function as: a detection accuracy calculation unit that calculates detection accuracy of a region other than the learning exclusion target region discriminated by the learning exclusion target region discrimination unit, wherein the machine learning unit performs the labeling of the region other than the learning exclusion target region by machine learning, on the basis of the divided learning data created by the division unit and the detection accuracy calculated by the detection accuracy calculation unit.
3. The machine learning device according to claim 2, wherein the detection accuracy calculation unit calculates the detection accuracy on the basis of an average of Intersection over Union (IoU) between a predicted label and a ground truth label of each region.
4. The machine learning device according to claim 1, wherein the division unit re-divides the learning data to create another divided learning data in which a number of the learning exclusion target region is reduced.
5. The machine learning device according to claim 1, wherein the division unit creates pieces of divided learning data having an overlapping portion.
6. The machine learning device according to claim 1, wherein the tomographic image is a three-dimensional medical tomographic image, and the first region includes an organ.
7. A machine learning method executed by a computer, the machine learning method comprising: a step of receiving an input of learning data including volume data of a tomographic image and labeling of a first region in the volume data; a step of dividing the learning data to create divided learning data, wherein the divided learning data includes a second region, and a labeling of the second region corresponds to the labeling of the first region; a step of discriminating a learning exclusion target region which is a region to be excluded from a target of learning, from the divided learning data and the learning data; and a step of training a neural network to perform labeling of a region other than the learning exclusion target region by machine learning, on the basis of the divided learning data, wherein a volume of the second region labeled in the divided learning data is compared to a volume of the first region labeled in the learning data, and the second region is set as the learning exclusion target region to be excluded from the target of learning when a volume ratio of the second region to the first region is equal to or less than a predetermined threshold value.
8. A machine-learned model obtained by machine learning by the machine learning method according to claim 7.
9. A non-transitory computer-readable recording medium that records thereon, computer commands which cause a computer to execute the machine learning method according to claim 7 in a case where the computer commands are read by the computer.
10. A machine learning device comprising: a neural network; a machine learning unit; a processor configured to function as: a learning data input unit that receives an input of learning data, the learning data including volume data of a tomographic image and labeling of a first region in the volume data; a division unit that divides the learning data received by the learning data input unit to create divided learning data, wherein the divided learning data includes a second region, and a labeling of the second region corresponds to the labeling of the first region; a learning exclusion target region discrimination unit that discriminates a learning exclusion target region which is a region to be excluded from a target of learning, from the divided learning data created by the division unit and the learning data; wherein the machine learning unit trains the neural network to perform labeling of a region other than the learning exclusion target region discriminated by the learning exclusion target region discrimination unit by machine learning, on the basis of the divided learning data created by the division unit, wherein the learning exclusion target region discrimination unit compares an area of the second region labeled in the divided learning data to an area of the first region labeled in the learning data, and sets the second region as the learning exclusion target region to be excluded from the target of learning when an area ratio of the second region to the first region is equal to or less than a predetermined threshold value.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
DESCRIPTION OF THE PREFERRED EMBODIMENTS
(7)
(8) The original learning data input unit 11 receives an input of sets (original learning data) of volume data V consisting of a number of axial tomographic images (multi-slice images) and a ground truth mask G in which each pixel in an image is classified into the type (class) of an anatomical structure by a doctor or the like manually assigning (labeling) a ground truth label such as “lung”, “bronchi”, “blood vessel”, “air filling pattern”, and “others (background)” to each voxel included in the volume data.
(9) The original learning data division unit 12 divides (crops) the original learning data of which the input is received by the original learning data input unit 11, in the axial direction by a predetermined unit to create N pieces of divided learning data D1, D2, D3, . . . , and DN consisting of divided volume data V1, V2, V3, . . . , and VN and divided ground truth masks G1, G2, G3, . . . , and GN (refer to
(10) Different two pieces of divided learning data may include an overlapping portion. Further, the original learning data may be divided not only in the axial direction but also in a sagittal direction or a coronal direction.
(11) The learning exclusion target discrimination unit 13 calculates, from the volume data V, a total volume Vi of each organ Oi (i is an integer of 1 or more), and calculates a volume V(j,i) (i=1 to n(j)) of n(j) organs O(j,i) included in the divided ground truth mask Gj. O(j,i) is assigned with the same organ label as Oi. However, O(j,i) and Oi do not exactly match depending on the position of division.
(12) For example, as illustrated in
(13) The learning exclusion target discrimination unit 13 calculates a volume ratio A(j,i)=V(j,i)/Vi(<1) of the organ O(j,i) in the divided ground truth mask Gj to the organ Oi in the ground truth mask G.
(14) The learning exclusion target discrimination unit 13 discriminates whether the entire organ Oi included or only a part of the organ Oi is included in the divided learning data Dj on the basis of the volume ratio A(j,i)=V(j,i)/Vi of the organ Oi in the divided ground truth mask Gj.
(15) Specifically, the learning exclusion target discrimination unit 13 discriminates whether the volume ratio A(j,i) falls below a predetermined threshold value Th (for example, Th=0.9 or the like, or substantially 1 or a value near 1). The learning exclusion target discrimination unit 13 discriminates the divided learning data Dk having a subscript k with A(k,i)<Th as a learning exclusion target of the organ Oi. Hereinafter, the organ Oi with A(k,i)<Th in the divided learning data Dk is expressed as O(k,i).
(16) Instead of the volume ratio, an area ratio of the organ Oi included in the divided learning data Dj may be calculated from the divided learning data Dj in the sagittal direction or coronal direction, and whether only a part of the organ Oi is included in the divided learning data may be discriminated on the basis of the area ratio.
(17) The divided learning data output unit 14 outputs the divided learning data Dj subjected to the discrimination of the learning exclusion target discrimination unit 13, to the machine learning unit 15.
(18) The machine learning unit 15 causes the neural network 16 to perform machine learning on the basis of the divided learning data Dj output from the divided learning data output unit 14.
(19) The neural network 16 is a multi-layer classifier configured by a convolutional neural network (CNN) or the like.
(20) The machine learning of the neural network 16 by the machine learning unit 15 uses backpropagation (error propagation method). The backpropagation is a method of comparing teacher data for the input data with actual output data obtained from the neural network 16 to change each connection load from an output layer side to an input layer side on the basis of the error.
(21) Specifically, as illustrated in
(22) The machine learning unit 15 compares the predicted mask Pj with the divided ground truth mask Gj as the teacher data to perform backpropagation of the neural network 16 on the basis of the error. That is, the backpropagation of the neural network 16 is performed for each divided learning data Dj.
(23) However, in a case where the organ O(k,i) as the learning exclusion target is included in a predicted mask Pk corresponding to the divided learning data Dk, the machine learning unit 15 does not perform backpropagation of labeling of the organ O(k,i). The details will be described below.
(24)
(25) First, in S1 (divided learning data creation step), the original learning data division unit 12 creates N divided learning data D1, D2, . . . , and DN from the original learning data that is received by the original learning data input unit 11. N is an integer of 2 or more.
(26) In S2 (volume calculation step), the learning exclusion target discrimination unit 13 calculates a volume ratio A(j,i)=V(j,i)/Vi of each organ O(j,i) from the divided ground truth mask Gj and the ground truth mask G.
(27) In S3 (exclusion target specifying step), the learning exclusion target discrimination unit 13 determines whether the volume ratio A(j,i) is less than the predetermined threshold value Th. In case of j=k, and in case of A(k,i)<Th, the learning exclusion target discrimination unit 13 discriminates that the learning exclusion target in the divided learning data Dk is the organ O(k,i).
(28) In S4 (predicted mask creation step), the neural network 16 inputs the divided volume data Vj of the divided learning data Dj to create the predicted mask Pj of each of n(j) organs O(j,i). Here, j=1, 2, . . . , and N.
(29) In S5 (loss calculation step), the machine learning unit 15 calculates the detection accuracy of each organ O(j,i) in the predicted mask Pj using a loss function Loss. However, in case of j=k, the detection accuracy of the organ O(k,i) is not calculated. That is, in the predicted mask Pk, the detection accuracy of the organ O(k,i) with a volume ratio that is small to some extent is ignored.
(30) Specifically, in the predicted mask Pj, the detection accuracy acc(j,i) is calculated for each of n(j) types of organs O(j,i) except for the organ O(k,i) as the learning exclusion target, and the average value thereof is regarded as the loss function Loss(j) corresponding to the divided learning data Dj.
(31) Loss(j)=Avg(acc(j,i)) (i=1, 2, . . . , n(j), where i≠(k) and acc(k,i)=0). acc(j,i) is the Intersection over Union (IoU) of each organ O(j,i) in the predicted mask Pj. That is, the IoU is a value obtained by dividing the number of voxels the intersection of a set Pr(i) of the organ O(j,i) in the predicted mask Pj and a set Ht of the organ O(j,i) in the divided ground truth mask Gj, by the number of voxels of a union of the set Pr(i) and the set Ht. As the detection accuracy of each organ O(j,i) in the divided volume data Vj is increased, acc(j,i) approaches 1. However, in a case where there are many organs with a low detection accuracy, even in a case where the detection accuracy of another organ is high, the loss function Loss does not approach 1. The detection accuracy of the organ of which the volume ratio is less than the threshold value is not reflected in the value of the loss function Loss in the first place.
(32) The expression for calculating the detection accuracy is not limited to the above description. In general, the expression can be represented by Expression (1).
acc(i)=f1(Pr(i)∩Ht)/f2(Pr(i)∪Ht) (1)
(33) f1 is a function using Expression (2) as a parameter.
Pr(i)∩Ht (2)
(34) f2 is a function using Expression (3) as a parameter.
Pr(i)∪Ht (3)
(35) For example, a value obtained by multiplying the IoU by a constant (such as 100 times) or a Dice coefficient may be used as acc (i).
(36) In S6 (backpropagation step), the machine learning unit 15 changes each connection load of the neural network 16 from the output layer side to the input layer side according to the loss function Loss.
(37) In S7 (divided learning data creation step), the original learning data division unit 12 re-creates the divided learning data Dk. In this case, the original learning data division unit 12 re-divides the original learning data such that the entire organ O(k,i) is included in the divided learning data Dk. However, the unit for re-division is also constrained by hardware resources. The process returns to S2, and for the divided learning data Dk, the predicted mask Pk of each organ including the organ O(k,i) is created.
(38)
(39) Thus, it is possible to perform backpropagation to improve the detection accuracy of any organ regardless of the volume of the organ. However, in a case where an organ having a volume smaller than the threshold value is included in the divided learning data, the detection accuracy of the organ is not reflected in the loss function. Therefore, it is possible to prevent that the detection accuracy of a part of an organ which is cut due to the division of the original learning data is reflected in the loss function to adversely affect the backpropagation.
(40) Further, the organ cut due to the division of the learning data can be subjected to the calculation of detection accuracy and the backpropagation by re-division of the learning data.
EXPLANATION OF REFERENCES
(41) 1: machine learning device 11: original learning data input unit 12: original learning data division unit 13: learning exclusion target discrimination unit 14: divided learning data output unit 15: machine learning unit 16: neural network