FUNCTIONAL MEASUREMENTS IN ECHOCARDIOGRAPHY
20200074625 · 2020-03-05
Inventors
Cpc classification
International classification
Abstract
A method for processing echocardiography data to enable automatic functional measurements based on cardiac ultrasound images as an input, including (i) classification of the cardiac ultrasound images to ensure that relevant images are passed on to the next steps, optionally utilising a first neural network, such as a convolutional neural network, (ii) segmentation and semantic partitioning of the left ventricle (LV) myocardium to extract relevant parts of the image, optionally by using a second neural network, (iii) regional motion estimates to determine a mapping of displacements in the extracted parts of the image and to output estimated tissue motion vectors for the extracted parts of the image, optionally using a third neural network, and (iv) fusion of measurements via state estimation applied to the tissue motion vectors and thereby incorporating a temporal domain to produce data showing variation of the estimated measurements over time.
Claims
1. A method for processing echocardiography data in order to enable automatic functional measurements based on cardiac ultrasound images as an input, the method comprising: (i) classification of the cardiac ultrasound images in order to ensure that relevant images are passed on to the next steps; (ii) segmentation and semantic partitioning of the left ventricle (LV) myocardium to extract relevant parts of the images; (iii) regional motion estimates to determine a mapping of displacements in the extracted parts of the image and to output estimated tissue motion vectors for the extracted parts of the image; and (iv) fusion of measurements via state estimation applied to the tissue motion vectors and thereby incorporating a temporal domain to produce data showing variation of the estimated measurements over time; wherein at least one of steps (i), (ii) and (iii) is done using a neural network, or a part of a neural network.
2. A method as claimed in claim 1, wherein step (i) uses a first neural network, or a first part of a neural network, step (ii) uses a second neural network, or second part of a neural network, and step (iii) uses a third neural network, or third part of a neural network.
3. A method as claimed in claim 1, wherein step (i) uses a convolutional neural network (CNN) in the form of a feed-forward CNN composed of inception blocks and a dense connectivity pattern.
4. A method as claimed in claim 1, wherein the segmentation at step (ii) uses a U-Net type of CNN classify the LV myocardium with the pixel map of the segmentation being processed and used to define regions of interest (ROI) and the basis of measurement kernels.
5. A method as claimed in claim 3, wherein the segmentation at step (ii) is used for masking to extract the relevant parts of the images and also used for centerline extraction.
6. A method as claimed in claim 5 wherein the contour of the segmentation mask is used to define the endo- and epicardial borders, and the centerline is sampled along the myocard, with the latter being passed to the state estimation at step (iv).
7. A method as claimed in claim 1, wherein the motion estimation at step (iii) uses the neural networks referred to as FlowNets and includes stacking of multiple U-Net architectures with image warping of intermediate motion and propagation of brightness error.
8. A method as claimed in claim 7, wherein step (iii) includes two parallel routes for the motion estimation in order to tackle larger and smaller displacements separately.
9. A method as claimed in claim 8, wherein: larger displacements are predicted by stacking three U-Net architectures, the first which includes explicit correlation of feature maps, while the two succeeding are standard U-Net architectures without custom layers; and wherein for smaller displacements only one U-Net is used, but compared to the networks for the larger displacements, the kernel size and stride of the first layer is reduced.
10. A method as claimed in claim 1, including a further step (v), which comprises calculation of clinical indices.
11. A method as claimed in claim 10, wherein step (v) uses the updated point-velocity components from step (iv) in order to calculate one or more clinical indices selected from: global or regional longitudinal myocardial strain; global or regional longitudinal myocardial strain rate; regional myocardial velocity; and/or regional myocardial displacement.
12. A method as claimed in claim 1, wherein the input cardiac images are cardiac B-mode ultrasound data
13. A method as claimed in claim 1, comprising automatically providing an output comprising regional numeric deformation values for all segments of the left ventricle.
14. A method as claimed in claim 1, wherein cardiac cycles which have inferior quality are automatically discarded during initial processing at step (i).
15. A method as claimed in claim 14 wherein the automatic discarding of ultrasound data occurs: when view recognition fails due to view recognition confidence being lower than a threshold; when segmentation indicates that apex is moving away and towards the probe during the cardiac cycle; and/or when regularization is failing.
16. A method as claimed in claim 1, including outputting information via a user interface that displays a trend graph for regional and/or global values based on processing of as current exam and optionally also prior exams for the same patient.
17. A method as claimed in claim 1, wherein at the end of an exam, the user is automatically presented with a Left ventricle segmental bullseye generated in background based on processing recorded images while the user was completing the exam.
18. A computer programme product containing instructions that, when executed, will configure a computer device to carry out the method of claim 1.
19. A computer device configured to carry out the method of claim 1.
20. An ultrasound imaging system comprising a computer device as claimed in claim 19 for processing ultrasound images obtained via the imaging system.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0048] An example embodiment of the invention will now be described by way of example only and with reference to the accompanying drawings, in which:
[0049]
[0050]
[0051]
DETAILED DESCRIPTION
[0052] As described in further detail below, a method has been developed that enables automatic functional measurements in 2D echocardiography. The system works in an end-to-end fashion with standard cardiac ultrasound images as input, and several clinical measures, as well as motion estimates and regional masks as direct output.
[0053] The method core is comprised of five components, (i) classification of cardiac view, (ii) segmentation and semantic partitioning of the left ventricle (LV) myocardium, (iii) regional motion estimates, (iv) fusion of measurements and (v) calculation of clinical indices. An illustration of an example setup for measuring global longitudinal strain after view classification is illustrated in
[0054] The steps (i) to (iv) summarised above are implemented as follows:
[0055] (i) Cardiac View Classification:
[0056] Utilizes a convolutional neural network (CNN) to classify cardiac views. This is used to assure that relevant images obtained in the apical acoustic window are forwarded through the measurement pipeline.
[0057] The view classification is the first essential step in the automatic pipeline, and is used to quality assure and sort incoming data. We employ a feed-forward CNN composed of inception blocks and a dense connectivity pattern. Initially, input is propagated through two component blocks with (33) convolution kernels followed by max pooling. The first and second convolution layer has 16 and 32 filters respectively. We use pooling with size (22) and equal strides. After the second pooling layer, data is processed through an inception module with three parallel routes. Each route consist of a bottleneck, two of which were followed by blocks with larger convolution kernels, i.e. (33) and (55) respectively. The input of the inception module is concatenated with the output and processed into a transition module with bottleneck and max pooling. This step is repeated three times, and we double the amount of filters before every new pooling layer. The dense connectivity pattern alleviates the vanishing gradient problem, and can enhance feature propagation and reusability. After the third transition, the data is processed through two inception blocks with constant amount of filters and no pooling. The route with (55) convolution kernels is omitted in these modules, and dropout regularization was used between them. The final classification block consists of a compressing convolution layer with (11) kernels and number of filters equal to the class count. This is activated with another PReLU, before features are spatially averaged and fed into a softmax activation.
[0058] Training is performed from scratch with Adam optimizer and categorical cross entropy loss, with input size of (128128) greyscale. A total of eight classes were used for training, the apical four chamber, two chamber and long-axis, the parasternal long- and short-axis, subcostal four-chamber and vena cava inferior, as well as a class for unknown data. The final network classifies the different cardiac views, and if applicable, i.e. high confidence of apical four-chamber, the image is processed into the remaining processing chain.
[0059] (ii) Segmentation: Utilizes a U-Net type of CNN classify the LV myocardium. The pixel map of the segmentation is processed and used to define regions of interest (ROI) and the basis of measurement kernels. The implementation allows high modularity within the segmentation masks in terms of measurements kernels, e.g. regional (base, mid, apical), centerlines with different widths, points, epicard and endocard borderlines. See
[0060] A standard U-Net type of CNN is utilized. The architecture consists of a downsampling, and an upsampling part of five levels with concatenating cross-over connection between equally sized feature maps. Each level has two convolution layers with the same amount of filters ranging from 32 to 128 from top to bottom respectively. All filters have a size of (33), Max pooling with size (22) and equal strides was used for downsampling and nearest neighbour for upsampling. Training was performed with Adam optimizer and Dice loss, and the size of the input image was set to (256256) greyscale. The output of the network is a segmentation mask .
[0061] The segmentation is used a basis for two different tasks, masking the input of the motion estimation network I.sub.m, and centerline extraction. We mask the ultrasound image I to remove redundant input signal. The contour of the segmentation was used to define the endo- and epicardial borders, and further the centerline C=[(x,y).sub.1, . . . , (x,y).sub.N] was sampled between with N=120 equally spaced points along the myocard. The latter is passed to the Kalman filter.
[0062] Both networks are trained on data from up to 500 patients annotated by experts and the results are extensively evaluated with state-of-the-art accuracy. The two preceding steps hence include the following advances: [0063] Real-time classification of standard views in transthoracic echocardiography using convolutional neural networks. [0064] Automatic, fast and accurate cardiac ultrasound segmentation using deep learning.
[0065] (iii) Motion estimation: the motion estimation step has been developed based on the work done by Ilg et al in the paper entitled Flownet 2.0: Evolution of optical flow estimation with deep networks, IEEE conference on computer vision and pattern recognition (CVPR), Vol. 2, 2017. It thus uses the networks referred to as FlowNets. The design involves stacking of multiple U-Net architectures with image warping of intermediate motion and propagation of brightness error. The output prediction of the network is dense tissue motion vectors in the masked area.
[0066] Two parallel routes are created to tackle large and small displacements separately. The prior is solved by stacking three U-Net architectures, the first which includes explicit correlation of feature maps, while the two succeeding are standard U-Net architectures without custom layers. For small displacement, only one U-Net is used, but compared to the networks for large displacements, the kernel size and stride of the first layer is reduced. At the end, the two routes are fused together with a simple CNN. Proper handling of variable motion is essential in echocardiography, and such a complex setup can learn to reproduce motion from the whole cardiac cycle. The networks are trained separately, in a schedule consisting of different synthetic datasets with a wide range of motion vector representations. The small displacement network is fine-tuned on a dataset modified for subpixel motion. Adam optimizer and endpoint error loss is used while training for all the networks. In this example the input size of the network was kept the same as the original implementation, i.e. (512384).
[0067] The output prediction of the network is dense tissue motion vectors in the masked ultrasound area. The C of the current segmentation is used to extract the corresponding set of motion vectors M={(v.sub.x,v.sub.y).sub.1, . . . , (v.sub.x,v.sub.y).sub.N}. Disregarding the fundamentals of motion, the datasets have no resemblance to echocardiography. We assume that more relevant data with ground truth information, such as simulations, can improve the results.
[0068] (iv) Fusion of estimations: Performed by employing a Kalman filter on every point-velocity component in the measurement kernel with a constant acceleration model. In addition to the motion estimation, this serves as a simple method for further incorporating the temporal domain, which is natural in the context of echocardiography. It adds temporal smoothing, reducing potential burst noise detectable in image-to-image measurements. The updated point-velocity components are then used to calculate several clinical indices. The updated centreline C is used to calculate the longitudinal ventricular length, t, i.e. the arc length, for each timestep t. Further, this is used to estimate the global longitudinal strain (t)=(.Math.(t).Math..sub.0)/.Math..sub.0 along the center of the myocard.
[0069] Subsequent to steps (i) to (iv) the output estimations may be used in the calculation of various clinical indices, and hence a further step may be:
[0070] (v) Calculation of clinical indices: The current implementation features calculations of the following indices of LV function: [0071] Global and regional longitudinal myocardial strain [0072] Global and regional longitudinal myocardial strain rate [0073] Regional myocardial velocity [0074] Regional myocardial displacement
[0075] Summarized, the method may provide one or more of the following: [0076] A modular implementation and extensive details about the methods. [0077] A setup for training the convolutional neural networks with own data or existing methods. [0078] Extraction of specific components for use in other pipelines and methods (e.g. motion estimation). [0079] Deployable models with trained parameters ready for use (current implementation supports Tensorflow backend).
[0080] The different CNNs used in this system are based on publicly available architectures and thus each are readily implemented with appropriate knowledge of the teaching presented herein. However, the pipeline composition and corresponding synergy together with traditional methods has not previously been attempted. The combination of features set out herein facilitates a simple and standardized approach to cardiac functional imaging enabling fully automated processing of ultrasound data making use of the advantages provided by deep learning techniques. Some potential key points for this innovation include: [0081] Fusion of segmentation and motion estimation. Combining sensory data from disparate sources will potentially result in less uncertainty than the sources being used individually. [0082] Regularization of motion measurements by removing signal outside the region of interest based on segmentation of the myocard. [0083] The motion estimation network can learn highly non-linear motion, and can overcome the limitations and rigidity of model based methods. [0084] The setup can learn to reproduce results of existing methods, such as speckle tracking or simulators. [0085] The method is fully automatic and requires no interaction by the user other than supplying ultrasound data.
[0086] As an example of the effectiveness of the proposed method
[0087] For validation of GLS, 21 subjects called for evaluation of cardiac disease in two clinical studies were included. Both are REB approved, and informed consent was given. Two specialists in cardiology performed standard strain measurements using a semi-automatic method implemented in GE EchoPAC. The method uses speckle tracking to estimate myocardial deformation, but the methodology is unknown. The results were used as a reference for evaluating the implemented pipeline.
[0088] GLS was obtained successfully in all patients. The results for apical four-chamber views are displayed in
[0089] The view classification achieved an image-wise F.sub.1 score of 97.9% on four-chamber data of 260 patients, and the segmentation a dice score of (0.870.03) on 50 patients, all unknown and independent from the training set. The system was implemented as a Tensorflow dataflow graph}, enabling easy deployment and optimized inference. Using a modern laptop with a Nvidia GTX 1070 GPU, the average inference time was estimated to 1151 ms per frame, where flow prediction accounts for approximately 70% of the runtime.
[0090] Compared to the reference method, the measurements from the proposed pipeline were slightly underestimated. The reference method is not a gold standard for GLS and might not necessarily yield correct results for all cases. Speckle tracking can fail where noise hampers the echogenicity. We could identify poor tracking in the apical area due to noise for some subjects, and this would in turn result in larger strain. Further, the vendor comparison study shows that the commercial system used in this study on average overestimates the mean of all vendors by 1.7%. This in mind, we note that the results from the current implementations are in the expected range. For individual cases, the deformation have overlapping and synchronized trends, as is prevalent from
[0091] The proposed pipeline involves several sources of error, especially the segmentation and motion networks being the fundamental building blocks of the measurements. Using the segmentation mask to remove redundant signal in the ultrasound image seems feasible and useful for removing some noise in the motion network. However, it is not essential when measuring the components of the centerline, as they are far from the borders of the myocard, where the effect is noticable.
[0092] Future developments and potential refinements to the method include the addition of multiple views, e.g. apical two- and long-axis, allowing average GLS. This is considered a more robust metric, less prone to regional noise. Also, fusion of models are currently naive, and we expect results to improve inducing models with more relevance to cardiac motion. The same holds for the motion estimation, i.e. the network could benefit from training on more relevant data. Further, we wish to do this for regional strain measurements. For clinical validation, we need to systematically include the subject condition and a larger test material.
[0093] Thus, it will be apparent from the above that the proposed method can provide the following advantages and/or may include the features of the following clauses as alternatives to or in addition to the features set out in the appended claims: [0094] An ultrasound processing method comprising: accessing cardiac B-mode ultrasound data; processing the ultrasound data through a neural network (Deep Learning) pipeline; at end of processing, providing regional numeric deformation values for all segments of the left ventricle.
[0095] OR [0096] An ultrasound imaging system comprising: accessing cardiac B-mode ultrasound data; processing the ultrasound data through a neural network (Deep Learning) pipeline; at end of processing, providing regional numeric deformation values for all segments of the left ventricle.
[0097] Optionally also one or more of: [0098] The pipeline is fully automatic. [0099] Cardiac cycles which have inferior quality for robust deformation estimation are automatically discarded and/or a warning is issued. [0100] The pipeline consists of a view recognition step, a segmentation step, a motion estimation step and a regularization step. [0101] View recognition is performed by a convolutional neural network (CNN). [0102] Segmentation is performed by a U-Net type CNN. [0103] Motion estimation is performed by a FlowNet type network. [0104] Automatic discarding of ultrasound data from a cardiac cycle, or issuance of a warning happens when view recognition fails due to view recognition confidence being lower than a threshold. [0105] The view recognition confidence is calculated by the percentage of image frames classified to the same view compared to all frames classified from that cycle. [0106] Automatic discard of ultrasound data from a cardiac cycle happens when segmentation indicates that apex is moving away and towards the probe during the cardiac cycle. (This indicates fore-shortening.) [0107] Automatic discard of ultrasound data from a cardiac cycle happens when regularization is failing. [0108] Processing is applied to several stored cardiac cycles from one patient. [0109] Processing is applied to several stored cardiac cycles from different patients. [0110] Results from several stored cardiac cycles from different patients are used to compare regional strain between patient groups. [0111] A user interface displays a trend graph for regional and/or global values based on processing of current and prior exams for the same patient. [0112] Processing is applied to images from different stress levels and the results are comparedand segments where regional deformation is diverging between stress levels. [0113] A user selects exams to be processed. Later (after processing completed) the user is presented with results showing average regional and global deformation values for all selected exams together with the variance of the results. [0114] Deformation values are used to calculate strain. [0115] Deformation values are used to calculate strain rate. [0116] Automatically, at the end of an examination, the user is presented with a Left ventricle segmental bullseye generated in background based on processing recorded images while the user was completing the exam. Thus, processing occurs automatically on apical B-mode images while the user continues to record colorflow and Doppler images. [0117] Automatically, in background while user continues to record images, processing of already recorded images is performed. If a left ventricle segment falls outside normal deformation value ranges, a warning highlighting this segment and its non-normal value is presented to the user. [0118] Results are presented as bullseye plots.
[0119] It should be apparent that the foregoing relates only to the preferred embodiments of the present application and the resultant patent. Numerous changes and modification may be made herein by one of ordinary skill in the art without departing from the general spirit and scope of the invention as defined by the following claims and the equivalents thereof.