GPU BASED IMPLEMENTATION OF SENSE (A PARALLEL MRI ALGORITHM) USING LEFT INVERSE METHOD
20170371020 · 2017-12-28
Inventors
Cpc classification
G01R33/5611
PHYSICS
International classification
Abstract
A method including: constructing coil sensitivity encoding matrix; inversing of the coil sensitivity encoding matrix using Left Inverse method; and multiplying inverse of coil sensitivity encoding matrix with an under-sampled data matrix using a GPU residing on a host computer.
Claims
1. A method, comprising: constructing a coil sensitivity encoding matrix; inversing of the coil sensitivity encoding matrix using a Left Inverse method; and multiplying an inverse of the coil sensitivity encoding matrix with an under-sampled data matrix using a Graphical Processing Unit (GPU) residing on a host computer.
2. The method of claim 1 which is implemented on GPU to exploit maximum parallelism using a parallel approach.
3. The method of claim 2, further comprising computing all independent tasks on GPU by utilizing a maximum number of threads.
4. The method of claim 1, further comprising acquiring the under-sampled data by skipping k-space lines.
5. The method of claim 1 further comprising reconstructing Magnetic Resonance (MR) images by performing the inversion of coil sensitivity information.
6. The method of claim 1, further comprising reconstructing Magnetic Resonance (MR) images from the under-sampled data acquired from MM scanner having multiple receiver coils.
7. The method of claim 6, wherein an acceleration factor is less than the number of the multiple receiver coils.
8. The method in claim 1, wherein MR signals are used and are acquired by Cartesian sampling.
9. A system, comprising: a Magnetic Resonance Image (MM) scanner and the host computer comprising the GPU, wherein the data acquired from the MRI scanner is processed by the GPU by applying the method of claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] The accompanying drawings, which are included to provide a further understanding of the inventive concept, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the inventive concept, and, together with the description, serve to explain principles of the inventive concept.
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
DETAILED DESCRIPTION OF ILLUSTRATED EMBODIMENTS
[0039] In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of various exemplary embodiments. It is apparent, however, that various exemplary embodiments may be practiced without these specific details or with one or more equivalent arrangements. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring various exemplary embodiments.
[0040] Unless otherwise specified, the illustrated exemplary embodiments are to be understood as providing exemplary features of varying detail of various exemplary embodiments. Therefore, unless otherwise specified, the features, components, modules, layers, films, panels, regions, and/or aspects of the various illustrations may be otherwise combined, separated, interchanged, and/or rearranged without departing from the disclosed exemplary embodiments. Further, in the accompanying figures, the size and relative sizes of layers, films, panels, regions, etc., may be exaggerated for clarity and descriptive purposes. When an exemplary embodiment may be implemented differently, a specific process order may be performed differently from the described order. For example, two consecutively described processes may be performed substantially at the same time or performed in an order opposite to the described order. Also, like reference numerals denote like elements.
[0041] For the purposes of this disclosure, “at least one of X, Y, and Z” and “at least one selected from the group consisting of X, Y, and Z” may be construed as X only, Y only, Z only, or any combination of two or more of X, Y, and Z, such as, for instance, XYZ, XYY, YZ, and ZZ. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0042] In exemplary embodiments, modules and/or one or more components thereof, may be implemented via one or more general purpose and/or special purpose components, such as one or more discrete circuits, digital signal processing chips, integrated circuits, application specific integrated circuits, microprocessors, processors, programmable arrays, field programmable arrays, instruction set processors, and/or the like.
[0043] According to one or more exemplary embodiments, the features, functions, processes, etc., described herein may be implemented via software, hardware (e.g., general processor, digital signal processing (DSP) chip, an application specific integrated circuit (ASIC), field programmable gate arrays (FPGAs), etc.), firmware, or a combination thereof. In this manner, modules and/or one or more components thereof may include or otherwise be associated with one or more memories (not shown) including code (e.g., instructions) configured to cause modules, processors, and/or one or more components thereof to perform one or more of the features, functions, processes, etc., described herein.
[0044] The memories may be any medium that participates in providing code to the one or more software, hardware, and/or firmware components for execution. Such memories may be implemented in any suitable form, including, but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media include, for example, optical or magnetic disks. Volatile media include dynamic memory. Transmission media include coaxial cables, copper wire and fiber optics. Transmission media can also take the form of acoustic, optical, or electromagnetic waves. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a compact disk-read only memory (CD-ROM), a rewriteable compact disk (CD-RW), a digital video disk (DVD), a rewriteable DVD (DVD-RW), any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a random-access memory (RAM), a programmable read only memory (PROM), and erasable programmable read only memory (EPROM), a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which information may be read by, for example, a controller/processor.
[0045] The idea behind SENSE is to reconstruct a un-aliased image from under-sampled data using the sensitivity encoding matrix obtained from the receiver coils. To implement SENSE, certain requirements need to be fulfilled: (1) coils sensitivity maps must be known, which may be obtained using Pre-scan method, (2) For accelerating the data acquisition in MRI, some phase encoding steps need to be skipped. The skipped phase encoding steps (k-space lines) decide the acceleration factor which affects the field of view (FOV).
[0046] The SENSE equation is given as:
M=[(C.sup.tψ.sup.−1C).sup.−1ψ.sup.−1]I Equation 1
[0047] Where M is the image matrix to be reconstructed, C is the encoding matrix, ψ is the noise correlation matrix, I is the aliased image matrix. SENSE reconstruction algorithm requires inverting large number of independent small encoding matrices (C) which is time consuming if done sequentially. However, GPUs can be used to perform this task in parallel thus reducing the time for SENSE reconstruction. Also, MR signals are used and are acquired by Cartesian sampling.
[0048]
[0049]
[0050] The CPU implementation is very similar to the GPU implementation. The number of operations is exactly the same in both cases. The CPU implementation executes the code sequentially by using ‘For’ loops whereas GPU implementation breaks up the tasks and launches the threads to execute the tasks in parallel.
[0051]
[0052] To monitor the reconstructed image quality for both the CPU and GPU implementations, we have used artefact power as a quantifying parameter. Mean SNR is calculated using pseudo multiple replica method.
[0053] The experiments are performed on two datasets acquired from 1.5 Tesla MRI Scanner which includes phantom and human head data. The dimensions of the aliased images (under-sampled by factor 2) are 128×256×2, 128×256×4, 128×256×6, and 128×256×8 for two, four, six and eight receiver coils respectively.
[0054]
[0055]
[0056] Mean SNR is found using the SNR map method and is shown in Table 3 for phantom image and human head images respectively. The mean SNR in phantom reconstructed image is more than 39 dB and for human head data it is more than 29 dB which shows that the reconstructed MR images possess good SNR.
[0057] The results according to one exemplary embodiment show that GPU implementation of SENSE has reduced the computation time significantly as compared to CPU implementation as well as the quality of the reconstructed image is maintained. The results show that the computation time increases with the number of receiver coils because the required number of operations (multiplications and additions) is increased.
[0058] The exemplary embodiments present the implementation of SENSE algorithm on GPU using left inverse method. The performance comparison of GPU and the multi core CPU implementations is performed. The rectangular matrix inversion is implemented in CUDA for GPU implementation of SENSE. The results according to one exemplary embodiment show that GPU provides approximately 7ט28× reduction in computational time as compared to the CPU. The future work includes the performance comparison between CPU and GPU for higher acceleration factors in SENSE. Also, with new generations of graphic cards, it will be possible to further reduce the computation time with better optimized GPU programs.
TABLE-US-00001 TABLE 1 Summary of AP for phantom image and human head image for CPU and GPU implementation of SENSE algorithm from FIG. 4 and 5. AP for AP for AP for No. of Phantom Phantom AP for Human Human receiver image using image using Head image using Head image coils multi-core CPU GPU multi-core CPU using GPU 2 3.07 × 10.sup.−31 2.45 × 10.sup.−13 3.07 × 10.sup.−31 5.94 × 10.sup.−13 4 2.69 × 10.sup.−31 2.60 × 10.sup.−13 2.70 × 10.sup.−31 5.99 × 10.sup.−13 6 2.34 × 10.sup.−31 2.54 × 10.sup.−13 2.34 × 10.sup.−31 6.06 × 10.sup.−13 8 9.35 × 10.sup.−32 2.53 × 10.sup.−13 9.35 × 10.sup.−32 5.99 × 10.sup.−13
TABLE-US-00002 TABLE 2 Performance Comparison of multi core CPU and GPU. No. of SENSE SENSE receiver coils SENSE computation time computation (Acceleration Computation time by GPU (kernel time by GPU factor = 2) by multi-core CPU and data transfer) (kernel only) 2 14 ms 1.3 ms 0.5 ms 4 18 ms 2.6 ms 1.0 ms 6 31 ms 3.1 ms 1.4 ms 8 47 ms 4.7 ms 1.7 ms
TABLE-US-00003 TABLE 3 Mean SNR for phantom image and human head image for CPU and GPU implementation of SENSE algorithm Mean SNR for Mean Mean SNR for Mean SNR Human Head SNR for No. of Phantom image for Phantom image Human Head receiver using multi-core image using using multi- image using coils CPU (dB) GPU (dB) core CPU (dB) GPU (dB) 2 39.4721 39.4699 29.1428 29.2865 4 39.93 39.9256 28.9766 29.210 6 39.4723 39.4680 29.294 29.2858 8 39.4688 39.4742 29.2869 29.2876
[0059] Although certain exemplary embodiments and implementations have been described herein, other embodiments and modifications will be apparent from this description. Accordingly, the inventive concept is not limited to such embodiments, but rather to the broader scope of the presented claims and various obvious modifications and equivalent arrangements.