Method, device and system for remote deep learning for microscopic image reconstruction and segmentation
11482400 · 2022-10-25
Inventors
- Remco Schoenmakers (Best, NL)
- Maurice Peeman (Rijsbergen, NL)
- Faysal Boughorbel (Eindhoven, NL)
- Pavel Potocek (Eindhoven, NL)
Cpc classification
International classification
Abstract
The present invention relates to a method of training a network for reconstructing and/or segmenting microscopic images comprising the step of training the network in the cloud. Further, for training the network in the cloud training data comprising microscopic images can be uploaded into the cloud and a network is trained by the microscopic images. Moreover, for training the network the network can be benchmarked after the reconstructing and/or segmenting of the microscopic images. Wherein for benchmarking the network the quality of the image(s) having undergone the reconstructing and/or segmenting by the network can be compared with the quality of the image(s) having undergone reconstructing and/or segmenting by already known algorithm and/or a second network.
Claims
1. A method for processing microscopic images including: uploading one or more first microscopic images from a first microscopic system to a cloud; selecting a pre-trained network from multiple pre-trained networks in the cloud by benchmarking the multiple pre-trained networks with the one or more first microscopic images; training the selected pre-trained network to obtain a trained network; downloading the trained network to a second microscopic system; acquiring one or more second microscopic images from the second microscopic system; and processing the one or more second microscopic images with the downloaded trained network.
2. The method of claim 1, wherein benchmarking the multiple pre-trained networks with the one or more first microscopic images includes processing the one or more first microscopic images with the multiple pre-trained networks and comparing qualities of the processed one or more first microscopic images.
3. The method of claim 1, wherein training the selected pre-trained network includes training the selected pre-trained network with one or more third microscopic images that are different from the one or more first microscopic images.
4. The method of claim 1, wherein the first system microscopic system and the second microscopic system are same.
5. The method of claim 1, further comprising checking a quality of the trained network after acquiring every predetermined number of microscopic images.
6. The method of claim 5, wherein checking the quality of the trained network includes processing one or more last acquired microscopic images with the trained network and a classical algorithm, and comparing outcomes of the trained network and the classical algorithm.
7. The method of claim 1, wherein the one or more first microscopic images and the one or more second microscopic images are acquired for at least one of single particle analysis, cryo-electron microscopy, volume scope data acquisition, neuron reconstruction, image super-resolution, single-image resolution, single-image super-resolution.
8. The method of claim 1, further comprising uploading training data to the cloud and training one or more networks with the training data to obtain one or more of the multiple pre-trained networks.
9. The method of claim 8, wherein the one or more networks includes one or more of a deep convolutional network, an enhanced deep residual network (EDSR), and a multi-scale deep super-resolution system (MDSR).
10. The method of claim 1, wherein processing the one or more second microscopic images includes reconstructing and/or segmenting the one or more second microscopic images.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
DESCRIPTION OF EMBODIMENTS
(4)
(5) A communication channel 3 (hereinafter used for 3a-3c) can unidirectional or bidirectional connect the microscopic device(s) with a remote cloud 4. The communication channels can be different and can comprise (a) wired and/or wireless communication section(s).
(6)
(7) One of a plurality of examples could be a SEM microscope configured for high quality imaging that requires a long acquisition time. For example, it takes a stack of 4096×4096 point images with a dwell time of 10 μs, requiring 168 seconds per image. For scenarios where a large volume must be acquired, i.e. a large number of 4096×4096 frames, the acquisition time constitutes a severe bottleneck. To speed up acquisition, compressive sensing techniques can be used for fast acquisitions. While maintaining the same field of view the number of point can be reduced to 1024×1024. This results in a reduced acquisition time of a factor of 16. Additionally, a reduced dwell time of 0.1 μs per point can be used and as a result, the acquisition data rate increases to D.sub.r=160 Mbit/s and more importantly the acquisition time reduces to a more practical 0.1 seconds per frame.
(8) In a first step 10 training data comprising image data is uploaded into the cloud after its generation by one or more microscopes and optionally associated devices and/or by use of a respective collection collected over the past. The data can originate from one or more kinds of sources, such as one or more particle analyzing microscope(s). The uploading can take place by any known means as long as the uploading speed is appropriate.
(9) On the basis of the image data a network can be trained to obtain a pre-trained network in a step 20. As a starting point an already known network or newly developed network can be taken, such as described in the art mentioned before. Additionally an already pre-trained network or a plurality of pre-trained networks can be further developed or trained for the deep learning of the network(s). This can be one or more network(s) that have been the result of previous analysis and/or already existing networks. In case of a plurality of networks they can be stored in one or more libraries. They can be labeled according to their kinds, fields of application, strengths and/or weaknesses. E.g., they could be assigned to certain protein structures, viruses, chemical structures, semiconductors, minerals etc.
(10) The training can be done sequentially and/or by parallel computing. As it is done in the cloud remote from the local microscope system and potentially associated infrastructure the training and/or testing in the cloud can be performed in parallel to the operation of the local microscope(s).
(11) The network(s) or pre-trained networks can be further trained. One example is to upload the first image(s) acquired in step 30. These first images then undergo alternative networks in step and/or classical algorithms in step 41. Alternatively or additionally, the factors in the network can be modified and optimized to improve the output of the network.
(12) Thereafter the quality of the results can be checked and compared and the most appropriate approach, i.e. the most appropriate network is selected. The quality of the results can be based on one or more criteria, such as the capturing of the image, the grade of improvement of the image, the segmenting, noise to signal ratio, edge criteria etc. The tests can comprise negative and/or positive selection criteria.
(13) As one possible result, the most appropriate or best pre-trained network will be selected in a step 50. This can depend on the criteria and/or can be selected manually, semi-manually and/or automatically.
(14) In a next step 60, a further image or set of images or the last portion of the images is uploaded from the microscope or the microscope systems or environment in order to further train and run the network. The training and running of the network takes place in step 70. In parallel the data can undergo one or more classical algorithms in a step 71 in order to benchmark the outcome of the network's performance.
(15) Thereafter, the quality of the outcomes of the network in step 70 and of the algorithm in step 71 are compared in a step 80. This can be done in the manner as already described before. Should the quality of the network's result be inferior to the one run by a classical algorithm, the network can undergo further training, e.g. on the basis of further image(s). For that it can go back to step 60 where another selected network can be selected and/or further images can be uploaded for the further training of the previous network or the training of another network.
(16) Should in step 80 the result of the selected network be better in quality than the classical algorithm or of the network compared, the network can be downloaded to the microscope system or a plurality of microscope systems. In this context the term microscope comprises the microscope, any local infrastructure and/or any computer implemented system locally assigned to the microscope. Thus, the network will be transferred from the cloud to the local microscope system in a Step 85 to run locally at or close to the microscope system.
(17) In a step 90 a plurality of next image(s) is/are acquired. This image or these images are reconstructed in a step 100 on the microscope. This can also comprise the single particle image approach. In a step 120 the reconstructed image(s) is/are displayed and stored. The storage can be done locally and/or in the cloud.
(18) In steps 140, 150, 151, 160 and 170 steps 60 to 80 are performed in a similar manner. That is an uploading of the last image acquired by the microscope in step 140, then run the network selected and alternatively a classical algorithm and/or another network for benchmarking or testing the network selected.
(19) Comparative Examples
(20) Two data processing scenarios around a microscope acquisition system are compared with one that is connected to a compute cloud.
(21) Scenario 1: Perform Data Processing Locally on the Acquisition System
(22) In the common case where the data rate D.sub.r of the system is larger than the compute throughput C the system cannot or can hardly perform live processing. When C«D.sub.r live processing is hardly or not possible at all.
(23) Scenario 2: Perform Data Processing in the Cloud
(24) According to the present invention perform the processing in a compute cloud with more computational resources. However, in case the data rate D.sub.r of the system is larger than the bandwidth to the cloud N the system cannot perform live processing. When N«D.sub.r live processing in the cloud is not possible. In practice the data rate requirements could be even larger in case the results must be transferred back through a shared transmission resource. This increases the data rate requirement D.sub.r with a factor k that spans a domain 1≤k<x depending on the application. For now we leave out this factor k for simplicity.
(25) Scenario 3: Hybrid Use of the Cloud to Obtain a Faster Approximate Model Locally
(26) The new scenario 3 proposes to replace the classical algorithm with a throughput C by an approximate algorithm with throughput Ć.
(27) In case the throughput of Ć>C some interesting scenario's can occur. 1. If Ć>C but still Ć≤D.sub.r a delayed processing can be performed. A speedup over the original is possible but live processing cannot be formed. 2. If Ć>C and Ć>D.sub.r live processing scenario can be considered.
(28) To obtain an approximate algorithm with throughput Ć, which delivers acceptable or similar quality compared to C, deep learning techniques are used that require training time. This training in the cloud introduces a latency penalty to the approximate algorithm. The combined latency and throughput will define if the new scenario 3 has an advantage over the existing scenarios 1 and 2.
(29) Processing Volume Comparisons
(30) The time for acquiring a data set is given as
(31)
and the time for processing would be
(32)
The difference between these two defines our processing margin, which is given by T.sub.L=(T.sub.P−T.sub.D). For the processing margin there are two possible scenarios: 1. T.sub.L≤0 Over capacity; live processing is possible in scenario 1 by local computation. 2. T.sub.L>0 Under capacity, so more data is generated than can be processed. Live processing is not possible and delayed processing requires adequate buffering.
(33) Solve Under Capacity with a Deep Learning Based Solution
(34) In case of under capacity one can switch to a deep learning based approximate algorithm with more throughput Ć. This requires a training task in the cloud and a portion of the data S.sub.train to be transferred for training. Both tasks introduce additional latencies
(35)
for transmission and T.sub.train for the training task. As a result a new deep learning based processing margin can be computed by adding the terms T.sub.Nt+T.sub.train+T.sub.L. As a result, scenario 3 improves overall system throughput when Equation (1) is valid.
T.sub.Nt+T.sub.train+T.sub.L<T.sub.L (1)
(36) Again this results in three scenarios: 1. In case the inequality of Equation (1) is not valid, so T.sub.Nt+T.sub.train+T.sub.L≥T.sub.L there is no benefit for applying the deep learning solution. Although the deep learning approximate algorithm has more throughput compared to the original algorithm (Ć>C) the overhead of training can cause that the solution cannot process more data in a given amount of time. 2. In case T.sub.Nt+T.sub.train+T.sub.L≤0 live processing is possible for at least a portion on the data collection procedure. In this scenario the approximate algorithm and the amount of time for the acquisition make that there is enough over capacity to fully absorb the penalty of training. Initially some buffer space is required but over time the approximate solution will empty the buffer since it processes data faster than the production rate. When the last frame is produced the result will be ready with minimal latency. 3. When T.sub.Nt+T.sub.train+T.sub.L<T.sub.L there is a speedup over the original algorithm but live processing is not possible. In this scenario the buffer is not empty when the last frame is produced, so the system will require some additional time to show the last frame. However, the total delay is less than for the original algorithm.
(37) Real-World Reconstruction Application
(38) A SEM microscope configured for high quality imaging requires a long acquisition time. For example it takes a stack of 4096×4096 point images with a dwell time of 10 μs, requiring 168 seconds per image. For scenarios where a large volume must be acquired (many of these 4096×4096 frames) this acquisition time is a severe bottleneck. To speed up acquisition, compressive sensing techniques can be used for fast acquisitions. For instance one could use a dwell time of 0.1 μs per point and while maintaining the same field of view we reduce the number of point to 1024×1024. As a result, the acquisition data rate increases to D.sub.r=160 Mbit/s and more importantly the acquisition time reduces to a more practical 0.1 seconds per frame.
(39) Scenario 1: Local Processing
(40) Advanced reconstruction algorithms from the compressive sensing domain can reconstruct a clean 4096×4096 pixel image from the noisy sparse data. This may require 20 hours of processing per frame on a workstation, so the acquisition bottleneck is transformed into a compute bottleneck. By optimizing the reconstruction algorithm and mapping to GPUs we can speed up, e.g. to 20 minutes of processing time. Note that in the improved GPU scenario the local compute throughput of C=14 kbit/s will prevent live processing and causes a buffering requirement that quickly grows for large acquisitions. Therefore, the local processing scenario 1 seems impossible, since C«D.sub.r.
(41) Assumed that a volume of 4096 frames can be scanned in compressive sensing mode where each frame consists of 1024×1024 points. The classical algorithm faces a huge under capacity
(42)
For 4096 frames this explodes to 1363 additional processing hours after the last frame is acquired.
(43) Scenario 2: Data Processing in the Cloud
(44) A cloud infrastructure can provide much more compute power, so let's assume that C.sub.cloud=1 Gbit/s. However, the transport bandwidth N to the cloud is limited, e.g. 100 Mbit/s. As a result, streaming the acquisition data rate D.sub.r=160 Mbit/s live to the cloud would be difficult. In addition it would be difficult to retrieve the processed and up scaled results (D.sub.r.Math.16=2560 Mbit/s) in a live mode. Clearly, scenario 2 faces a data transport problem N«k.Math.D.sub.r.
(45) In case we try an in-cloud scenario where 4096 frames are processed in the cloud.
(46)
We assume no additional latency for the processing in the cloud. The acquisition of the 4096 frames requires only 7 minutes, so after acquiring the last frame a transfer time of 3 hours is required to obtain the full reconstructed dataset.
(47) Scenario 3: Hybrid Use the Cloud to Obtain a Faster Approximate Solution
(48) A trained deep convolutional network can perform the reconstruction task much faster compared to the compressive sensing solution. For example, a naïve GPU mapping of a trained convolutional network performs the frame reconstruction task in 2 seconds instead of 20 minutes. Here a speedup over the original algorithm Ć>C is achieved, but live processing is not possible. Processing cannot keep up with the acquisition data rate Ć≤D.sub.r, were Ć=8.4 Mbit/s and D.sub.r=160 Mbit/s. However, also the acquisition has latencies (stage moves, cutting of sections etc.) which introduce sufficient delays to make this still semi-live.
(49) With more effort, highly tuned networks and matched mappings on FPGAs or GPUs can speed up the inference speed by 40× resulting in 0.05 seconds processing time per frame. This enables live processing if the training latency by the processing can be hidden. In this scenario the local processing throughput increases to Ć=336 Mbit/s.
(50) In case we use a deep learning solution we should first transfer a training set to the cloud, e.g. 128 frames. Since there is a transport rate limitation to the cloud this will give a latency penalty
(51)
In addition, cloud time for training is required, e.g. for selection of a pre-trained network and fine tuning on our dataset 20 minutes are added (T.sub.train=1200 s). The new processing margin for a naïve neural network mapping would be
(52)
For 4096 frame this would already give a speedup over the classical scenario. T.sub.Nt+T.sub.train+T.sub.L=21 s+1200 s+7751 s=2.5 h additional processing delay after acquiring the last frame, which is a substantial improvement over the 1363 hours in the classical case locally.
(53) Live processing is only achieved if a negative T.sub.L is obtained, so the throughput of the neural networks is optimized with FPGAs or GPUs to C=336 Mbit/s. In this improved scenario
(54)
This margin can be used to make up for the training latency. For 4096 frames this would not be sufficient T.sub.Nt+T.sub.train+T.sub.L=21 s+1200 s−205 s=24 minutes of additional delay. However, from a job size of 24420 frames and more there would be no processing delay after the acquisition of the last frame. The processing margin f.Math.T.sub.L must be −1221 seconds to compensate the training latency. Note that these numbers change depending on the characteristics of the algorithm and the platform.
(55) Whenever a relative term, such as “about”, “substantially” or “approximately” is used in this specification, such a term should also be construed to also include the exact term. That is, e.g., “substantially straight” should be construed to also include “(exactly) straight”.
(56) Whenever steps were recited in the above or also in the appended claims, it should be noted that the order in which the steps are recited in this text may be the preferred order, but it may not be mandatory to carry out the steps in the recited order. That is, unless otherwise specified or unless clear to the skilled person, the order in which steps are recited may not be mandatory. That is, when the present document states, e.g., that a method comprises steps (A) and (B), this does not necessarily mean that step (A) precedes step (B), but it is also possible that step (A) is performed (at least partly) simultaneously with step (B) or that step (B) precedes step (A). Furthermore, when a step (X) is said to precede another step (Z), this does not imply that there is no step between steps (X) and (Z): That is, step (X) preceding step (Z) encompasses the situation that step (X) is performed directly before step (Z), but also the situation that (X) is performed before one or more steps (Y1), . . . , followed by step (Z). Corresponding considerations apply when terms like “after” or “before” are used.