METHODS AND SYSTEMS FOR CLASSIFYING A MALIGNANCY RISK OF A KIDNEY AND TRAINING THEREOF
20230326608 · 2023-10-12
Inventors
Cpc classification
G16H50/20
PHYSICS
International classification
Abstract
A computer-implemented method is provided for classifying a malignancy risk of a kidney, in particular a human kidney. Imaging data of an anatomy of a subject patient at least partially includes a representation of a kidney of the subject patient. A first neural network segments at least one region of the kidney representation based on the imaging data. A second neural network detects one or more suspected lesions of the segmented kidney representation. A third neural network classifies the detected suspected lesion with a malignancy risk. The third neural network is a deep profiler.
Claims
1. A method for classifying a malignancy risk of a kidney, the method comprising: providing imaging data of an anatomy of a subject patient, wherein the imaging data comprises at least partially a representation of the kidney of the subject patient; segmenting using a first neural network at least one region of the kidney representation based on the imaging data; detecting using a second neural network one or more suspected lesions of the segmented kidney representation; and classifying the detected suspected lesion with the malignancy risk using a third neural network, wherein the third neural network is a deep profiler.
2. The method of claim 1, wherein the third neural network classifies the malignancy risk based on imaging data and non-imaging data, wherein the non-imaging data comprises at least histopathologic data.
3. The method of claim 1, wherein classifying comprises classifying using the deep profiler, the deep profiler comprising an encoder for extracting imaging features.
4. The method of claim 3, wherein the encoder is a convolutional neural network.
5. The method of claim 3, wherein the deep profiler comprises a decoder, and wherein classifying comprises estimating at least one malignancy risk indicator by the decoder.
6. The method of claim 1, wherein the deep profiler comprises a task-specific network, and wherein classifying comprises generating at least one image signature for classifying the malignancy risk using the task-specific network.
7. The method of claim 1, further comprising: detecting anatomical landmarks using a fourth neural network based on the provided imaging data.
8. The method of claim 7 wherein the fourth neural network is a convolutional neural network using at least one universal non-linear function approximator, and wherein classifying comprises extracting an image feature by the convolutional neural network.
9. The method of claim 1, wherein the first neural network is a convolutional encoder-decoder architecture or a multi-level feature concatenation and deep supervision architecture.
10. The method claim 1, wherein detecting the one or more suspected lesions comprises detecting based on a fully convolutional one-stage object detection of the second neural network.
11. The method of claim 1, wherein providing the imaging data comprises at least one of the following: providing based on computer tomography and/or magnet resonance imaging, and/or providing at least partially a 3D illustration of the anatomy of the subject patient.
12. The method of claim 1, further comprising: converting the imaging data from at least a partially 3D illustration of the anatomy of the subject patient to a 2D illustration of the anatomy of the subject patient.
13. A system for classifying a malignancy risk scoring of a kidney, the system comprising: an interface configured to provide imaging data of an anatomy of a subject patient, wherein the imaging data comprises at least partially a representation of a kidney of the subject patient; a processor configured to use a first neural network to segment at least one region of the kidney representation which is based on the imaging data, configured to use a second neural network to detect one or more suspected lesions of the segmented kidney representation, and configured to implement a deep profiler to classify the detected suspected lesion with a malignancy risk.
14. The system of claim 13, wherein the deep profiler is configured to classify the malignancy risk based on imaging data and non-imaging data, wherein the non-imaging data comprises at least histopathologic data.
15. The system of claim 13, wherein the deep profiler comprises an encoder to extract imaging features.
16. The system of claim 13, wherein the deep profiler comprises a decoder configured to estimate at least one malignancy risk indicator.
17. The system of claim 13, wherein the deep profiler comprises a task-specific network configured to generate at least one image signature for classification of the malignancy risk using the task-specific network.
18. A method for training a machine learning algorithm to classify a malignancy risk of a kidney, the method comprising: training a first neural network with first training data including imaging data of an anatomy of at least one subject patient, wherein the imaging data comprises at least partially a representation of one or more kidneys; training a second neural network with second training data including one or more detected lesions of one or more segmented kidney representations; and training a third neural network with third training data of one or more lesions classified with a malignancy risk.
19. The method of claim 18, wherein training the third neural network comprises training a deep profiler.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0043] Further details and advantages can be taken from the following description of preferred examples in conjunction with the drawings, in which:
[0044]
[0045]
[0046]
[0047]
DETAILED DESCRIPTION OF THE EXAMPLE EXAMPLES
[0048] The drawings are to be regarded as being schematic representations and elements illustrated in the drawings are not necessarily shown to scale. Rather, the various elements are represented such that their function and general purpose become apparent to a person skilled in the art. Any connection or coupling between functional blocks, devices, components, or other physical or functional units shown in the drawings or described herein may also be implemented by an indirect connection or coupling. A coupling between components may also be established over a wireless connection. Functional blocks may be implemented in hardware, firmware, software, or a combination thereof.
[0049] Various examples will now be described more fully with reference to the accompanying drawings in which only some examples are shown. Specific structural and functional details disclosed herein are merely representative for purposes of describing examples. Examples, however, may be embodied in various different forms, and should not be construed as being limited to only the illustrated examples. Rather, the illustrated examples are provided as examples so that this disclosure will be thorough and complete and will fully convey the concepts of this disclosure to those skilled in the art. Accordingly, known processes, elements, and techniques, may not be described with respect to some examples. Unless otherwise noted, like reference characters denote like elements throughout the attached drawings and written description, and thus descriptions will not be repeated. The present invention, however, may be embodied in many alternate forms and should not be construed as limited to only the examples set forth herein.
[0050] It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, components, regions, layers, and/or sections, these elements, components, regions, layers, and/or sections, should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example examples of the present invention. As used herein, the term “and/or,” includes any and all combinations of one or more of the associated listed items. The phrase “at least one of” has the same meaning as “and/or”.
[0051]
[0052] In a first method act 101, imaging data ID of an anatomy of a subject patient is provided. The imaging data ID includes at least partially a representation of a kidney of the subject patient. The imaging data ID may be based on image capturing via computer tomography (CT) and/or magnet resonance imaging (MRI).
[0053] Thus, the provided imaging data ID may include at least partially a 3D illustration of the anatomy of the subject patient. In particular, the 3D illustration is layered, which means that at least two layers can be extracted from the imaging data ID defining at least two different depths of the captured anatomy.
[0054] In a second method act 102, a fourth neural network is used to detect anatomical landmarks based on the provided imaging data ID. This act is in particular important since the method may require a precise, automatic detection of anatomical structures, preferably of at least a kidney, to initialize and constrain mathematical models for volumetric organ segmentation. As such, enabling accurate and efficient anatomical landmark detection can support the method 100 for a more effective and streamlined image reading.
[0055] Therefore, the fourth neural network may be designed as convolutional neural network for image feature extraction. Furthermore, the fourth neural network may additionally use at least one universal non-linear function approximator.
[0056] The network may be parametrized by Θ=[W, b], where W denotes the inter-neural connection weights organized as (multichannel) filter kernels, and b defines the set of neuron bias values.
[0057] Convolutional layers may exploit local spatial correlations of image voxels to learn translation-invariant convolutional kernels, which capture discriminative image features. It may be considered a multi-channel signal representation M.sub.k in layer k, i.e., a channel-wise concatenation of signal representations M.sub.k,c with c∈. One can generate a signal representation in layer k+1 as: M.sub.k+1,l=ϕ(M.sub.k*w.sub.k,l+b.sub.k,l), where W.sub.k,l∈W may represent a convolutional kernel with the same number of channels as M.sub.k, the value b.sub.k,l∈b may represent the bias, l denotes the channel index, and * denotes a convolution operation.
[0058] The function ϕ may represent the nonlinear activation function, which is applied pointwise. Rectified linear unit (ReLU) activations may be used. The final network layers may be typically fully-connected. In a supervised regression setup, given training data D=[(X.sub.1,y1), . . . , (X.sub.N,YN)] i.e., N independent pairs of volumetric image observations with value assignments, one may define the network response function as R(.Math.;θ), and use Maximum Likelihood Estimation to estimate the optimal network parameters (L denotes the likelihood):
[0059] This optimization problem may be solved with stochastic gradient descent (SGD) combined with the backpropagation algorithm to compute the network gradients.
[0060] It may be beneficial to reformulate the anatomy detection as a cognitive learning task for an artificial agent. Given a volumetric image I: .sup.3.fwdarw.
and the location of an anatomical structure of interest {right arrow over (p)}.sub.GT∈
.sup.3 within I, the task may learn a navigation strategy to {right arrow over (p)}.sub.GT in image space, i.e., a voxel grid of the imaging data ID. In other words, to seek voxel-based navigation trajectories from any arbitrary starting point {right arrow over (p)}.sub.0 to {right arrow over (p)}.sub.k within image I, with the property that ∥{right arrow over (p)}.sub.k−{right arrow over (p)}.sub.GT∥ is minimal. With reinforcement learning this problem may be modelled using a Markov Decision Process (MDP) M:=(S, A, T, R, γ) where: [0061] S may represent a finite set of states, s.sub.t∈S being the state of the agent at time t. To encode the location of the agent in the imaged volumetric space at time t, it may be defined s.sub.t=I({right arrow over (p)}.sub.k), which may denote an axis-aligned box of image intensities extracted from I and centered at the voxel-position {right arrow over (p)}.sub.t in image space. [0062] A may represent a finite set of actions allowing the agent to interact with the environment defined by I, where a.sub.t∈A is the action the agent may perform at time t. A discrete voxel-wise navigation model may be used allowing the agent to move from any voxel position {right arrow over (p)}.sub.t to an adjacent voxel position {right arrow over (p)}.sub.t+1 in image space. [0063] T:S×A×S.fwdarw.[0;1] may be a stochastic transition function, where T.sub.s,a.sup.s′ may describe the probability of arriving in state s′, after performing action a in state s. [0064] R:S×A×S.fwdarw.
may be a scalar reward function, which drives the behavior of the agent, where R.sub.s,a.sup.s′∈
may denote the expected reward after a state transition. For a state transition s.fwdarw.s′ at time t from {right arrow over (p)}.sub.t.fwdarw.{right arrow over (p)}.sub.t+1, we define R.sub.s,a.sup.s′=∥{right arrow over (p)}.sub.t−{right arrow over (p)}.sub.GT∥.sub.2.sup.2−∥{right arrow over (p)}.sub.t+1−{right arrow over (p)}.sub.GT∥.sub.2.sup.2. This may represent a distance-based feedback, which is positive if the agent gets closer to the target structure and negative otherwise. γ may be the discount factor controlling the importance of future versus immediate rewards.
[0065] Furthermore, an optimal action-value function Q*( . . . , . . . ) may be defined that encodes the maximum expected future discounted reward when starting in state s, performing action a, and acting optimally thereafter: [0066] Q*(s,a)=maxE[R.sub.t|s.sub.t=s,a.sub.t=a,π], where π may be an action policy, in other words a probability distribution over actions in any given state. An important relation satisfied by the optimal action-value function Q* may be the Bellman optimality equation, which represents following recursive formulation: [0067] Q*(s,a)=ΣT.sub.s,a.sup.s′(R.sub.s,a.sup.s′+γmaxQ*(s′,a′))=E.sub.s′(r+γmaxQ*(s′,a′)), where s′ defines a possible state visited after s, a′ the corresponding action and r=R.sub.s,a.sup.s′ represents a compact notation for the current, immediate reward. In one example, this approach may be amended by deploying a deep Q-network (DQN) which is used as a non-linear approximator for the optimal action-value function. Accordingly, a deep Q-network can be trained in an RL setup using an iterative approach to minimize the mean squared error based on the Bellman optimality equation. At any training-iteration i, it may be possible to approximate an optimal expected target value for the action-value function using a set of reference parameters based on a previous training iteration i′<i.
[0068] Learning the action-value function Q* may enable the agent to effectively search for objects in the image, as opposed to scanning the volumetric space exhaustively. This learning process may be based on an adequate exploration of the environment, which we ensure through an off-policy ∈-greedy approach.
[0069] The variable ∈ε[0,1] controls the randomness in the exploration. This means that during training, actions are selected either uniformly at random with probability ∈, or deterministically using the current policy with probability 1−∈. Another important strategy to ensure the training stability may be the decorrelation of the training samples using the concept of experience replay. During training, the agent maintains an active memory of episodic trajectories M=[T1; T2; . . . ], which is constantly expanded and uniformly sampled to estimate the learning gradient.
[0070] To further accelerate the training, it may be possible to use an adaptive episode length. By gradually reducing the episode length during training using linear decay, the space exploration by sampling increasing numbers of trajectories that are stored in the active memory may be improved.
[0071] In a third method act 103, a first neural network is used to segment at least one region of the kidney representation based on the imaging data ID. Therefore, the first neural network may be designed as convolutional encoder-decoder architecture. Additionally, the first neural network may be designed as multi-level feature concatenation and deep supervision architecture. Compared to non-learning-based approaches like statistical distribution of the intensity, including atlas-based, active shape model (ASM-) based, level-set based or graph-cut-based methods, learning-based approaches may be more beneficial for better segmentation.
[0072] Fully convolutional networks (FCN) with deep supervision may be used, which can perform end-to-end learning and inference. The output of FCN may be refined with a fully connected conditional random field (CRF) approach. Furthermore, cascaded FCNs followed by CRF refinement may be applied.
[0073] However, also Generative Adversarial Networks (GAN) may be a powerful framework for this task. The GAN may include at least a generator and a discriminator. The generator tries to produce the output that is close to the real samples, while the discriminator attempts to distinguish between real and generated samples.
[0074] An advanced approach may be an adversarial image-to-image network (DI2IN-AN), wherein a deep image-to-image network (DI2IN) may serve as a generator to produce a liver segmentation. It may employ a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. The network may try to optimize a conventional multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of DI2IN and ground truth.
[0075] Ideally, the discriminator pushes the generator's output towards the distribution of ground truth, so that it may have the potential to enhance generator's performance by refining its output. Since the discriminator is usually a CNN that takes the joint configuration of many input variables, it may embed the higher-order potentials into the network (the geometric difference between prediction and ground truth is represented by the trainable network model instead of heuristic hints). The proposed implementation also achieves higher computing efficiency since the discriminator does not need to be executed at inference.
[0076] DI2IN may take the 3D imaging data ID as input and outputs the probability maps that indicate how likely voxels belongs to the liver region. At least one block, in particular all blocks in DI2IN include 3D convolutional and bilinear upscaling layers.
[0077] In the encoder part of DI2IN, only the convolution layers are used in all blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, stride may be set as 2 at some layers and the size of feature maps may be reduced. Moreover, larger receptive field may cover more contextual information and help to preserve liver shape information in the prediction. The decoder of DI2IN may include convolutional and bilinear upscaling layers.
[0078] To enable end-to-end prediction and training, the upscaling layers may be implemented as bilinear interpolation to enlarge the activation maps. All convolutional kernels may be of 3×3×3. The upscaling factor in a decoder may be 2 for x; y; z dimension. The leaky rectified linear unit (Leaky ReLU) and batch normalization may be adopted in all convolutional layers for proper gradient back-propagation.
[0079] In order to further improve the performance of DI2IN, several mainstream technologies may be adopted. First, a feature layer concatenation may be used in DI2IN. Fast bridges may be built directly from the encoder layers to the decoder layers. The bridges may pass the information from the encoder forward and then may concatenate it with the decoder feature layers. The combined feature may be used as the input for the next convolution layer. Following the acts above to explicitly combine advanced and low-level features, DI2IN may benefit from local and global contextual information.
[0080] In a fourth method act 104, a second neural network is used to detect one or more suspected lesions of the segmented kidney representation. Therefore, the third neural network may be configured to classify the malignancy risk MR based on imaging data ID and non-imaging data NID, wherein the non-imaging data NID includes at least histopathologic data.
[0081] The second neural network may be configured at least partly as a fully convolutional one-stage object detection (FCOS). This is preferably performed in a per-pixel prediction fashion, analogue to semantic segmentation. The FCOS may be anchor box free and/or proposal free. Thus, the FCOS may avoid any complicated computation related to anchor boxes such as calculating overlapping during training. Additionally, FCOS may avoid hyper-parameters related to anchor boxes, which may be sensitive to a final detection performance.
[0082] An example of a fully convolutional one-stage object decoder may be defined as follows: Let F.sub.i∈.sup.H×W×C be the feature maps at layer I of a backbone CNN and s be the total stride until the layer. The ground-truth bounding boxes for the imaging data ID may be defined as {B.sub.i}, where B.sub.i=(x.sub.0.sup.(i), y.sub.0.sup.(i), x.sub.1.sup.(i), y.sub.1.sup.(i), c.sup.(i))∈
.sup.4×{1,2 . . . C}. (x.sub.0.sup.(i), y.sub.0.sup.(i)) and (x.sub.1.sup.(i), y.sub.1.sup.(i)) may denote the coordinates of the left-top and right-bottom corners of the bounding box. c.sup.(i) may be the class that the object in the bounding box belongs to. C may be the number of classes. For each location (x, y) on the feature map F.sub.i, one may map it back onto the input image as
which is near the center of the receptive field of the location (x, y). The target bounding box may be directly regressed at the location.
[0083] Location (x, y) may be considered as a positive sample if it falls into any ground-truth box and the class label c* of the location may be the class label of the ground-truth box. Otherwise, it may be a negative sample and c*=0 (background class). Furthermore, there may be a 4D real vector t*=(l*, t*, r*, b*) being the regression targets for the location. l*, t*, r* and b* may be distances from the location to the four sides of the bounding box. If a location falls into multiple bounding boxes, it may be considered as an ambiguous sample. The bounding box with minimal area may be chosen as its regression target.
[0084] If location (x, y) is associated to a bounding box B.sub.i, the training regression targets for the location may be formulated as, l*=x−x.sub.0.sup.(i), t*=y−y.sub.0.sup.(i), r*=x.sub.1.sup.(i)−x and b*=y.sub.1.sup.(i)−y. The FCOS may leverage as many foreground samples as possible to train the regressor.
[0085] Corresponding to the training targets, the final layer of the networks may predict an 80D vector p of classification labels and a 4D vector t=(l, t, r, b) bounding box coordinates. In one example, C binary classifiers may be trained. In one example, at least four convolutional layers may be added after the feature maps of the backbone networks respectively for classification and regression branches. Since regression targets may be positive, exp(x) may be employed to map any real number to (0,∞) on the top of the regression branch.
[0086] A training loss function may be defined as follows:
where L.sub.cls may be focal loss and Lreg may be the IOU loss. N.sub.pos may denote the number of positive samples and λ being in 1 may be the balance weight for L.sub.reg. The summation may be calculated over all locations on the feature maps F.sub.i. 1.sub.{c.sub.
[0087] Giving the imaging data ID, they may be forwarded through the network and may obtain the classification scores p.sub.x,y and the regression prediction t.sub.x,y for each location on the feature maps F.sub.i. In one example, the required IOU scores for positive anchor boxes may be lowered.
[0088] In one example, a single-layer branch may be added additionally, in parallel with a classification branch, in order to predict a “centerness” of a location. The centerness may depict the normalized distance from the location to the center of the object that the location is responsible for. Given the regression targets 1*, t*, r* and b* for a location the centerness target may be defined as
The centerness may range from 0 to 1 and may be trained with binary cross entropy (BCE) loss. The final score may be computed by multiplying the predicted centerness with the corresponding classification score.
[0089] In a fifth method act 105, the detected suspected lesion may be classified with a malignancy risk MR using a third neural network, wherein the third neural network is a deep profiler. Therefore, the deep profiler may include an encoder for extracting imaging features. Furthermore, the encoder may be designed as a convolutional neural network (CNN), in particular a three-dimensional CNN. Additionally, the deep profiler may include a decoder for estimating at least one malignancy risk MR indicator and/or a task-specific network for generating at least one image signature for classifying at least one malignancy risk MR.
[0090] In one example, the imaging data ID may be quantified by intensity, geometry, texture and/or wavelet features. The geometry features may quantify the 2D or 3D shape characteristics of the kidney, the texture features may describe spatial distribution of voxel or pixel intensities, thereby quantifying a heterogeneity. Any intensity and texture features may be computed after applying wavelet transformations to the imaging data ID.
[0091] To find voxels of the imaging data ID that contribute the most toward the prediction, the derivative of the final partial likelihood loss with respect to the imaging data ID may be taken and evaluated.
[0092] In one example, the imaging data ID and/or non-imaging data NID may include context data relating to data recording information. In addition, the context data may include at least one out of following: data recording device information, contrast configuration information, brightness configuration information, recording direction information, total recording time information, projection type information (e.g. CT: average intensity projection (AIP), maximum intensity projection (MIP) or Minimum intensity projection MinIP)), date and/or time of recording information.
[0093] In one example, the non-imaging data NID of the subject patient may include at least one out of following: age, weight, size, health condition information, gender, nutritional practice.
[0094] A sixth method act 106, may convert the imaging data ID from at least a partially 3D illustration of the anatomy of the subject patient to a 2D illustration of the anatomy of the subject patient. This method act may be performed at any time due to, e.g., memory or computational limitations.
[0095]
[0096] The system 200 includes an interface 201 configured to provide imaging data ID of an anatomy of a subject patient, wherein the imaging data ID includes at least partially a representation of a kidney of the subject patient. Therefore, imaging data ID will be forwarded, in particular together with non-imaging data, to the interface.
[0097] A first analyzing unit 202 (processor with analyzing instructions) of the system 200 is configured to use a first neural network to segment at least one region of the kidney representation which is based on the imaging data. A second analyzing unit 203 (processor with analyzing instructions) is also part of the system 200, wherein the second analyzing unit 203 is configured to use a second neural network to detect one or more suspected lesions of the segmented kidney representation. The system includes additionally a deep profiler 204 (processor with deep profiler instructions) which is configured to classify the detected suspected lesion with a malignancy risk MR.
[0098] In one example, the deep profiler 204 is configured to classify the malignancy risk MR based on imaging data ID and non-imaging data NID, wherein the non-imaging data NID includes at least histopathologic data.
[0099] Whenever a malignancy risk MR score has been determined by the system 200, the value may be stored in the system 200 (storage in memory) or in a cloud-based memory storage system. In one example, at least parts of the system 200 may be deployed in a cloud-based system, which means that they do not have to be physically integrated in a box.
[0100] The system 200 may be configured that each input data and/or output data can be forwarded to each of its units, namely its interface 201, its first analyzing unit 202, its second analyzing unit 203 and/or its deep profiler 204. It is further denoted that the neural networks of the specific units may be designed analogue to the neural networks disclosed above relating to
[0101]
[0102] Therefore, the method 300 includes a first act 301 of training a first neural network with first training data including imaging data ID of an anatomy of at least one subject patient, wherein the imaging data ID includes at least partially a representation of one or more kidneys. An adversarial network may be utilized for training of the first neural network to discriminate the output from ground truth.
[0103] In a second method act 302, a second neural network is trained with second training data including one or more detected lesions of one or more segmented kidney representations.
[0104] In a third method act 303, a third neural network is trained with third training data of one or more lesions classified with a malignancy risk MR. The third training data are preferably based on imaging data ID and non-imaging data NID, wherein the non-imaging data NID includes at least histopathologic data. Furthermore, at least one ground-truth label of the deep profiler may be determined based on histopathologic data.
[0105] In a fourth method act 304, a fourth neural network is trained using deep reinforcement learning based on fourth training data related to detected anatomical landmarks, in particular landmarks relating to one or more kidney representations.
[0106] In one example at least two different training data fragments of the first, second third or fourth training data may be based on the same subject patient. Furthermore, the two different training data fragments may be indicated as training data of the same subject patient. Additionally, the non-imaging data NID may include at least one indicator which is configured to serve for patient follow-up diagnosis.
[0107]
[0108] Therefore, the system 400 includes a first analysis unit 401 (processor with analyzing instructions) which is configured to train a first neural network with first training data including imaging data ID of an anatomy of at least one subject patient, wherein the imaging data ID includes at least partially a representation of one or more kidneys.
[0109] A second analysis unit 401 (processor with analyzing instructions) of the system 400 is configured to train a second neural network with second training data including one or more detected lesions of one or more segmented kidney representations. The system 400 further includes a third analysis unit 403 (processor with analyzing instructions) configured to train a third neural network with third training data of one or more lesions classified with a malignancy risk MR.
[0110] The system may additionally include a fourth analysis unit 404 (processor with analyzing instructions) configured to train a fourth neural network with fourth training data related to detected anatomical landmarks, in particular landmarks relating to one or more kidney representations.
[0111] The system 400 may be configured that each input data and/or output data can be forwarded to each of its units, namely its first analysis unit 401, its second analysis unit 402, its third analysis unit 403 and/or its fourth analysis unit 404. It is further denoted that the neural networks of the specific units may be designed analogue to the neural networks disclosed above relating to
[0112] According to one embodiment the following the following clause is provided: [0113] Clause 1: A computer-implemented method (100) for classifying a malignancy risk (MR) of a kidney, in particular a human kidney, comprising following acts: [0114] providing (101) imaging data (ID) of an anatomy of a subject patient, wherein the imaging data (ID) comprises at least partially a representation of a kidney of the subject patient; [0115] using (102) a first neural network to segment at least one region of the kidney representation which is based on the imaging data (ID); [0116] using (103) a second neural network to detect one or more suspected lesions of the segmented kidney representation; and [0117] classifying (104) the detected suspected lesion with a malignancy risk (MR) using a third neural network, [0118] wherein the third neural network is a deep profiler. [0119] 2. The method of clause 1, wherein the imaging data (ID) and/or non-imaging data (NID) comprises context data relating to data recording information. [0120] 3. The method of clause 1 or 2, wherein the context data comprises at least one out of following: [0121] data recording device information, contrast configuration information, brightness configuration information, recording direction information, total recording time information, projection type information, date and/or time of recording information. [0122] 4. The method of any of clauses 1 to 3, wherein the non-imaging data (NID) of the subject patient comprises at least one out of following: [0123] age, weight, size, health condition information, gender, nutritional practice. [0124] According to a 5.sup.th clause, the following is provided: [0125] A computer-implemented method (300) for training a machine learning algorithm to classify a malignancy risk (MR) of a kidney, in particular a human kidney, comprising following acts: [0126] training (301) a first neural network with first training data including imaging data (ID) of an anatomy of at least one subject patient, wherein the imaging data (ID) comprises at least partially a representation of one or more kidneys; [0127] training (302) a second neural network with second training data including one or more detected lesions of one or more segmented kidney representations; and training (303) a third neural network with third training data of one or more lesions classified with a malignancy risk. [0128] 6. The method of clause 5, wherein the third training data are based on imaging data (ID) and non-imaging data (NID), wherein the non-imaging data (NID) comprises at least histopathologic data. [0129] 7. The method of clause 5 or 6, wherein at least one ground-truth label of the deep profiler is determined based on histopathologic data. [0130] 8. The method of any of clauses 5 to 7, comprising following additional act: [0131] training (304) a fourth neural network using deep reinforcement learning based on fourth training data related to detected anatomical landmarks, in particular landmarks relating to one or more kidney representations. [0132] 9. The method of any of clauses 5 to 8, wherein an adversarial network is utilized for training of the first neural network to discriminate the output from ground truth. [0133] 10. The method of any of clauses 5 to 10, wherein at least two different training data fragments are based on the same subject patient. [0134] 11. The method of clause 10, wherein the two different training data fragments are indicated as training data of the same subject patient. [0135] 12. The method of any of clauses 5 to 11, wherein the non-imaging data (NID) comprise at least one indicator which is configured to serve for patient follow-up diagnosis.
[0136] Although the present invention has been described in detail with reference to the preferred example, the present invention is not limited by the disclosed examples from which the skilled person is able to derive other variations without departing from the scope of the invention. Example examples being thus described, it will be obvious that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the present invention, and all such modifications as would be obvious to one skilled in the art are intended to be included within the scope of the following claims.