Rapid Digital Nuclear Reactor Design Using Machine Learning
20230237226 · 2023-07-27
Assignee
Inventors
Cpc classification
G06N7/01
PHYSICS
G06N5/01
PHYSICS
Y02E30/00
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G06N3/126
PHYSICS
G06F30/13
PHYSICS
International classification
Abstract
A method designs nuclear reactors using design variables and metric variables. A user specifies ranges for the design variables and threshold values for the metric variables and selects design parameter samples. For each sample, the method runs three processes, which compute metric variables for thermal-hydraulics, neutronics, and stress. The method applies a cost function to compute an aggregate residual of the metric variables compared to the threshold values. The method deploys optimization methods, either training a machine learning model using the samples and computed aggregate residuals, or using genetic algorithms, simulated annealing, or differential evolution. When using Bayesian optimization, the method shrinks the range for each design variable according to correlation between the respective design variable and estimated residuals using the machine learning model. These steps are repeated until a sample having a smallest residual is unchanged for multiple iterations. The final model assesses relative importance of each design variable.
Claims
1-20. (canceled)
21. A method of designing a nuclear reactor, comprising: identifying a plurality of design variables for the nuclear reactor; identifying a plurality of metric variables for the nuclear reactor, each of the plurality of metric variables measuring a respective thermal-hydraulic property, neutronics property, or stress property; receiving user input to specify a respective range of values for each of the plurality of design variables, thereby forming an initial trust region; receiving user input to specify a respective threshold value for each of the plurality of metric variables; (i) constructing a Latin hypercube comprising N samples of values for the plurality of design parameters within the initial trust region, wherein N is an integer greater than 1; (ii) computing each of the plurality of metric variables for each of the N samples in the Latin hypercube and applying a cost function to compute a respective residual of the respective metric variable compared to the respective threshold value; (iii) training a machine learning model according to the N samples and the corresponding computed residuals; shrinking the trust region, wherein the respective range for each of the plurality of design variables is shrunk according to a correlation between the respective design variable and estimated residuals using the machine learning model; repeating the (i) constructing, (ii) computing, and (iii) training until a sample having a smallest residual is unchanged for a predetermined number of iterations; using the machine learning model from the final iteration to assess relative importance of each of the plurality of design variables; and providing the assessment visually in a report.
22. The method of claim 21, wherein the machine learning model is a random forest of decision trees or a neural network for Bayesian optimization.
23. The method of claim 21, wherein the machine learning model uses one or more evolutionary methods, including genetic algorithms, coupled simulated anneal algorithms, and/or differential evolutionary algorithms.
24. The method of claim 21, wherein computing each of the plurality of metric variables is performed concurrently
25. The method of claim 21, wherein computing each of the plurality of metric variables is performed serially.
26. The method of claim 21, wherein computing each of the plurality of metric variables includes a thermal-hydraulics analysis process, a neutronics analysis process, and a stress analysis process, and each of the thermal-hydraulics analysis process, the neutronics analysis process, and the stress analysis process is performed at a respective distinct computing subsystem
27. The method of claim 21, further comprising during repeating of the (i) constructing, (ii) computing, and (iii) training: determining a sample having a smallest aggregate residual; determining that the sample has a value for a first design variable of the plurality of the design variables on a boundary of the trust region; and in response to the determination, expanding the trust region to include a range for the first design variable that was not previously in the trust region.
28. The method of claim 21, wherein one of the plurality of metric variables is the effective neutron multiplication factor (K.sub.eff).
29. The method of claim 21, wherein the shrinking uses a learning rate multiplier specified by user input
30. The method of claim 21, wherein the initial Latin hypercube is centered at average values for the user-specified ranges of the design variables.
31. The method of claim 21, wherein the plurality of design variables includes a first design variable that has discrete categorical values, the method further comprising: encoding each distinct categorical value as a numeric value in a continuous range to form a first replacement design variable; and substituting the first replacement design variable for the first design variable.
32. The method of claim 31, further comprising, during each repeating of the (i) constructing, (ii) computing, and (iii) training: for a sample having a smallest residual, estimating probabilities that switching to different categorical values would produce a smaller residual according to the cost function; and for an immediately subsequent repeating of the (i) constructing, (ii) computing, and (iii) training, using sampling rates for the Latin hypercube that are proportional to the estimated probabilities.
33. The method of claim 31, wherein the first design variable is fluid type or material type
34. The method of claim 33, wherein the categorical values for fluid type or material type are selected from a database of materials including single phase gases, liquids, and solids
35. A computing system, comprising: one or more computers, each having one or more processors and memory, wherein the memory stores one or more programs configured for execution by the one or more processors, the one or more programs comprising instructions for: identifying a plurality of metric variables for a nuclear reactor, each of the plurality of metric variables measuring a respective thermal-hydraulic property, neutronics property, or stress property; receiving user input to specify a respective range of values for each of the plurality of design variables, thereby forming an initial trust region; receiving user input to specify a respective threshold value for each of the plurality of metric variables; (i) constructing a Latin hypercube comprising N samples of values for the plurality of design parameters within the initial trust region, wherein N is an integer greater than 1; (ii) computing each of the plurality of metric variables for each of the N samples in the Latin hypercube and applying a cost function to compute a respective residual of the respective metric variable compared to the respective threshold value; (iii) training a machine learning model according to the N samples and the corresponding computed residuals; shrinking the trust region, wherein the respective range for each of the plurality of design variable is shrunk according to a correlation between the respective design variable and estimated residuals using the machine learning model; repeating the (i) constructing, (ii) computing, and (iii) training until a sample having a smallest residual is unchanged for a predetermined number of iterations; using the machine learning model from the final iteration to assess relative importance of each of the plurality of design variables; and providing the assessment visually in a report.
36. The computing system of claim 35, wherein the machine learning model is a random forest of decision trees or a neural network for Bayesian optimization.
37. The computing system of claim 35, wherein the machine learning model uses one or more evolutionary methods, including genetic algorithms, coupled simulated anneal algorithms, and/or differential evolutionary algorithms.
38. The computing system of claim 35, wherein one of the plurality of metric variables is the effective neutron multiplication factor (K.sub.eff)
39. The computing system of claim 35, wherein the plurality of design variables includes a first design variable that has discrete categorical values, the method further comprising: encoding each distinct categorical value as a numeric value in a continuous range to form a first replacement design variable; and substituting the first replacement design variable for the first design variable.
40. The computing system of claim 39, further comprising, during each repeating of the (i) constructing, (ii) computing, and (iii) training: for a sample having a smallest residual, estimating probabilities that switching to different categorical values would produce a smaller residual according to the cost function; and for an immediately subsequent repeating of the (i) constructing, (ii) computing, and (iii) training, using sampling rates for the Latin hypercube that are proportional to the estimated probabilities.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0031] For a better understanding of the disclosed systems and methods, as well as additional systems and methods, reference should be made to the Description of Implementations below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
[0032]
[0033]
[0034]
[0035]
[0036] Reference will now be made to implementations, examples of which are illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without requiring these specific details.
DESCRIPTION OF IMPLEMENTATIONS
[0037]
[0038] The algorithm 104 is iterative, starting with a trust region 120. From this trust region 120, a plurality of sample points are selected. In some implementations, the samples form a Latin hypercube (e.g., centered at a point whose coordinates are the average of each of the ranges). The algorithm selects the sample points so that the maximum multivariate correlation between the design variables is minimized. In some implementations, this is an iterative process that specifies a predetermined maximum value p_max, iterating until the maximum multivariate correlation is less than p_max.
[0039] After the sample points are selected, algorithm 104 runs three processes to determine the quality of each sample point. Typically these processes are run in parallel, but some implementations perform an initial screening using one or more of the processes (e.g., the neutronics engine 136) to rule out some designs that are too far from workable. Each of the thermal-hydraulics engine, 132, the mechanical stability (“stress”) engine 134, and the neutronics engine 136 computes values for one or more of the metric variables 112. For each sample, the algorithm computes a cost function that specifies the quality of the sample in terms of producing a viable model for a reactor core. In some implementations, the cost function computes a residual, which is a weighted polynomial of the amount by which each of the metric variables differs from its target constraint value (threshold). For example suppose there are n metric variables y.sub.1, y.sub.2, . . . , y.sub.n with target constraint values c.sub.1, c.sub.2, . . . , c.sub.n and weights w.sub.1, w.sub.2, . . . , w.sub.n. Some implementations specify the residual as the sum w.sub.1(y.sub.1−c.sub.1)+w.sub.2(y.sub.2−c.sub.2)+ . . . +w.sub.n(y.sub.n−c.sub.n).
[0040] The sample points and computed residuals are input to the machine learning engine 140, which uses the data to build a machine learning model 262 (or multiple models). In some implementations, the model is a random forest of decision trees. In some implementations, the model is a neural network. Some implementations use Gaussian processes for the machine learning. In other implementations, memoryless methods for evolutionary optimization may be used. In some implementations, a suite of optimization methods is provided for selection to best meet the design problem. Based on the machine learning model 262, the algorithm 104 determines the relative importance of each of the design variables. Some of the design variables have a significant impact on the residual, but other design variables have little impact on the residual. Using this data, the algorithm 104 is able to shrink (150) the trust region 120 and repeat the iterative process with a smaller trust region. Because the trust region shrinks on each iteration, the algorithm 104 collects more and more information about smaller regions, and thus the data becomes more precise. Note that the cumulative sample points from all of the iterations are used when building the machine learning model 262 (e.g., if the same number of sample points are generated for the second iteration, the second iteration will have twice as many sample points in total to build the machine learning model).
[0041] In some implementations, the process uses a genetic algorithm instead of or in addition to the shrinking of the trust region in each iteration. When using a genetic algorithm, each of the sample points is weighted according to the residual. In this case, a sample with a smaller residual has a higher weight. In some implementations, a sample with a residual that is too high will have a weight of zero. These weights are used for constructing the samples used in the next iteration. For example, randomly select pairs of samples according to their weights, and for each pair, construct children samples for the next iteration using mutation and/or crossover. In some implementations, more than two samples are used to generate a child sample for the next iteration.
[0042] When the best sample point does not change after multiple iterations, the algorithm 104 exits the processing loop. The data from the final iteration is used to provide output analytics 106 for users. In some implementations, the output analytics 106 are provided as graphical data visualizations, such as the graphical data visualizations 160 and 162. The first graphical data visualization 160 shows the residual composition based on the input parameters (i.e., the design variables 110). This shows how much each of the design variables 110 contributed to the computed residuals. In this example, there are eight input parameters that contributed to the residuals, including core radius, core height, flow hole diameter, and flow rate. In some implementations, any input parameter whose contribution to the residual is less than a threshold value is omitted from the graphic 160. The second graphical data visualization 162 shows the residual composition based on the metric constraints 112. In this example data visualization 162, T.sub.max and K.sub.eff are the two most critical constraints, whereas Inlet temperature (Inletτ) and pressure drop dP/P were less significant. In some implementations, these data visualizations 160 and 162 are interactive, allowing a user to investigate the options.
[0043]
[0044] The computing device 200 may include a user interface 206 comprising a display device 208 and one or more input devices or mechanisms 210. In some implementations, the input device/mechanism includes a keyboard. In some implementations, the input device/mechanism includes a “soft” keyboard, which is displayed as needed on the display device 208, enabling a user to “press keys” that appear on the display 208. In some implementations, the display 208 and input device/mechanism 210 comprise a touch screen display (also called a touch sensitive display).
[0045] In some implementations, the memory 214 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM or other random access solid state memory devices. In some implementations, the memory 214 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. In some implementations, the memory 214 includes one or more storage devices remotely located from the CPU(s) 202. The memory 214, or alternatively the non-volatile memory device(s) within the memory 214, comprises a non-transitory computer readable storage medium. In some implementations, the memory 214, or the computer-readable storage medium of the memory 214, stores the following programs, modules, and data structures, or a subset thereof: [0046] an operating system 216, which includes procedures for handling various basic system services and for performing hardware dependent tasks; [0047] a communications module 218, which is used for connecting the computing device 200 to other computers and devices via the one or more communication network interfaces 204 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on; [0048] a web browser 220 (or other application capable of displaying web pages), which enables a user to communicate over a network with remote computers or devices; [0049] an input user interface 222, which allows a user to specify ranges 232 of allowed values for the design variables 110 and to specify target (threshold) values 242 for the constraints 112. The ranges for the design variables are using to specify an initial trust region 120; [0050] an output user interface 224, which provides graphical analytics 226 about the designs constructed. Some examples of graphical analytics 226 are the data visualizations 160 and 162 in
[0058] Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 214 stores a subset of the modules and data structures identified above. Furthermore, the memory 214 may store additional modules or data structures not described above.
[0059] Although
[0060]
[0061] Next, the process generates (306) a Latin hypercube (LH) of samples from the initial trust region. Generating the Latin hypercube is iterative, and the process iterates (306) on the cube size until the multivariate correlation between the design variables for the samples is less than a predefined constant p.sub.max. Each iteration increases the size of the Latin hypercube and/or replaces at least one of the samples with a new sample. The initial stochastic process for building the Latin hypercube is (306) centered at the average of the trust region.
[0062] The process then runs (308) thermal-hydraulic (TH) calculations for each of the points in the Latin hypercube. Some implementations use a 3D Finite Element Method for calculations. Some implementations use a true 2D calculation, which includes both a 2-dimensional fluids calculation (axial and radial) and a 2-dimensional heat conduction calculation (axial and radial). However, some implementations are able to produce good quality results with less than a full 2D calculation. In general, the geometries create flows that really only result in axial changes that are relevant. Thus, some implementations solve only 1D axial flow without a radial component. The reverse is true for the conduction. The radial component is very important, but the axial is not as important. Thus, some implementations solve only 1D radial conduction. In this scenario, there are solutions that are both axial and radial, but they are not true 2D calculations. Computing only the 1D axial flow and only the 1D radial conduction is sometimes referred to as “1.5D”.
[0063] The output from the thermal-hydraulic calculations (e.g., the axial temperature distribution) is then then used (310) as input to the neutronic calculations and the mechanical stability analysis. The thermal-hydraulic process, neutronics process, and mechanical stability analysis compute various metric variables. In particular, the process extracts (312) nominal K.sub.eff and its standard error from the neutronic calculations. In some implementations, boxes 308 and 310 in this flow loop for convergence before proceeding to box 312.
[0064] The computed metric values from the multiple physics processes are then used as an input to a cost function, which measures (314) the overall quality of a sample with respect to satisfying the constraints. Even if all of the samples fail to provide viable designs for a nuclear reactor, some of the designs are better than others, and can be used to iteratively locate better designs. The metrics used by the cost function generally includes metrics based on neutronics, thermal-hydraulics, mechanical stability, mass, geometry, cost, and other factors. In some implementations, the cost function f is a weighted combination of differences between the target constraint value for each metric variable and the computed value for each metric variable, such as Σ.sub.iw.sub.i (y.sub.i−c.sub.i), where the w.sub.i are the weights, the y.sub.i are the measured metric values, and the c.sub.i are the target constraint values.
[0065] Using the data from the cost function, the process applies (316) machine learning to build a model. In the example illustrated in this flowchart, the machine learning uses (316) a random forest (RF) of decision trees according to the samples of the design variables and the values of the cost function. The cost function simplifies the analysis by converting the data for multiple metrics into a single number (a single dimension rather than many dimensions). Some implementations fit the data according to non-Bayesian algorithms, using memoryless evolutionary optimization, such as genetic algorithms, coupled simulated annealing algorithms, or differential evolution algorithms.
[0066] Among the samples, the process identifies (318) a vector (i.e., a sample point) that creates the smallest residual (i.e., the smallest value for the cost function). The process uses the random forest and bootstraps to assess variability and confidence in the region around this vector. In particular, the process calculates (320) the statistical power at this sample point to determine the sample size n for the next Latin cube. Some implementations use Gaussian-based analysis to perform these steps. Using a Gaussian distribution often works well due to the central limit theorem. Some implementations use the following equation for a Gaussian distribution to determine the next sample size n:
where MSE=mean squared error of the model (e.g., using bootstrap on a random forest), Z.sub.α=Gaussian quantile at the alpha confidence level, Z.sub.β=Gaussian quantile at the beta power level, and error=a constant specified by a user to indicate how small of a difference is important. The MSE specifies the confidence.
[0067] As an alternative, some implementations use a non-Gaussian distribution based on a binomial, which requires more complex equations.
[0068] Once the variability and confidence are known, the process shrinks (322) the trust region. In some implementations, the shrinkage is according to a user supplied learning rate multiplier. The shrinkage for each of the design variables is according to variability of the cost function for that design variable. The shrinkage creates (322) a new trust region centered at the best vector, and the region is generally not a hypercube. Using this new trust region and the determined sample size, the process generates (324) a new Latin hypercube of samples, and then iterates the calculations (starting at box 308) for the new Latin hypercube.
[0069] During this iterative process, the vector having the least residual can be (326) at a wall (aka border) of the trust region. When this occurs, the optimal solution may be outside the trust region, so the trust region is expanded (326) beyond the wall (e.g., expand in a direction normal to the boundary).
[0070] The iterative process continues (328) until the best vector remains the same across multiple iterations. Some implementations specify a predetermined number of generations of unchanging best vector before the process stops. For example, some implementations set the predetermined number of iterations at 3, 5, or 10.
[0071] Once the process is complete, data from the selected optimization method (e.g., data from the final random forest) is used to provide a user with analytic data about the design and metric variables. Some examples of output are illustrated by the graphics in
[0072]
[0073] Once the covariates are controlled, the process calculates (354) uncertainty at the best vector (based on the random forest). For this best vector, the process identifies (356) all permutations of the categorical variables. The number of categorical variables is usually small, so the number of permutations is not large. Then, the process applies (358) the random forest model with each permutation to assess the probability of the permutation producing a result that is better than the current best. For this step, the process essentially builds new samples that retain all of the non-categorical variables from the best vector, and include each of the permutations of the categorical variables. Each of these new samples is assessed according to the machine learning model (e.g. random forest) to estimate the probability of producing a better result than the current best vector.
[0074] In general, there will not be the same number of sample points corresponding to each categorical value. Therefore, the process scales (360) standard error by the number of data points in each category in order to avoid mistakenly biased factors that have been downweighted. In some implementations, the probabilities are scaled (362) so that the sum of all the probabilities is 1. The process then uses (364) the probabilities as sampling rates. For example, if a certain categorical value has a higher probability of leading to a better result, it will be used in the sampling at a higher frequency than other categorical values.
[0075] The process uses (366) the computed sampling rates in the generation of the next Latin hypercube, which splits the levels of each category in proportion to the sampling rates. In other words, take the Latin hypercube, which is setup for continuous values and “bin” the samples based on the levels in proportion to the probability values calculated by the random forest. The process then repeats (368) until it converges, as described in
[0076] The calculations for the thermal-hydraulics engine 132, the mechanical stability analysis engine 134, and the neutronics engine 136 can be performed at varying levels of fidelity. In some implementations, there are three broad levels of fidelity. At the lowest level of fidelity, samples are tested for gross fitness, and many such samples can be discarded quickly because they are not even close to satisfying the constraints. At the second level of fidelity, calculations are slower, but reasonably accurate. By limiting the number of sample points that use the second level of fidelity, the overall algorithm is able to create design spaces in a matter of hours.
[0077] The third level of fidelity is applied outside the scope of the disclosed processes. This highest level of fidelity is used to validate a design so that it meets the strict requirements specified by government regulations. Because the disclosed processes produce very good design spaces, it is expected that the highest fidelity algorithms (which may take weeks or months to run) will validate designs that are within the generated design space.
[0078] Typical high fidelity CFD calculations incorporate the interaction between fluids and solid conduction through a conjugate heating calculation. However, CFD calculations resolve the fluid boundary layer, which can be a time consuming calculation. Instead a 3D FEM conduction mesh can be coupled with a FDM 1D axial flow or 1D porous media calculation to accelerate convergence of a physics solution. Coupling is achieved by implementing a pseudo-time step and iterating on the FEM/FDM convergence. Many design calculations desire either pressure equalization across all flow channels or flow must be searched on to achieve a desired boundary condition criteria. This flow searching or pressure equalization can also be a very time consuming process. Thus, leveraging the use of a pseudo-time step the code iteratively modifies while converging on a solution to both achieve a converged physics solution and obtain the goal of flow searching or pressure equalization. This process results in a significant decrease in run time overall. The same concept used to search on flow can also be applied to fluid density, static pressure, fluid temperatures, power, or any other parameter in the thermal-hydraulics or structural physics calculation.
[0079] Neutronics calculations can also utilize an FEM mesh to determine neutronics performance for complex designs not based on “primitive” shapes.
[0080]
[0081] In
[0082]
[0083]
[0084]
[0085] Some implementations also include batch correlations and plots showing what happened in the previous iteration. In this way, as the optimization is running, users can see what is going on (rather than waiting until the optimization is complete).
[0086] The terminology used in the description of the invention herein is for the purpose of describing particular implementations only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
[0087] The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various implementations with various modifications as are suited to the particular use contemplated.