Decisions with big data

11501042 · 2022-11-15

Assignee

Inventors

Cpc classification

International classification

Abstract

This invention presents a framework for applying artificial intelligence to aid with product design, mission or retail planning. The invention outlines a novel approach for applying predictive analytics to the training of a system model for product design, assimilates the definition of meta-data for design containers to that of labels for books in a library, and represents customers, requirements, components and assemblies in the form of database objects with relational dependence. Design information can be harvested, for the purpose of improving decision fidelity for new designs, by providing such database representation of the design content. Further, a retrieval model, that operates on the archived design containers, and yields results that are likely to satisfy user queries, is presented. This model, which is based on latent semantic analysis, predicts the degree of relevance between accessible design information and a query, and presents the most relevant previous design information to the user.

Claims

1. An apparatus for predictive analytics, one that utilizes vector entities archived from past design or planning projects, for the purpose of facilitating decision making on new design or planning projects, the apparatus comprising; one or more processors; memory coupled to the one or more processors and storing instructions, which, when executed by the one or more processors, causes the one or more processors to perform operations comprising: operations for accessing a database for storing the vector entities, with one vector entity representing an input vector from a past design or planning project, and another vector entity representing a corresponding known good output vector, with the known good output vector being defined as an output vector that fulfills original requirements imposed on the archived past design or planning project, operations involving a system model, where the system model relates the output vector to the input vector, where the system model is derived from the archived vector entities, and where inputs to the system model represent intended features but outputs represent observed features, where the system model supports input vectors with elements taking on continuous or categorical values, where the values are confined to a range or not, and where the system model approximates a generic continuous-valued system model with a two-layer feed-forward neural network, operations for prediction, to produce a guiding reference for a new design or planning project, where the guiding reference is obtained as output from the system model when applying an input vector from the new design or planning project as input; the database with relational dependence between objects, to support fast and efficient searches, even in case of large design or planning project archives, with fast and efficient searches being defined in terms of speed improvements resulting from ability of relational databases to offer indexed searches with sub-linear-time lookups, compared to linear-time lookups for some non-relational databases, and with the large design or planning project archives defined as consisting of at least hundreds of design or planning projects archived, wherein the database has hierarchical structure, wherein the database is capable of storing at least hundreds of design or planning projects.

2. The apparatus according to claim 1, an apparatus wherein the composition of said vector entities includes the input vector, which captures product design specification, and which further comprises a requirement object associated with a customer object, the output vector, which captures design data observed, and which further comprises component objects, sub-assembly objects, assembly objects, component or assembly design options, descriptors related to the properties of the component or assembly objects, metadata related to hierarchical dependence between the component and assembly objects, metadata related to design risks identified, metadata related to mitigation of design risks, metadata related to calculations or analysis of design components, sub-assemblies or assembles, metadata related to results from design testing, metadata related to requirement validation, metadata related to a bill of material for the design, metadata related to manufacturing option for the design, metadata related to cost of the design or metadata related to design reviews or design documentation, or metadata related to revisions of the design.

3. The apparatus according to claim 2, wherein the definition of the customer object includes a specification of a unique identifier for the customer, a specification of a name for the customer.

4. The apparatus according to claim 2, wherein the definition of the requirement object includes a specification of a unique identifier for the requirement, a specification of a name for the requirement, a specification of a customer with whom the requirement is associated.

5. The apparatus according to claim 2, wherein the definition of the assembly objects includes a specification of a unique identifier for each assembly object, a specification of a name for each assembly object, a specification of the requirement, or requirements, that each assembly object addresses.

6. The apparatus according to claim 2, wherein the definition of the component objects includes a specification of a unique identifier for each component object, a specification of a name for each component object, a specification either of the requirement, or requirements, that each component object addresses, or of the assembly, sub-assembly, assemblies or sub-assemblies, that the component corresponding to each component object is a part of.

7. The apparatus according to claim 2, wherein the objects in the database with the hierarchical structure include a master assembly object, that specify the requirements that the master level object addresses, sub-assembly objects and components, that depend on the master assembly object, and that specify the requirements that the sub-assembly or component objects address, sub-sub-assembly objects and components, that depend on the sub-assembly objects, and that specify the requirements that these sub-sub-assembly or component object address, subsequent levels of hierarchical representation of assembly objects and components, together with the requirements that these objects address, as needed to represent a design with the accuracy and detail desired.

8. The apparatus according to claim 2, wherein the database is structured such that component options and risks are programmed into the database based on existing engineering knowledge.

9. The apparatus according to claim 2, wherein the apparatus interfaces with APACHE HADOOP, APACHE SPARK, APACHE FUNK, or another software framework for distributed processing or storage of large data sets, through application program interfaces provided by the software frameworks.

10. A method for predictive analytics, one that utilizes vector entities archived from past design or planning projects, for the purpose of facilitating decision making on new design or planning projects, a method that further utilizes a database access step, for accessing the vector entities, with one vector entity representing an input vector from a past design or planning project, and another vector representing a corresponding known good output vector, with the known good output vector being defined as an output vector that fulfills original requirements imposed on the design or planning project archived, with relational dependence between objects stored in the database accessed, to support fast and efficient searches, even in case of large design or planning project archives, with fast and efficient searches being defined in terms of speed improvements resulting from ability of relational databases to offer indexed searches with sub-linear-time lookups, compared to linear-time lookups for some non-relational databases, and with the large design or planning project archives defined as consisting of at least hundreds of design or planning projects archived, wherein the database has hierarchical structure, wherein the database is capable of storing at least hundreds of design or planning projects; a system model access step, for relating the output vector to the input vector, where the system model is derived from the archived vector entities, and where inputs to the system model represent intended features but outputs represent observed features, the system model supports input vectors with elements taking on continuous or categorical values, where the values are confined to a continuous range or not, the system model approximates a generic continuous-valued system model with a two-layer feed-forward neural network, a prediction step, for producing a guiding reference for a new design or planning project, a reference obtained as output from the system model when applying a vector from the new design or planning project as input.

11. The A method according to claim 10, wherein the composition of said vector entities includes the input vector, which captures product design specification, and which further comprises a requirement object associated with a customer object, the output vector, which captures design data observed, and which further comprises component objects, sub-assembly objects, assembly objects, component or assembly design options, descriptors related to the properties of the component or assembly objects, metadata related to hierarchical dependence between the component and assembly objects, metadata related to design risks identified, metadata related to mitigation of design risks, metadata related to calculations or analysis of design components, sub-assemblies or assembles, metadata related to results from design testing, metadata related to requirement validation, metadata related to a bill of material for the design, metadata related to manufacturing option for the design, metadata related to cost of the design or metadata related to design reviews or design documentation, or metadata related to revisions of the design.

12. The method according to claim 11, wherein the definition of the customer object includes a specification of a unique identifier for the customer, a specification of a name for the customer.

13. The method according to claim 11, wherein the definition of the requirement object includes a specification of a unique identifier for the requirement, a specification of a name for the requirement, a specification of a customer with whom the requirement is associated.

14. The method according to claim 11, wherein the definition of the assembly objects includes a specification of a unique identifier for each assembly object, a specification of a name for each assembly object, a specification of the requirement, or requirements, that each assembly object addresses.

15. The method according to claim 11, wherein the definition of the component objects includes a specification of a unique identifier for each component object, a specification of a name for each component object, a specification either of the requirement or requirements, that each component object addresses, or of the assembly, sub-assembly, assemblies or sub-assemblies, that the component corresponding to each component object is a part of.

16. The A method according to claim 11, wherein the objects in the database with the hierarchical structure include a master assembly object, that specify the requirements that the master level object addresses, sub-assembly objects and components, that depend on the master assembly object, and that specify the requirements that the sub-assembly or component objects address, sub-sub-assembly objects and components, that depend on the sub-assembly objects, and that specify the requirements that these sub-sub-assembly or component object address, subsequent levels of hierarchical representation of assembly objects and components, together with the requirements that these objects address, as needed to represent a design with the accuracy and detail desired.

17. The A method according to claim 11, wherein the database is structured such that component options and risks are programmed into the database based on existing engineering knowledge.

18. The method according to claim 10, wherein the method employs comparison of distribution of each data variable, either by using quantiles or another statistical test, to see if the variations are significantly different, or counting of occurrence of each label or category and comparing, or a distance measure, consisting of the Mahalanobis distance, a Euclidean distance, a Manhattan distance, a Minkowski distance, a Chebyshev distance, a Hamming distance, a Chi-square distance, a cosine similarity, a Jaccard similarity, a Sorensen-Dice index, a global distance metric or a local distance metric, and looking for big changes, or applying an absolute difference between new and old data, setting a threshold, and reporting everything exceeding the threshold, or a multi-dimensional technique, comprising a correlation matrix, principal components, relevant components, clustering, k-nearest neighbors, k-means or looking for changes, or a statistical or machine learning model, specialized for anomaly detection, consisting of supervised learning, unsupervised learning, regression, dimensionality reduction, ensemble methods, neural networks, principal component analysis, decision trees, support vector machines, t-distributed stochastic neighbor embedding, isolation forests, random forests, naïve Bayes, peer group analysis, break point analysis, transfer learning, or reinforcement learning.

19. The method according to claim 10, wherein the method utilizes an application program interface access step, for generation of an inspection plan from recognized features.

Description

DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 outlines the framework for applying predictive analytics to specific areas of engineering product design, such as mechanical design. The same framework can be extended to other areas of engineering design. The input requirement vector (“features desired”) is expected to stay largely the same, but the composition of the output vector (“features observed”) will likely be revised in accordance with the new output.

(2) FIG. 2 presents a generic method for associating data files from Detailed Design with the engineering requirements. Such a method is needed, for the purpose of informing the automatic assessment which data files to look at in order to obtain information needed for assessment of given engineering requirements.

(3) FIG. 3 presents the schematics of a logistic regression classifier in the form of a single-layer feed-forward neural network.

(4) FIG. 4 offers a neural network representation of Kolmogorov's theorem. A universal transformation M maps R, into several uni-dimensional transformations (Hassoun 1995). We draw upon Kolmogorov's theorem to illustrate generality of our system model. Kolmogorov's theorem provides theoretical foundation for the capabilities of two-layer feed-forward neural network in terms of approximating any continuous function.

(5) FIG. 5 presents schematics of a multi-layer perceptron network.

(6) FIG. 6a and FIG. 6b provide visualization of the convex nature of the multivariate linear regression problem, for deriving the system model, when one or more input requirements are limited to a single continuous range (a range limited by a minimum and a maximum value).

(7) FIG. 7 highlights parallels between the strike mission planning process and the design process. The user interface for multi-strike mission planning system may be crafted through minor modification of the Ecosystem interface for engineering design.

(8) FIG. 8 highlights the cost savings that can be derived from early identification of design oversights, in the context of the V-model for system engineering.

(9) FIG. 9 illustrates how the Ecosystem design framework can provide decision support across mission domains.

(10) FIG. 10 presents key elements from a generic retail management process.

(11) FIG. 11 presents further specifics for a generic inventory management system. The big data framework can help with the forecasting of category sales (Step 1).

(12) FIG. 12 provides an example of customer database objects. This drawing is adapted from (SteingrimssonYi 2017), but with additional context provided. The database objects for mechanical product design, along with their corresponding attributes, are defined based on function.

(13) FIG. 13 offers an example of a requirement database object, together with its relational association with a customer object. Here we assume, for simplicity, that a given requirement originates from a single customer.

(14) FIG. 14a and FIG. 14b provide an illustration of a design database with assembly and component objects, together with its relational association with a requirement object. Here we assume that a component or assembly object can address many requirements, and that given requirement may appear in many components or assemblies. Again, the database objects for mechanical product design, along with their corresponding attributes, are defined based on function.

(15) FIG. 15 illustrates how a master assembly object (a Level 0 object), along with the requirements that it addresses, can be represented in relation with sub-assemblies and components (Level 1 objects), along with the requirements that the sub-assemblies or components address. FIG. 15 similarly shows how the Level 1 objects can be represented in relation to corresponding Level 2 objects (together with the requirements that the Level 2 objects address). FIG. 15 further provides an illustration of how component options and risks, for an overall design, can be programmed into databases based on existing engineering knowledge (for example, machine design text awareness of risks for certain components or uses). Analogous to FIG. 12-FIG. 14, this drawing is replicated from (SteingrimssonYi 2017), but with additional context provided.

(16) FIG. 16 provides an illustration of the process of retrieving previous e-design history relevant to a user query. We refer to this process as the querying engine. FIG. 16 also clarifies the context between the querying engine and the predictive analytics framework in FIG. 1.

(17) FIG. 17 illustrates how the big data framework can be applied to the Concept Design stage of a project involving the design of a single-person Go Kart lift stand.

(18) FIG. 18 presents detailed explanations related to application of the process of retrieving previous e-design history relevant to the user input query from FIG. 16.

(19) FIG. 19 outlines the high-level framework for applying predictive analytics to automated planning of aerial strike missions.

(20) FIG. 20 presents detailed explanations related to application of the process of retrieving previous e-design history relevant to an input query from an aerial strike mission planner (an input query associated with FIG. 19).

(21) FIG. 21 describes application of the framework for predictive analytics to automated surface or underwater mission planning.

(22) FIG. 22 presents detailed explanations related to application of the process of retrieving previous e-design history relevant to an input query from a surface or underwater mission planner (an input query associated with FIG. 21).

(23) FIG. 23 presents application of the framework for predictive analytics to retail planning and supply chain management.

(24) FIG. 24 presents the machine learning framework as a high-level framework of learning methods overlooking JMPS. The big data framework provides advice to JMPS, based on outcomes and performance data provided.

(25) For the case of surface or underwater mission planning, FIG. 25 presents the big data framework as a high-level framework, containing machine learning methods that interface with the Shipboard MEDAL, the CMWC MEDAL, and the data warehouse.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

1. Definitions

(26) Table 2 captures the primary acronyms used in the patent.

(27) TABLE-US-00002 TABLE 2 Summary of the primary definitions and acronyms. Name Definition AI Artificial Intelligence CAG Commander, Air Group CNN Convolutional Neural Network DCC Dewey Decimal Classification DL Deep Learning DOD Department of Defense FLOP Floating Point Operation IR Information Retrieval JMPS Joint Mission Planning System LIDAR Light Imaging Detection And Ranging LMS Lease Mean Square LSA Latent Semantic Analysis LSI Latent Semantic Indexing MEDAL Mine Warfare Environmental Decision Aid Library ML Machine Learning OEM Original Equipment Manufacturer PDF Probability Density Function PDS Product Design Specification PLM Product Lifecycle Management SLAM Simultaneous Localization and Mapping SVD Singular Value Decomposition SW Software

(28) We define artificial intelligence as the use of computers to mimic the cognitive functions of humans. When machines carry out tasks based on algorithms in an “intelligent” manner, that is AI. Artificial intelligence is a broader concept than machine learning (DataScienceCentral 2018).

(29) We define machine learning as a subset of AI that focuses on the ability of machines to receive a set of data and learn for themselves, and change algorithms as they learn more about the information that they are processing (DataScienceCentral 2018).

(30) We refer to deep learning as a subset of machine learning. We define deep learning in terms of “deep neural networks”, i.e., neural networks comprising of two or more layers. Deep learning networks need to see large quantities of items in order to be trained (DataScienceCentral 2018).

(31) We define a “known, good design” as a design that has satisfied all the requirements. This is a design that has the requirement vector fully specified, and where the design produced (the output vector) fulfills the design requirements.

(32) Supervised learning is a data mining task that involves inference of a function from labeled training data.

(33) Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses.

(34) Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.

2. Best Mode of the Invention

(35) FIG. 1, FIG. 14, FIG. 15, FIG. 16, FIG. 19, FIG. 21 and FIG. 23 capture the best mode contemplated by the inventors, according to the concepts of the present invention.

3. The Generic System Model Assumed

(36) We assume a generic system model:
{tilde over (y)}=ƒ({tilde over (x)}).  (1)

(37) The input vector, {tilde over (x)}, could be considered as consisting of design variables (design targets or requirements). The transformation, ƒ( ) could be a non-linear function of the input, {tilde over (x)}. Engineers transfer the requirements, {tilde over (x)}, into the product, {tilde over (y)}, through the transformation. The transformation, ƒ( ) may contain reasoning and knowledge on how to make {tilde over (y)} (on how to design or produce {tilde over (y)}). To other extent, we treat the making of {tilde over (y)} from {tilde over (x)} as a black box. In the case of the design process, the transformation, ƒ( ), may consist of customers, requirements, systems and assembly. We present artificial intelligence and supervised learning as one of the options to train the system, as explained below.

(38) The input vector, {tilde over (x)}, could capture design criteria, such as the desired weight, width, height and length of an automotive part (the “intended features”). The elements of the product vector, {tilde over (y)}, could capture performance of the finalized part, or even ideas or options relevant to specific design stages (the “observed features”). The elements of {tilde over (x)}, and even {tilde over (y)}, could be derived from the 24 categories listed in Table 1.

(39) For clarification, refer to the examples below. It is assumed that the organization adopting the invention has practiced structured capture of the binders ({tilde over (x)}, {tilde over (y)}) from past projects, for example in SW like the Ecosystem (SteingrimssonYi 2017), (Steingrimsson 2014). The binders from the past projects may, for example, have been archived in an internal database.

(40) This invention offers a scalable solution, depending on the size of the input data. In the case of small (design) databases, we present regression as a suitable tool for determining the system model. For large (design) databases, say, hundreds, thousands or millions of ({tilde over (x)}, {tilde over (y)}) duplets, we present neural networks as a suitable tool for determining the system model.

(41) 3.1 Multivariate Linear Regression

(42) In case of archived binders of relatively small-to-modest size, or even of moderately large size, we recommend applying a multivariate (multiple) linear regression model: Drawing upon FIG. 1, we model the archived binders, in this case, as
{tilde over (y)}.sub.ik=b.sub.0k+Σ.sub.j=1.sup.pb.sub.jk{tilde over (x)}.sub.ij+e.sub.ij  (2)
for i∈{1, . . . , n} and k∈{1, . . . , m}. Here,

(43) {tilde over (y)}.sub.ik∈R is the k-th real-valued response for the i-th observation.

(44) b.sub.0k∈R is the regression intercept for k-th response.

(45) b.sub.jk∈R is the j-th predictor's regression slope for k-th response.

(46) (e.sub.i1, . . . , e.sub.im)˜N(0.sub.m,Σ) is a multivariate Gaussian error vector.

(47) The archived binders can be stacked up into a matrix and represented as

(48) [ y ~ 11 .Math. y ~ 1 m y ~ 21 .Math. y ~ 2 m y ~ 31 .Math. y ~ 3 m .Math. .Math. y ~ n 1 .Math. y ~ nm ] = [ 1 x ~ 11 x ~ 12 .Math. x ~ 1 p 1 x ~ 21 x ~ 22 .Math. x ~ 2 p 1 x ~ 31 x ~ 32 .Math. x ~ 3 p 1 .Math. .Math. .Math. 1 x ~ n 1 x ~ n 2 .Math. x ~ np ] [ b 01 .Math. b 0 m b 11 .Math. b 1 m b 21 .Math. b 2 m .Math. .Math. b p 1 .Math. b pm ] + [ e 11 .Math. e 1 m e 21 .Math. e 2 m e 31 .Math. e 3 m .Math. .Math. e n 1 .Math. e nm ] ( 3 ) Y ~ = X ~ B + E . ( 4 )

(49) The ordinary least squares problem is
min.sub.B∈R.sub.(p+1)xm∥{tilde over (Y)}−{tilde over (X)}B∥.sup.2  (5)
where ∥•∥ denotes the Frobenius norm (Helwig 2017).

(50) The ordinary least squares problem has a solution of the form (Helwig 2017)
{circumflex over (B)}=({tilde over (X)}.sup.T{tilde over (X)}).sup.−1{tilde over (X)}.sup.T{tilde over (Y)}  (6)

(51) A new project will give rise to the input vector x from which the guiding reference y is obtained as
y={circumflex over (B)}.sup.Tx.  (7)

(52) Note that the multivariate regression offers a deterministic solution (no convergence problems).

(53) 3.2 Other Regression Analyses Available

(54) While multivariate linear regression may be a natural choice, in case regression analysis is preferred for determining the system model, in particular for continuous and real-valued inputs, there are other options available. These include:

(55) 1. Polynomial regression (WikiPolynomialRegression 2018).

(56) 2. Logistic regression (WikiLogisticRegression 2018).

(57) 3. Multinomial logistic regression (WikiMultinomialLogisticRegression 2018).

(58) 4. Ordinal regression (WikiOrdinalRegression 2018).

(59) 5. Other types of nonlinear regression (PennState 2018), (WikiNonlinearRegression 2018).

(60) 3.3 AI Predictor (Neural Network)

(61) Neural networks are typically preferred over linear regression in cases when the input data size is very large. Neural networks may be used for image classification, for example to detect an event or artifact in a sample of images, each of which may contain approx. 10.sup.6 pixels, each with, say, 64 color values. Neural networks are similarly suitable for identification of events in sampled audio, where the input may occupy gigantic space of time and frequency. Neural networks may be preferred, in the case of a very large set of archived binders.

(62) Neural networks currently used in machine learning include Feed-Forward Neural Networks, Radial Basis Function Neural Networks, Kohonen Self-Organizing Neural Networks, Recurrent Neural Networks, Convolutional Neural Networks and Modular Neural Networks. Other neural networks supported by this invention include Deep Feed Forward Networks, Long/Short Term Memory, Gated Recurrent Units, Auto Encoders, Variational Auto Encoders, Denoising Auto Encoders, Sparse Auto Encoders, Markov Chains, Hopfield Networks, Boltzmann Machines, Restricted Boltzmann Machines, Deep Belief Networks, Deep Convolutional Networks, Deconvolutional Networks, Deep Convolutional Inverse Graphics Networks, Generative Adversial Networks, Liquid State Machines, Extreme Learning Machines, Echo State Networks, Deep Residual Networks, Kohonen Networks, Support Vector Machines and Neural Turing Machines.

(63) Even so, the main emphasis here is on Single and Two-Layer Feed-Forward Neural Networks. We show how Two-Layer Feed-Forward Neural Networks can approximate any system model.

(64) For simplicity, it is assumed that input parameters take on values from a continuous range.

(65) 1. Single-Layer Feed-Forward Neural Network

(66) In a feed-forward neural network, the connections between the nodes do not form a cycle. The simplest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above a given threshold (typically 0) the neuron fires and takes on the activated value (typically 1). Otherwise, it takes on the deactivated value (typically −1). Perceptrons can be trained by a simple learning algorithm that is usually referred to as the Delta Rule (see (16)). It calculates the errors between calculated output and sample output data, and uses this to create an adjustment to the weights, thus implementing a form of gradient descent. Single-layer perceptrons are only capable of learning linearly separable patterns.

(67) A common choice for the activation function is the sigmoid (logistic) function:

(68) f ( z ) = 1 1 + e - z . ( 8 )
With this choice, the single-layer network shown in FIG. 3 is identical to logistic regression.

(69) A single-layer neural network has guaranteed convergence (equivalent to regression).

(70) 2. Multi-Layer Feed-Forward Neural Network

(71) It may be recognized that a straight forward application of a single-layer neural network model cannot accommodate all requirements. To handle binary requirements (simple presence or absence), XOR-like conditions, or categorical requirements, a two-layer neural network may be necessary (Duda 2001).

(72) A multi-layer feed-forward neural network, shown in FIG. 5, consists of multiple layers of computational units, interconnected in a feed-forward fashion. Each neuron in one layer has directed connections to the neurons of the subsequent layer. In many applications the units of these networks apply the sigmoid function (8) as the activation function (WikiFFNeuralNet 2018). Similar to the single-layer case, the activation is defined as
a.sub.j.sup.i=ƒ(Σ.sub.kw.sub.ikx.sub.ikx.sub.kj+b.sub.i).  (9)

(73) Convergence of a multi-layer neural network involves non-convex optimization, and hence is not guaranteed. You can get stuck in local minima.

(74) 3.4 Approximation Capabilities of Feed-Forward Neural Networks for Continuous Functions

(75) According to Kolmogorov, any continuous, real-valued function can be modeled in the form of a two-layer neural network. More specifically, Kolmogorov showed in 1957 that any continuous real-valued function ƒ(x.sub.1, x.sub.2, . . . , x.sub.n) defined on [0,1]n, with n≥2, can be represented in the form
y=ƒ(x.sub.1,x.sub.2, . . . ,x.sub.n)=Σ.sub.j=1.sup.2n+1g.sub.j(Σ.sub.i=1.sup.nϕ.sub.ij(x.sub.i))  (10)
where the g.sub.j's are properly chosen functions of one variable, and the ϕ.sub.ij's are continuously monotonically increasing functions independent of ƒ. FIG. 4 offers a neural network representation of Kolmogorov's theorem (Hassoun 1995).

(76) While users are welcome to use neural networks of three or more layers, we recommend limiting the networks to two layers, for complexity sake. Yet, even with two layers, this invention can approximate a generic system model, per Kolmogorov's theorem.

(77) 3.5 Multivariate Linear Regression when the Input Requirements Defined in Terms of Ranges

(78) When one or more of the inputs is limited to a single, continuous range
x.sub.j∈[x.sub.j,min,x.sub.j,max]  (11)
the multivariate linear regression problem can be formulated as

(79) min B R ( p + 1 ) xm X ~ Conv ( P ) .Math. R ( p + 1 ) .Math. Y ~ - X ~ B .Math. 2 . ( 12 )

(80) Here, Conv(P).Math.R.sup.(p+1) is the convex hull defined by the (p+1) elements comprising the vector {tilde over (X)}.

(81) It is important to recognize that when any given element of the input vector {tilde over (X)} is confined to a single, continuous range, the resulting vector subset forms a convex polytope in R.sup.(p+1). FIG. 6 illustrates, for a simple case, that you can travel between any given points in the polytope, and yet stay within the set.

(82) While the optimization problem in (12) may not have a closed-form solution, the fact that it is convex means that you can generate a solution using efficient interior-point solvers, with polynomial worst-case complexity (WikiInteriorPoint 2018):
O((p+1).sup.3.5).  (13)
3.6 Assigning Importance Levels (Priorities) to Specific Requirements

(83) The optimization problem in (12) can be extended not only to support ranges, but also priorities assigned to specific requirements:

(84) min B R ( p + 1 ) xm X ~ Conv ( P ) .Math. R ( p + 1 ) .Math. Y ~ - X ~ W ~ B .Math. 2 . ( 14 )

(85) Here {tilde over (W)} is a p×p diagonal matrix with element w.sub.jj specifying the priority associated with input requirement j, x.sub.j. The requirements with higher priority receive higher weight. The optimization problem is still linear and convex.

(86) 3.7 Sensitivity Analysis

(87) In case it is of interest, the interior-point methods, that can be used to solve the optimization problems in (12) and (14), can also inform the designers of the relative contributions of given input requirements to the objective function. The sensitivity to a given input requirement is defined as the derivative of the objective function with respect to that requirement. The sensitivities come about as Lagrange multipliers that are produced as a by-product from the interior-point solvers. While you already may be hitting a boundary, the derivatives may inform designers about variables that still can change within the feasible set.

(88) 3.8 Input Requirements Defined in Terms of Categories

(89) Once an input requirement consists of categories, the optimization problem automatically becomes NP-hard, meaning the run time of a solver is no longer guaranteed to be deterministically polynomial in the size of the problem. For categorical (discrete) input, there may be cases that can be solved quickly. But in general the problem may be subjected to exponential worst-case complexity (WikiNpHard 2018).

4. The Generic Processes Assumed

(90) 4.1 Design Process

(91) It is assumed that a classical design process consists of the following stages: Requirement Gathering.fwdarw.Concept Design.fwdarw.Detailed Design.fwdarw.Final Design.

(92) Such process is modeled in the Ecosystem SW (SteingrimssonYi 2017), (Steingrimsson 2014). The customers, customer requirements and corresponding engineering requirements are defined, as a part of the Requirement Gathering, and captured in the Product Design Specification. The Concept Design consists of brainstorming, concept design analysis (scoring) and design selection. The Detailed Design may capture detailed analysis of both the overall system and associated subsystems. Final Design is usually preparation for prototype building or production, and may include steps such as testing and requirement validation (SteingrimssonYi 2017), (Steingrimsson 2014).

(93) 4.2 Strike Mission Planning Process

(94) 1. Similarities with the Design Process

(95) Mission and strike planning are complex processes, integrating specific performance characteristics for each platform into a comprehensive mission.

(96) As FIG. 7, Table 3 and Table 4 illustrate, the Navy's tactical aircraft strike planning process provides high degree of resemblance with the engineering design process presented in (Steingrimsson 2014) and (SteingrimssonYi 2017). Hence, the Ecosystem interface of (Steingrimsson 2014) and (SteingrimssonYi 2017) can be applied to the strike planning process, with relatively modest alterations.

(97) TABLE-US-00003 TABLE 3 Similarities between the design process and the mission strike planning process. Design Process Strike Planning Cycle Requirement Gathering 1. Receive tasking 2. Task strike teams Concept Design 3. Brainstorm rough plan 4. Brief CAG Detailed Design 5. Create detailed plan 6. Conduct briefings Final Design 7. Execute the mission 8. Gather bomb damage assessment

(98) TABLE-US-00004 TABLE 4 Similarities between the Ecosystem software for engineering design and the Navy strike mission planning. Ecosystem Navy Strike Mission Planning Design process Tactical air craft strike planning process Design decision Planning decision Design project Combat search & rescue mission Requirement gathering Mission planning Project sponsor National Command Authority, Joint Chiefs of Staff, Commanders in Chief (CAG) Supervisor or instructor Strike leader Design team formation Strike team formation Ecosystem: High-level SW JMPS: High-level SW for joint planning for design process
planning
2. Differences 1. The typical Military Personnel Center conducts mission planning from start to finish in an approx. 8 hour window, whereas engineering design projects are usually completed over a period of consisting of a few to several months or even years. 2. In combat, the crew that plans is typically not the crew that flies it, so decisions are sometimes made to enable as much flexibility as possible. 3. In combat, the crew that plans is typically NOT the crew that executes it, so decisions are sometimes made to enable as much flexibility as possible and delegate the decision authority to the lowest possible level (i.e. the person in the cockpit in the mission). 4. In significant contrast to the goal of a design process, the end result of a mission is usually fairly different from the way it was envisioned in the plan. This is due to the fact that the enemy has a vote in how the fight takes place, and you are dealing with severely imperfect assumptions that affect the battle space. The input assumptions may be subjected to considerable time dependence.
4.3 Surface and Underwater Mission Planning
1. Similarities with the Design Process

(99) The Ecosystem supports the “V” model of system engineering, shown in FIG. 8, and aligns with Navy's thrust in the area of Surface Warfare Mission Engineering and Analysis. In addition to providing decision support for all phases in the design process, FIG. 9 and Table 5 show how the Ecosystem can also support each of the Navy mission domains.

(100) TABLE-US-00005 TABLE 5 Similarities between the design process and the Navy mission domains. Design Process Navy Mission Domain Requirement Gathering Plan Concept Design Detect Detailed Design Control Final Design Engage Assess
2. Differences 1. Just as in the case of the strike mission planning process, a Military Personnel Center may conduct mission planning from start to finish in an approx. 8 hour window, whereas engineering design projects may be completed over a period of consisting of a few to several months or even years. 2. Furthermore, in combat, the personnel that plans is typically not the personnel that executes the mission, so decisions are often made with flexibility in mind. 3. In severe contrast to the goal of a design process, the end result of a mission may differ significantly from the way it was envisioned in the plan. This is due to the fact that the enemy has a vote in how a fight takes place, and you may be dealing with imperfect assumptions that affect the battle space. 4. Tactics may need to be adapted to account for changes in environmental conditions.
4.4 Retail Planning Process

(101) FIG. 10 and FIG. 11 summarize a retail planning process. Table 6 presents a mapping between the retail planning processing and the design process.

(102) TABLE-US-00006 TABLE 6 Mapping between the design process and the retail planning process. Design Process Retail Planning Process Requirement Gathering 1. Forecast category sales 2. Develop an assortment plan Concept Design 3. Determine appropriate inventory levels and product availability 4. Develop a plan for managing the inventory Detailed Design 5. Allocate merchandise for the stores 6. Buy merchandise Final Design 7. Monitor and evaluate performance and make adjustments

(103) We apply the big data framework to the retail planning process by harvesting analogies from Table 6, in a similar fashion as we harvested analogies from Table 3 and Table 5, in order to extend the big data framework from the design process to the mission planning process. To this effect, we define the input and output vectors as follow:
x=[Forecasted_sale_category1_store1,Forecasted_sale_category2_store1, . . . ,Forecasted_sale_category1_store2,Forecasted_sale_category2_store2, . . . ,Forecasted_sale_category1_store3,Forecasted_sale_category2_store3, . . . ]  (15)
y=[Observed_sale_category1_store1,Observed_sale_category2_store1, . . . ,Observed_sale_category1_store2,Observed_sale_category2_store2, . . . ,Observed_sale_category1_store3,Observed_sale_category2_store3, . . . ]  (16)

6. How to Make the Invention

(104) 6.1 Archived Binders

(105) The archived project binders in FIG. 1, FIG. 19, FIG. 21 and FIG. 23 consist of past project binders, and are taken to represent known good designs, known good mission plans, or known good retail plans (projects with all requirements fulfilled). The input vector, {tilde over (x)}, in the ({tilde over (x)}, {tilde over (y)}) duplet could capture design criteria, such as the desired weight, width, height and length of an automotive part (the “intended features”). The elements of the product vector, {tilde over (y)}, could capture performance of the finalized part, or even ideas or options relevant to specific design stages (the “observed features”), as noted above. The duplets are used to train a generic system model, per (3) and (4).

(106) While the composition of the ({tilde over (x)}, {tilde over (y)}) duplets varies between applications, {tilde over (x)}, typically represents some type of ‘input’, in the case of an engineering application the complete ‘requirement list’, but {tilde over (y)} the ‘output’ of a ‘design’ or a ‘plan’. In the case of mechanical design, {tilde over (y)} could represent a complete parameterized list of the overall assembly from the solid modeling tool of choice. {tilde over (y)} could also contain information related to bill of materials or drawings.

(107) The project binders may contain pointers to pertinent content, based on designer inputs and available information. The input format captures and preserves content and associates with the relevant context. This facilitates storage for future use. Pertinent third-party data is accessed from databases with available context provided. The databases may be owned by the vendor of the SW design tool used, by a customer or by a third party. Designers ultimately choose to consider the information that is most relevant for any given design decision. This arrangement allows designers to leverage digital content management to make more informed design decisions without losing focus of the primary design challenge.

(108) The information developed for the project binders in FIG. 1 consists of pointers to the PDS and design (database) objects. The PDS comprises of requirement objects, in programming context, and the design objects are comprised of component and assembly objects. Both can have hierarchy imposed. The design data itself is stored in mass outside the application.

(109) 6.2 New Predictions

(110) For new designs, designers could extract the design vector, x, in FIG. 1 from the new requirements, apply to the system model, and get the guiding design, y, as an output. The guiding design, y, could be a reference (starting point) for design of the new product. Such reference may help improve the fidelity of design decisions. If design decisions cause the product to deviate significantly from the reference, y, explanations are likely warranted.

(111) Similarly, this invention assumes training of an automated mission planning system using requirements, combined with tactical mission plans or asset performance models from past mission planning projects, as shown in FIG. 20 and FIG. 22. When applying requirements from a new mission planning project as input, the trained system offers a guiding plan as an aid to mission designers. This guiding plan can leverage and exploit mission performance data and user feedback, including after action reports, planning decisions, and critiques of system performance.

(112) When designing something specific, like a bolt, one expects narrowly defined requirements and relatively good prediction capabilities. As the system model captures a broader subject, one expects more variations in the model, and worse prediction capability.

(113) 6.3 Identification of Anomalies

(114) Identification of anomalies (significant deviations from the reference prediction) depends on the nature of the data (categorical or continuous). In general, the following methods apply: 1. One can compare distribution of each variable either by using quantiles or another statistical test, to see if the variations are significantly different. 2. One can count occurrence of each label or category and compare. 3. One can employ a distance measure, such as the Mahalanobis distance, and look for big changes. 4. One can apply an absolute difference between new and old data, set a threshold, and report everything that exceeds the threshold. 5. One can apply a multidimensional technique, like correlation matrix, principal components, clustering, etc., and look for changes. 6. One can employ statistical/ML models specialized for anomaly detection, including Support Vector Machines, t-distributed Stochastic Neighbor Embedding, Isolation Forests, Peer Group Analysis or Break Point Analysis.
6.4 Database Objects

(115) The database objects suitable for engineering product design, mission or retail planning are defined along with their associated attributes. By defining the databases based on function, four databases with seemingly reasonable attributes are proposed. The database management overhead associated with the proposed architecture is expected to be minimal.

(116) 1. Customer and Requirement Objects

(117) FIG. 12 and FIG. 13 present embodiments of customer and requirement objects from a database containing the PDS objects. Through the PDS, the designer builds up a collection of pointers to pertinent design information objects. It is of key importance to define proper attributes for the object pointers in the PDS database, and formulate metadata and leading indices (index maps) accordingly. For the PDS database, the object pointers considered pertinent are listed in Table 7 and Table 8. The constraints in Table 8 may be binary and can be relatively easy to verify. The performance requirements typically involve binary thresholds, and are judged in accordance to design performance relative to the threshold. The objectives involve no thresholds, but rather provide optimization considerations for decisions.

(118) TABLE-US-00007 TABLE 7 Essential attributes pertinent to the customer objects in the PDS database. Attribute Description CustomerID Unique ID for the customer object Name Organization, Person, Entity Type Internal, External, Other Importance Low, Medium, High

(119) Note the requirements are not listed in the customer objects, but the customers are listed in the requirement objects. This avoidance of duplicity, and cyclic relationships, is consistent with the design philosophy behind relational databases.

(120) The framework for predictive analytics is capable of generating, managing, and presenting content with relevance to the design problem at hand in the databases available. It is assumed that, during the course of a design project, the database continues to grow. If design content is not readily available through a third-party or in-house, designers are apt to define it.

(121) TABLE-US-00008 TABLE 8 Essential attributes pertinent to requirement objects in the PDS database. Attribute Description RequirementID Unique ID for the requirement object Name Descriptive name for the Requirement object Owner Key to customer database: Customer[i] Importance Low, Medium, High Type Constraint, Performance or Objective Function Function (e.g., mechanics) that requirement is addressing Characteristics Key to characteristics database: Characteristic[j] Could be based on the function Units Key to units database: Units[k] Threshold Value for binary assessment

(122) FIG. 1 shows how the PDS object can be built using pointers to a database, for the purpose of being big data compatible.

(123) 2. Assembly and Component Objects

(124) FIG. 14 presents an embodiment of the assembly and component objects from the design database. The assembly objects consist of nested, aggregated subordinate levels, and have authority to define requirements applicable to the subordinates. The component objects consist of individual parts, pieces, or obtainable, self-contained assemblies. In case of the Design database, the pertinent attributes for the assembly and component objects are listed in Table 10 and Table 9. The rules in Table 10 specify the governing constraints of aggregated subassemblies and components. It is assumed that the design database complies with standard relational database (schema) formats for big data compatibility.

(125) In FIG. 14, we assume that a component or assembly object can address many requirements, and that given requirement may appear in many components or assemblies. The relational dependence between the assembly or component objects and the requirement objects is configured accordingly. Again, the database objects for mechanical product design, along with their corresponding attributes, are defined based on function.

(126) TABLE-US-00009 TABLE 9 Attributes pertinent to components objects in the design database (SteingrimssonYi 2017). Attribute Description ComponentID Unique ID of the component object Name Descriptive name for the Assembly object Requirements Key to the PDS database: Requirement[ ] Input Key to database: Flow[ ] Output Key to database: Flow[ ] Process Key to database: Process[ ] Dimensions Nominal and tolerance, in the form of solid model data Material Key to database (for material properties): Material[ ] Properties Description of miscellaneous properties

(127) TABLE-US-00010 TABLE 10 Attributes pertinent to assembly objects in the design database (SteingrimssonYi 2017). Attribute Description AssemblyID Unique ID of the assembly objects Name Descriptive name for the Assembly object Requirements Key to the PDS database: Requirement[ ] Subordinates Define subassemblies and components Input Key to database: Flow[ ] Output Key to database: Flow[ ] Process Key to database: Process[ ] Rules Key to Rules database: Rules[ ]
3. Mission Planning Objects

(128) We apply the framework for predictive analytics to generate mission plans in near-autonomous mode, given the current work flow for mission planning, by leveraging data, models and standards in the core database for the mission planning system. The framework is capable of generating, managing, and presenting content with relevance to the mission plan at hand in the databases available. We define database objects suitable for mission planning, along with their associated attributes, based on the function desired.

(129) Specifically, for multi-domain, multi-asset mission planning, the key to applying the framework for predictive analytics involves “correctly” collecting data into the database and tagging. All the domains and assets supported should have a universal index, and the indices should be recorded as a part of the Requirement objects, as shown in Table 11. There may be a “local” versions of the database, for each domain and/or asset. As you collect data into the “local” versions of the database, you may tag the data to indicate which domain or asset it corresponds to. Then you may have a “global”, centralized database for all the assets and domains.

(130) TABLE-US-00011 TABLE 11 Selected items from requirement objects for mission planning. Item Explanation Goals The mission goals Location Location of the target Target Target selected High-payoff targets The high-payoff targets Domain To enable multi-vehicle mission planning (universal indexing of the domains) Asset To enable multi-domain mission planning (universal indexing of the assets) Security level To account for different security levels (standardized names of levels)
4. External Databases

(131) As shown in FIG. 1, FIG. 19, FIG. 21 and FIG. 23, the design content aggregated is multi-faceted and covers a broad spectrum of inputs. It not only consists of the project binders from the designers, but also includes existing and previous design projects within an organization, plus the linked-in design files, the outputs from the design tools, material from the industry databases (results from context verification), the configuration scripts, examples, content of sample databases provided, legacy databases for known good designs, search analytics, information on manufacturing procedures, material characteristics, material prices, and parts that can be obtained from elsewhere, etc. The result is a sizable database of useable information available to designers and design organizations.

5. An Overall Product Design

(132) FIG. 15 illustrates how a master assembly object (a Level 0 object), along with the requirements that it addresses, can be represented in relation with sub-assemblies and components (Level 1 objects), along with the requirements that the sub-assemblies or components address. FIG. 14 similarly shows how the Level 1 objects can be represented in relation to corresponding Level 2 objects (together with the requirements that the Level 2 objects address). FIG. 15 further provides an illustration of how the component options and associated requirements, for an overall design (one comprising of multiple subsystems), can be programmed into the database, based on engineering knowledge gleaned from prior designs. This knowledge may, for example, be related to machine design text awareness of risks for certain components or uses.

(133) 6.3 Data Annotation

(134) In order to present a classification system suitable for engineering product design, mission and retail planning, we think of the collection of archived project binders (e-design notebooks (SteingrimssonYi 2017), (Steingrimsson 2014)) as books in a library. We assimilate the indexing of the project binders to cataloging of books in a library. And we compare the meta-data defined for the project binders to the index labels placed on the books. Similar to the index labels for helping with identification of books of interest, the meta-data facilities rapidly processing and accurately responding to designer queries. We assume the design content gets tagged, in a similar fashion as Google tags all websites, to facilitate queries reflecting the users' needs.

(135) 6.4 More on Data Tagging: Relations between Database Objects

(136) 1. Relational Databases

(137) Big data analysis capabilities can be applied, for example, at organizations with design repositories arranged in the form of relational databases. The core concept, of project binders accessing data in databases holding all the design information, and creating pointers to the pertinent design content, may be adapted to other database structures.

(138) 2. Sample Relations between Database Objects

(139) FIG. 13 illustrates a sample relation between the customer and requirement objects. Here we assume, for simplicity, that a given requirement originates from a single customer. FIG. 14 provides a sample illustration of the relation between component or assembly objects and the requirement objects, respectively. Again, we assume that a component or assembly object can address many requirements, and that given requirement may appear in many components or assemblies.

(140) 3. Efficient Tagging through Index Maps

(141) The relational databases consist of index maps in addition to the data. The index maps allow you to find data efficiently. They mimic tags on books in a library. The tagging constructs a map of specific key words, enabling efficient search on a portion of the input design.

(142) 4. Note on Data Management

(143) Binders for new design projects are assumed to have the same structure as the binders from the past design projects (and are assumed to be archived as such). Note that regardless of which Product Lifecycle Management system a design organization elects to use, the design data needs to be entered only once. The Ecosystem provides capability for exporting design data into formatted project reports. So the design data does not need to be entered more than once. Content from the exported reports can be used in progress reports or project presentations. As long as design organizations make sure that each design project gets archived after completion, data management is expected to require relatively minor effort.

(144) 6.6 Querying Engine

(145) In the context of mechanical design, this invention assumes that a user need occurs as a mix of the following four elements: Function, cost, material, and energy. The first step in the analysis of project binders (e-design notebooks) involves automatic understanding of user needs. Here, the querying engine is expected:

(146) 1. To comprehend statements of user need/requirement.

(147) For this purpose, we treat a user need as a query.

(148) 2. To retrieve previous design information, or examples, that yield a good match to the user need.

(149) In the process of doing so, the information retrieval framework presented can provide design teams (workforces) with design information similar to the ones previously reported and harvested, for the purpose of enhancing design efficiency and efficacy. To deal with these tasks, we propose to adopt an indexing and retrieval method from the field of information retrieval, one referred to as Latent Semantic Analysis.

(150) 1. Latent Semantic Analysis

(151) LSA is an extension of a classic IR model, the Salton's Vector Space model (VSM) (Salton 1975). LSA was developed as an information retrieval technique that discovers hidden semantic structure embedded in documents (Deerwester 1990). It is an unsupervised technique for mapping high-dimensional count vectors to a lower dimensional representation. In more detail, complex relationships exist between words and surrounding contexts, such as phrases, statements or documents, in which the words are located. For the discovery of latent semantic relationships (correlations), LSA begins with the creation of a co-occurrence matrix, where the columns represent contexts and the rows represent words or terms. An entry (i, j) in the matrix corresponds to the weight of the word i appearing in the context j. The matrix is then analyzed by applying Singular Value Decomposition to derive the associated hidden semantic structures from the matrix. SVD is a way to factorize a rectangular matrix. For an m-by-n matrix, A, with m>n, the singular value decomposition of the matrix A is the multiplication of three matrices: An m-by-r matrix U, an r-by-r matrix Σ, and the inverse of an n-by-r matrix V, in that order. That is,
A=UΣV.sup.T.  (17)

(152) Here, V.sup.T is the matrix transpose of V, obtained by exchanging V's rows and columns. Then, U and V have orthonormal columns and Σ is a diagonal matrix. Such a multiplication form is referred to as the SVD of A. The diagonal elements of Σ are all positive and ordered by decreasing magnitude. The original matrix, A, can be approximated with a smaller matrix, A.sub.K, where A.sub.K is obtained by keeping the first k largest diagonal elements of Σ. By definition, k is the rank of the matrix Σ. By applying SVD factorization to the matrix A, context (e.g., a set of statements characterizing user needs) is represented in a much smaller dimension, k, rather than the original high dimension, m. Note that
k≤n,  (18)
n<<m.  (19)

(153) As a result, a context is represented in a lower dimensional space, rather than in the full, much higher dimension. k is referred to as the dimension of the latent semantic structure of A. A comprehensive overview of LSA can be found in (Dumais 2004).

(154) 2. LSA-Based Approach

(155) The goal of the LSA-based approach proposed is to provide designers with access to previous design records that are relevant to the designers' need. In the vocabulary of Information Retrieving, a designer's need is equivalent to a query.

(156) LSA is adopted as the framework of retrieving designers' needs in this invention, because the method has been proven to be an effective unsupervised algorithm for IR (Laundauer 2007). In fact, for purpose of the querying engine, we are not considering a supervised learning approach, such as neural networks or deep learning, since such an approach requires a very large amount of previous e-design examples, or cases, that presently are unavailable.

(157) FIG. 16 depicts an LSA-based approach of retrieving e-design cases that are likely to satisfy a query (i.e., a user need). The LSA-based method predicts the degree of relevance of e-design examples to the query and presents the most relevant previous e-design cases, or examples, to the user.

(158) 3. Complexity

(159) The algorithms for Latent Semantic Analysis and Latent Semantic Indexing are based on Singular Value Decomposition. The time complexity for extracting the SVD of a m×n matrix is bounded by (Trefethen & Bau III 1997)
O(m*n.sup.2)FLOPs.  (20)

(160) Here, m refers to the number of terms, and n to the number of documents.

(161) 4. Example Involving Mechanical Product Design

(162) A simple example is presented here to show how the semantic framework for analysis of users' needs can help with idea generation (brainstorming) in the Concept Design stage of a project involving the design of a reliable, single-operator Go Kart lift stand. This may be a capstone project, where the experience of the designers in the area may be somewhat limited. Therefore, they simply pose the following as input to the querying engine: “We need a reliable, single-operator stand for kart racers”.

(163) The system responds to the stated need by offering a number of ideas or options. Based on what can be retrieved from the databases, or the training data available, the system may offer the following suggestions:

(164) “1. JEGS Multi-Purpose Lift

(165) 2. KartLift BigFoot

(166) 3. KartLift Winch Lift

(167) 4. Electric Super Lift

(168) 5. Go Kart Stand Lift”.

(169) Supplementing the overall process outlined in FIG. 16, FIG. 18 lists intermediate steps elucidating how the engine for latent semantic analysis is able to arrive at this conclusion.

(170) While this example may come across as relatively simple (many mechanical designers may have a clue as to what type of lift stands are available), it conveys an application (illustrates the purpose) of the querying engine. More nuanced examples can be crafted, say, around specific standards, policies, material properties or common components. To our knowledge, there is presently no systematic search available for helping designers with brainstorming during concept design.

(171) 5. Example Involving Strike Mission Planning

(172) FIG. 20 presents an example showing how the LSA-based approach can be used to retrieve cases that are likely to satisfy an input query from a strike mission planner (i.e., a user need). The LSA-based method predicts the degree of relevance of the archived examples to the query and presents the most relevant previous cases, or examples, to the mission planner.

(173) 6. Example Involving Surface or Underwater Mission Planning

(174) FIG. 22 presents a similar example showing how the LSA-based approach can be used to retrieve cases that are likely to satisfy an input query from a surface or underwater mission planner (i.e., a user need). The LSA-based method predicts the degree of relevance of the archived examples to the query and presents the most relevant previous cases, or examples, to the mission planner.

(175) 7. Note about Practicality

(176) Note that the user does not need to worry about parsing of the input query. The representation step of the querying engine is invisible to the user. The user provides the input query in the form of a sentence, such as the ones in FIG. 18, FIG. 20 or FIG. 22. The sentence may even be provided verbally with the help of a speech recognition system.

(177) 6.7 More on Training for the Predictive Analytics

(178) The predictive analytics framework presented assumes a holistic big data analysis and efficient utilization of a broad spectrum of available information. Through proper database representation of the design content, in the form of composite design objects (Design[t]), and references from the design project journals, one can categorize the data and run various cross-correlations (queries) across projects or within projects. By storing the comprehensive design history in a cloud, and harvesting repositories of known good designs through database queries, one can improve the design decision fidelity for new designs. Access to such repositories may also prove invaluable for the purpose of post-mortem failure analysis.

(179) The big data frameworks in FIG. 1, FIG. 19, FIG. 21 and FIG. 23 can be trained, for example, using the Delta Rule, mentioned above, which is also sometimes referred to as the Widrow and Hoff learning rule, or the Least Mean Square rule (Widrow 1960):

(180) Δ w ijx = - .Math. δ E δ w ij = .Math. δ a ix . ( 16 )
Here Δw.sub.ijx represents the update applied to the weight at node (perceptron) between links i and j in a neural network (Widrow 1960). E represents an error function over an entire set of training patterns (i.e., over one iteration, or epoch) (Widrow 1960). £ is a learning rate applied to this gradient descent learning. a.sub.ix denotes actual activation for node x in output layer i (Widrow 1960).
6.8. Validating the Integrity of Archived Data—The Impact of Incomplete Data
1. Predictive Analytics

(181) We put high emphasis of only training the system model on known good design (designs qualified as having all requirements fulfilled), even though 1. It is possible you get a guiding reference of some value, even if you relax the constraint about the archived designs being qualified as good; 2. It is of course much easier to get started if you relax the constraint, esp. for organizations with large databases of legacy designs. 3. We are recommending the neural network solution in the event of a large set of archived binders. In the event of a small to intermediate number of archived binders, we are recommending multi-variate linear regression for training the system model.

(182) Once you relax the constraint of the archived designs being of good quality, the quality of the output will be un-deterministic. In the worst case, the quality of the guiding reference may soon be impacted (significantly?), if one were to relax the constraint on the bad designs. Are 90% of the designs good or only 10%? Even if the system model were trained based on designs, only 50% of which had been qualified as good, the quality of the guiding reference might be suspect. We want to take steps to protect against the possibility of “garbage in” producing “garbage out”.

(183) 2. Adopting Designs from a Legacy Database (Whose Designs are Yet to be Qualified as Good)

(184) As a practical way to adoption of the big data framework by companies, that have a large database of legacy designs, that are yet to be qualified as good, we recommend incremental re-training.

(185) Even in the case of a large database of legacy designs, we recommend starting out by identifying a (small) set of good designs on which to train the system model. Then you incrementally expand the training set, but qualify the new designs from the legacy database, either through automatic requirement verification provided by SW like the Ecosystem or by a human. Then you retrain the system model, once you have qualified the new input. In this way, you can expand the training set in a controlled fashion, and refine the system model, and yet maintain quality of the archived designs comprising the training set.

(186) 2. Querying Engine

(187) Most search engines are based on similar concepts as the querying engine presented (information retrieval). LSA is less sensitive to data scarcity than many supervised machine learning techniques. It's assumed that the LSA returns hits closely matching an input query. With an appropriate threshold being set, LSA can tell how many matches there are to the input query. In this sense, the querying engine is capable of gracefully accounting for cases of incomplete data.

(188) 6.9 Accommodations Specific to Incremental or Remedial Design Projects

(189) 1. Single Reference Design

(190) In case of an incremental or remedial design project, based on a single reference design, there is no need for predictive analytics. As opposed to generating a guiding reference, y, through predictive analytics, the original reference is the guiding reference.

(191) 2. (Small) Sub-Set of Reference Designs

(192) The matrix {tilde over (W)} enables specification of the (small) sub-set of designs to be used reference in this case. Here, one could still produce the guiding reference as
y={circumflex over (B)}.sup.Tx.

(193) But {circumflex over (B)} would be obtained as solution to (14), where row j of {tilde over (W)} consisted of all ones if and only if archived vector {tilde over (x)}.sub.j was among the few vectors in the (small) sub-set.

(194) 6.10 Accommodations Specific to Mission Planning

(195) 1. Accommodations of Highly Dynamic Environments—Dynamic Re-Planning

(196) Dynamic re-planning can be accounted for relatively easily, in principle: Once you have a new input (a new “requirement” vector), you receive the system output (the guiding mission plan) near instantaneously.

(197) If things get heated in battle, due to an action of an adversary, and decisions needed to be made on the fly, we may hold off presenting the corresponding guiding advisories, unless components of the guiding plan fulfilled minimum quality criteria.

(198) Further, as new information become available, one can conduct incremental training updates of the system model. The framework for predictive analytics employs supervised learning. The system is trained once in the beginning based on the archived (Requirement, Mission plan)
duplets available. But the longer the trained system is used, and as more information becomes available, one can conduct incremental training updates.
2. Rapid and Continuous Planning when New Information Necessitates Updating

(199) Similarly, rapid and continuous planning can be accounted for relatively easily, in principle: Once you have a new input (a new Requirement vector), you receive the system output (the guiding Mission Plan) near instantaneously.

(200) 3. Multi-Vehicle, Multi-Domain Mission Planning: Generation of Mission Plans in a Near-Autonomous Mode, Given the Current Workflow for Mission Planning

(201) Key to applying the framework for predictive analytics to multi-vehicle, multi-domain mission planning is to “correctly” collect data into the database and tag. All the vehicles and domains supported can have a universal index, and the indices can be recorded as a part of the “requirement” objects, as shown in Table 11. There may be a “local” version of the database, for each vehicle and/or domain. As you collect data into the “local” versions, you may tag the data to indicate which domain or asset it corresponds to. Then you may have a “global”, centralized database for all the assets and domains, with whom the “local” versions synchronize their data.

(202) 4. Accounting for Different Security Levels

(203) In combat, the crew that plans a mission is typically not the crew that executes it, so decisions are sometimes made to enable as much flexibility as possible, and delegate the decision authority to the lowest possible level (i.e. the person in the cockpit). Our general approach will be to comply with Navy protocols in this regard. As shown in Table 11, the security level can be recorded as a tag in the database for the input “requirements”, in the archived duplets for (Requirement, Mission plan).

(204) One can record the security levels using a standardized scheme.

(205) 6.10. Support for Pertinent Interfaces

(206) 1. Interface with Spark or Hadoop for Big Data Analysis

(207) Regarding the type of interfaces that the system will be able to support (regarding the system's ability to support both heterogeneous (distributed) and homogeneous architectures), one should note that large, distributed databases (enterprise applications), such as Apache Hadoop or Spark, can be supported through the API interfaces provided by these tools (Spark 2017). Hadoop provides a native Java API to support file system operations (BigData 2017). One can use WebHFDS to interact with the Apache Hadoop file system externally through a more user friendly REST API (Hadoop 2017). WebHDFS concept is based on HTTP operations like GET, PUT, POST and DELETE (Hadoop 2017). Operations like OPEN, GETFILESTATUS, LISTSTATUS are using HTTP GET, while others like CREATE, MKDIRS, RENAME, SETPERMISSIONS are relying on HTTP PUT (Hadoop 2017). The APPEND operation is based on HTTP POST, whereas DELETE is using HTTP DELETE (Hadoop 2017). Authentication can be based on user.name query parameter (as a part of a HTTP query string). If security has been turned on, then the authentication relies on Kerberos (Kerberos 2017).

(208) 2. Interface with JMPS System for Strike Mission Planning

(209) The Joint Mission Planning System is a software application that consists of a basic framework together with unique mission planning environment software packages for various platforms (TianWeiYaoping 2016). FIG. 24 outlines one way of integrating the big data framework into JMPS. We are presenting a high-level framework of learning methods that can overlook JMPS. The big data framework can offer advice to JMPS, based on outcomes and performance data provided.

(210) 3. Interface with the MEDAL System for Surface and Underwater Mission Planning

(211) FIG. 25 shows how the big data framework can offer advice to the Shipboard MEDAL, the CMWC MEDAL and the data warehouse systems, based on outcomes and performance data provided. In current mine warfare mission planning, historical oceanographic data are acquired from the NAVOCEANO database, also listed in FIG. 25. These data, including seafloor characteristics, water column properties, and atmospheric parameters, are input into MEDAL to determine optimal lane spacing and predict operational time lines and risk. Once into the exercise, in situ data are collected and input into MEDAL and other tactical decision aids, used for naval warfare mission planning. When in situ data are entered into MEDAL, the operator can output predictions of best and worst-case scenarios for line spacing to balance operational objectives with clearance time and risk. It is important at this stage of operations that mine warfare personnel be receptive to modifying tactics to fit changes in environmental conditions.

7. How to Use the Invention

(212) 1. For Improving Fidelity of Design Decisions through Comparison with a Reference Forecast

(213) FIG. 1 summarizes this application. The vector y captures the reference forecast. When decisions made during the course of a new design project deviate from the reference, this prompts a case (anomaly) likely worth investigating. The invention addresses harvesting of information from past engineering design projects for the purpose of aiding future design projects.

(214) 2. For Idea Generation: Querying for Solution Options

(215) FIG. 17 and FIG. 18 summarize such an application.

(216) 3. For Querying for Common Components

(217) This invention can be used to query external databases for common components used in a design, as FIG. 1 highlights.

(218) 4. For Querying for Information Related to Standards

(219) This invention can be used to query external databases for engineering standards pertinent to a design, as FIG. 1 highlights.

(220) 5. For Querying for Information Related to Material Properties

(221) This invention can be used to query external databases for the properties of materials considered for us in a design, as FIG. 1 highlights.

(222) 6. For Querying for Information Related to Regulations

(223) This invention can be used to query external databases for regulations that may impact a design.

(224) 7. For Querying for Information Related to Policies

(225) This invention can be used to query external databases for policies that may impact a design.

(226) 8. For Querying for Customer Related Information

(227) This invention may be used to query for information related to customers involved in a design.

(228) 9. For Querying for Information Related to Internal Requirements

(229) This invention may be used to query for information related to internal requirements involved in a design.

(230) 10. For Querying for Information Related to Best Practices

(231) This invention may be used to query for information related to best practices involving a design.

(232) 11. For Querying for Information Related to Previous Solutions

(233) This invention may be used to query for information related to previous design solutions.

(234) 12. For Querying for Information Related to Analogies

(235) This invention may be used to query for information related to analogous solutions.

(236) 13. For Querying for Other Information Related to Detailed or Final Design

(237) This invention may be used to query for other information related to the detailed or final design stages of a design project.

(238) 14. For Design Projects Involving Incremental Changes or Remedial Efforts

(239) This invention can be used on design projects involving incremental changes or remedial efforts, as explained in 6.9 above. The driver may involve cost savings, material change, etc.

(240) 15. For Design to Manufacture, or Simply for Manufacture

(241) The predictive analytics and the querying engine may be used both to design an engineering product as well as to design the equipment used to manufacture the design. In case of mechanical design in industry, separate department may be responsible for designing parts and designing the molds that make the parts. Decisions made during the design of the molds can have significant cost implications. For example, in manufacture of a key board, do you manufacture the keys in a group or individually? Are you going to use single-shot molding or double-shot? If you misplace the position of the screw, where the plastic flows into the mold, there can be significant cost penalty.

(242) 16. For Maritime Mine Detection and Neutralization (for Surface or Underwater Mission Planning)

(243) This invention may be used for continuous planning for maritime mine detection and neutralization using unmanned vehicles. Assets involved may include submarines or ships to be protected from the mines. The ocean environment significantly influences mine warfare tactical planning (OceanStudies 2000). Understanding of nearshore environmental variability is important, not only for physical mine placement but also for its impact on mine hunting sensors (OceanStudies 2000). The coastal environment tends to be complex and may lack of high-resolution nearshore oceanographic and bathymetric data, particularly in areas of potential conflicts (OceanStudies 2000).

(244) 17. For Guiding Near-Autonomous Generation of Strike Mission Plans

(245) For specifics on application of the invention to strike mission planning, refer to FIG. 19 and FIG. 20.

(246) 18. For Automatically Acquiring and Continually Updating Asset Performance Models and Tactical Planning Knowledge, in Order to Improve Decision Support by Automated Mission Planning Systems

(247) For specifics on application of the invention to surface or underwater mission planning, refer to FIG. 21 and FIG. 22 FIG. 25 shows how the invention can fit into the information systems presently used, such as the Mine Warfare Environmental Decision Aid Library.

(248) 19. For Use in Conjunction with Automatic Requirement Verification

(249) The invention can be used in conjunction with automatic requirement verification, such as within the automotive or avionics industries. The invention can help contribute to the making of safe autonomous vehicles, vessels or aircrafts.

(250) 20. For Mission Command

(251) The invention can harvest operational data for the purpose of providing predictions, alerts, and recommendations. An autonomous learned system (machine learning) can understand large amounts of data, manage the results, and react to cyber defense, electronic warfare, and even large raid attacks. The big data framework can help enhance human performance in the processing of information management and knowledge management during exercise of Mission Command.

(252) 21. For Retail Planning and Supply Chain Management

(253) For specifics on application of the invention to retail planning and supply chain management, refer to FIG. 23.

(254) 22. Other Applications

(255) Other private sector applications may include survey and first responder operations. Solutions for multi-vehicle, multi-domain mission planning may benefit various companies that deal with parcel delivery such as Amazon, UPS, FedEx, and others by generating autonomous mission plans (optimized delivery plans for either multiple ground and air vehicles). Another field that may benefit is traffic engineering, providing an dynamic approach to traffic control based on various traffic conditions.

8. Further Examples of the Invention

(256) Thus, it will be appreciated by those skilled in the art that the present invention is not restricted to the particular preferred embodiments described with reference to the drawings, and that variations may be made therein without departing from the scope of the invention