SYSTEM AND METHOD FOR VIGOROUS ARTIFICIAL INTELLIGENCE

Abstract

A system and method for predicting a characteristic of an object in an artificial intelligence system. The method includes evaluating the object using a first model to produce a first prediction of a characteristic of the object. The object is evaluated using a second model to produce a second prediction of the characteristic of the object, the second model being dissimilar to the first model. A final prediction of the characteristic of the object is generated as a function of dynamic weightings of the first prediction and the second prediction.

Claims

1. A method of characterizing an object input to an artificial intelligence (AI) system, comprising: evaluating said object using a first model to produce a first prediction of a characteristic of said object; evaluating said object using a second model to produce a second prediction of said characteristic of said object, said second model being dissimilar to said first model; and generating a final prediction of said characteristic of said object as a function of dynamic weightings of said first prediction and said second prediction.

2. The method recited in claim 1, wherein evaluating said object using said first or second model further comprises determining a quality of said first or second prediction, respectively.

3. The method recited in claim 2, wherein said quality comprises a measure of the confidence in said prediction.

4. The method recited in claim 2, wherein said dynamic weightings are a function of said quality.

5. The method recited in claim 1, wherein said dynamic weightings are a function of at least one external input.

6. The method recited in claim 1, wherein said dynamic weightings are a function of at least one predefined rule.

7. The method recited in claim 1, wherein said evaluating said object using first and second models are executed in parallel.

8. The method recited in claim 1, wherein said first model comprises a neural network.

9. The method recited in claim 8, further comprising training said first model using a corpus of data.

10. The method recited in claim 1, wherein one of said first and second models comprises a Fast Fourier Transform.

11. An artificial intelligence (AI) system for characterizing an input object, comprising: at least one processor; and, at least one memory, said at least one memory containing instructions which, when executed by said at least one processor, are operative to: evaluate said object using a first model to produce a first prediction of a characteristic of said object; evaluate said object using a second model to produce a second prediction of said characteristic of said object, said second model being dissimilar to said first model; and, generate a final prediction of said characteristic of said object as a function of dynamic weightings of said first prediction and said second prediction.

12. The AI system recited in claim 11, wherein evaluating said object using said first or second model further comprises determining a quality of said first or second prediction, respectively.

13. The AI system recited in claim 12, wherein said quality comprises a measure of the confidence in said prediction.

14. The AI system recited in claim 12, wherein said dynamic weightings are a function of said quality.

15. The AI system recited in claim 11, wherein said dynamic weightings are a function of at least one external input.

16. The AI system recited in claim 11, wherein said dynamic weightings are a function of at least one predefined rule.

17. The AI system recited in claim 11, wherein the operations of evaluating said object using said first model and evaluating said object using said second model are executed in parallel.

18. The AI system recited in claim 11, wherein said first model comprises a neural network.

19. The AI system recited in claim 18, further comprising the operation of training said first model using a corpus of data.

20. The AI system recited in claim 11, wherein one of said first and second models comprises a Fast Fourier Transform.

21. An artificial intelligence (AI) modulator for characterizing an object, comprising: a processor; and, a memory, said memory containing instructions which, when executed by said processor, are operative to cause said AI modulator to: receive a first evaluation of said object from a first model, said first evaluation comprising a first prediction of a characteristic of said object; receive a second evaluation of said object from a second model, said second evaluation comprising a second prediction of said characteristic of said object, said second model being dissimilar to said first model; and, generate a final prediction of said characteristic of said object as a function of dynamic weightings of said first prediction and said second prediction.

22. The AI modulator recited in claim 21, wherein said first or second evalutions of said object further comprises a quality of said first or second prediction, respectively.

23. The AI modulator recited in claim 22, wherein said quality comprises a measure of the confidence in said prediction.

24. The AI modulator recited in claim 22, wherein said dynamic weightings are a function of said quality.

25. The AI modulator recited in claim 21, wherein said dynamic weightings are a function of at least one external input.

26. The AI modulator recited in claim 21, wherein said dynamic weightings are a function of at least one predefined rule.

27. The AI modulator recited in claim 21, wherein one of said first and second models comprises a neural network.

28. The AI modulator recited in claim 21, wherein one of said first and second models comprises a Fast Fourier Transform.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0110] For a more complete understanding of the present disclosure, reference is now made to the following detailed description taken in conjunction with the accompanying drawings, in which:

[0111] FIG. 1 illustrates a block representation of artificial intelligence, machine learning, and deep learning hierarchy;

[0112] FIG. 2 illustrates a graphical representation of an artificial neural network;

[0113] FIG. 3 illustrates a graphical representation of an artificial neural network for deep learning;

[0114] FIG. 4 illustrates a drawing of a partitioned system made of an obvious amalgam;

[0115] FIG. 5 illustrates a block diagram of a mathematical model for Fourier series as used in evolved artificial intelligence;

[0116] FIG. 6 illustrates a block diagram of a master integrator for blending dissimilar data and artificial intelligence;

[0117] FIG. 7 illustrates a drawing of a postulated response to a face-eye configuration;

[0118] FIG. 8 illustrates a drawing of an exemplary vigorous artificial intelligence diagram;

[0119] FIG. 9 illustrates an exemplary vigorous artificial intelligence diagram showing a plurality of twins;

[0120] FIG. 10 illustrates a drawing of vigorous artificial intelligence news story attribution and a credibility system;

[0121] FIG. 11 illustrates a flow diagram of an embodiment of a method of evaluating an object to predict a characteristic of the object; and

[0122] FIG. 12 illustrates a block diagram of an embodiment of an apparatus for predicting a characteristic of an object.

[0123] Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated and, in the interest of brevity, may not be described after the first instance.

DETAILED DESCRIPTION

[0124] The making and using of the present exemplary embodiments are discussed in detail below. It should be appreciated, however, that the embodiments provide many applicable inventive concepts that can be embodied in a wide variety of specific contexts. The specific embodiments discussed are merely illustrative of specific ways to make and use the systems, subsystems, and modules for predicting a characteristic of an object. While the principles will be described in the environment of artificial intelligence systems, any environment such as a general modeling system is well within the broad scope of the present disclosure.

[0125] This process introduced herein seeks to achieve advantages by a method and system based on AI segmentation. Avoiding undesirable depth by segmentation can be understood by examining primate brain functions. Neurological functions are segmented. The left and right hemispheres have different primary functions. In ground feeders, one side is tuned to find food, the other to be wary of predators.

[0126] Segmentation in humans is extraordinarily complex, and our understanding is incomplete. A great deal has been written on this which does not need to be repeated here. From a system architecture standpoint, some of the strategy seems to be avoiding undesirable depth. Excessive depth imposes several penalties, including long processing time, bandwidth demands, being poorly explainable, and slower learning.

[0127] An error from the early days of AI was based on a false assumption. There was an idea that our brains are one large neural network. The connector between hemispheres, the corpus callosum, was poorly understood in the 1960s, and its function was unknown in the days of Turing and von Neumann. As the fundamental approaches of AI were established, we believed the corpus callosum was the connection between brain hemispheres. This unfortunate (and limited) paradigm helped prop up the idea of the brain as one big neural network. Consider the following timeline:

[0128] In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work, modeling a simple neural network using electrical circuits.

[0129] In 1949, Donald Hebb wrote The Organization of Behavior, pointing neural pathways are strengthened each time they are used (laying the ground work for weight adjustments).

[0130] In the 1950's, Nathanial Rochester from the IBM research labs attempted and failed to create NNs in a computer.

[0131] In 1959, Bernard Widrow and Marcian Hoff of Stanford developed ADALINE and MADALINE, the first neural network applied to a practical problem: echoes on phone lines. The system is reportedly still in commercial use.

[0132] In 1962, Widrow & Hoff developed a weight change learning procedure which eventually influenced the development of deep methods

[0133] The foundations of today's AI were firmly laid before neuroscience discovered the function of the corpus callosum. It was much later that we discovered the importance of inhibition.

[0134] By the early 21st century, we began to see the corpus callosum was not just a collection of connections. Today, we know it is better understood as a boundary bridge. It allows some traffic to cross between hemispheres and inhibits other traffic. In other cases, it seems to integrate data from both hemispheres.

[0135] Understanding brain scans progressed along the same lines. Early researchers assumed active areas were doing processing. Now we understand that neural activity can be both inhibitory and excitatory; connecting and processing, or, overriding and blocking. Segmentation seems to occur at multiple levels, not just between hemispheres.

[0136] These three functions (excitation, integration, and inhibition) are poorly represented in AI made of obvious amalgams. Primate biological systems are more elegant and subtle. Segmentation allows our brains to process information faster, in different ways. Both system processing speed and system process diversity deserve additional observations.

[0137] Referring to Izhikevich's brain model, recall that he reported one second of brain processing time required 50 days of Beowulf cluster run time. Clearly smaller networks with parsimonious connections are needed to improve speed.

[0138] Considering system diversity, we can understand our world and our own body's senses in several ways at the same time Like the ground feeding bird looking for both food and prey, we interpret our surroundings and ourselves for different purposes and in different ways simultaneously.

[0139] Better understanding of brain functional segmentation has made the brain more explainable. Today, we have rough understanding of how we process images, understand speech, and some subtle differences, such as the difference between words we read, words we speak, words we type, and words we hear. We know the regions of the brain which recognize word (and other symbols) are separate from the parts of the brain which understand their abstract meaning. One brain area knows the word cat. A different area has the rich understanding of the abstractions associated with cats.

[0140] Mapping of brain functions has made our big human neural nets more explainable than many DNNs. It should be clear how functional partitioning of a system would support explainable AI.

[0141] Vigorous AI mimics these biological segmentation principles. For example, consider four different digital twin types, each representing a system with rotating machinery. Imagine we build a system with sensors and processing as follows: [0142] A stochastic network Fast Fourier Transform (FFT) trained on thermal and acoustic sensors observing the system. [0143] A failure model based on what Box called mechanistic cause and effect and constrained machine learning (e.g., Kalman filters) predicts system failure (failure twin). [0144] A model (also mechanistic with constrained machine learning) built from the design team's engineering models, represents how the machine should perform (performance twin). [0145] An as-built model represents the constituent parts that make up the system, and that can be compared to other similar systems.
These four systems can operate with reasonable speed, and with parsimonious sensor installations. They will each reflect a view of reality. In some ways this mimics the genetic performance of a four-way cross. But, unlike selective breeding, we are not constrained by a fixed ratio of contribution. Most so called F2 hybrids include a fixed ratio of genetic contribution from four pure breeds. But we can dynamically change the contribution of our four twins, for different functions, and under different conditions. In biological neural processing, such dynamic shifts seem to be plentiful.

[0146] The four-way cross example can be applied to a system for monitoring and managing rotating equipment. Vibrations from various system components can be processed by an FFT-based network as illustrated in FIG. 5, in accordance with an embodiment, as well as the other four components described above. In this example, the stochastic network FFT and performance twin can both suggest something is wrong by comparing the observed state to a learned normal state. These are near real-time anomaly detectors. But they may be prone to false alarms. For example, induced vibration from an external source might mimic vibration from within the rotating system. Their combination is likely to be more explainable than the FFT which only knows something is off, and faster than performance twin which might have time constants in its ML set to avoid false alarms, or to limit edge system processing needs. In fact, when we use them as something other than an OA, we can intentionally set them up with different response times and false alarm rates, to create a hybrid which outperforms either model alone.

[0147] The failure twin can suggest something will be wrong long before the DNN or a performance twin, under many circumstances. And, the failure twin can say why a failure is predicted. It is explainable. When the long-term failure prediction is added to the system, we have an even more powerful hybrid. We achieve more vigor because attention can be focused toward likely failures before the other system components can detect them. This allows dynamically setting the thresholds in the system components to levels which might otherwise create an unacceptable level of false alarms.

[0148] Both digital twin models can operate with simulated (virtual) sensors where none exist in the physical world. Both can feed the FFT with data which might not otherwise exist because of missing sensors.

[0149] Because the twins described in this example are mechanistic, they can encode vast amounts of information efficiently. We can write the equations of even complex orbital mechanics quite briefly. The training data set to create an equivalent DNN is vast and the DNN would be huge. We avoid this complexity by partitioning mechanistic understanding to the twin subsystems. Both twins are mechanistic, so they are more or less immune to data bias.

[0150] The system with rotating machinery is an example. It illustrates how hybridization of these four twins can be far more powerful and reliable than reliance on a single model. It also illustrates how assigning different kinds of AI and ML provides improved explainable (transparent) predictions and prescriptions. It shows how we might extend the performance of an AI which can only predict problems and generate highly targeted prescriptions. It shows how we can be frugal in our system's demands for sensors, processing and bandwidth, while exceeding the performance of systems which are greedy and fragile.

[0151] The Vigorous AI described in this example blends machine learning and AI, with the knowledge of engineers encoded into the twins or other models. It provides partition of functionality. It preserves many types of non-linear and non-continuous representations which would be infeasible in a pure AI approach, even if the AI was made of many NNs and tree structures.

[0152] VAI enables ensembles without forced voting or averaging. Redundancy is another desirable attribute of VAI. Most random forests are ensembles, but because of the Six Limitations, they inherit the weaknesses already described. Most current AI ensembles are blended by averaging or selecting from among ensemble elements. Averaging is said to have won several hackathons with gradient boosted random forests. Ensemble blending methods include several variants, and several weighting functions, resulting in several different names. Perhaps because of the history of using deep AI for classification, these voting or averaging methods arrive at a deterministic result; either it's a cat, or not.

[0153] Because classifiers are based on the crisp linear logic of our left hemispheres, they are vulnerable to a problem/joke Pearl likes to tell his AI and Computer Science students:

[0154] Input:

[0155] 1. If the grass is wet, then it rained.

[0156] 2. If we break this bottle, the grass will get wet.

[0157] Output: If we break this bottle, then it rained

Neurological studies show that right hemisphere damage makes humans prone to say the same kind of silly things as Pearl's hypothetical AI. The right hemisphere can deal with multiple alternatives and ambiguity better than the left. The left hemisphere masters the familiar and would posit Pearl's first statement. The right knows the second one is silly. Together they are more powerful than either on its own. This is part of the reason normal humans get the humor of Groucho Marx.

[0158] Humans are comfortable with ambiguity and even contradiction. This is one of the characteristics of intelligence higher than simplistic DNNs.

[0159] As described above, using VAI principles, we can construct a system with redundant systems which are highly dissimilar. We can easily create a vector of results, and a wide range of system actions or responses. To improve transparency, we can meaningfully compare the system component findings.

[0160] As described above, the exemplary VAI system can create the four separate abstractions and call each a digital twin. We don't need to sacrifice any of the insights we can glean from each of them. And, because of segmentation, we can think of them as a sparse matrix. Building out the entire matrix (even if it was possible considering the Six Limitations) would be a huge NN in a deep AI. But with a VAI system implementation, we have reasonable hope of building the system described on a limited edge computing platform. This is a very robust and dynamic form of redundancy, not easily achieved by OAs, but natural for the VAI.

[0161] VAI naturally accommodates the control of multiple components and abstractions by means of a Master Model Integrator (MMI). As shown in FIG. 5, MMI is a means to integrate disparate components into hybrid system, whether a VAI, or simply a hybrid simulation system. An MMI system is an integrating model system with native probabilistic processing. Lone Star's TruNavigator and AnalyticsOS are examples of such native stochastic modeling environments.

[0162] To integrate deterministic models, the results of vectorized results, embodied in SLURPs or similar system constructs, are used in the integrating model system. When a full system of systems has been represented, optimization elements of the system are activated. As an example, a non-linear optimizer can be operated such as used in Evolved AI. Other stochastic methods can be used, including stochastic gradient descent (SDG).

[0163] Because the relationships in each trial set are preserved, the optimal inputs can be determined, while at the same time integrating several different legacy systems represented in their SLURPs (or other such system construct).

[0164] The MMI can function in a number of useful ways. For oil field optimization, MMI can integrate large systems such as reservoir estimation systems, gathering system flow management and prediction systems, oil price futures economic models, artificial lift modeling systems, and others. Some of these system components are likely to represent very large sunk investments. MMI creates a viable integration system without the need to reinvest or replace successful legacy systems. For VAI, MMI provides a means to optimally integrate highly dissimilar system components, with the result being a Vigorous AI which optimally integrates system elements which might not otherwise be easily blended, much less optimized.

[0165] Preservation of uncertainty is another feature of VAI. Most current deep AI performs probabilistic feature detection of some sort. But the shape and span of uncertainty is lost, and yes/no binary results are the typical results. A critical aspect of this this loss is treatment of truth as a binary attribute. AI classifiers mimic the left hemisphere and attempt to determine if something is true/not. This can be thought of in terms of classical probability of detection and false alarm, the intersection of signal processing and Bayesian decision making. But perhaps more importantly, it can be thought of in terms of considering contrasting and even contradictory findings.

[0166] Current deep AI successfully avoids Orwellian doublethink. Doublethink holds contradictory and even opposing beliefs without cognitive dissonance. But it sacrifices the uncertainty around a person in a cat suit. Is that a picture of a cat? Well, yes, it is, but no it isn't. This sacrifice is at the heart of current AI's inability to tell the difference between good natured humor, sarcasm, and falsehood. It explains why current AI can't get Groucho Marx. With VAI there is no need for this sacrifice. The full span of uncertainty is easily preserved by stochastic methods in the mechanistic system components. Thresholds for classification, alerts and system state transitions can be adaptive. VAI also preserves competing representations of truth.

[0167] The preservation of both types of uncertainty is immensely powerful. In biological systems we see this use of uncertainty for mode shifting and threshold adaption. A prey species will shift to several different behaviors depending on the odds that a stimulus represents a predatory threat. White tailed deer will freeze under some condition, and leap away under slightly different conditions. The presence of another deer, alerting with white tail held high will shift the odds of freezing or fleeing.

[0168] In most prey species, the right hemisphere seems to be constantly looking for danger. Constantly looking for uncertain trouble is perhaps the most important form of uncertainty to be preserved for many VAI applications. In the case of GPS spoofing, or navigation failures, VAI can mimic the right hemisphere by constantly asking is the GPS really working? And asking, when was the last time GPS was provably correct?

[0169] Preservation of uncertainty has applications to both internal VAI processing, and to VAI interactions with other systems, and with humans. Military research has long focused on dealing with uncertainty, while recognizing its constant presence. Clausewitz spoke of the fog of war but the concept predates even Sun Tzu. A former Secretary of Defense often reminded subordinates, the first reports are always wrong. Recent work by Professor Sibel Adali at Rensselaer Polytechnic Institute and Dr. Jin-Hee Cho of the U.S. Army Research Laboratory show preserving uncertainty, even when a great deal of doubt exists, is helpful for commanders making critical decisions.

[0170] Preservation of uncertainty, and the option to provide a representation of it, is critical in highly ambiguous circumstances. It allows making better decisions faster. Bayesian adaptation is an example of benefits of preserving uncertainty. These methods are richer than simply noting a threshold was (or was not) crossed the last time we tested. Preservation of uncertainty is also critical in high dimensional spaces, which VAI is well suited to address.

[0171] Mechanistic methods are particularly useful for VAI. If we have adequate engineering information to know that speeds above a certain RPM exhibit non-linear bearing wear, compared to a different wear profile at lower speeds, then we should not expect it to be useful to attempt to train a deep AI, whether NN or tree. The curse of dimensionality will deny us a rich data set in the non-linear failure region, and it is likely our fitting will result in an AI more linear than the runaway failure mode.

[0172] Mechanistic methods can easily deal with T4C violations, if business rules or laws of physics or other constructs define the system and its behavior. Use of mechanistic models which incorporate uncertainty provide the flexibility to address problems which might otherwise be intractable for deep AI methods. In particular, mechanistic methods deal well with non-linear, non-continuous, and non-monotonic functions. We simply don't need an AI to learn repeatedly the temperature at which electrical insulation fails. But the chemical breakdown of insulation is controlled by a highly non-linear exponential function, something difficult to train a DNN to learn with limited observations of that particular failure.

[0173] Mechanistic methods also deal with multiple types of uncertainty. It is very natural to create a mechanistic model to deal with problems like the Monty Hall problem, even though the person programming the model might be prone to misunderstand it. Good mechanistic models teach us things we didn't understand. This is a contrast to AI classifiers which often depend on the limits of human understanding.

[0174] Mechanistic algorithms, even if stochastic may be derided as hard coded. Primates arrive with some hard coding at birth. All species know how to breathe and eat. A Whitetail fawn can walk. Mammals can nurse. Hard coding is not antithetical to intelligence. Rather, it protects and preserves the allocation of flexible and adaptive processing where learning is needed.

[0175] Brute force deep AI methods waste resources (both data and processes) by rejecting mechanistic methods. Because mechanistic models can be implemented in systems quickly and can offload those areas where deep AI is truly needed, we gain significant reductions in data collection, data cleaning, model training and other limitations of pure AI and OA approaches.

[0176] On the other hand, mechanistic methods are greatly enhanced by using the strengths of AI when uncertainty is preserved. The three previous principles (Avoiding undesirable depth by segmentation, Ensembles without forced voting or averaging (redundancy) and Preservation of uncertainty) create even more vigor when appropriate mechanistic methods are employed. It should be clear, for example, that the dimensionality reduction from segmentation is enhanced by use of mechanistic models.

[0177] Hard overrides are more feasible in VAI systems than pure deep AI systems. Because VAI segments a system into understandable components, intervention based on those components is more natural. As a result, mode changes and safeguards are more easily devised. This is due to: [0178] improved explainability of segmented systems over massive, deep AI [0179] improved explainability of mechanistic systems over pure AI [0180] sophisticated override logic enabled by preservation of uncertainty characteristics [0181] comparison of multiple dimensions of ensemble results
System mode (or state) logic and other forms of hard overrides allow VAI to exhibit highly non-linear behavior with rich adaptation. Such behavior is difficult or impossible to achieve with traditional AI or OA.

[0182] The societal risks from pure AI or OA making life altering choices are lessened by this attribute of VAI systems. For example, a mechanistic human performance model representing the experience of a parole officer could estimate the span of uncertainty around the officer's judgements and recommendations. Officers with a great deal of experience, and whose knowledge is current in the topics relevant to a case, might be deemed less prone to bias or error than a less proven officer. A VAI combining the mechanistic experience model, estimating the risks of error and bias by the officer, along with a deep AI risk model, could be supervised by hard overrides testing for statistical bias in the officer's case load or in other statistical pools. The logic could shift from alerting a supervisor to a plausible statistical bias, to an urgent warning depending on the elements of the VAI.

[0183] This approach has analogues to previous examples. To prevent the hypothetical fishing AI from draining the lake, we would summarize the applicable fishing laws. To prevent distraction from GPS or other navigation and timing controls, we would add some safety checks to the supervisor.

[0184] Non-linear mode controls are a final attribute of VAI described in this paper. In biological systems, the elements of the system are integrated into control mechanisms separate from, and in ways more richly complex than hard overrides.

[0185] A skilled pilot shifts focus to a few factors while landing an aircraft. This shift of attention varies from other flight regimes. But it also varies with the landing. Low visibility, buffeting cross winds, precipitation, and other factors will cause critical changes in behavior and focus. Even though the pilot has a checklist to guide the correct typical behavior, a skilled pilot's experience changes the allocation of effort and attention.

[0186] Primates have interconnected cognitive, limbic, and endocrine systems. These are richly connected in highly non-linear ways we do not fully grasp. The sense that something is wrong might come from some combination of right hemisphere (unspoken) processes. It could seem more urgent if sensory information (balance, noise) were unfamiliar. This combination could trigger an adrenaline response, shifting many physiological systems. This chain of events creates options, such as fight or flight without committing to a predetermined path. The chain starts a response with good prospects (best Bayesian prior) and modifies the path as information is added. The fight or flight decision might further be modulated by factors such as those shown in FIG. 7. The defense of the vulnerable (right side of FIG. 7) might change the choice to fight and make flight an unethical choice.

[0187] This is far different from brittle AI, which is like the left hemisphere of the brain. With nothing like the right hemisphere to insist on other options, it is difficult to either represent, or to preserve uncertainty for DNNs. Even tree methods will struggle when uncertainty is outside the training deck which pruned and tuned the tree.

[0188] Primate intelligence displays phenomena which some neuroscientist call modulation. The prefrontal cortex may override the impulses of the posterior brain. One part of the left hemisphere may energize its compliment on the right side when stimulus is outside the familiar patterns the left side prefers. Modulation is roughly the same as what we mean by non-linear controls, and in primates it is particularly important when dealing with uncertainty. Thus, the preservation of uncertainty is necessary for rich non-linear controls.

[0189] VAI can integrate subsystems to form a system of systems. The subsystems control each other in highly discontinuous and non-linear ways. Because traditional deep AI has underlying linearity, this is more difficult and less reliable than a VAI made up of segmented functions, some of which are mechanistic, and can be constructed to any form we choose, not constrained by T4C. Like primate brain modulation, this form of control permits alternating dominance where each VAI component can take precedence (whether a hard override or not). The four VAI systems/subsystems described in the example above can alternate in their dominance, just as the brain may alternate from left to right and from posterior to prefrontal as a decision is formed.

[0190] This dynamic, non-linear control makes traditional deep AI functions far more useful and powerful than might be possible in an OA. Just as the left side the brain alone is brittle, stubborn and greedy, a VAI approach can augment with mechanistic systems who are less prone to these problems.

[0191] To summarize, this teaching has explained six principles for VAI systems and methods: [0192] 1. Avoiding undesirable depth by segmentationDecomposition allows faster execution, improved exploitability, permits component reuse, and makes VAI control a natural and feasible system feature. [0193] 2. Ensembles without forced voting or averaging (redundancy)Ensemble methods are powerful because they represent rich diversity. VAI seeks to preserve these multiple system states, even when they seem contradictory. [0194] 3. Preservation of uncertaintyUncertainty is critical in VAI because it improves control functions, and because it provides human decision makers with context for aided choices. Further, it allows rich representation of uncertainty both in probabilistic terms, and in terms of alternate, even competing results. [0195] 4. Mechanistic methodsCause and effect-based system components are less prone to seduction by falsified data and often provide parsimony (compactness). This has the effect of directly reducing important risks of current AI methods, while preserving rich hyper-dimensionality. Because there are a small number of such models which are needed repetitively to represent modern economies, systems, processes and living, these can be reused over many VAI systems, without the need for extensive hand programming, the typical objection raised by AI cargo cultists. For example, a rich mechanistic model of electric motors, ubiquitous in modern civilization, can be generated and reused in large numbers, without resorting to training a NN on these devices. Those skilled in the art will see how many topics can be supported by such compact, and reliable models, without the need for brute force AI, and without the risks of seduction faced by such AI. [0196] 5. Hard overridesTo improve safety, legal compliance, and to enable ethical AI, hard, testable controls are part of the VAI controller. [0197] 6. Non-linear mode controlsTo improve robustness and deal with rare but important events, highly non-linear controls are part of the VAI system control component resulting in robustness not achievable with current AI and ML methods.
Taken together the last two principles (hard overrides, and non-linear control) provide a means to create VAI systems which embody the other four principles.

[0198] Three examples help to illustrate system implementation.

[0199] Example 1: An MMI controller provides a means of optimal control for a VAI composed of arbitrary system components. These components might include direct hardware sensors, DNNs, mechanistic models and other components. The MMI can be set to provide optimum control and response, even to novel conditions which had not been available for training purposes.

[0200] Example 2: Using MMI methods, a DNN can be trained to respond to system components. The DNN's responses can be classified as correct and incorrect for training purposes.

[0201] Example 3: A hybrid controller composed of hard coded rule, a DNN and an MMI can provide a segmented controller, to oversee the segmented system components.

[0202] Turning now to FIG. 8, a simplified example of VAI is presented, in accordance with an embodiment, wherein: [0203] A traditional NN (upper left) has been trained with data. [0204] An unconstrained NN includes FFT functionality (upper right). [0205] An algorithm mimics the human tendency to protect or defend based on face and eye characteristics (lower left). [0206] The VAI Modulator amplifies or inhibits the flow of data (from IoT, lower right) and modulates the weight given to results generated by the VAI components in order to generate results (lower center).
The traditional NN has all the benefits associated with that technology. It does not depend on known cause-effect relationships. The unconstrained NN with FFT functionality has all the benefits associated with that technology, it is unconstrained by the limitations of UAT. It responds to signals matching FFT, only. VAI modulator can choose to ignore input data, such has high frequency sounds from predator faced entities, or, give weigh to high frequency sounds from vulnerably faced entities. Thus, the FFT results will be preempted in one instance and amplified in another.

[0207] No traditional AI could match the performance of the exemplary system shown in FIG. 8. Training a traditional NN to perform the functions of an FFT would requires massive data sets and is vulnerable to noise, among other types of fragility. The VAI modulator is different than the obvious amalgam methods used in traditional AI. Rather than simply passing information from one stage to the next as shown in FIG. 4, the interaction and data flows are modulated. Both amplification and attenuation are performed. Timing and sequencing are implicit to the VAI modulator, which like the corpus callosum actively controls information flow to influence the outcome of the intelligence.

[0208] Turning now to FIG. 9, four types of digital twins are represented; a plurality of twins, accordance with an embodiment. Each of the four have different attributes, or a plurality of twins, as described in U.S. patent application Ser. No. 16/270,338, by Roemerman, entitled System and Method That Characterizes an Object Employing Virtual Representations Thereof, which is incorporated herein by reference. The as-built twin represents configuration, and changes in configuration over time. The performance twin is a representation of how the item should be expected to perform. If live performance varies, this might suggest intervention is needed. The failure twin predicts failure and remaining useful life. It may also prescribe specific interventions. The FFT Evolved AI Twin can mimic frequency domain attributes, in contrast to the time domain representations of the other three twins. It will be apparent to those skilled in the art that other twins might be included in this example, such as deep neural network trained twins. NN twins could provide alternative versions of both the performance twin and the mechanistic twin. An example of NN performance twins can be seen in U.S. Pat. No. 10,430,531 by Haye, entitled Model-based System Monitoring, which is incorporated herein by reference. Thus, the plurality shown here could easily have many more elements, as will be apparent to those skilled in the art.

[0209] The Vigorous AI Modulator (center) moderates the information exchange between items and provides oversight of the system, as a whole. It determines which predictions and prescriptions are published to the system users (top center).

[0210] Internet of Things (IoT) and other input/output data flow is shown on the far right. This represents data flow available to the system, including data which the system observes but does not control. This data interacts with all the other elements shown in FIG. 9, but interconnections are omitted for the sake of clarity. Not all data need be controlled by the VAI Modulator. Some direct information flow may be permitted.

[0211] The as-built twin (lower right) is a data base of the configuration of each item to be twinned. It is used to configure the other three twins, indicated by the three thin dashed lines. For example, if the item to be twinned can be configured with different types of electric motors, the as-built twin data will reflect motor characteristics such as line voltage, single or multiple phase, whether the motor has variable frequency drive control, target RPM, maximum allowable temperature, or any of several other attributes which will be apparent to those skilled in the art.

[0212] The performance twin (upper left) predicts how the item should perform in the real world. If a plurality of performance twins were employed each of those twins would provide such predictions. While FIG. 9 shows a mechanistic twin, an NN twin could be employed instead of, or in addition to the mechanistic twin. Differences between observer performance and predicted performance is a key attributed use by the VIA Modulator. If a plurality of performance twins were employed the differences between these twins, and their differences compared to the real world would further be used by the VAI Modulator.

[0213] The FFT Evolved AI Twin (upper right) provides a frequency domain representation of the item.

[0214] The Vigorous AI Modulator provides control functions and oversight. For example, if the as-built twin shows a change in configuration, but IoT data I/O shows the system has not been taken off line, the configuration change might be challenged, or the configuration update delayed, pending down time and validation the configuration has been updated. Thus, the VAI can accommodate the contradiction of a supposed configuration update which is not consistent with other abstractions in the plurality of twins.

[0215] Meanwhile the contraction can be used to create anticipation. If the configuration twin shows component recalls have been issued, while the failure twin is prediction imminent end-of-life for the same component, it might issue a prescription to prioritize intervention, and further alert maintainers that while the configuration twin shows a change has been made, the actual update seems to have not yet been accomplished. The VAI is anticipating the need for more urgent attention. This is just one example, and other similar benefits from the VAI will be understood by those skilled in the art.

[0216] As previously mentioned, not all information flow need be under the control of the VAI Modulator. In human neural processes, reflexes exhibit a variety of reflexes, as do other higher forms of life which are sometimes classified as hereditary, instinctive, or otherwise innate. These do not normally operate under the higher forms of cognitive processing.

[0217] Avoiding using the VAI Modulator as a choke point provides faster reactions for some critical functions and avoids processing burden for mundane tasks. Higher forms of life do not require use of higher cognitive functions for simple acts as chewing, swallowing, blinking, stepping, or standing (depending on the species). Those skilled in the art will see this is a significant system architectural advantage compared to brute force AI, with large scale NNs. This is an example of how VAI directly addresses the curse of dimensionality, and other AI processing challenges.

[0218] FIG. 9 helps to illustrate how a VAI can mimic the systems architecture of higher forms of intelligent life by partitioning, modulation, and prioritization. Future it helps illustrate that multiple forms of processing from NNs to Evolved AI to mechanistic models can be blended to create hybrid vigor, with attributes superior to any of the individual constituents. Those skilled in the art will appreciate this is exemplary, and many other configurations will be obvious.

[0219] Turning now to FIG. 10, an exemplary system to validate the attribution of news information, and to assign a credibility score is presented, in accordance with an embodiment. This exemplary system illustrates how VAI can blend uncertainty, ambiguity, and multiple types of AI to produce a responsive and trustworthy assessment of live news feeds from the internet.

[0220] In the upper left, a corpus of historic news has been complied. This would include stories from mainly straight news such as United Press International, the Wall Street Journal, etc. In addition, slanted news sources such as state-controlled press such as RT Press (Russian press), advocate press such as National Review, Mother Jones, etc., and satire press such as The Onion and Babylon Bee are included.

[0221] The corpus of historic news and attribution are used to train two AI systems which classify each story in two dimensions; attribution and topic. The AI systems are an FFT based Evolved AI (upper center), and a Natural Language Processing Deep Neural Network (lower center). These two systems provide a redundant input to the VAI Modulator (lower right), when live News Feed information (lower left) is presented to the AI.

[0222] A corpus of judged relatability by topic and source (upper right) also feeds the VIA Modulator. A panel of human experts is tasked to assess the reliability of news sources on a variety of topics, matching the topic classifications used to train the two AIs. A given judge might assign high credibility on sports to Sports Illustrated but less credibility to RT Press and zero credibility to The Onion. Publications by advocacy groups and industry associations provide further classes of news sources to be judged, which represent both a degree of expertise, and a degree of bias.

[0223] The panel of judges' inputs provides a spread of uncertainty representing source relatability (a statistical distribution). This distribution reflects both the span of credibility of the source, and the perception of credibility across the population of judges. The judges can be selected to ensure diversity of expert knowledge, political persuasion, religious belief, gender balance, cultural perspectives, and ethic viewpoint. Thus, the risk of unethical bias can be greatly reduced. At the same time, using this diverse group to create a distribution of judgements preserves contradictory viewpoints and honest disagreement.

[0224] The corpus of judged reliability drives a rule-based system, similar to the expert systems of early AI, within the VAI Modulator. The Modulator compares the topic and attribution results from the two AIs to create a composite topic and attribution result. If the two AIs disagree on topic or on attribution, this indicates a risk in the system's ability to confidently classify a news story. Agreement indicates less risk in classification. Classification results (topic and attribution) are compared with the attribution in the news feed. If byline attribution fails to match with the AI classifier attribution, this also indicates risk in the attribution.

[0225] The rule-based system avoids human biases such as confirmation bias. A story which confirms the prejudices of a human reader are unlikely to mislead the system presented in FIG. 9. Further, the rules represented in the VAI modulator can be crafted to ensure ethical responses. Ambiguity will be treated consistently, regardless of the topic, for example.

[0226] It will be clear to those skilled in the art that the system presented in FIG. 9 can provide explainable, fast assessments of news stories with reduced risk of bias, prejudice and unethical behavior. Further it will be clear to those skilled in the art that this system resists deception and false attribution and can provide an assessment of confidence in the source and the content of a story, which would otherwise be impractical.

[0227] The exemplary systems shown in FIGS. 8, 9, and 10 serve to illustrate how the Vigorous AI system architecture disclosed herein can be adapted to a wide range of applications. Like older AI methods, VIA has broad fields of use, ranging from natural language processing, to machine control, maintenance planning and many others.

[0228] Turning now to FIG. 11, illustrated is a flow diagram of an embodiment of a method 1100 of evaluating an object to predict a characteristic of the object. The method 1100 may be employable in an artificial intelligence system to predict the characteristic of the object. The method 1100 is operable on a processor such as a microprocessor coupled to a memory. The method 1100 begins at a start step or module 1105.

[0229] At a step or module 1110, a first model is trained using a corpus of data.

[0230] At a step or module 1120, an object is evaluated using the first model to produce a first prediction of a characteristic of the object.

[0231] In an embodiment, the evaluating the object using the first model includes producing a quality of the first prediction. In an embodiment, the quality of the first prediction includes a confidence thereof.

[0232] At a step or module 1130, the object is evaluated using a second model to produce a second prediction of the characteristic of the object, the second model being dissimilar to the first model.

[0233] At a step or module 1140, a final prediction of the characteristic of the object is generated as a function of dynamic weightings of the first prediction and the second prediction.

[0234] In an embodiment, the dynamic weightings are a function of at least one external input. In an embodiment, the dynamic weightings are a function of predefined rules. In an embodiment, the dynamic weightings are constructed to ensure explainability.

[0235] The method 1100 terminates at end step or module 1150.

[0236] The first model can be, as an example, a Fast Fourier Transform trained on a stochastic network. Another example model can be a failure model, a mechanistic cause and effect model such as a Kalman filter, or an as-built model representing constituent parts that make up a system. The second model, being dissimilar to the first, can be, as a further example, a digital failure twin of the first model. Knowledge of an engineer can be encoded into one or more of digital twins or into the other model. A traditional neural network trained with data can be functional for the first model, such as training by a newsfeed.

[0237] The predicted characteristic of the system can include, as examples, an estimation of a natural resource reservoir capacity, a binary quality that may be true or not, or a quality of human judgment. A confidence of the predicted characteristic can be produced. A mechanistic cause and effect model can incorporate uncertainties.

[0238] Turning now to FIG. 12, illustrated is a block diagram of an embodiment of an apparatus 1200 for predicting a characteristic of an object. The apparatus 1200 is configured to perform functions described hereinabove of predicting the characteristic of the object. The apparatus 1200 includes a processor (or processing circuitry) 1210, a memory 1220 and a communication interface 1230 such as a graphical user interface.

[0239] The functionality of the apparatus 1200 may be provided by the processor 1210 executing instructions stored on a computer-readable medium, such as the memory 1220 shown in FIG. 12. Alternative embodiments of the apparatus 1200 may include additional components (such as the interfaces, devices and circuits) beyond those shown in FIG. 12 that may be responsible for providing certain aspects of the device's functionality, including any of the functionality to support the solution described herein.

[0240] The processor 1210 (or processors), which may be implemented with one or a plurality of processing devices, perform functions associated with its operation including, without limitation, performing the prediction of the characteristic of the object. The processor 1210 may be of any type suitable to the local application environment, and may include one or more of general-purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and processors based on a multi-core processor architecture, as non-limiting examples.

[0241] The processor 1210 may include, without limitation, application processing circuitry. In some embodiments, the application processing circuitry may be on separate chipsets. In alternative embodiments, part or all of the application processing circuitry may be combined into one chipset, and other application circuitry may be on a separate chipset. In still alternative embodiments, part or all of the application processing circuitry may be on the same chipset, and other application processing circuitry may be on a separate chipset. In yet other alternative embodiments, part or all of the application processing circuitry may be combined in the same chipset.

[0242] The memory 1220 (or memories) may be one or more memories and of any type suitable to the local application environment, and may be implemented using any suitable volatile or nonvolatile data storage technology such as a semiconductor-based memory device, a magnetic memory device and system, an optical memory device and system, fixed memory and removable memory. The programs stored in the memory 1220 may include program instructions or computer program code that, when executed by an associated processor, enable the respective apparatus 1200 to perform its intended tasks. Of course, the memory 1220 may form a data buffer for data transmitted to and from the same. Exemplary embodiments of the system, subsystems, and modules as described herein may be implemented, at least in part, by computer software executable by the processor 1210, or by hardware, or by combinations thereof.

[0243] The communication interface 1230 modulates information for transmission by the respective apparatus 1200 to another apparatus. The respective communication interface 1230 is also configured to receive information from another processor for further processing. The communication interface 1230 can support duplex operation for the respective other processor 1200.

[0244] The Vigorous AI approach introduced herein solves several important problems: [0245] VAI can be controlled to ensure ethical behavior; [0246] VAI can be tested to prevent unethical behavior; [0247] VAI provides explainable, transparent system function; [0248] VAI can greatly reduce the curse of dimensionally; [0249] VAI can perform at higher levels of confidence with less training and input data; [0250] VAI can incorporate multiple abstractions of reality and tolerate ambiguity; and, [0251] VIA can incorporate multiple abstractions of realty and accommodate contradictions.

[0252] A partitioned approach has been introduced for Artificial Intelligence to provide vigor such as for predicting a characteristic of an object. The systems and methods described herein use principles from higher order human intelligence. These include functional partitioning to constrain AI functions to limited or segmented roles, allowing the functions to operate with a degree of independence (including allowing for disagreement) and further includes functional combinations which include integration, inhibition, and excitation. The VAI approach can be used to create a system and method which: more reliably converges compared to existing art; can resist a wide range of subtle errors including human foibles; is easier to explain compared to existing art; allows representations of highly non-linear and non-continuous functions; copes with dimensionality, requiring less data than existing art, and is less likely to require dimensionality reduction; accommodates uncertainty and even contradictions; and, naturally resists data-driven contamination, both accidental and malicious.

[0253] As described hereinabove, the exemplary embodiments provide both a method and corresponding apparatus consisting of various modules providing functionality for performing the steps of the method. The modules may be implemented as hardware (embodied in one or more chips including an integrated circuit such as an application specific integrated circuit), or may be implemented as software or firmware for execution by a processor. In particular, in the case of firmware or software, the exemplary embodiments can be provided as a computer program product including a computer readable storage medium embodying computer program code (i.e., software or firmware) thereon for execution by the computer processor. The computer readable storage medium may be non-transitory (e.g., magnetic disks; optical disks; read only memory; flash memory devices; phase-change memory) or transitory (e.g., electrical, optical, acoustical or other forms of propagated signals-such as carrier waves, infrared signals, digital signals, etc.). The coupling of a processor and other components is typically through one or more busses or bridges (also termed bus controllers). The storage device and signals carrying digital traffic respectively represent one or more non-transitory or transitory computer readable storage medium. Thus, the storage device of a given electronic device typically stores code and/or data for execution on the set of one or more processors of that electronic device such as a controller.

[0254] Although the embodiments and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope thereof as defined by the appended claims. For example, many of the features and functions discussed above can be implemented in software, hardware, or firmware, or a combination thereof. Also, many of the features, functions, and steps of operating the same may be reordered, omitted, added, etc., and still fall within the broad scope of the various embodiments.

[0255] Moreover, the scope of the various embodiments is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed, that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized as well. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.

SYSTEM AND METHOD FOR VIGOROUS ARTIFICIAL INTELLIGENCE

Assignee

Inventors

Cpc classification

Classification Explorer

G06F17/142

PHYSICS

Classification Explorer

G06N20/00

PHYSICS

Classification Explorer

G06N3/084

PHYSICS

Classification Explorer

G06F7/023

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

G06F17/18

PHYSICS

Classification Explorer

G06F2207/4824

PHYSICS

Classification Explorer

G06F11/3452

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G06F7/22

PHYSICS

Classification Explorer

G06N3/048

PHYSICS

Classification Explorer

G06N3/042

PHYSICS

Classification Explorer

G06F30/27

PHYSICS

Classification Explorer

G06N7/01

PHYSICS

Classification Explorer

G06F17/16

PHYSICS

Classification Explorer

G06N5/01

PHYSICS

Classification Explorer

G06F2111/10

PHYSICS

Classification Explorer

G06N3/105

PHYSICS

Classification Explorer

G06F17/11

PHYSICS

International classification

Classification Explorer

G06N3/04

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

Abstract

Claims

Description