SYSTEM AND METHOD FOR DATA PROCESSING AND COMPUTATION

Abstract

A data processing device and a computer-implemented method are configured to execute in parallel a data hub process (6) comprising at least a segmentation sub-process (61) which segments input data into data segments and at least one keying sub-process (62) which provides keys to the data segments creating keyed data segments, wherein the data hub process (6) stores the keyed data segments in a shared memory device (4) as shared keyed data segments and a plurality of processes in the form of computation modules (7) wherein each computation module (7) is configured to access the at least one shared memory device (4) to look for modulo-specific data segments which are shared keyed data segments that are keyed with at least one key which is specific for at least one of the computation modules (7) and to execute a machine learning method on the module-specific data segments, said machine learning method comprising data interpretation and classification methods using at least one pre-trained neuronal network (71) and to output the result of the executed machine learning method to the shared memory device (4) or another computation module.

Claims

1. A data processing device, comprising: at least one first interface for receiving input data at least one second interface for outputting output data at least one shared memory device into which data can be written and from which data can be read at least one computing device to which the at least one first interface and the at least one second interface and the at least one shared memory device are connected, and which is configured to receive input data from the at least one first interface send output data the at least one second interface read data from and write data into the at least one shared memory device wherein the at least one computing device is configured to execute in parallel a plurality of processes, said plurality of processes comprising at least: at least one data hub process receiving input data from the at least one first interface and/or the at least one shared memory device and comprising at least one keying sub-process which provides keys to data segments of the input data creating keyed data segments wherein the at least one data hub process stores the keyed data segments in the at least one shared memory device as shared keyed data segments a plurality of processes in the form of computation modules wherein each computation module is configured to access the at least one shared memory device to look for module-specific data segments which are shared keyed data segments that are keyed with at least one key which is specific for at least one of the computation modules execute a machine learning method on the module-specific data segments, said machine learning method comprising data interpretation and classification methods using at least one artificial neuronal network output the result of the executed machine learning method to at least one of the at least one shared memory device and at least one other computation module.

2. The data processing device of claim 1, wherein at least part of the plurality of computation modules is formed by computation modules having a hierarchical vertical structure with layers and/or at least part of the plurality of computation modules is formed into a horizontal structure by way of computational groups.

3. The data processing device of claim 1, wherein at least one routing process is provided which directs output provided by at least one of the computation modules to at least one other computation module and/or the shared memory device.

4. The data processing device of claim 1, wherein the at least one data hub process comprises at least one segmentation subprocess which segments input data into data segments and keeps information which shared keyed data segments were segmented from the same input data.

5. The data processing device of claim 4, wherein the at least one data hub process stores the keyed data segments in the at least one shared memory device as shared keyed data segments and keeps information which shared keyed data segments were segmented from the same input data by using a machine learning technique, preferably neuronal networks.

6. The data processing device of claim 1, wherein the data processing device is configured to repeatedly check the weights of synapses of neuronal networks of at least part of, preferably all of, the plurality of computation modules to make sure they do not diverge.

7. The data processing device of claim 1, wherein at least part of the plurality of computation modules is configured to represent categorical constructions, preferably chosen from a group comprising at least: object, morphism, functor, commutative diagrams, non-commuting morphisms or functors, natural transformation, pullback, pushforward, projective limit, inductive limit, sub-object classifier.

8. The data processing device of claim 7, wherein the data processing device is configured to do unsupervised learning by using commutating diagrams to determine unknown objects and/or morphisms.

9. The data processing device of claim 7, wherein the data processing device is configured to create a sense of orientation in space and/or time by using non-commutating morphisms or functors.

10. The data processing device of claim 7, wherein a random signal generator is configured to input random signals to at least some of the artificial neurons of at least one of the neuronal networks of at least some of the computation modules and wherein it is preferably provided that the random signals are used to create new concepts, in particular preferably by using projective limits.

11. The data processing device of claim 7, wherein random signals of the random signal generator are inputted to at least some of the artificial neurons of at least one of the neuronal networks of a group of computation modules representing a projective limit to generate random data sets which are used to test a hypotheses and to approximately simulate the universal quantifier of natural logic.

12. The data processing device of claim 7, wherein the data processing device is configured to attribute the same natural language description to parts of different images showing the same object.

13. The data processing device of claim 7, wherein the data processing device is configured: to do supervised and unsupervised learning to use new concepts created by using random signals in supervised and unsupervised learning.

14. A computer implemented method for processing data, comprising: running at least one computing device which receives input data, outputs output data and writes data into and reads data out from at least one shared memory device wherein the at least one computing device executes in parallel a plurality of processes, said plurality of processes comprising at least one data hub process receiving input data and comprising at least one keying sub-process which provides keys to data segments of the input data creating keyed data segments wherein the at least one data hub process stores the keyed data segments in the at least one shared memory device as shared keyed data segments a plurality of processes in the form of computation modules wherein each computation module accesses the at least one shared memory device to look for module-specific data segments which are shared keyed data segments that are keyed with at least one key which is specific for at least one of the computation modules executes a machine learning method on the module-specific data segments, said machine learning method comprising data. interpretation and classification methods using at least one artificial neuronal network if a module-specific data. segment is present and runs idle if no module-specific data segment is present outputs the result of the executed machine learning method to at least one of the at least one shared memory device and at least one other computation module.

15. The method of claim 14, wherein at least part of the plurality of computation modules is formed by computation modules having a hierarchical vertical structure with layers and/or at least part of the plurality of computation modules is formed into a horizontal structure by way of computational groups.

16. The method of claim 14, wherein at least one routing process is provided which directs output provided by at least one of the computation modules to at least one other computation module and/or the shared memory device.

17. The method of claim 14, wherein the at least one data hub process comprises at least one segmentation sub-process which segments input data into data segments and keeps information which shared keyed data segments were segmented from the same input data.

18. The method of claim 17, wherein the at least one data hub process stores the keyed data segments in the at least one shared memory device as shared keyed data segments and keeps information which shared keyed data segments were segmented from the same input data using a machine learning technique, preferably neuronal networks.

19. The method of claim 14, wherein the weights of synapses of neuronal networks of at least part of, preferably all of, the plurality of computation modules are repeatedly checked to make sure they do not diverge.

20. The method of claim 14, wherein at least part of the plurality of computation modules represent categorical constructions, preferably chosen from a group comprising at least: object, morphism, functor, commutative diagrams, non-commuting morphisms or functors, natural transformation, pullback, pushforward, projective limit, inductive limit, sub-object classifier.

21. The method of claim 20, wherein unsupervised learning is done by using commutating diagrams to determine unknown objects and/or morphisms.

22. The method of claim 20, wherein a sense of orientation in space and/or time is created by using non-commutating morphisms or functors.

23. The method of claim 20, wherein a random signal generator inputs random signals to at least some of the artificial neurons of at least one of the neuronal networks of at least some of the computation modules and wherein it is preferably provided that the random signals are used to create new concepts, in particular preferably by using projective limits.

24. The method of claim 20, wherein random signals of the random signal generator are inputted to at least some of the artificial neurons of at least one of the neuronal networks of a group of computation modules representing a projective limit to generate random data sets which are used to test a hypotheses and to approximately simulate the universal quantifier of natural logic.

25. The method of claim 20, wherein the same natural language description are attributed to parts of different images showing the same object.

26. The method of claim 20, wherein: supervised and unsupervised learning is done new concepts created by using random signals are used in supervised and unsupervised learning

27. A computer program which, when the program is executed by a data processing device, causes the data processing device to be configured according to claim 1.

Description

BRIEF DESCRIPTION OF DRAWINGS

[0168] The Figures show schematic views of:

[0169] FIG. 1: a data processing device according to an embodiment of the invention

[0170] FIG. 2: the internal structure of the computing device and interactions between its components and other components of the data processing device

[0171] FIG. 3: the internal structure of computation modules and interactions between their components and other components of the data processing device

[0172] FIG. 4: the internal structure of a data hub process.

[0173] FIG. 5: steps according to an embodiment of the invention

[0174] FIG. 6: computation modules representing categorical constructions

[0175] FIG. 7: an example involving structure recognition

[0176] FIG. 8: a detail regarding the example of FIG. 7

[0177] FIG. 9: a detail regarding the example of FIG. 7

[0178] FIG. 10: a detail regarding the example of FIG. 7

[0179] FIG. 11: an example involving data processing

[0180] FIG. 12: the example of FIG. 11 using categorical constructions

[0181] FIG. 13: an example showing a single artificial neuron having a receptor for a random signal

[0182] FIG. 14: an example of a neuronal network having a plurality of neurons as shown in FIG. 13

[0183] FIG. 15: a correspondence between computational modules and a categorical construct

[0184] FIG. 16: different phases in the operation of an inventive data processing device

[0185] FIG. 17: a possible vertical hierarchical organization of a computation module

[0186] FIG. 18: an example of using the categorical construct “pullback” to define a concept for the data processing device to act upon

[0187] FIG. 19A: an example involving unsupervised learning by using categorical constructs

[0188] FIG. 19B: an example involving unsupervised learning by using categorical constructs

[0189] FIG. 19C: an example involving unsupervised learning by using categorical constructs

[0190] FIG. 20A: an example involving analysis of a combination of data types

[0191] FIG. 20B: an example involving analysis of a combination of data types

[0192] FIG. 21: another example involving analysis of natural language

[0193] FIG. 22A: an example showing an approximate definition of the allquantor in natural logic

[0194] FIG. 22B: an example showing an approximate definition of the existence quantor in natural logic

[0195] FIG. 23: an example how the data processing device can construct a sense of orientation in time

[0196] FIG. 24: an example how the data processing device can construct a sense of orientation in space

[0197] FIG. 25A: an example how a sub-object classifier can be constructed

[0198] FIG. 25B: an example how a sub-object classifier can be used

[0199] It should be noted that the number of components shown in the Figures is to be understood exemplary and not limiting. In particular with respect to the computation modules 7 it is to be assumed that in reality there will be many more instantiations than shown in the Figures. Dashed lines show at least some of the interactions between components of the data processing device 1 but, possibly, not all of the interactions. It should also be noted that graphical representations of entities such as computation modules 7 or images of objects shown in conjunction with such entities (e.g., geometrical bodies) are drawn for better understanding of the invention but, with respect to the data processing device 1, are entities encoded in computer code and instantiated during runtime (technical-point-of-view) or categorical representations (information-point-of-view):

[0200] There is a difference between a physical-point-of-view of the data processing device 1 and a information-point-of-view. With respect to the former point of view the plurality of computation modules 7 can be viewed as a matrix (or a higher-dimensional tensor) in which each individual computation module 7 is addressed by an index, e.g., C.sub.k,l. With respect to the latter point of view categorical constructs are present which are represented by one or more computation modules 7. By way of example, a category comprising 1000 objects and/or morphisms might be represented by a matrix of, e.g., 50×4 computation modules 7. In other words, a 1:1 correspondence between a computation module 7 and a categorical construct does not need to exist and, in most embodiments, will not exist.

[0201] FIG. 1 shows an embodiment of a data processing device 1 comprising:

[0202] at least one first interface 2 for receiving input data ID

[0203] at least one second interface 3 for outputting output data OD

[0204] at least one shared memory device 4 into which data can be written and read from

[0205] at least one computing device 5 to which the at least one first interface 2 and the at least one second interface 3 and the at least one shared memory device 4 are connected and which is configured to:

[0206] receive input data ID from the at least one first interface 2

[0207] send output data OD to the at least one second interface 3

[0208] read data from and write data into the at least one shared memory device 4

[0209] FIG. 2 shows a plurality of computation modules 7 which in some embodiments are organized into logical computational groups 16 (which could be organized into logical meta-groups, but this is not shown) and which interact with at least one data hub process 6 via a shared memory device 4. Input data ID is inputted via at least one first interface 2 into the shared memory device 4 and/or the at least one data hub process 4. Output data OD is outputted via at least one first interface 2 into the shared memory device 4 and/or the at least one data hub process 6.

[0210] FIG. 3 shows the internal structure of computation modules 7 for an embodiment in which the computation modules 7 are provided with, e.g., six different layers I, II, III, IV, V, VI (the number of layers could be different for different computation modules 7). Steps of analysing data using such a structure are also shown in FIG. 17. It can also be seen that a routing process 28 is present (in this embodiment separate from the data hub process 6 although in some embodiments it can form part of it) which knows which computation module 7 has to be connected with which other component of the data processing device 1.

[0211] In some embodiments layer I might be configured to process module-specific keyed data segments KS.sub.i obtained from shared memory 4 or the data hub process 6 such as a target vector. This layer can prepare data to be better suited for processing by the at least one neuronal network 71, e.g., by topological down transformation, as is known in the art.

[0212] In some embodiments layer II and/or III might be configured to process data obtained from layer I and, possibly, from other computational modules 7, e.g., via neuronal networks 71 (by way of example ANNs are shown). These are the layers where machine learning takes place to cognitively process data during data analysis. In some embodiments, these layers can also receive information from other computation modules 7, e.g., from layers V or VI of these other computation modules 7.

[0213] In some embodiments layer IV might be configured to comprise at least one neuronal network 71 which, however, is not used for cognitive data processing but to transform data from the data hub process 6 or the shared memory 4 (such as an input vector) for layers II and III, e.g., by topological down transformation.

[0214] In some embodiments layer V and/or VI might be configured to comprise neuronal networks 71 which can be used to learn whether information represented by data is better suited to be processed in a different computation module 7 and can send this data accordingly to the data hub process 6 (preferably via the routing process 28) and/or the shared memory device 4 and/or at least one other computation module 7 where this data can be inputted, e.g., in layers II or III.

[0215] FIG. 4 shows the internal structure of one of possibly several data hub processes 6 for an embodiment in which:

[0216] input data ID is segmented into data segments S.sub.1, . . . , S.sub.7 by one of possibly several segmentation sub-processes 61

[0217] keys K.sub.1, . . . , K.sub.7 are determined by one of possibly several keying sub-processes 62 (in some embodiments at least one ART network might be used for that purpose)

[0218] the keys K.sub.1, . . . , K.sub.7 are assigned to the data segments S.sub.1, . . . , S.sub.7 to create keyed data segments KS.sub.1, . . . , KS.sub.7 by one of possibly several keying sub-processes 62

[0219] the keyed data segments KS.sub.1, . . . , KS.sub.7 are written into the shared memory device 4

[0220] an optional at least one routing process 28, here as a sub-process, which directs output provided by at least one of the computation modules 7 to at least one other computation module 7, the at least one routing process 28 accessing the shared memory device 4

[0221] FIG. 5 shows possible steps carried out by at least one data hub process 6 and at least one computation module 7:

[0222] input data ID is captured via the at least one first interface 2

[0223] keys K.sub.i are determined by one of possibly several keying sub-processes 62

[0224] input data ID is segmented into data segments S.sub.i by one of possibly several segmentation sub-processes 61

[0225] keyed data segments KS.sub.i are created by one of possibly several keying sub-processes 62

[0226] the keyed data segments KS.sub.i are provided to shared memory device 4

[0227] the computation modules 7 repeatedly check shared memory device 4 for module-specific keyed data segments KS.sub.i

[0228] the computation modules 7 load their module-specific keyed data segments KS.sub.i if any are present, otherwise they stay idle

[0229] the computation modules 7 start data analysis on the module-specific keyed data segments KS.sub.i

[0230] the computation modules 7 provide their output to shared memory device 4 and/or at least one data hub process 6 and/or at least one other computation module 7

[0231] FIG. 5 shows how categorical constructs can be represented by the computation modules 7 and their interactions in some embodiments. It should be noted that the number of computation modules 7 per computational group 16 can be different between computational groups 16 and that the representation of categorical constructions by computation modules 7 in no way relies on the presence of computational groups 16 or the internal structure of computation modules 7.

[0232] In some embodiments different computational groups 16 may represent different categories custom-character wherein each computation module 7 represents an object A.sub.i, B.sub.i, C.sub.i, D.sub.i or a morphism a.sub.i, b.sub.i, c.sub.i, d.sub.i and other computational groups 16 may represent functors .sub.1, .sub.2 between different categories, e.g., .sub.1:.fwdarw. and .sub.2:.fwdarw. such that custom-character (A.sub.i)=C.sub.i, (B.sub.i)=D.sub.i for the objects of the categories and .sub.1.sup.a.sup.i(a.sub.i)=c.sub.i, .sub.2.sup.a.sup.i(b.sub.i)=d.sub.i for the morphisms of the categories.

[0233] Different examples of more complex categorical constructs such as the projective limit

[00009] $\overset{\lim}{\leftarrow} A_{i}$

or natural transformations and their possible uses have already been discussed above and further examples will be discussed with respect to the following Figures.

[0234] It is an advantage of those embodiments of the present invention comprising categorical constructions that concepts which have been learned by computation modules 7 in a supervised way can be used by the data processing device 1 to learn related concepts in an, at least partially, unsupervised way.

[0235] FIG. 7 shows an example where a number of computation modules 7 is configured to do structure recognition in order to enable them to recognize geometrical objects in the shape of tetrahedrons, octahedrons or boxes, irrespective of a color, rotational state or possible deformations of the geometrical objects. It could be arranged that the data procession device 1 causes a robot 20 to remove some geometrical objects from the conveyor belt 9 but not others by providing output data OD via the second interface 3 in the form of robot commands to the robot 20.

[0236] Different objects (tetrahedron 17, octahedron 18 and box 19) are placed on a conveyor belt 9 which transports them past an image capturing device 8 (here in the form of an optical camera) which is connected to the first interface 2 to provide video stream or a series of images as input data ID which can be loaded by the data hub process 6. The input data ID is segmented and keys are created as described above. In the present example it is supposed that the segmentation sub-process 61 has been trained according to the art to recognize the presence of individual objects in the input data ID and to create data segments S.sub.1, S.sub.2, S.sub.3 (without recognizing the type of object) and the keying sub-process 62 has been trained according to the art to create keys K.sub.1, K.sub.2, K.sub.3 for the different objects such that the data hub process 6 can create keyed data segments KS.sub.1, KS.sub.2, KS.sub.3 and provide them to the shared memory device 4.

[0237] Turning to FIG. 8 a number of computation modules 7 representing a category custom-character is shown. The number of computation modules 7 is understood to be symbolic, in reality it will often be larger than the four computation modules 7 shown. A first computation module 7 represents an object A.sub.1 and is trained to repeatedly access the shared memory device 4 looking for keyed data segments KS.sub.1 representing objects. Although the computation modules 7 of this group are specifically trained to analyse tetrahedrons 17 it will load all keyed data segments KS.sub.1, KS.sub.2, KS.sub.3 which are keyed as representing objects. In case during analysis it finds that a loaded keyed data segment KS.sub.2, KS.sub.3 does not represent a tetrahedron 17 it can return this keyed data segment KS.sub.2, KS.sub.3 to the shared memory device 4 with the additional information “not a tetrahedron 17” so that it will not be loaded by a computation module 7 of this group again. Once a keyed data segment KS.sub.1 has been loaded by the computation module 7 representing object A.sub.1 analysis begins. This computation module 7 has been trained to recognize tetrahedrons 17 irrespective of the color of the object (symbolized by shading), orientation of the object or possible deformation. As an output it creates data representing A.sub.1=“tetrahedron” as symbolized by the box showing a tetrahedron 17 without shading and provided with the additional information “TETRA”. This output can either be sent directly to other computation modules 7 of this group or can be stored in the shared memory device 4. Here it is assumed that it is stored in the shared memory device 4 and the computation module 7 representing object A.sub.2 loads this information. Computation module 7 representing object A.sub.2 has been trained to recognize that the tetrahedron 17 is in a rotational state (with respect to a normalized state represented by object A.sub.3) and outputs this information as A.sub.2=“TETRA, ROT, α, β, γ”. However, it should be noted, that this computation module 7 does not necessarily encode the rotation group SO(3) since it is not necessary for the computation module 7 to know the exact values of α, β, γ. Computation module 7 has been trained to receive as input A.sub.2 and A.sub.3, recognize the rotational state of tetrahedron 17 by comparing these two inputs and to output this information which can be understood as representing the morphism a.sub.1:A.sub.3.fwdarw.A.sub.2 as “TETRA, ROT, α, β, γ”.

[0238] Of course other types of transformations than rotations could be represented, such as translations, reflections, . . . . It is to be understood that in some embodiments the morphism a.sub.1 might be composed of several morphisms a.sub.1=a.sub.11 ∘ . . . ∘a.sub.1k wherein each morphism is encoded by one or several computation modules 7, e.g., of three morphisms a.sub.11, a.sub.12, a.sub.13 wherein each morphism encodes rotation about a single axis or translation along a single direction.

[0239] The group of computation modules 7 of FIG. 9 is the same as the one shown in FIG. 8. Using the categorical construct of a functor custom-character the objects and morphisms of category can be mapped to objects and morphisms of category which, in this example, represents octahedrons 18. In this way it is not necessary to train the computation modules 7 of category once training of the computation modules 7 of category is completed because all necessary concepts are mapped by the functor custom-character from category to category resulting in FIG. 10 (of course, the same can be done by a different functor with respect to a category representing boxes). In this example the functor has been learned by comparing the rotational states of different geometrical objects, namely tetrahedrons 17 and boxes 19 after these rotational states had been learned.

[0240] FIG. 10 shows a number of computation modules 7 representing a category C. The number of computation modules 7 is understood to be symbolic, in reality it will often be larger than the four computation modules 7 shown. A first computation module 7 represents an object C.sub.1 and is trained to repeatedly access the shared memory device 4 looking for keyed data segments KS.sub.1 representing objects. Although the computation modules 7 of this group are specifically trained to analyse octahedrons 18 it will load all keyed data segments KS.sub.1, KS.sub.2, KS.sub.3 which are keyed as representing objects. In case during analysis it finds that a loaded keyed data segment KS.sub.1, KS.sub.3 does not represent an octahedron 18 it can return this keyed data segment KS.sub.1, KS.sub.3 to the shared memory device 4 with the additional information “not an octahedron 18” so that it will not be loaded by a computation module 7 of this group again. Once a keyed data segment KS.sub.2 has been loaded by the computation module 7 representing object C.sub.1 analysis begins. This computation module 7 has been trained to recognize octahedrons 18 irrespective of the color of the object (symbolized by shading), orientation of the object or possible deformation. As an output it creates data representing C.sub.1=“octahedron” as symbolized by the box showing an octahedron 18 without shading and provided with the additional information “OCTO”. This output can either be sent directly to other computation modules 7 of this group or can be stored in the shared memory device 4. Here it is assumed that it is stored in the shared memory device 4 and the computation module 7 representing object C.sub.2 loads this information. Computation module 7 representing object C.sub.2 has been trained to recognize that the octahedrons 18 is in a rotational state (with respect to a normalized state represented by object A.sub.3) and outputs this information as C.sub.2=“OCTO, ROT, α, β, γ”. Computation module 7 has been trained to receive as input C.sub.2 and C.sub.3, recognize the rotational state of octahedron 18 by comparing these two inputs and to output this information which can be understood as representing the morphism c.sub.1:C.sub.3.fwdarw.C.sub.2 as “OCTO, ROT, α, β, γ”.

[0241] FIG. 11 shows how, in some embodiments, the data processing device 1 can analyse complex data by making use of different computation modules 7 which are each trained to recognize specific data. Some data X is inputted to the routing process 28 (or a different structure such as a sufficiently complex arrangement of computation modules 7) which sends this data to different computation modules 7. Each computation module 7 checks whether it knows (at least part of) the data X by checking, whether A.sub.i forms part of data X (here represented by the mathematical symbol for “being a subset of”). If the answer is “yes” it reports this answer back to sub-process 63. If the answer is “no” it can report this answer back to sub-process 63 or, in a preferred embodiment at least with respect to some computational modules 7, sends the data (segment) to at least one other computation module 7 (which can, e.g., form part of a category that might be better suited to recognize this data). By way of example, data X might represent some geometrical object such as an octahedron or (part of) a sentence such as “The cat gives birth to a baby”.

[0242] In the first example, the computation modules 7 of a first category custom-character might represent objects A.sub.i that represent geometrical objects in the form of differently deformed or rotated tetrahedrons, while the computation modules 7 of second category might represent objects C.sub.i in the form of differently deformed or rotated octahedrons. The computation modules 7 of the first category custom-character will not be able to recognize data X in the form of an octahedron (since they know tetrahedrons) and will either give this information to the routing process 28 or, as shown in this Figure, can send this data X to computation modules 7 of the second category which will be able to recognize the data X.

[0243] In the second example, the computation modules 7 of a first category custom-character might represent objects A.sub.i that represent nouns (e.g., “cat”, “birth”) or verbs (e.g., “give”) referring to a first topic (e.g., “cats”), while the computation modules 7 of second category might represent objects C.sub.i that represent nouns (e.g., “dog”, “birth”) or verbs (e.g., “give”) referring to a second topic (e.g., “dogs”). The computation modules 7 of the first category custom-character will be able to recognize data X in the form of a sentence concerning “cats” and will give this information to the routing process 28 or, could send this data X to computation modules 7 of a different category for further processing.

[0244] In preferred embodiments, the data processing device 1 is enabled to create new concepts itself (cf. FIG. 13) by inputting a random signal RANDOM to at least one layer of the neuronal network(s) 71 of a computation module 7 such that the inputs of the neurons which, after integration, are used by an activation function σ of the known kind of the neuronal network 71 to determine whether a certain neuron 21 will fire or not, are modified. In this way, a neuronal network 71 which is inputted with information regarding a geometrical object will base its computation not on the inputted information alone but on the inputted information which was altered by the random signal. In FIG. 11 this is shown by the signal line denoted “RANDOM”. In some embodiments, if a hierarchically structured computation module 7 is used, this random signal RANDOM could be provided to the at least one neuronal network 71 present in layer III.

[0245] FIG. 12 shows how the projective limit can be used for the process described in FIG. 11, e.g., by the routing process 28 of the data hub process 6 and/or by individual computation modules 7 and/or groups 16 of computation modules 7: data X which is to be interpreted is inputted to a computation module 7 (depending on the complexity of the data it will, in practice, often have to be a group 16 of computation modules 7) which is interpreted to represent the projective limit of the data X which is interpreted to consist of a sequence of data segments

[00010] $A_{1} \overset{a_{1}}{\leftarrow} A_{2} \overset{a_{2}}{\leftarrow} .Math. \overset{a_{n = k - 1}}{\leftarrow} A_{n = k} .$

The projective limit is the object

[00011] $\overset{\lim}{\leftarrow} A_{i}$

together with morphisms π.sub.i which means that the sequence A.sub.n=i, . . . , A.sub.n=k is projected onto its ith member A.sub.n=i. The data processing device 1 can remember how the data X was segmented, e.g., by use of the projection morphisms π.sub.i and morphisms α.sub.i. Although not shown in FIG. 12, input of random signals RANDOM could also be present.

[0246] FIG. 13 shows a single artificial neuron 21 of an artificial neuronal network 71. The artificial neuron 21 (in the following in short: “neuron 21”) has at least one (usually a plurality of) synapse 24 for obtaining a signal and at least one axon for sending a signal (in some embodiments a single axon can have a plurality of branchings 25). Usually, each neuron 21 obtains a plurality of signals from other neurons 21 or an input interface of the neuronal network 71 via a plurality of synapses 24 and sends a single signal to a plurality of other neurons 21 or an output interface of the neuronal network 71. A neuron body is arranged between the synapse(s) 24 and the axon and comprises at least an integration function 22 for integrating the obtained signals according to the art and an activation function 23 to decide whether a signal is sent by this neuron 21. Any activation function 23 of the art can be used such as a step-function, a sigmoid function. . . . As known in the art, the signals obtained via the synapses 24 can be weighted by weight factors w. These can be provided by a weight storage 26 which might form part of a single computation module 7 or could be configured separately from the computation modules 7 and could provide individual weights w to a plurality (or possibly all) of the neuronal networks 71 of the computation modules 7. These weights w can be obtained as known in the art, e.g., during a training phase by modifying a pre-given set of weights w such that a desired result is given by the neuronal network 71 with a desired accuracy.

[0247] In some embodiments the neuron body can comprise a receptor 29 for obtaining a random signal RANDOM which is generated outside of the neuronal network 71 (and, preferably, outside of the computation module 7). This random signal RANDOM can be used in connection with the autonomous creation of new concepts by the data processing device 1.

[0248] The neurons 21 of a neuronal network 71 can be arranged in layers L.sub.1, L.sub.2, L.sub.3 (which are not to be confused with the layers I-VI of a computation module 7 if the computation module 7 has a hierarchical architecture).

[0249] In some embodiments, the layers L.sub.1, L.sub.2, L.sub.3 will not be fully connected.

[0250] FIG. 14 shows three layers L.sub.1, L.sub.2, L.sub.3 of neurons 21 which form part of a neuronal network 71. Not all of the connections between the neurons 21 are shown. Some of the neurons 21 are provided with a receptor 29 for obtaining a random signal RANDOM.

[0251] FIG. 15 shows, by way of example, how a plurality of computation modules 7 (the chosen number of four is an example only) C.sub.11, C.sub.12, C.sub.21, C.sub.22 which form part of a tensor (here a 2×2 matrix) is used to represent a single category ε and how, in the information-point-of-view, this category is connected to a base or index category custom-character via a functor ϕ (ε) can be viewed as a fibred category) while in the physical-point-of-view the four computation modules 7 are connected via the routing process 28 to the data hub process 6. The routing process 28 and/or the data hub process 6 know where the information provided by the computation modules 7 has to be sent to.

[0252] FIG. 16 shows that although, approximatively speaking, different phases can be thought to be present in the operation of an embodiment of a data processing device 1 according to the invention, at least some of these phases can be thought of temporally overlapping or being present in a cyclic way:

[0253] A first phase is denoted as “Configuration”. In this phase the basic structures of the data processing device 1 are configured such as the presence of the data hub process 6, the presence of the computation modules 7, configuration of categorical structures, configuration of auxiliary processes and the like.

[0254] Once this first phase is finished the data processing device 1 can start with supervised training. It is not necessary that this training is done as known in the art (by providing training data to the neuronal networks and adjusting weights until a desired result is achieved with a desired accuracy), although this can be done. According to the invention it is also possible (additionally or alternatively) that the data processing device 1 receives input data ID, e.g., by way of a sensor or by accessing an external database, analyses the input data ID using the computation modules 7 and checks back with an external teacher, e.g., a human operator or an external database or the like, whether the results of the analysis are satisfactory and/or useful. If so, supervised learning is successful, otherwise, another learning loop can be done.

[0255] In addition to this supervised learning, unsupervised learning is started by the data processing device 1 in the above-described way using categorical constructs such as objects, morphisms, commutative diagrams, functors, natural transformations, pullbacks, pushouts, projective limits, . . . .

[0256] In addition to the phases of supervised and unsupervised learning, once a certain level of knowledge has been achieved by the data processing device 1, the creation of new concepts, i.e., thinking, can be done using random signal RANDOM inputs as described above. Once it has been checked that a new concept makes sense and/or is useful (i.e., is logically correct and/or is useful for data analysis) this new concept can be used in supervised and unsupervised learning processes such that there can be a loop (which can be used during the whole operation of the data processing device 1) between learning (unsupervised and/or supervised) and thinking.

[0257] FIG. 17 shows an embodiment in which at least some of the computation modules 7 have a vertical hierarchical organization with, e.g., six layers I-VI. Arrows show the flow of information.

[0258] Layer I is configured to process module-specific keyed data segments obtained from shared memory 4. This layer can prepare data to be better suited for processing by the at least one neuronal network 71, e. g., by topological down transformation. This data can comprise, e.g., a target vector for the neuronal networks 71 in layers II and III.

[0259] Layers II and III can comprise at least one neuronal network 71 each, each of which processes data obtained from layer I and, possibly, from other computational modules 7. These are the layers where machine learning can take place to process data during data analysis in a cognitive way using well-known neuronal networks such as general ANNs or more specific ANNs like MfNNs, LSTMs, . . . (here synaptic weights w are modified during training to learn pictures, words, . . . ). In some embodiments, these layers can also receive information from at least one other computation module 7, e.g., from layers V or VI of the at least one other computation module 7. In some embodiments, layer III contains at least one neuronal network 71 which receives random signals RANDOM as described above.

[0260] Layer IV can comprise at least one neuronal network 71 which, however, is not used for cognitive data processing but to transform data for layers II and III, e.g., by topological down transformation. This data can comprise, e.g., an input vector for the neuronal networks 71 in layers II and III.

[0261] In layers V and VI neuronal networks 71 can be present which can be used to learn whether information represented by data is better suited to be processed in a different computation module 7 and can be used to send this data accordingly to the data hub process 6 and/or the shared memory 4 and/or routing processes 28 and/or directly to another computation module 7 where this data can be inputted. e.g., in layers II or III.

[0262] FIG. 18 shows an example of using the categorical construct “pullback” to define a concept for the data processing device 1 controlling a robot 20 shown in FIG. 7 to act upon (categorical object A is the pullback of C.fwdarw.D←B, i.e., C×.sub.DB, which is denoted by the small custom-character placed to the lower right of A):

[0263] Categorical object X represents “a geometrical object that has a discernable geometric shape in the form of a box is to be grabbed by the robot”.

[0264] Categorical object A represents “a geometrical object which is to be grabbed by the robot”.

[0265] Categorical object B represents “a discernable shape in the form of a box”.

[0266] Categorical object C represents “a geometrical object with a discernible shape”.

[0267] Categorical object D represents “a discernible shape”.

[0268] Functor ϕ.sub.1 represents “has as discernible shape”.

[0269] Functor ϕ.sub.2 represents “is”.

[0270] Functor ϕ.sub.3 represents “has”.

[0271] Functor ϕ.sub.4 represents “is”.

[0272] Functor Ψ.sub.1 represents “is an object which is”.

[0273] Functor Ψ.sub.2 represents “is”.

[0274] Functor Ψ.sub.3 represents “has as the geometrical object's shape”.

[0275] The diagram formed by categorical objects A, B, C, D is commutative which is denoted by the arrow custom-character . In category theory it can be proven that functor Ψ.sub.1 is unique. In other words, there is an unambiguous assignment of the command represented by X to the pullback represented by A which, in turn, is connected to categorical objects C, B, D. During processing of the data provided by the video capturing device 8 it can be checked by the different computation modules 7, or computational groups 16 of computation modules 7, which represent categorical objects C, B, D, whether any of the data can be interpreted as representing one or more of these categorical objects. In case all of these categorical objects are present in the processed data (i.e., all of the following can be ascertained by processing the data: “a shape can be discerned”, “the shape can be discerned with respect to a geometrical object”, “the shape is in the form of a box”) it can be concluded that the command represented by X is to be executed with the effect that out of all possible geometrical objects which might be arranged on the conveyer belt 9 only those are to be grabbed by the robot 20 for which a shape is discernible and which shape is found to be a box 19.

[0276] FIGS. 19A,B,C show examples involving unsupervised learning to learn new concepts by using categorical constructs.

[0277] By way of example, FIG. 19A shows a commutative diagram (as denoted by custom-character . If A.sub.1 represents the person “Anne”. A.sub.2 represents “a school”, A.sub.3 represents “a student”, a.sub.1 represents “attends” and a.sub.2 represents “is attended by”, then the data processing device can learn the concept that “Anne” is “a student” because a.sub.2∘a.sub.1=a.sub.3 gives: “A school is attended by students”∘“Anne attends a school”=“Anne is a student.”. i.e., a.sub.3=“is”.

[0278] The example of FIG. 19B shows an analysis of natural language using the categorical construction of a pullback (as denoted by custom-character where the knowledge of “cats eat meat” and “dogs eat meat”, represented by the commutative diagram shown (categorical objects A.sub.2, A.sub.3 represent “cats” and “dogs”, respectively, and the morphisms a.sub.2, a.sub.4 represent “eat”), has as pullback C “dogs and cats eat” (morphisms a.sub.1 and a.sub.3 are projections) which can then be abstracted, e.g., to mammals. Upon checking by the data processing device 1 involving a human operator or an external database, this generalization would be found to be incorrect because not all mammals eat meat and the connections between them would have to be retrained.

[0279] It is known in category theory that pullbacks can be added by joining the commutative diagrams representing them.

[0280] Suppose that, in the example of FIG. 19C, custom-character represents the category of “box-shaped objects” and category represents the category of “tetrahedrons” such that A.sub.1, A.sub.2 are two boxes which are connected to each other by a rotation represented by morphism f and B.sub.1, B.sub.2 are two tetrahedrons. In other words, the data processing device 1 has learned that a box which has been rotated is still the same box. Using functor custom-character (: A.sub.1.fwdarw.B.sub.1, A.sub.2.fwdarw.B.sub.2, f.fwdarw.g) this concept can be mapped to the category of “tetrahedrons” meaning it is not necessary for the data processing device 1 to re-learn the concept of “rotation of a geometric object” as represented by g in the category of “tetrahedrons”.

[0281] FIG. 20A shows an example involving analysis of a combination of data types in the form of objects in images which are provided with image-specific descriptions given in natural language. Another example would be images provided with audio. Another example, involving the same data type, would be the combination of images. The depiction of FIG. 20A relates to an information-point-of-view. Two fibred categories, ε with base category custom-character and with base category , are used to represent an image (e.g., a cat being shown in a house) and a description given in natural language and relating to the image of the cat in isolation (this can be done, e.g., by teaching a computation module 7 or a plurality of computation modules 7 to recognize cats), i.e., irrespective of the fact that in the image the cat is located in a house (e.g., a cat is a mammal, it eats fish and meat, . . . ), respectively. Both, the image and the description have, for themselves, unique identifiers, e.g., in the form of keys or addresses or, as shown, by base categories. The data processing device 1 can be trained to learn that a certain description is specific to a certain image such that, in this sense, they belong together. This fact can be learned by functors between the index categories.

[0282] FIG. 20B shows an example where it is important that one and the same description has to be specific to different images. For human beings it is intuitively clear that, e.g., a cat which sits on a tree, jumps down from the tree and enters a house is always the same living being. For the neuronal networks 71 that are used to cognitively analyse information in the computation modules 7 this is per se not clear and must be taught in a supervised way. Once the data processing device 1 has learned that a cat that has moved is still the same object, it can learn in an unsupervised way (without the need for a random signal, using only commutativity) that the same description is to be associated to two different images, wherein one of the images shows the cat in the house and the other image shows the same cat in a tree. This is shown by having both categories “cat in house” and “cat in tree” point to the same base category custom-character . The dashed arrow shows the unsupervisedly learned connection between “cat on tree” and the category “cat description”. Therefore, the data processing device 1, in some embodiments, is configured to attribute the same natural language description to parts of different images showing the same object.

[0283] FIG. 21 shows in a combined information-point-of-view and physical-point-of-view a projective limit

[00012] $\overset{\lim}{\leftarrow} C_{i}$

which is represented by a plurality of computation modules 7 C.sub.1, C.sub.2, . . . , C.sub.n which can be used for generation of concepts in language. A random signal generator 27 is coupled to receptors 29 of neuronal networks 71 (which have already been trained with respect to cats and dogs) of the computation modules 7 to create new language concepts such as “Human eats dog.”, “Dog eats cat.”, “Cat eats cat.” and so on. A group of computation modules 7 which have been trained to recognize information comprising “dogs” and “cats” can load these sentences and analyse them, e.g., by breaking the sentence “Dog eats cat” down into its components “dog”, “eats” and “cat”. As shown in the information-point-of-view this sentence can be analysed by using a trained functor custom-character representing the verb “eats” between a category D.sub.1 representing dogs and a category D.sub.2 representing cats. In the physical-point-of-view these correspond to a plurality of fibred categories A.sub.1, A.sub.2, A.sub.3 with base categories I.sub.1, I.sub.2, I.sub.3. In order to check internally whether this sentence is already known a different plurality of computation modules 7 E.sub.1, E.sub.2, . . . , E.sub.m which represent an inductive limit

[00013] $\overset{\lim}{.fwdarw.} E_{i}$

can be used to analyse the sentence as a whole. If the sentence is not found internally, it can be analyzed by another group of computation modules 7 representing another projective limit (not shown) which realizes that it does not know whether this concept makes sense. Therefore the data processing device 1 will ask a human operator or an external database whether this concept makes sense. If the external feedback is “not true” this concept will be deleted.

[0284] FIG. 22A shows a different use of a projective limit

[00014] $\overset{\lim}{\leftarrow} C_{i}$

which is inputted with random signals RANDOM, namely how to approximately represent the universal quantifier ∀ (the projective limit

[00015] $\overset{\lim}{\leftarrow} C_{i}$

is represented by computation modules 7 C.sub.1, C.sub.2, . . . , C.sub.n).

[0285] Fandom signals RANDOM are inputted by a random signal generator 27 and are used to generate new concepts in the form of test data. Of course, it is impossible to exactly represent a quantifier like V which must hold true for an infinite number of elements in a finite system. Therefore, infinity is simulated by inputting the random signals RANDOM to stochastically create ever new test data (e.g., sets of test data like n-tupels (x.sub.1, x.sub.2, . . . , x.sub.n), (x′.sub.1, x′.sub.2, . . . , x′.sub.n), (x″.sub.1, x″.sub.2, . . . , x″.sub.n)) which, approximately, can be thought of as having the same effect as if there were an infinite number of test data from which elements can be chosen. In this sense, the randomly (stochastically) generated test data can be thought of simulating the universal quantifier V in the following sense: Suppose the computation modules 7 representing the projective limit

[00016] $\overset{\lim}{\leftarrow} C_{i}$

have learned some facts, e.g., regarding prime numbers, which they use to formulate a hypothesis (e.g., for all natural numbers n there is a larger natural number m which is prime). Then, using a multitude of test data which is stochastically generated, they can check whether the hypothesis is true with respect to a given predicate, e.g., whether it is true that for each natural number of the test data there is a larger natural number which is prime. Of course, this is not a mathematical proof in the traditional sense. Rather the reasoning is, that if a hypothesis is checked for a very large number of test data and holds true for each of the test data, it might as well be considered true for all possible data. Another example would be “All humans are mortal”. Test data would include information regarding a plurality of humans and the data processing device would check for each of the humans whether the human is dead.

[0286] In this way, unsupervised learning can take place, in some embodiments even without checking with an external reference such as a human operator or an external database. Checking can be, e.g., done using other computation modules 7 (E.sub.1, E.sub.2, . . . , E.sub.n), in particular computational modules 7 which, together, represent an inductive limit

[00017] $\overset{\lim}{.fwdarw.} E_{i}$

which can be viewed as an existential quantifier custom-character of natural logic (cf. FIG. 22b). Once a hypothesis has been checked it can be sent for representation to another group of computation modules 7 to free the computation modules 7 C.sub.1, C.sub.2, . . . , C.sub.n for checking other hypothesis.

[0287] FIG. 23 and FIG. 24 show examples how the data processing device 1 can construct a sense of orientation in time and/or space not by providing a coordinate system for time and/or space but by encoding temporal and spatial relationships in non-commuting morphisms or functors.

[0288] In FIG. 23 three events E.sub.1, E.sub.2, E.sub.3 (e.g., frames in a movie or inputs to a sensor) which happen during different time spans are shown by way of example. Data relating these events is inputted into the data processing device 1 and is represented by, e.g., a category A.sub.1, A.sub.2, A.sub.3 for each of the three events E.sub.1, E.sub.2, E.sub.3. In some embodiments, events which are inputted after another within a pre-determinable time span, e.g., between 0.1 seconds to 0.5 seconds, are represented in a connected way by connecting the base categories I.sub.1, I.sub.2, I.sub.3 of the categories A.sub.1, A.sub.2, A.sub.3 by non-commuting functors ϕ.sub.1, ϕ.sub.2 (i.e., ϕ.sub.1∘ϕ.sub.2≠ϕ.sub.2∘ϕ.sub.1). Alternatively, the data processing device 1 could be explicitly instructed to connect these events. In either way, causal relationships and temporal concepts such as “earlier” and “later” can be encoded, e.g., by functors between the categories A.sub.1, A.sub.2, A.sub.3.

[0289] Also, temporal sentences like “I will go to school tomorrow.” can be analysed using the concept showing in FIG. 23 if the data processing device 1 has been trained to recognize that “will” and “tomorrow” imply that there is a present and a future. This time ordering can be represented by categories which are connected via their bases as described above.

[0290] In FIG. 24 an image is depicted showing a ground on which sits a house having a single level covered by a roof and a balloon floating at one side of the house. The data processing device 1 analyses the image and extracts the different objects, i.e., “ground”. “level of house”. “roof” and “balloon”. The spatial relationships between these objects are encoded by non-commuting functors ϕ.sub.1, ϕ.sub.2 between the base categories I.sub.1, I.sub.2, I.sub.3, I.sub.4 of the categories A.sub.1, A.sub.2, A.sub.3, G. They are non-commuting because, e.g., the house stands on the ground and not the other way round. Therefore, in the example of FIG. 24, the more to the right a category is the higher the object represented by that category is arranged in the image.

[0291] In this example it can be seen that a base category can also be a fibred category, having a base category (I.sub.4), itself which, in this example is used to encode that the balloon is to one side of the house.

[0292] FIG. 25A shows the construction of a sub-object classifier for a category custom-character , which is, e.g. a topos. It is an object Ω together with a morphism t: 1.fwdarw.Ω from the terminal object 1 to object Ω with the property that for any monomorphism f: I.fwdarw.J in there exists a unique morphism char(f): J.fwdarw.Ω (called characteristic function) such that the diagram shown in FIG. 25A is a pullback.

[0293] FIG. 25B shows an example, in which this concept can be used to define whether a certain geometric object, here a cube or box belongs to the sub-category of polyhedrons (which forms a sub-category of all geometric shapes as symbolized by the presence of a ball). To encode this fact in a categorical way the sub-object classifier Ω (here a two-element set) together with the morphism t: 1.fwdarw.Ω from the terminal object 1 (here a singleton set in the category of sets) to Ω is used. The cube or box is sent by char(f) to the element 1 (or “true”) while the ball is sent to the element 0 (or “false”).

REFERENCE SIGNS LIST

[0294] 1 data processing device [0295] 2 first interface [0296] 3 second interface [0297] 4 shared memory device [0298] 5 computing device [0299] 6 data hub process [0300] 61 segmentation sub-process [0301] 62 keying sub-process [0302] 7 computation module [0303] 71 neuronal network [0304] 8 video capturing device [0305] 9 conveyor belt [0306] 10 layer I [0307] 11 layer II [0308] 12 layer III [0309] 13 layer IV [0310] 14 layer V [0311] 15 layer VI [0312] 16 computational group [0313] 17 tetrahedron [0314] 18 octahedron [0315] 19 box [0316] 20 robot [0317] 21 artificial neuron [0318] 22 integration function [0319] 23 activation function [0320] 24 synapse of artificial neuron [0321] 25 branching of axon of artificial neuron [0322] 26 weight storage [0323] 27 random signal generator [0324] 28 routing process [0325] 29 receptor for random signal [0326] ID input data [0327] OD output data [0328] K.sub.i ith key [0329] S.sub.i ith data segment [0330] KS.sub.i ith keyed data segment [0331] L.sub.i ith layer of neuronal network [0332] RANDOM random signal

[00018] $\overset{\lim}{\leftarrow} C_{i}$

projective limit

[00019] $\overset{\lim}{.fwdarw.} C_{i}$

inductive limit [0333] Ω sub-object classifier [0334] ∀ universal quantifier [0335] custom-character existential quantifier [0336] pullback [0337] commutative diagram

SYSTEM AND METHOD FOR DATA PROCESSING AND COMPUTATION

Inventors

Cpc classification

Classification Explorer

G06N3/044

PHYSICS

Classification Explorer

G06N3/084

PHYSICS

Classification Explorer

G06V10/96

PHYSICS

Classification Explorer

G06V10/26

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G06V10/764

PHYSICS

Classification Explorer

G06F9/544

PHYSICS

Classification Explorer

G06V30/153

PHYSICS

Classification Explorer

G06N3/088

PHYSICS

Classification Explorer

G06V10/82

PHYSICS

Classification Explorer

G06F16/35

PHYSICS

Classification Explorer

G06V10/955

PHYSICS

Classification Explorer

G06V10/426

PHYSICS

Classification Explorer

G06V30/10

PHYSICS

International classification

Classification Explorer

G06N3/088

PHYSICS

Classification Explorer

G06V10/26

PHYSICS

Classification Explorer

G06V10/82

PHYSICS

Classification Explorer

G06V10/764

PHYSICS

Classification Explorer

G06V10/96

PHYSICS

Classification Explorer

G06V10/94

PHYSICS

Abstract

Claims

Description