Industrial data verification using secure, distributed ledger
11582042 · 2023-02-14
Assignee
Inventors
- Benjamin Edward Beckmann (Niskayuna, NY, US)
- Anilkumar Vadali (Niskayuna, NY, US)
- Lalit Keshav Mestha (North Colonie, NY, US)
- Daniel Francis Holzhauer (Santa Clarita, CA, US)
- John William Carbone (Ballston Spa, NY, US)
Cpc classification
G06F16/27
PHYSICS
Y04S40/20
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
H04L9/3239
ELECTRICITY
H04L9/3297
ELECTRICITY
International classification
G06F16/27
PHYSICS
H04L9/32
ELECTRICITY
Abstract
A verification platform may include a data connection to receive a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors. The verification platform may store the subset of industrial asset data into a data store, the subset of industrial asset data being marked as invalid, and record a hash value associated with a compressed representation of the subset of industrial asset data combined with metadata in a secure, distributed ledger (e.g., associated with blockchain technology). The verification platform may then receive a transaction identifier from the secure, distributed ledger and mark the subset of industrial asset data in the data store as being valid after using the transaction identifier to verify that the recorded hash value matches a hash value of an independently created version of the compressed representation of the subset of industrial asset data combined with metadata.
Claims
1. A system to facilitate industrial data verification, comprising: a verification platform comprising a verification client and a verification server, the verification platform including: a data connection of the verification client, the data connection configured to receive a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors, and at least one verification platform computer hardware processor coupled to the data connection and adapted to: store the subset of the industrial asset data into a data store, the subset of the industrial asset data being marked as invalid; generate, using a verification client processor of the verification client, a first Patricia-Merkle trie comprising the subset of the industrial asset data and associated metadata, wherein the associated metadata includes at least a pseudo identifier; determine, using the verification client processor, a first hash value of the first Patricia-Merkle trie; record the first hash value in a secure, distributed ledger; receive a transaction identifier from the secure, distributed ledger; and mark, using a verification server processor of the verification server, the subset of the industrial asset data in the data store as valid after using the transaction identifier to verify that the recorded first hash value matches a second hash value of an independently created second Patricia-Merkle trie comprising the subset of industrial asset data and associated metadata, wherein the independently created second Patricia-Merkle trie and the second hash value of the independently created second Patricia-Merkle trie are generated by the verification server independent of the verification client and the first Patricia-Merkle trie and the first hash value of the first Patricia-Merkle trie are generated by the verification client independent of the verification server; wherein the data store is adapted to provide information marked as valid to a consuming platform.
2. The system of claim 1, wherein each node of the trie is associated with at least a portion of the subset of the industrial data and the associated metadata.
3. The system of claim 2, wherein the trie metadata comprises a Patricia-Merkle trie.
4. The system of claim 1, wherein the associated metadata includes at least one of: (i) a time stamp, (ii) a unique client identifier, and (iii) data shape information.
5. The system of claim 1, wherein the verification platform is associated with at least one of: (i) a single network cloud-hosted topology, (ii) a multiple network cloud-hosted topology, and (iii) a participant hosted intranet environment.
6. The system of claim 1, wherein the industrial asset sensors are associated with at least one of: (i) an engine, (ii) an aircraft, (iii) a locomotive, (iv) power generation, and (v) a wind turbine.
7. The system of claim 1, wherein the secure, distributed ledger comprises blockchain technology.
8. A method associated with industrial data verification, comprising: receiving, at a computer processor of a verification platform, a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors, the verification platform comprising a verification client and a verification server; storing, by the verification platform, the subset of the industrial asset data into a data store, the subset of the industrial asset data marked as invalid in the data store; generating, by a verification client processor of the verification client, a first Patricia-Merkle trie comprising the subset of the industrial asset data and associated metadata, wherein the associated metadata includes at least a pseudo identifier; determining, using the verification client processor, a first hash value of the first Patricia-Merkle trie; recording, by the verification platform, the first hash value of the first Patricia-Merkle trie in a secure, distributed ledger; receiving, at the verification platform, a transaction identifier from the secure, distributed ledger; and marking, using a verification server processor of the verification server, the subset of the industrial asset data in the data store as valid after using the transaction identifier to verify, at the verification platform, that the first recorded hash value matches a second hash value associated with an independently created second Patricia-Merkle trie comprising the subset of the industrial asset data and the associated metadata, wherein the independently created second Patricia-Merkle trie and the second hash value of the independently created second Patricia-Merkle trie are generated by the verification server independent of the verification client; wherein the data store is adapted to provide information marked as valid to a consuming platform.
9. The method of claim 8, wherein the trie comprises a Patricia-Merkle trie.
10. The method of claim 8, wherein the associated metadata comprises at least one of: (i) a time stamp, (ii) a unique client identifier, and (iii) data shape information.
11. The method of claim 8, wherein the secure, distributed ledger comprises blockchain technology.
12. A system to facilitate industrial data verification, comprising: a verification client, including: a data connection to receive a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors, and a verification client computer hardware processor coupled to the data connection and adapted to: create a first Patricia-Merkle trie comprising the subset of the industrial asset data and associated metadata, determine a hash value of the first Patricia-Merkle trie, receive a pseudo identifier from a verification engine, and transmit the subset of the industrial asset data to a verification server along with the associated metadata, the verification engine, including: a verification engine computer processor adapted to: receive the hash value from the verification client, transmit the pseudo identifier to the verification client, record the hash value in a secure, distributed ledger, receive a transaction identifier from the secure, distributed ledger, and transmit the pseudo identifier and the transaction identifier to the verification server, the verification server, including: a verification server computer processor adapted to: receive the subset of the industrial asset data and the associated metadata from the verification client, receive the pseudo identifier and the transaction identifier from the verification engine, store the subset of the industrial asset data into a data store, the subset of the industrial asset data marked as invalid in the data store, independently create a second Patricia-Merkle trie comprising the subset of the industrial asset data and the associated metadata, the second Patricia-Merkle trie generated independently of the verification client and the first Patricia-Merkle trie, determine an independent hash value associated with the second Patricia-Merkle trie, retrieve the hash value from the secure, distributed ledger, and mark the subset of the industrial asset data in the data store as valid after verifying that the recorded hash value matches the independent hash value associated with the second Patricia-Merkle tries and the data store, wherein the data store is adapted to provide information marked as valid to a consuming platform.
13. The system of claim 12, wherein the associated metadata includes at least one of: (i) the pseudo identifier, (ii) a time stamp, (iii) a unique client identifier, and (iv) data shape information.
14. The system of claim 12, wherein the verification client is associated with at least one of: (i) a single network cloud-hosted topology, (ii) a multiple network cloud-hosted topology, and (iii) a participant hosted intranet environment.
15. The system of claim 12, wherein the industrial asset sensors are associated with at least one of: (i) an engine, (ii) an aircraft, (iii) a locomotive, (iv) power generation, and (v) a wind turbine.
16. The system of claim 12, wherein the secure, distributed ledger comprises blockchain technology.
17. The system of claim 12, further comprising: an industrial asset item including the industrial asset sensors to generate the stream of industrial asset data, including the subset of the industrial asset data.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
DETAILED DESCRIPTION
(22) In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of embodiments. However, it will be understood by those of ordinary skill in the art that the embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the embodiments.
(23) One or more specific embodiments of the present invention will be described below. In an effort to provide a concise description of these embodiments, all features of an actual implementation may not be described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
(24) It may generally be desirable to efficiently and accurately facilitate industrial data verification.
(25) According to some embodiments, the data store 160 stores electronic records defining the received stream of industrial data 120. According to some embodiments, the verification platform 150 and/or other elements of the system may then record information about various transactions using the secure, distributed ledger 190 (e.g., via a blockchain verification process). For example, the verification platform 150 might record a date and time, hash value, etc. via the secure, distributed ledger 190 in accordance with any of the embodiments described herein. According to some embodiments, the distributed ledger might be associated with the HYPERLEDGER® blockchain verification system. Note that the verification platform 150 could be completely de-centralized and/or might be associated with a third party, such as a vendor that performs a service for an enterprise.
(26) The verification platform 150 might be, for example, associated with a Personal Computer (“PC”), laptop computer, a tablet computer, a smartphone, an enterprise server, a server farm, and/or a database or similar storage devices. According to some embodiments, an “automated” verification platform 150 may automatically verify industrial data. As used herein, the term “automated” may refer to, for example, actions that can be performed with little (or no) intervention by a human.
(27) As used herein, devices, including those associated with the verification platform 150 and any other device described herein, may exchange information via any communication network which may be one or more of a Local Area Network (“LAN”), a Metropolitan Area Network (“MAN”), a Wide Area Network (“WAN”), a proprietary network, a Public Switched Telephone Network (“PSTN”), a Wireless Application Protocol (“WAP”) network, a Bluetooth network, a wireless LAN network, and/or an Internet Protocol (“IP”) network such as the Internet, an intranet, or an extranet. Note that any devices described herein may communicate via one or more such communication networks.
(28) The verification platform 150 may store information into and/or retrieve information from data stores. The data stores might, for example, store electronic records representing industrial asset sensor data, operational data, etc. The data stores may be locally stored or reside remote from the verification platform 150. Although a single verification platform 150 is shown in
(29) In this way, the system 100 may efficiently and accurately facilitate industrial data verification. For example,
(30) At 210, a computer processor of a verification platform may receive a stream of industrial asset data, including a subset of the industrial asset data (e.g., a “packet” of data), from industrial asset sensors. Note that the verification platform might be associated with a single network cloud-hosted topology, a multiple network cloud-hosted topology, a participant hosted intranet environment, etc. Moreover, the industrial asset item might be associated with, by way of examples only, an engine, an aircraft, a locomotive, power generation, a wind turbine, etc. At 220, the verification platform may store the subset of industrial asset data into a data store, the subset of industrial asset data being marked as invalid.
(31) At 230, the verification platform may record a hash value associated with a compressed representation of the subset of industrial asset data combined with metadata in a secure, distributed ledger. Although other types of compressed representations of data might be used, according to some embodiments the compressed representation of the subset of industrial data combined with “metadata” is a trie. Note that the metadata might include, for example, a pseudo identifier, a time stamp, a unique client identifier, data shape information (e.g., the depth and/or width of the data), etc.
(32) Referring again to
(33) In this way, a data verification platform may protect and authenticate sensor data output from industrial systems and further ensure that corrupted data does not flow to other important system. Utilizing the secure aspects of a distributed ledger, such as blockchain technologies, along with a compression data structure such as a trie, a more detailed description of a process to verity industrial data is provided in connection with the system of
(34) The verification client 452 initially establishes a connection with an industrial asset and waits for data to be sent over. Once the verification client 452 receives a packet of data, it utilizes a data structure (e.g., a trie) to store the data. As described with respect to
(35) The verification engine 454 may be initially connected to the verification client 452 and listen for a data packet containing the hash of the trie created by the verification client 452. Once the hash is received, the verification engine 454 sends back a pseudo identifier. The verification engine 454 may then store or record the hash into a secure, distributed ledger 490 at (C1) and receive back a transaction identifier at (C2) that can be used to monitor the stored hash in the ledger 490 (e.g., blockchain). Next, the verification engine 454 closes the connection with the verification client 452 and opens a connection with the verification server 456. Once that connection is open, the verification engine 454 may send the transaction identifier and pseudo identifier to the verification server 456 at (D) and the verification server 456 can utilize both identifiers accordingly.
(36) The verification server 456 may continuously listen to both the verification client 452 and the verification engine 454 waiting for information. First, the verification server 456 may receive the transaction identifier and the pseudo identifier from the verification engine 454 at (D) and store them for future use. The verification server 456 may also receive the data packet that was sent from the verification client 452 at (E) and store it into a data store 460 at (F). At this point, all the data is invalid and is marked as such in the data store (as illustrated by the dashed arrow in
(37) In this way, the system 400 may help ensure that the sensor data received by controllers and operators is indeed anchored in time and has been verified. According to some embodiments, this is achieved through utilizing secure infrastructures, such as blockchains and cryptographically protected compression data structures (e.g., a Patricia-Merkle trie) to safeguard the data. Furthermore, embodiments may let a user know exactly when data has been changed and also help the user respond as soon as possible.
(38)
(39) At 510, a trie, such as a Patricia-Merkle trie as described with respect to
(40) According to some embodiments, the lossless protection procedure might be associated with a “Merkle tree.”
(41) To implement a “tree authentication” method for a vector of data items Y=Y.sub.1, Y.sub.2, . . . . Y.sub.n a method is provided to authenticate a randomly chosen Y.sub.i. To authenticate the Y.sub.i, define the function H(I, j, Y) as follows:
(42) H(i, i, Y)=F(Y.sub.i) H(i, j, Y)=F(H(i, (i+j−1)/2, Y), H(i+j+1)/2, j, Y))
(43) where F(Y.sub.i) is a one-way function. H(i, j, Y) is a one-way function of Y.sub.i, Y.sub.i+. . . . Y.sub.j and H(1, n, Y) can be used to authenticate Y.sub.1 through Y.sub.n. H(1, n, Y) is a one-way function of all the Y.sub.i (H(1, n, Y) might comprise, by way of example only, 100 bits of data). In this way, a receiver may selectively authenticate any “leaf,” Yi, of the binary tree 600 defined with the function H(i, n, Y).
(44) For example, the sequence of recursive calls required to compute the root, H(1, 8, Y) of the binary tree 600 is shown in
(45) Some embodiments described herein utilize a specific type of Merkle tree referred to as a Practical Algorithm To Retrieve Information Coded In Alphanumeric (“PATRICIA”) or a Patricia-Merkle trie. A Patricia-Merkle trie may provide a cryptographically authenticated data structure that can be used to store all (key, value) bindings. They may be fully deterministic, meaning that a Patricia trie with the same (key, value) bindings is guaranteed to be exactly the same down to the last byte and therefore have the same root hash. Moreover, a Patricia-Merkle trie may provide O(log(n)) efficiency for inserts, lookups and deletes. Note that the use of a Patricia-Merkle trie as a method to compress, store, and uniquely identify data as described herein (e.g., instead of a hash table) means that there will not be any key collisions that may corrupt or overwrite existing data. Additionally, the compression properties of the Patricia-Merkle trie and the relatively low-level time and space complexity may allow for a substantial amount of data to be stored within the trie. Moreover, the system may quickly determine if the data has been corrupted. As a result, the ability to utilize the root node hash of the trie as a fingerprint of the data stored in the trie can help with validation and verification in a relatively quick fashion.
(46)
(47) Although some embodiments are described using specific blockchain technologies, note that other approaches could be incorporated. For example, a Chainpoint platform for blockchains might be utilized to allow for the creation of a timestamp proof of the data and verify the existence and integrity of data stored in a blockchain. That is, a verification platform and the Chainpoint proof could be employed as a verification tool, rather than manually checking if the hashes match at a verification server.
(48)
(49)
(50)
(51) Embodiments described herein may comprise a tool that facilitates industrial data verification and may be implemented using any number of different hardware configurations. For example,
(52) The processor 1510 also communicates with a storage device 1530. The storage device 1530 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, mobile telephones, and/or semiconductor memory devices. The storage device 1530 stores a program 1512 and/or network security service tool or application for controlling the processor 1510. The processor 1510 performs instructions of the program 1512, and thereby operates in accordance with any of the embodiments described herein. For example, the processor 1510 may receive a stream of industrial asset data, including a subset of the industrial asset data, from industrial asset sensors. The processor 1510 may store the subset of industrial asset data into a data store 1600, the subset of industrial asset data being marked as invalid, and record a hash value associated with a compressed representation of the subset of industrial asset data combined with metadata in a secure, distributed ledger (e.g., associated with blockchain technology). The processor 1510 may then receive a transaction identifier from the secure, distributed ledger and mark the subset of industrial asset data in the data store 1600 as being valid after using the transaction identifier to verify that the recorded hash value matches a hash value of an independently created version of the compressed representation of the subset of industrial asset data combined with metadata.
(53) The program 1512 may be stored in a compressed, uncompiled and/or encrypted format. The program 1512 may furthermore include other program elements, such as an operating system, a database management system, and/or device drivers used by the processor 1510 to interface with peripheral devices.
(54) As used herein, information may be “received” by or “transmitted” to, for example: (i) the platform 1500 from another device; or (ii) a software application or module within the platform 1500 from another software application, module, or any other source.
(55) In some embodiments (such as shown in
(56) Referring to
(57) The transaction identifier 1602 may be, for example, a unique alphanumeric code identifying a packet of data that has been received from industrial asset sensors (e.g., as part of a larger stream of data). The subset of industrial data 1604 may include the actual values received from the sensors (e.g., temperatures, speeds, power levels, etc.). The date and time 1606 may indicate when the data was generated or received by the system. The validity indication 1608 might indicate that the data is “invalid” (not yet verified) or “valid” (e.g., the hash of an independently created Patricia-Merkle trie matched a hash value recorded in a secure, distributed ledger). The data store 1600 may be configured such that information associated with a validity indication of “valid” may be made available to remote consuming platforms.
(58) Although specific hardware and data configurations have been described herein, note that any number of other configurations may be provided in accordance with embodiments of the present invention (e.g., some of the information described herein may be combined or stored in external systems). Similarly, the displays shown and described herein are provided only as examples, and other types of displays and display devices may support any of the embodiments. For example,
(59) Embodiments may be associated with any type of distributed ledger having a de-centralized consensus-based network that supports smart contracts, digital assets, record repositories, and/or cryptographic security. For example,
(60) Thus, some embodiments described herein may have a technical advantage because the system is able to receive data from sensors while also creating the Trie with the data received, all inline. As a result, there is no need for the system to wait until all the data is received, but rather it may start constructing the Trie while it gets data without substantial lag. Additionally, embodiments may be blockchain agnostic meaning that any type of blockchain can be used and the verification platform will still function. For example, when one blockchain is taking a very long time to confirm transactions, another (faster) blockchain may be swapped in to reduce confirmation times. Furthermore, embodiments may be applicable to any situation that needs data verification. That is, the model does not depend on the input of the data or where the input is coming from and embodiments may read data, determine the shape, create a Patricia-Merkle trie from the data, and continue with the data verification process by validating or invalidating the hash of the trie along with the associated metadata associated. In other words, there is no data type dependency associated with the embodiments described herein. In addition, embodiments may be deployed within controlled environments such as inside factories or even within industrial equipment to properly verify and authenticate data.
(61) Note that the security of an industrial verification system may be enhanced when only certain elements of the system have knowledge of various types of information (e.g., to prevent unauthorized access to a single element from learning every type of information). For example,
(62) As another example,
(63) The following illustrates various additional embodiments of the invention. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that the present invention is applicable to many other embodiments. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above-described apparatus and methods to accommodate these and other embodiments and applications.
(64) Some embodiments have been described with respect to information associated with an “industrial asset,” which might include, for example, sensors, actuators, controllers, etc. Moreover, note that embodiments described herein may interact with an automated cyber-security system that monitors one or more industrial assets, including those associated with power generation, Unmanned Aerial Vehicle (“UAV”) fleets, propulsion, healthcare scanners, etc. As another example,
(65) The present invention has been described in terms of several embodiments solely for the purpose of illustration. Persons skilled in the art will recognize from this description that the invention is not limited to the embodiments described, but may be practiced with modifications and alterations limited only by the spirit and scope of the appended claims.