METHOD FOR IDENTIFYING AND PARSING INDUSTRIAL CONTROL PROTOCOL BASED ON INDUSTRIAL GATEWAY
20220206473 · 2022-06-30
Inventors
- Tie Qiu (Tianjin, CN)
- Xiaoyu Jiang (Tianjin, CN)
- Keqiu Li (Tianjin, CN)
- Jiancheng CHI (Tianjin, CN)
- Ning Chen (Tianjin, CN)
Cpc classification
H04L43/0876
ELECTRICITY
G05B19/41885
PHYSICS
H04L67/125
ELECTRICITY
H04L41/145
ELECTRICITY
H04L69/18
ELECTRICITY
H04W4/70
ELECTRICITY
G05B19/4183
PHYSICS
International classification
Abstract
Disclosed is a method for identifying and parsing an industrial control protocol based on an industrial gateway. The industrial gateway captures, through a serial port and a network port, messages sent from a host computer and a lower computer to the industrial gateway, extracts features representing different protocol types and protocol fields from the messages, and identifies and parses the messages based on protocol character features.
Claims
1. A method for identifying and parsing an industrial control protocol based on an industrial gateway, wherein the industrial gateway captures, through a serial port and a network port, messages sent from a host computer and a lower computer to the industrial gateway, extracts features representing different protocol types and protocol fields from the messages, and identifies and parses the messages based on protocol character features, the method including steps comprising: step 1, establishing, by the industrial gateway, serial communication with a sensing device and socket communication with a client, and determining a source of a current message based on respective amount of data existing in two channels; and if the message originates from the lower computer, proceeding to step 2; or if the message originates from the host computer, proceeding to step 3; step 2, reading data from the serial port; determining, based on that the message originates from the serial communication, that a communication protocol used to transmit the current message is Modbus-RTU protocol since the serial communication of the industrial gateway currently only supports the Modbus-RTU protocol, and parsing the current message based on a communication format of the Modbus-RTU protocol in a protocol field template library to complete the protocol identification and parsing of the message; step 3, receiving, by the industrial gateway, the message from the client through the network port, splitting the message by layer according to a five-layer network architecture, obtaining a port number used by a transport layer, and matching a protocol type corresponding to a port based on the port number; if the matching succeeds, the port has a corresponding protocol, proceeding to step 4; or if the matching fails, proceeding to step 5; step 4, matching a parsing template corresponding to the protocol type in the protocol field template library based on the protocol type identified by the industrial gateway, and parsing the message based on a parsing format corresponding to the parsing template, that is, a field dictionary; step 5, matching the message with templates in a protocol type template library; if the matching succeeds, identifying the message as a protocol type corresponding to the template, performing step 4; or if the matching fails, proceeding to step 6; step 6, marking the message as an unknown protocol message, sending a manual intervention request to the gateway, dispatching manual identification and parsing tasks to relevant personnel, and adding identification and parsing results as new templates into the protocol type template library and the protocol field template library; and step 7, repeating steps 2, 3, 4, 5, and 6 until there is no data in the channels.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0019]
[0020]
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0021] The present disclosure is further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely intended to explain but not to limit the present disclosure.
[0022] Step 1: RS-485 and RS-232 serial ports of an industrial gateway are enabled, socket communication between a client (client side) and the industrial gateway (server side) is established, monitor data transmission channels of the serial ports and network ports at the same time, FIFO policy is adopted to process a message that is first transmitted to the industrial gateway, and a current communication mode of the industrial gateway is determined based on a source of the message. If the current communication mode is serial communication, go to step 2; or if current communication mode is socket communication, go to step 3.
[0023] Step 2: an amount of data waiting to be received is obtained from the serial port, messages of the corresponding data amount is read, a channel from which messages are captured is released, and a next data transmission mode is determined. For a message captured in the serial port, since it is generally believed that Modbus-RTU protocol is the only industrial control protocol supporting serial communication, so the protocol type to which the currently captured message belongs can be identified as the Modbus-RTU by the communication mode. A protocol field template library is searched for a message communication format corresponding to Modbus-RTU, and the message is parsed according to a matched field template.
[0024] Step 3: a message sent from a socket client is monitored and captured by the industrial gateway used as the server side, a prompt is sent to the client after the message is successfully received, and the communication is cut off The captured message is split in sequence according to a five-layer architecture of the network, including application layer, transport layer, network layer, data link layer, and physical layer, a port number used by TCP/UDP is obtained at the transport layer, and a protocol type corresponding to the port number is matched in a port dictionary; if the matching succeeds, go to step 4; or if the matching fails, go to step 5.
[0025] Step 4: a network message is split based on the five-layer architecture. The message content above the application layer is parsed based on a general format of network protocols, and a RAW part of the application layer is parsed separately. A parsing template corresponding to the identified protocol type is matched in the protocol field template library based on the identified protocol type, the longest common subsequence LCS is taken as a similarity measure basis, and the message is parsed based on the parsing format corresponding to a template having the highest similarity with the message.
[0026] Specifically, a template matching formula may be in the form of equation (1).
[0027] Where T and T.sub.match represent a template list and a template matching result respectively; and Similarity.sub.lcs(i) represents a similarity measure result between a message and the i-th template in T.
[0028] Specifically, a similarity measure formula may be in the form of equation (2).
[0029] Where t and m represent a template and a message respectively; l.sub.t and l.sub.m represent a length of the template and a length of the message respectively; and l.sub.LCS represents the length of the LCS between the message m and the template t.
[0030] Step 5: similarity between the message and messages in a protocol type template library is measured. If a measure result reaches a threshold range, it is considered that the message is matched with a template, the message is marked as a protocol type corresponding to the template, going to step 4; if the measure result does not reach the threshold range, it is considered that the matching fails, going to step 6.
[0031] Step 6: a message whose protocol type is still not identified is marked as an unknown message, the message is determined as an unknown protocol message or a custom protocol message, and a manual intervention request is sent to the industrial gateway. The industrial gateway performs manual analysis on the message: first determining whether the message is either an error message or an attack message; if yes, discarding the message; and if no, performing manual identification and parsing, and adding results as new templates into the protocol type template library and the protocol field template library.
[0032] Step 7: After completing the parsing of a single message, steps 2, 3, 4, 5, and 6 are repeated, until there is no data in the two channels within a waiting time range, and at this time, it is considered that there is no longer a need to identify and parse a protocol, and the related processes is suspended.
[0033] The present disclosure is not limited to the embodiments described above. The above description of the specific embodiments is intended to describe and illustrate the technical solutions of the present disclosure, and the above specific embodiments are only illustrative but not restrictive. Without departing from the ideas of the present disclosure and the protection scope of the claims, those of ordinary skill in the art can make many specific changes under the enlightenment of the present disclosure, and all these changes fall within the protection scope of the present disclosure.