DATA FUSION AND RECONSTRUCTION METHOD FOR FINE CHEMICAL INDUSTRY SAFETY PRODUCTION BASED ON VIRTUAL KNOWLEDGE GRAPH

20230236587 · 2023-07-27

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention provides a data fusion and reconstruction method for fine chemical industry safety production based on a virtual knowledge graph. In view of the characteristics of fine chemical industry safety production data, such as a large amount of structured data, a multi-source heterogeneous database and a strong sequential logic, the present invention innovatively proposes a method of using a virtual knowledge graph to complete the fusion and reconstruction of a traditional database for fine chemical industry. The present invention fuses static structured knowledge in the field of fine chemical industry with a real-time dynamic database for chemical industry safety production in the concept of ontologies for the first time to organize time series data in the form of entities. In addition, the mapping rules of the existing OBDA system are improved based on a data set of the present invention.

    Claims

    1. A data fusion and reconstruction method for fine chemical industry safety production based on a virtual knowledge graph, comprising the following steps: step 1: constructing a structured knowledge data set for fine chemical industry safety production the structured knowledge data set for fine chemical industry safety production is mainly from the following two aspects: (4) dynamically changing real-time database the dynamically changing real-time database is mainly composed of a time series data set from a sensor and a shift log set from an operator; {circle around (1)} the time series data set from a sensor real-time changing monitoring data collected by a sensor is centrally processed by a DCS (Distributed Control System) and stored in a DCS database, and then distributed to other data application systems on top of the DCS database, thus to achieve on-demand access to the monitoring data; {circle around (2)} the shift log set from an operator the shift log set from an operator comprises three aspects of data: shift taking over situation, current shift situation and shift handing over situation, which are entered into a PMCI database by a person in charge; the three aspects of data includes four kinds of data, i.e., a data record of main detection sites at a shift change moment, an operator's operation record, a material getting in and out record, and a material handing over record; (5) statically stored relational data table the statically stored relational data table is mainly composed of a main production equipment table, a fine chemicals database, an alarm risk analysis and control measure table, and an SIS interlocking control scheme table; {circle around (1)} the main production equipment table comprises equipment, bit numbers, and temperature and pressure ranges of the equipment; {circle around (2)} the fine chemicals database comprises a substance identification and classification table, a hazardous chemicals identification table, and a main hazardous chemicals physical and chemical property data table; {circle around (3)} the alarm risk analysis and control measure table is divided into a DCS alarm analysis and control measure set and an SIS alarm analysis and control measure set, mainly describing normal operation values, alarm thresholds and post-alarm processing measures at detection sites; {circle around (4)} the SIS interlocking control scheme table is exported from a safety interlocking system which is a system that can achieve one or more safety functions and is used for monitoring the operation of a production device or individual unit; if a production process exceeds a safe operation range, the safety interlocking system will make the production device or individual unit enter a safe state to ensure the safety thereof; the safety interlocking system is a logic operation set based on PID control, while the SIS interlocking control scheme table is an integration of such control logics and rules, and is used for representing an interconnection relation based on safety production between the equipment and the bit numbers; step 2: constructing an OWL2 QL ontology set (1) determining ontologies an ontology hierarchy with a gradient structure including top-level ontologies and lower-level ontologies is constructed; wherein the top-level ontologies include various real-time dynamic databases or static knowledge data tables; and the lower-level ontologies include non-attribute fields of various structured databases; (2) determining ontology relations the relations between the top-level ontologies and the lower-level ontologies are as follows: the lower-level ontologies are a subclass of the top-level ontologies and inherit all attributes of the top-level ontologies; relations and attributes of the lower-level ontologies can be inherited by all entities under the lower-level ontologies, and the entities are specifically represented in the data set as records of a dynamic time series database or a static knowledge database at each moment; step 3: designing R2RML mapping rules under the lower-level ontologies, a specific structured record is taken as an entity, the DCS is taken as a core database to be associated with other databases or data tables, and each monitoring site of the DCS is taken as a primary key; a R2RML mapping language is used to dynamically generate required RDF data according to a user's requirements, then merge the same subjects and objects in the RDF data into graph nodes in a graph view, and finally form a graph structure view; as the process involves only the part of the data that the user needs to access, the method is a partial reconstruction achieved on a source database, rather than a full replication. for a large amount of structured data in fine chemical industry safety production, especially time series data generated by continuous iteration, the R2RML language is adopted, and “time constraints” are added on the basis of the original R2RML language, i.e., monitoring data within a certain time period or a time period taking a certain event as a node is invoked according to the user's requirements, and knowledge data of other associated databases is returned to the user; direct mapping rules of DM are as follows: {circle around (1)} tables of the databases are mapped into RDF classes; {circle around (2)} columns in the tables of the databases are mapped into RDF attributes; {circle around (3)} each row in the tables of the databases is mapped into a triple entity, creating an IRI; and {circle around (4)} value of each cell in the tables of the databases is mapped into a literal value; if the value of the cell is corresponding to a foreign key, the value is replaced with the IRI of the resource or entity to which the value of the foreign key is pointed; a custom mapping language of R2RML is adopted and improved, and improved mapping rules are as follows: {circle around (1)} tables of the databases are mapped into an RDF class of top-level ontologies; {circle around (2)} in column fields of the tables of the databases: data of a literal or symbol class is mapped into an RDF class of lower-level ontologies; {circle around (3)} in column fields of the tables of the databases: data of a numeric class is defined as an attribute of primary keys of the row; {circle around (4)} in each row of each field of the tables of the databases: data of a literal or symbol class is defined as an entity; {circle around (5)} in each row of each field of the tables of the databases except the DCS database: data of a numeric class is defined as an attribute of primary keys of the row; {circle around (6)} data under each site at each moment of the DCS database is taken as an entity; {circle around (7)} if a cell is a literal or symbol class of data, and is corresponding to a foreign key of the tables of the other databases, the cell is replaced with the entity to which the value of the foreign key is pointed; i.e., one subject mapping and multiple predicate-object mappings; the subject mapping is to generate the subjects of all RDF triples from a logic table, i.e., to select the primary keys as the subjects of the triples; and the predicate-object mappings include a predicate mapping and an object mapping.

    Description

    DESCRIPTION OF DRAWINGS

    [0058] FIG. 1 is an architecture diagram of constructing a virtual knowledge graph for fine chemical industry safety production.

    [0059] FIG. 2 is an ontology hierarchy design diagram of a virtual knowledge graph.

    [0060] FIG. 3 is an example of a virtual graph of a DCS database at a single site and a single moment.

    DETAILED DESCRIPTION

    [0061] Specific embodiments of the present invention are further described below in combination with accompanying drawings and the technical solution.

    [0062] The data used in the present invention is common structured data of fine chemical industry, but the problem faced is the production safety of fine chemical industry, instead of all structured data. Therefore, based on this problem, six data sources including real-time dynamically changing structured data and static knowledge data tables are collected and sorted. The data is organized in the form of a traditional relational database and presented to the user in the form of a data table view as shown below.

    TABLE-US-00001 DCS database Time TA001 PA001 LA001 Dec. 1, 2021 08:00:00 50 60 70

    TABLE-US-00002 Shift log Person Shift handing Person Shift handing handing over over Current shift Shift taking Shift taking taking over over time situation the shift situation over time over situation the shift 2021/12/01 TA001 = 51 A Opening 2021/12/01 TA001 = 55 B 08:15:00 PA001 = 61 valve 1 09:15:00 PA001 = 65 LA001 = 71 Charging LA001 = 75 through port 2

    TABLE-US-00003 Production equipment table DCS DCS bit number Equipment Equipment Temperature Pressure bit measurement number name limit limit number substance R-101 Reactor 60 200 TA001 CHE1 PA001 CHE1 LA001 CHE1

    TABLE-US-00004 Chemicals table CAS UN Melting Boiling Combustion Ignition Name number number point point limit temperature CHE1 — — — — — —

    TABLE-US-00005 Alarm risk analysis & control measure table Normal Bit operation Low Fault Fault Accident number Description value alarm cause consequence treatment TA001 R-101 inlet 40-100 — — — — temperature

    TABLE-US-00006 SIS data table LL HH Bit interlocking interlocking Interlocking number value value result LA001 −10 150 TRIP Alarm

    [0063] The core of a fine chemical industry safety production problem includes sensor data real-time monitoring, abnormal alarm, fault tracing and alarm treatment schemes. Based on this, the OWL2 QL language is used for ontology modeling for the first time, and the ontologies are divided into the top-level ontologies and the lower-level ontologies. The table name of each data source serves as a top-level ontology, and the field names below the table name serve as the lower-level ontologies or attributes.

    [0064] Taking one row of data in the DCS database (i.e., the data of 3 sensors at 1 time point, and the alarm risk analysis & control measure table) as an example, the following is a triple representation method of the two classes of data.

    [0065] <http://data.FineChemicalSafetyProduction.com/DCS/2021.12.01.08.00.00> rdf: type ex: TIME.

    [0066] <http://data.FineChemicalSafetyProduction.com/DCS/50> rdf: type ex: TA001.

    [0067] <http://data.FineChemicalSafetyProduction.com/DCS/60> rdf: type ex: PA001.

    [0068] <http://data.FineChemicalSafetyProduction.com/DCS/70> rdf: type ex: LA001.

    [0069] <http://data.FineChemicalSafetyProduction.com/DCS/2021.12.01.08.00.00> ex: TA001 is “50”.

    [0070] <http://data.FineChemicalSafetyProduction.com/DCS/2021.12.01.08.00.00> ex: PA001 is “60”.

    [0071] <http://data.FineChemicalSafetyProduction.com/DCS/2021.12.01.08.00.00> ex: LA001 is “100”.

    [0072] <http://data.FineChemicalSafetyProduction.com/DCS/50> ex: AlarmRiskAnalysis

    [0073] <http://data.FineChemicalSafetyProduction.com/AlarmRiskAnalysis/LA001>

    [0074] <http://data.FineChemicalSafetyProduction.com/AlarmRiskAnalysis/LA001> rdf: type ex: TagNumber.

    [0075] <http://data.FineChemicalSafetyProduction.com/AlarmRiskAnalysis/LA001> ex: Describe “Inlet Temperature and Pressure”.

    [0076] <http://data.FineChemicalSafetyProduction.com/AlarmRiskAnalysis/LA001> ex: NormalOperatingValue “40-100”.

    [0077] In order to convert traditional structured time series data and tables into the above RDF triple data, two mapping documents need to be created, i.e., a mapping document for single data tables and a mapping document for linkage of multiple data tables.

    [0078] Taking the DCS database as an example, an R2RML mapping document for single data tables is shown below:

    TABLE-US-00007   <#TriplesMap1>  rr: logicalTable <#DcsTableView>;  rr: subjectMap [   rr:template “http://data.FineChemicalSafetyProduction.com/DCS/ {2021.12.01.08.00.00}”;   rr: class ex: TIME;  ];  rr: predicateObjectMap [   rr: predicate ex: TA001is;   rr: objectMap [rr: column “TA001”];  ];  rr: predicateObjectMap [   rr: predicate ex: LA001is;   rr: objectMap [rr: column “LA001”];  ];  rr: predicateObjectMap [   rr: predicate ex: PA001is;   rr: objectMap [rr: column “PA001”];  ].

    [0079] Taking a linking view of the DCS database and the alarm risk analysis & control measure table as an example, an R2RML mapping document for linkage of multiple data tables is shown below:

    TABLE-US-00008   <#TriplesMap2> rr: predicateObjectMap [  rr: predicate ex: LA001;   rr: objectMap [   rr: parentTriplesMap <#TriplesMap2>;    rr: joinCondition [     rr: child “LA001”;     rr: parent “LA001”;];   ]; ].

    [0080] Then, the structured database for fine chemical industry is started, and OWL documents of the ontologies and the R2RML mapping documents are accessed to an OBDA system through an API interface. The mapping rules will have different encodes in different OBDA systems. For example, if an Ontop tool is used to access the DCS data of on certain day, a dynamic virtual ontology (i.e., the TA001 data on the same day) need to be added in addition to the above basic mapping rules, and dynamic mapping rules are as follows:

    [0081] mappingId dcs-today's TA001

    [0082] target: Safety in production/dcs/{TA001} a: dcs-today's TA001.

    [0083] source SELECT TIME, TA001 FROM “DCS” [0084] WHERE “TIME” (Time condition screening)

    [0085] Finally, satisfactory triple data is returned by query results, and is presented in the form of a virtual view.

    [0086] FIG. 1 is an architecture diagram of constructing a virtual knowledge graph for fine chemical industry safety production. The diagram is divided into three modules, i.e., an underlying data collection module, an OWL ontology design module and an R2RML mapping rule design module. Original underlying data is independent of each other, an OWL language and a Protégé ontology development tool are used for ontology modeling, and data with the same meaning is taken as one ontology to fuse the multi-source database. Each row of data in the database is mapped into entities under each ontology by a R2RML mapping sector, and then primary keys and foreign keys of the entities are selected from the structured database to complete the construction of the mapping rules.

    [0087] FIG. 2 is an ontology hierarchy design diagram of the present invention. By sorting out the database structure designed by the present invention, the ontology hierarchy is divided by a top-down method. The table names of the data tables are extracted to serve as a top-level ontology, the non-attribute fields of the data tables are extracted to serve as a lower-level ontology, and the attribute fields are extracted to serve as the attributes of the entities. When a user needs to access a database on demand, a virtual ontology will be created by the mapping language to organize the data of relevant entities, and will be revoked after the access is ended.

    [0088] FIG. 3 is an example of a concept graph constructed based on alarm risk analysis and control measures of 3 monitoring sites and 1 site at 2 time points of the DCS by defining the data as nodes in a virtual view and defining the logical relation between the nodes. The figure shows the logical relation of the virtual knowledge graph, i.e., the association relation between the real-time dynamically changing database data and the static knowledge base data is constructed, and the virtual view which is based on the ontologies and associated with the multi-source database is returned to the user.