SOURCE LOCALIZATION METHOD FOR RUMOR BASED ON FULL-ORDER NEIGHBOR COVERAGE STRATEGY
20230046801 · 2023-02-16
Inventors
Cpc classification
International classification
Abstract
Source localization method for rumor source based on full-order neighbor coverage strategy includes: constructing a network graph according to the user relationship in the actual target area; mapping an actual relationship into the network graph; determining sensors in the network graph, and deploying users corresponding to the sensors as observation users in an actual target area; executing a source inferring strategy when the number of the observation users in the actual target area who have received the rumor reaches an expected scale; calculating source likelihood score of non-sensor nodes in the network graph corresponding to the non-observation users in the actual target area; processing differentially the source likelihood scores; and outputting the non-observation user corresponding to the minimum source likelihood score as the source.
Claims
1. A source localization method for rumor based on full-order neighbor coverage strategy, comprising: (S1) inputting a user relationship database in an actual target area requiring sensor deployment and source localization; (S2) constructing and initializing a network graph G=(V,E) based on the actual target area; wherein after the user relationship database is input, an actual relationship is mapped into the network graph G; V is a set of nodes corresponding to users in the actual target area; E is a set of edges, and the connected edges in the network graph G indicate that corresponding two users know each other in the actual target area; and all nodes in the network graph G are initialized to a susceptible state which means the corresponding users in the actual area do not receive a rumor; (S3) deploying observation users in an actual target area according to the network graph G; wherein sensors are selected in the network graph G with a deployment ratio φ using the full-order neighbor coverage strategy; the full-order neighbor coverage strategy is configured to ensure that there are sensors in each order neighbor of any node in the network graph G; and users in the actual target area corresponding to the deployed sensors in G are marked as the observation users to record time and direction when they receive propagation information of the rumor; (S4) when the number of observation users who have received rumor's propagation information reaches four in the actual target area, performing a source inferring strategy to detect a source of the rumor; (S5) mapping time and direction recorded by the observation users to the network graph G; (S6) locating the source of the rumor by using graph theory with a topological structure of G; wherein initial source likelihood scores of non-sensor nodes in the network graph G are calculated by using a formula combining “minimum infection center” and “time-distance ratio”; (S7) differentially processing the source likelihood scores; wherein for any non-sensor node, if there is a first-order neighbors belonging to sensors who has not received the rumor's propagation information, the source likelihood score of such a non-sensor node is multiplied by a penalty coefficient α to reduce possibility of a corresponding user becoming the source of the rumor; and α is a real number between 1 and 1.1; (S8) traversing all non-sensor nodes, and obtaining nodes with a minimum source likelihood score; and predicting the user in the actual target area related to the nodes with a minimum source likelihood score in G as an original rumor source in real life.
2. The source localization method of claim 1, wherein the step (S3) is performed through steps of: (S31) selecting initially sensors in the network graph G by using the full-order neighbor coverage strategy to ensure that for each node in the network graph G, there is at least one sensor in each order neighbor from a first-order neighbor to an eccentricity-order neighbor of the node, so as to allow the sensors to be widely deployed in the network graph G; (S32) determining whether a ratio of the sensors selected by using the full-order neighbor coverage strategy to the network graph G reaches the deployment ratio φ; if no, further selecting non-sensor nodes in G by using other strategies; and selecting more sensors in G until the deployment ratio φ is reached; and (S33) marking the users who are sensors in the network graph G as the observation users in the actual target area.
3. The source localization method of claim 1, wherein in the step (S6), an initial source likelihood score of non-sensor nodes in the network graph G corresponding to the non-observation users in the actual target area is calculated through the following formula:
4. The source localization method of claim 1, wherein in the step (S7), the source likelihood score is processed through the following formula:
Score.sub.v=Score′.sub.v*Π.sub.j=1.sup.j=neighbor(v)∩O\Ō|a; wherein Score′.sub.v is an initial source likelihood score of a non-sensor node v in the network graph G corresponding to the non-observation user in the actual target area obtained in the step (S6); α is the penalty coefficient configured to increase penalty for nodes in G corresponding to users in the actual target area who are unlikely to be the source of the rumor, and is equal to 1.05; neighbor(v) is a first-order neighbor of a non-sensor v in G; Ō is a set of the sensors corresponding to the observation users with the deployment ratio φ in the actual target area, and the deployment ratio φ is 20%, 30% or 40%; and Ō is a set of sensors corresponding to the observation users in the actual target area who have received the propagation information of the rumor.
5. The source localization method of claim 1, wherein the deployment ratio φ is 20%, 30% or 40%.
6. The source localization method of claim 1, wherein the penalty coefficient α is 1.05.
7. The source localization method of claim 2, wherein in step (S32), the other strategies comprise random selection of sensors and selection of nodes with the highest degree in the network graph G as the sensors.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
DETAILED DESCRIPTION OF EMBODIMENTS
[0047] The disclosure will be described clearly and completely below to make the objectives, technical solutions and technical effects of the present disclosure clearer. Obviously, described below are merely some embodiments of the disclosure, which are not intended to limit the disclosure. Based on the embodiments provided herein, all other embodiments obtained by those skilled in the art without paying creative efforts shall fall within the scope of the present disclosure.
[0048] As shown in
[0049] A user relationship database in an actual target area requiring sensor deployment and source localization is input (step S1).
[0050] A network graph G=(V,E) is constructed and initialized based on the actual target area. After the user relationship database is input, an actual relationship is mapped into the network graph G; V is a set of nodes corresponding to users in the actual target area; E is a set of edges, and the connected edges in the network graph G indicate that corresponding two users know each other in the actual target area; and all nodes in the network graph G are initialized to a susceptible state which means the corresponding users in the actual area do not receive a rumor (step S2).
[0051] Observation users are deployed in the actual target area by applying the graph theory for the network graph G. Sensors are selected in G with a deployment ratio φ using the full-order neighbor coverage strategy, where the full-order neighbor coverage strategy is configured to ensure that there are sensors in each order neighbor of any node in the network graph G; and users in the actual target area corresponding to the deployed sensors in G are marked as the observation users to record time and direction when they receive propagation information of the rumor (step S3). In this embodiment, the deployment ratio φ is 20%, 30% or 40%.
[0052] When the number of observation users who have received rumor's propagation information reaches 4 in the actual target area, performing a source inferring strategy to detect a source of the rumor (step S4).
[0053] Observed information (i.e., time and direction) recorded by the observation users is mapped to the network graph G (step S5).
[0054] Locating the source of the rumor by using graph theory with a topological structure of G; wherein an initial source likelihood scores of non-sensor nodes in the network graph G are calculated by using a formula combining “minimum infection center” and “time-distance ratio” (step S6).
[0055] The source likelihood scores are differentially processed. For any non-sensor node, if there is a first-order neighbors belonging to sensors who has not received the rumor's propagation information, the source likelihood score of such a non-sensor node is multiplied by a penalty coefficient α to reduce possibility of a corresponding user becoming the rumor source; and α is a real number between 1 and 1.1 (step S7). In this embodiment, α is 1.05.
[0056] All non-sensor nodes in the network graph G are traversed, and obtaining nodes with a minimum source likelihood score; and predicting the user in the actual target area related to the nodes with a minimum source likelihood score in G as an original rumor source in real life (step S8).
[0057] In this embodiment, the step (S3) is performed through the following steps. Sensors are initially selected in the network graph G by using the full-order neighbor coverage strategy to ensure that for each node in the network graph G, there is at least one sensor in each order neighbor from a first-order neighbor to an eccentricity-order neighbor of the node, so as to allow the sensors to be widely deployed in the network graph G (step S31). It is determined whether a ratio of the sensors selected by using the full-order neighbor coverage strategy to the network graph G reaches the deployment ratio φ; if no, further selecting non-sensor nodes in G by using other strategies; and selecting more sensors in G until the deployment ratio φ is reached (step S32). The users who are sensors in the network graph G are marked as the observation users in the actual target area (step S33). The other strategies include random selection of sensors and selection of nodes with the highest degree in G as the sensors. In this embodiment, selection strategy of nodes with the highest degree in the network graph G as the sensors is used.
[0058] In the step (S6), an initial source likelihood score of non-sensor nodes in the network graph G corresponding to the non-observation users in the actual target area is calculated through the following formula:
[0059] wherein Score′.sub.v is an initial source likelihood score of a non-sensor node v in the network graph G corresponding to the non-observation user in the actual target area; Ō is a set of sensors in G corresponding to the observation users who have received the propagation information of the rumor in the actual target area; PI represents the number of elements in the Ō, and |Ō| is 4; d.sub.i,v is the shortest distance between a sensor i and a non-sensor node v in G; and t.sub.i represents a relative infection time of the sensor i in G corresponding to the observation user in the actual target area.
[0060] In the step (S7), the source likelihood score is processed through the following formula:
Score.sub.v=Score′.sub.v*Π.sub.j=1.sup.j=neighbor(v)∩O\Ō|a;
[0061] wherein Score′.sub.v is an initial source likelihood score of a non-sensor node v in the network graph G corresponding to the non-observation user in the actual target area obtained in the step (S6); α is the penalty coefficient configured to increase penalty for nodes in G corresponding to users in the actual target area who are unlikely to be the source of the rumor, and is equal to 1.05; neighbor(v) is a first-order neighbor of a non-sensor v in G; Ō is a set of the sensors corresponding to the observation users with the deployment ratio φ in the actual target area, and the deployment ratio φ is 20%, 30% or 40%; and Ō is a set of sensors corresponding to the observation users in the actual target area who have received the propagation information of the rumor.
Table 1 shows a scale of test dataset.
TABLE-US-00001 TABLE 1 Scale of test dataset Number of Number of Average Dataset Nodes Edges Degree Jazz 198 2742 27.7 Facebook 4039 88234 43.7 Twitch-ES 4648 59382 25.55 Wiki-Vote 7115 103689 29.15 Facebook-Large 22470 171002 15.22 GitHub Social 37700 289003 15.33
[0062]
(d) The minimum likelihood score determined by adding a penalty coefficient is the predicted rumor propagation source. The likelihood scores of the nodes Ŝ.sub.1 and Ŝ.sub.2 after adding the penalty coefficient are calculated as Ŝ.sub.1=Ŝ.sub.1*1.05, Ŝ.sub.2=Ŝ.sub.2*1, respectively. Therefore, the rumor propagation source is Ŝ.sub.1 with the minimum likelihood score. Thus, the user corresponding to Ŝ.sub.1 in the actual target area is the rumor propagation source predicted in the embodiment of the present disclosure.
[0063]
[0064]
[0065]
[0066] After the propagation data set is constructed, the corresponding rumor propagation source in the actual target area can be deduced according to the valid information. In order to prove the accuracy and feasibility of the present disclosure, after finding the rumor propagation source, the prediction result needs to be further confirmed.
[0067]
[0068] Thus, the network-graph-based algorithm provided in the present disclosure is a source localization method for a source based on full-order neighbor coverage strategy, which can locate the propagation source in a smaller area by deploying sensors basing the full-order neighbor coverage and effectively using the structural topology information and sensors' propagation information based on the graph theory. Locating the propagation source in a small area can not only improve the prediction accuracy, but also ensure that the loss is minimized by early locating. The algorithm solves the localization problem by the deployment strategy of sensors based full-order neighbor coverage, so less prior information is required. In reality, the algorithm of the present disclosure can be executed without collecting high-cost and high-overhead infection information. At the same time, the heterogeneous propagation model used in the present disclosure is close to the actual scenarios, which makes the algorithm of identifying source in the present disclosure have practical guiding significance. Finally, the propagation model and source localization method in the present disclosure are applied to locate the propagation source of the real-word datasets, and the ability to successfully predict the propagation source is high, which provides a scientific basis for the internet identifying rumor.
[0069] It should be noted that the embodiments described above are only used to illustrate the technical solutions of the present disclosure, but not intended to limit the present disclosure. Although the present disclosure has been described in detail above with reference to the embodiments, it should be understood that those skilled in the art can still make some modifications, replacements and variations based on the content disclosed herein. Understandably, any modifications and replacements made by those skilled in the art without departing from the spirit of the disclosure should fall within the scope of the disclosure defined by the appended claims.