CandC domain name analysis-based botnet detection method, device, apparatus and medium
11374897 · 2022-06-28
Assignee
Inventors
Cpc classification
G06N7/01
PHYSICS
International classification
G06F15/16
PHYSICS
Abstract
The invention provides a command-and-control (C&C) domain name analysis-based botnet detection method, device, apparatus and medium. The method includes an information acquisition step where DNS logs are acquired; a domain name analysis step where C&C domain names in the DNS logs are detected and the category of each C&C domain name is determined according to a pre-built domain name analyzer; a botnet determination step where whether a botnet exists is determined according to the C&C domain name and the category of C&C domain name. In the C&C domain name analysis-based botnet detection method, device, apparatus and medium provided by the present invention, by analyzing the domain name system (DNS) logs, the C&C domain name used in the attack activity is extracted for further analysis of the types of parasitic Trojans to thereby lock down the bot that the C&C server has controlled. In addition, the botnet activity trend can be analyzed by analyzing the Poisson parameter of each type of the C&C domain name, so as to form effective suppression measures in time.
Claims
1. A command-and-control domain name analysis-based botnet detection method, comprising: an information acquisition step, where domain name system logs are acquired; a domain name analysis step, where command-and-control domain names in the domain name system logs are detected and a category of each command-and-control domain name is determined according to a pre-built domain name analyzer; a botnet determination step, where whether a botnet exists is determined according to the command-and-control domain name and the category of command-and-control domain name; a data statistical step, where the occurrence frequency of each type of the command-and-control domain name is counted; and a trend determination step, where a botnet activity trend is determined according to the occurrence frequencies of all types of the command-and-control domain names so as to assist in formation of effective suppression measures in time, wherein training the domain name analyzer includes the steps of: cleaning legitimate domain names published by legitimate websites to obtain a legitimate domain name set; generating a command-and-control domain name set by a public domain name generation algorithm and classifying and tagging each domain name in the command-and-control domain name set; statistically analyzing a structure of each domain name in a legitimate domain name set and the command-and-control domain name set; establishing a character probability dictionary for each type of domain names; randomly selecting a given number of legitimate domain names from the legitimate domain name set to obtain a legitimate domain name training sample set; randomly selecting a given number of command-and-control domain names from the command-and-control domain name set to obtain a command-and-control domain name training sample set; and training the domain name analyzer according to the legitimate domain name training sample set, the command-and-control domain name training sample set and a character probability dictionary.
2. The method in claim 1, wherein the trend determination step includes the steps of: substituting the occurrence frequency of each type of the command-and-control domain name into a Poisson distribution probability function to obtain a Poisson parameter of the corresponding category; determining all the Poisson parameters as a measure of the botnet activity; and determining the botnet activity trend according to the measure of the botnet activity.
3. The method in claim 1, wherein: the domain name analyzer is a neural network model based on a cumulative backpropagation algorithm; and a neutral network model is provided with a regularization item that comprehensively considers empirical error factors and network complexity factors.
4. The method in claim 3, wherein calculating the neutral network model based on the cumulative backpropagation algorithm comprises the steps of: calculating an error target function; describing neutral network complexity; estimating model parameters by a cross validation process; and adjusting parameters by stochastic gradient descent to approximate a global minimum solution of the error function.
5. The method in claim 1, wherein the domain name analysis step includes the steps of: extracting a domain name from the domain name system logs; subjecting the domain name to feature extraction; determining whether the domain name is the command-and-control domain name according to the character probability dictionary; subjecting the command-and-control domain name to domain name feature quantification to obtain a classification number of the command-and-control domain name; and determining the category of the command-and-control domain name according to the classification number.
6. A computer apparatus, comprising: at least one processor; at least one memory; and computer program instructions stored in the memory, wherein the method in claim 1 is implemented when the computer program instructions are executed by the processor.
7. A non-transitory computer readable storage medium for storing computer program instruction, wherein the method in claim 1 is implemented when the computer program instructions are executed by the processor.
Description
BRIEF DESCRIPTION OF FIGURES
(1)
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(7) Hereinafter, the present invention is further illustrated by specific embodiments. However, it should be understood that these embodiments are only used for more detailed and specific explanations and should not be construed as limitations to the present invention in any form.
Example 1
(8) With reference to
(9) In the C&C domain name analysis-based botnet detection method in the example, by analyzing the DNS logs, the C&C domain name used in the attack activity is extracted for further analysis of the types of parasitic Trojans to thereby lock down the bot that the C&C server has controlled. In the example, the format of the DNS logs is specifically shown in Table 1.
(10) TABLE-US-00001 TABLE 1 DNS logs Apparatus IP Response IP Time address Doman name address TTL 2017-12-12 192.168.2.14 mbd.baidu.com 14.251.177.166 55 08:12:15.386 2017-12-12 192.168.2.19 news.ifeng.com 125.90.47.177 55 08:12:15.889 2017-12-12 192.168.2.110 www.78.cn 183.6.224.102 55 08:12:16.231 2017-12-12 192.168.2.118 www.ggspyfmreouxnhqi.com Null 0 08:12:17.001 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com Null 0 08:12:17.653 2017-12-12 192.168.2.118 mail.pivzovznpssx.com Null 0 08:12:17.967 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com Null 0 08:12:18.862 2017-12-12 192.168.2.118 www.zrkdvzjhse.com Null 0 08:12:19.768 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com Null 0 08:12:20.662 2017-12-12 192.168.2.19 www.rauggyguyp.com 208.100.26.251 235 08:12:21.524 2017-12-12 192.168.2.118 www.furiararji.com Null 0 08:12:22.325 2017-12-12 192.168.2.118 www.pibqzedhzwt.com Null 0 08:12:23.219 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com Null 0 08:12:24.165 2017-12-12 192.168.2.14 tech.meituan.com 103.37.152.63 41 08:12:24.981 2017-12-12 192.168.2.19 www.iteblog.com 123.206.77.132 53 08:12:25.824 2017-12-12 192.168.2.110 guanjia.qq.com 14.215.138.13 55 08:12:26.585 2017-12-12 192.168.2.118 en.wikipedia.org 198.35.26.96 51 08:12:27.186 2017-12-12 192.168.2.118 www.johannesbader.ch 162.254.250.112 44 08:12:28.115 2017-12-12 192.168.2.14 us.norton.com 23.193.116.250 53 08:12:29.023 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com Null 0 08:12:29.829 2017-12-12 192.168.2.110 spark.apache.org 195.154.151.36 50 08:12:30.691 2017-12-12 192.168.2.110 www.cnblogs.com 101.37.113.127 40 08:12:31.551 2017-12-12 192.168.2.14 blog.csdn.net 47.95.165.112 35 08:12:32.384 2017-12-12 192.168.2.19 baike.baidu.com 180.149.131.247 54 08:12:33.168 2017-12-12 192.168.2.118 www.jsntwyjcv.com Null 0 08:12:34.069 2017-12-12 192.168.2.118 app.tanwan.com 113.96.154.108 55 08:12:35.011 2017-12-12 192.168.2.110 www.icbc.com.cn 14.119.125.23 55 08:12:35.892 2017-12-12 192.168.2.118 www.miercn.com 113.96.154.108 55 08:12:36.721 2017-12-12 192.168.2.14 zs.91.com 125.77.24.228 53 08:12:37.259 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com Null 0 08:12:38.172
(11) In the example, domain name analysis is performed according to the logs as shown in Table 1 to obtain a domain name detection result as shown in Table 2, and the C&C domain names in the same category are counted out according to a time sequence in the domain name detection result.
(12) TABLE-US-00002 TABLE 2 Domain name detection result Apparatus IP Response IP Time address Domain name address TTL Category 2017-12-12 192.168.2.118 www.ggspyfmreouxnhqi.com null 0 banjori 08:12:17.001 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com null 0 banjori 08:12:17.653 2017-12-12 192.168.2.118 mail.pivzovznpssx.com null 0 banjori 08:12:17.967 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com null 0 banjori 08:12:18.862 2017-12-12 192.168.2.118 www.zrkdvzjhse.com null 0 banjori 08:12:19.768 2017-12-12 192.168.2.19 www.rauggyguyp.com 208.100.26.251 235 banjori 08:12:21.524 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com null 0 banjori 08:12:20.662 2017-12-12 192.168.2.118 www.furiararji.com null 0 banjori 08:12:22.325 2017-12-12 192.168.2.118 www.pibqzedhzwt.com null 0 banjori 08:12:23.219 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com null 0 banjori 08:12:24.165 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com null 0 banjori 08:12:29.829 2017-12-12 192.168.2.118 www.jsntwyjcv.com null 0 banjori 08:12:34.069 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com null 0 banjori 08:12:38.172
(13) Furthermore, it should be noted that the domain name analyzer in the example can identify 28 types of C&C domain names such as banjori.
(14) Preferably, as shown in
(15) Specifically, the trend determination step S5 includes substituting the occurrence frequency of each type of the C&C domain name into a Poisson distribution probability function to obtain a Poisson parameter of the corresponding category; determining all the Poisson parameters as a measure of botnet activity; and determining the botnet activity trend according to the measure of botnet activity.
(16) In the example, it is impossible for the botmaster to register all the generated domain names, and only several generated domain names are registered in advance on account of economic costs. A bot must generate similar C&C domain name attempt requests in each cycle in order to establish a connection with a C&C server, until the IP address of the C&C server is acquired. As a result, compared with a normal host, its behavior pattern has distinctive features manifested by the facts that: (1) A bot requests a large number of new C&C domain names, most of which undergoes a resolution failure; (2) When multiple parasitic Trojans exist in the network, the domain name request behavior of the bot presents a group behavior characteristic. Since the botmaster has limited server resources, the C&C domain names with successful resolution usually point to the same IP address.
(17) According to the principle of random service system, the occurrence frequency of the C&C server domain name satisfies Poisson distribution. The records extracted from the DNS logs are determined by a C&C domain name detection model, and the occurrence frequency k of the same type of C&C domain names per unit time is counted and substituted into a Poisson distribution probability function to estimate the Poisson parameter λ in a certain period, wherein the Poisson distribution probability function is shown as:
(18)
(19) In the example, the Poisson parameter is determined as a measure of botnet activity. Table 3 shows a botnet activity trend obtained through analysis.
(20) TABLE-US-00003 TABLE 3 Botnet activity trend Time interval Average frequency Poisson parameter Category 01 45 45 Banjori 01 87 87 Sisron 01 0 0 Qadars 02 12 12 Banjori 02 0 0 Sisron 02 0 0 Qadars 03 53 53 Banjori 03 89 89 Sisron 03 36 36 Qadars . . . . . . . . . . . .
(21) In Table 3, any unit time can be used as a statistical time interval, and the average frequency refers to the number of the C&C domain names captured at a current time interval within the period.
(22) Moreover, it should be noted that a bot (IP address and MAC address) with a regular C&C domain name request behavior is determined according to the DNS logs, and a possible attack target of the botnet is easily analyzed according to the role of the bot, so as to form targeted suppression measures in time.
(23) Preferably, as shown in
(24) In the example, the legitimate domain names published by websites such as Alexa are cleaned to obtain 1495163 legitimate domain names, and C&C domain names are obtained by sampling via a public DGA. It should be noted that DGA is a domain generation algorithm and an attacker can use it to generate a pseudorandom character string as a domain name, thereby avoiding blacklist detection. Pseudorandom means that the character string sequence seems to be random, but can be repeatedly generated and duplicated since its structure can be predetermined. The algorithm is usually used in malware and remote control software. In the example, the domain name features are shown in Table 4.
(25) TABLE-US-00004 TABLE 4 Domain name feature description Feature name Feature description length host name character string length uni-entropy host name 1-gram character information entropy uni-probavg host name 1-gram character average probability bi-entropy host name 2-gram character information entropy bi-probavg host name 2-gram character average probability tri-entropy host name 3-gram character information entropy tri-probavg host name 3-gram character average probability uni-gram-avgrank host name 1-gram character average sequence uni-gram-stdrank host name 1-gram character sequence standard deviation bi-gram-avgrank host name 2-gram character average sequence bi-gram-stdrank host name 2-gram character sequence standard deviation tri-gram-avgrank host name 3-gram character average sequence tri-gram-stdrank host name 3-gram character sequence standard deviation vowel-ratio vowel ratio digit-ratio digit ratio consonant-ratio consonant ratio consec-consonant consecutive consonant ratio consec-digit consecutive digit ratio top1gram-ratio host name 1-gram letter probability top 10 ratio top2gram-ratio host name 2-gram character combined probability top 100 ratio top3gram-ratio host name 3-gram character combined probability top 1000 ratio
(26) In the example, specifically, the domain name analyzer is a neural network model based on a cumulative BP algorithm, and the neutral network model is provided with a regularization item that comprehensively considers empirical error factors and network complexity factors. In addition, the calculation steps of the neutral network model based on the cumulative BP algorithm include calculating an error target function; describing the neutral network complexity; estimating model parameters by a cross validation process; and adjusting parameters by stochastic gradient descent to approximate the global minimum solution of the error function. In the example, an n-gram (uni-gram, bi-gram, tri-gram) character probability dictionary is established by use of the 1495163 legitimate domain names obtained by cleaning. Furthermore, like each type of C&C domain names, 1000 legitimate domain names are randomly extracted as a training sample set, the cumulative BP algorithm is employed, the part describing the neutral network complexity is added into the error target function, the model parameters are estimated by the cross validation process, and the parameters are adjusted by stochastic gradient descent to approximate the global minimum solution of the error function.
(27) It should be noted that, in the example, features are extracted according to the domain name character registration habit, and the regularization item is added during the model training process with the BP algorithm to compromise between the empirical error and the network complexity, so as to effectively control over-fitting.
(28) Further preferably, as shown in
Example 2
(29) With reference to
(30) In the C&C domain name analysis-based botnet detection method provided in the example of the present invention, by analyzing the DNS logs, the C&C domain name used in the attack activity is extracted for further analysis of the types of parasitic Trojans to thereby lock down the bot that the C&C server has controlled. In the example, the format of the DNS logs is specifically shown in Table 1.
(31) TABLE-US-00005 TABLE 1 DNS logs Apparatus IP Response IP Time address Domain name address TTL 2017-12-12 192.168.2.14 mbd.baidu.com 14.251.177.166 55 08:12:15.386 2017-12-12 192.168.2.19 news.ifeng.com 125.90.47.177 55 08:12:15.889 2017-12-12 192.168.2.110 www.78.cn 183.6.224.102 55 08:12:16.231 2017-12-12 192.168.2.118 www.ggspyfmreouxnhqi.com Null 0 08:12:17.001 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com Null 0 08:12:17.653 2017-12-12 192.168.2.118 mail.pivzovznpssx.com Null 0 08:12:17.967 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com Null 0 08:12:18.862 2017-12-12 192.168.2.118 www.zrkdvzjhse.com Null 0 08:12:19.768 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com Null 0 08:12:20.662 2017-12-12 192.168.2.19 www.rauggyguyp.com 208.100.26.251 235 08:12:21.524 2017-12-12 192.168.2.118 www.furiararji.com Null 0 08:12:22.325 2017-12-12 192.168.2.118 www.pibqzedhzwt.com Null 0 08:12:23.219 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com Null 0 08:12:24.165 2017-12-12 192.168.2.14 tech.meituan.com 103.37.152.63 41 08:12:24.981 2017-12-12 192.168.2.19 www.iteblog.com 123.206.77.132 53 08:12:25.824 2017-12-12 192.168.2.110 guanjia.qq.com 14.215.138.13 55 08:12:26.585 2017-12-12 192.168.2.118 en.wikipedia.org 198.35.26.96 51 08:12:27.186 2017-12-12 192.168.2.118 www.johannesbader.ch 162.254.250.112 44 08:12:28.115 2017-12-12 192.168.2.14 us.norton.com 23.193.116.250 53 08:12:29.023 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com Null 0 08:12:29.829 2017-12-12 192.168.2.110 spark.apache.org 195.154.151.36 50 08:12:30.691 2017-12-12 192.168.2.110 www.cnblogs.com 101.37.113.127 40 08:12:31.551 2017-12-12 192.168.2.14 blog.csdn.net 47.95.165.112 35 08:12:32.384 2017-12-12 192.168.2.19 baike.baidu.com 180.149.131.247 54 08:12:33.168 2017-12-12 192.168.2.118 www.jsntwyjcv.com Null 0 08:12:34.069 2017-12-12 192.168.2.118 app.tanwan.com 113.96.154.108 55 08:12:35.011 2017-12-12 192.168.2.110 www.icbc.com.cn 14.119.125.23 55 08:12:35.892 2017-12-12 192.168.2.118 www.miercn.com 113.96.154.108 55 08:12:36.721 2017-12-12 192.168.2.14 zs.91.com 125.77.24.228 53 08:12:37.259 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com Null 0 08:12:38.172
(32) In the example, a domain name detection result as shown in Table 2 can be obtained by domain name analysis based on the logs shown in Table 1, and the C&C domain names in the same category are counted out according to a time sequence in the detection result.
(33) TABLE-US-00006 TABLE 2 Domain name detection result Apparatus Response IP Time IP address Domain name address TTL Category 2017-12-12 192.168.2.118 www.ggspyfmreouxnhqi.com null 0 banjori 08:12:17.001 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com null 0 banjori 08:12:17.653 2017-12-12 192.168.2.118 mail.pivzovznpssx.com null 0 banjori 08:12:17.967 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com null 0 banjori 08:12:18.862 2017-12-12 192.168.2.118 www.zrkdvzjhse.com null 0 banjori 08:12:19.768 2017-12-12 192.168.2.19 www.rauggyguyp.com 208.100.26.251 235 banjori 08:12:21.524 2017-12-12 192.168.2.118 www.wyuhdsdttczd.com null 0 banjori 08:12:20.662 2017-12-12 192.168.2.118 www.furiararji.com null 0 banjori 08:12:22.325 2017-12-12 192.168.2.118 www.pibqzedhzwt.com null 0 banjori 08:12:23.219 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com null 0 banjori 08:12:24.165 2017-12-12 192.168.2.118 www.swtjyuhuefvl.com null 0 banjori 08:12:29.829 2017-12-12 192.168.2.118 www.jsntwyjcv.com null 0 banjori 08:12:34.069 2017-12-12 192.168.2.118 www.xjjcditjfkgkihfe.com null 0 banjori 08:12:38.172
(34) Furthermore, it should be noted that the domain name analyzer in the example can identify 28 types of C&C domain names such as banjori.
(35) Preferably, as shown in
(36) Specifically, the trend determination unit 5 is used for substituting the occurrence frequency of the C&C domain names in each category into a Poisson distribution probability function to obtain a Poisson parameter of the corresponding category; determining all the Poisson parameters as a measure of the botnet activity; and determining the botnet activity trend according to the measure of the botnet activity.
(37) In the example, it is impossible for the botmaster to register all the generated domain names, and only several generated domain names are registered in advance on account of the economic costs. A bot must generate similar C&C domain name attempt requests in each cycle in order to establish a connection with a C&C server, until the IP address of the C&C server is acquired. As a result, compared with a normal host, its behavior pattern has distinctive features manifested by the facts that: (1) A bot requests a large number of new C&C domain names, most of which undergoes a resolution failure; (2) When multiple parasitic Trojans exist in the network, the domain name request behavior of the bot presents a group behavior characteristic. Since the botmaster has limited server resources, the C&C domain names with successful resolution usually point to the same IP address
(38) According to the principle of random service system, the occurrence frequency of the C&C server domain name satisfies Poisson distribution. The records extracted from the DNS logs are determined by a C&C domain name detection model, and the occurrence frequency k of the same type of C&C domain names per unit time is counted and substituted into a Poisson distribution probability function to estimate the Poisson parameter λ in a certain period, wherein the Poisson distribution probability function is shown as:
(39)
(40) In the example, the Poisson parameter is determined as a measure of botnet activity. Table 3 shows a botnet activity trend obtained through analysis.
(41) TABLE-US-00007 TABLE 3 Botnet activity trend Time interval Average frequency Poisson parameter Category 01 45 45 Banjori 01 87 87 Sisron 01 0 0 Qadars 02 12 12 Banjori 02 0 0 Sisron 02 0 0 Qadars 03 53 53 Banjori 03 89 89 Sisron 03 36 36 Qadars . . . . . . . . . . . .
(42) In Table 3, any unit time can be used as a statistical time interval, and the average frequency refers to the number of the C&C domain names captured at a current time interval within the period.
(43) Moreover, it should be noted that a bot (IP address and MAC address) with a regular C&C domain name request behavior is determined according to the DNS logs, and a possible attack target of the botnet is easily analyzed according to the role of the bot, so as to form targeted suppression measures in time.
(44) Preferably, as shown in
(45) In the example, the legitimate domain names published by websites such as Alexa are cleaned to obtain 1495163 legitimate domain names, and C&C domain names are obtained by sampling via a public DGA. It should be noted that DGA is a domain generation algorithm and an attacker can use it to generate a pseudorandom character string as a domain name, thereby avoiding blacklist detection. Pseudorandom means that the character string sequence seems to be random, but can be repeatedly generated and duplicated since its structure can be predetermined. The algorithm is usually used in malware and remote control software. In the example, the domain name features are shown in Table 4.
(46) TABLE-US-00008 TABLE 4 Domain name feature description Feature name Feature description length host name character string length uni-entropy host name 1-gram character information entropy uni-probavg host name 1-gram character average probability bi-entropy host name 2-gram character information entropy bi-probavg host name 2-gram character average probability tri-entropy host name 3-gram character information entropy tri-probavg host name 3-gram character average probability uni-gram-avgrank host name 1-gram character average sequence uni-gram-stdrank host name 1-gram character sequence standard deviation bi-gram-avgrank host name 2-gram character average sequence bi-gram-stdrank host name 2-gram character sequence standard deviation tri-gram-avgrank host name 3-gram character average sequence tri-gram-stdrank host name 3-gram character sequence standard deviation vowel-ratio vowel ratio digit-ratio digit ratio consonant-ratio consonant ratio consec-consonant consecutive consonant ratio consec-digit consecutive digit ratio top1gram-ratio host name 1-gram letter probability top 10 ratio top2gram-ratio host name 2-gram character combined probability top 100 ratio top3gram-ratio host name 3-gram character combined probability ton 1000 ratio
(47) In the example, specifically, the domain name analyzer is a neural network model based on a cumulative BP algorithm, and the neutral network model is provided with a regularization item that comprehensively considers empirical error factors and network complexity factors. In addition, the calculation steps of the neutral network model based on the cumulative BP algorithm include calculating an error target function; describing the neutral network complexity; estimating model parameters by a cross validation process; and adjusting parameters by stochastic gradient descent to approximate the global minimum solution of the error function. In the example, an n-gram (uni-gram, bi-gram, tri-gram) character probability dictionary is established by use of the 1495163 legitimate domain names obtained by cleaning. Furthermore, like each type of C&C domain names, 1000 legitimate domain names are randomly extracted as a training sample set, the cumulative BP algorithm is employed, the part describing the neutral network complexity is added into the error target function, the model parameters are estimated by the cross validation process, and the parameters are adjusted by stochastic gradient descent to approximate the global minimum solution of the error function.
(48) It should be noted that, in the example, features are extracted according to the domain name character registration habit, and the regularization item is added during the model training process with the BP algorithm to compromise between the empirical error and the network complexity, so as to effectively control over-fitting.
(49) Further preferably, as shown in
Example 3
(50) The C&C domain name analysis-based botnet detection method in the example of the present invention with reference to
(51) The computer apparatus for implementation of the C&C domain name analysis-based botnet detection method may comprise a processor 401 and a memory 402 where computer program instructions are stored.
(52) Specifically, the processor 401 may comprise a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits capable of being configured to implement the example of the present invention.
(53) The memory 402 may comprise a mass memory for data or instructions. For example, not limitation, the memory 402 may comprise a hard disk drive (HDD), a floppy disk drive, a flash memory, an optical disk, a magneto-optical disk, a magnetic tape, a universal serial bus (USB) drive or a combination of two or more thereof. Under a proper condition, the memory 402 may comprise a removable or non-removable (or fixed) medium. Under a proper condition, the memory 402 may be arranged inside or outside a data processing device. In a specific example, the memory 402 is a non-volatile solid-state memory. In a specific example, the memory 402 comprises a read-only memory (ROM). Under a proper condition, the ROM may be a mask programmable ROM, a programmable ROM (PROM), an erasable PROM (EPROM), an electrically erasable PROM (EEPROM), an electrically alterable ROM (EAROM), a flash memory or a combination of two or more thereof.
(54) The processor 401 reads and runs the computer program instructions stored in the memory 402 to realize any C&C domain name analysis-based botnet detection method in the examples mentioned above.
(55) In one example, the computer apparatus may further comprise a communication interface 403 and a bus 410. As shown in
(56) The communication interface 403 is mainly used for communication between modules, devices, units and/or apparatuses in the examples of the present invention.
(57) The bus 410 comprises hardware, software or both and couples the parts of the computer apparatus with each other. For example, not limitation, the bus may comprise an accelerated graphics port (AGP) or other graphic buses, an enhanced industry standard architecture (EISA) bus, a front side bus (FSB), a hypertransport (HT) interconnection, an industry standard architecture (ISA) bus, an infiniband interconnection, a low pin count (LPC) bus, a memory bus, a microchannel architecture (MCA) bus, a peripheral component interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a serial advanced technology attachment (SATA) bus, a video electronics standards association local (VLB) bus, other proper buses or a combination of two or more thereof. Under a proper condition, the bus 410 may comprise one or more buses. Although the examples of the present invention describe and show the specific bus, the present invention takes any proper bus or interconnection into consideration.
Example 4
(58) Furthermore, the example of the present invention can provide a computer readable storage medium for implementing the C&C domain name analysis-based botnet detection methods in the examples mentioned above. The computer readable storage medium stores computer program instructions which are executed by a processor to implement any C&C domain name analysis-based botnet detection method in the examples mentioned above.
(59) It should be clarified that the present invention is not limited to the specific configurations and processes described hereinabove and shown in the figures. For simplicity, a detailed description of known methods is omitted herein. In the embodiments mentioned above, several specific steps are described and shown as examples. However, the methods/processes of the present invention are not limited to the specific described and shown steps. Those skilled in the art can make various changes, modifications and additions or change the sequence between the steps once understanding the spirit of the present invention.
(60) The functional blocks shown in the structural block diagram described above can be implemented as hardware, software, firmware or a combination thereof. When implemented as hardware, they can be, for example, electronic circuits, application specific integrated circuits (ASIC), proper firmware, plug-ins, functional cards, etc. When they are implemented as software, the elements of the present invention are programs or code segments for executing required tasks. The programs or code segments can be stored in a machine readable medium, or transmitted in a transmission medium or communication link via data signals in carriers. The machine readable medium may comprise any medium capable of storing or transmitting information. The examples of the machine readable medium include electronic circuits, semiconductor memory apparatuses, ROMs, flash memories, erasable ROMs (EROM), floppy disks, CD-ROMs, optical disks, hard disks, optical fiber media, radio frequency (RF) links, etc. The code segments can be downloaded via a computer network such as Internet and Intranet.
(61) It should also be noted that the exemplary embodiments mentioned in the present invention describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the sequences of the steps mentioned above, that is, the steps can be performed in the sequences mentioned in the embodiments or sequences different from those in the embodiments, or several steps are performed at the same time.
(62) Hereinabove mentioned are only the specific embodiments of the present invention. Those skilled in the art can clearly understand that no more details about the specific working processes (which can be learnt by reference to the corresponding processes in the foregoing method embodiments) of the systems, modules and units described above are provided herein, for the convenience and conciseness of description. It should be understood that the protection scope of the present invention is not limited thereto, anyone skilled in the art can easily come up with equivalent modifications or replacements within the technical scope disclosed by the present invention, and these modifications or replacements should be covered within the protection scope of the present invention.
(63) Although the present invention has been described to some extent, it is obvious that proper changes can be made to each condition without departing from the spirit and scope of the present invention. It is understood that the present invention is not limited to the embodiments mentioned above, but falls into the scope of the claims, which includes equivalent replacements of each factor.