Method and System For Augmenting Network Traffic Flow Reports
20170353486 · 2017-12-07
Inventors
Cpc classification
H04L43/08
ELECTRICITY
International classification
Abstract
Methods and systems for augmenting network traffic flow reports with domain name service (“DNS”) information are provided. A networking device system can monitor DNS response traffic through a network and extract domain name records from the response traffic that corresponds to domain names submitted in web requests. The extracted domain name records can be provided to a network traffic flow capture system for inclusion in a network traffic flow report.
Claims
1. A method for augmenting network traffic flow data with domain name service (“DNS”) information, involving a networking device having at least one data processor, the method comprising: monitoring, by the at least one data processor, DNS response traffic through a network; extracting, by the at least one data processor, at least one domain name record from the response traffic that corresponds to at least one domain name submitted in at least one web request; and providing, by the at least one data processor, the at least one domain name record for inclusion in the network traffic flow data.
2. The method of claim 1, further comprising storing the extracted at least one domain name record in cache memory.
3. The method of claim 2, wherein the cache memory includes prioritized cache memory.
4. The method of claim 1, wherein the at least one domain name record includes at least one of an ‘A’, an ‘AAA’, or a ‘CNAME’ record.
5. The method of claim 1, wherein the response traffic is directed to a client device from which the at least one web request is submitted.
6. The method of claim 1, wherein the network traffic flow data includes at least one Internet protocol (“IP”) address corresponding to the at least one domain name record.
7. The method of claim 1, wherein monitoring, extracting, and providing are implemented as an extension to a network traffic flow capture system.
8. A networking device configured to augment network traffic flow data with DNS information, comprising: a communications interface configured to route data to and from at least one client device; and at least one data processor configured to: monitor DNS response traffic through a network; extract at least one domain name record from the response traffic that corresponds to at least one domain name submitted in at least one web request; and provide the at least one domain name record for inclusion in the network traffic flow data.
9. The networking device of claim 8, further comprising storing the extracted at least one domain name record in cache memory.
10. The networking device of claim 9, wherein the cache memory includes prioritized cache memory.
11. The networking device of claim 8, wherein the at least one domain name record includes at least one of an ‘A’, an ‘AAA’, or a ‘CNAME’ record.
12. The networking device of claim 8, wherein the response traffic is directed to a client device from which the at least one web request is submitted.
13. The networking device of claim 8, wherein the network traffic flow data includes at least one IP address corresponding to the at least one domain name record.
14. The networking device of claim 8, wherein monitoring, extracting, and providing are implemented as an extension to a network traffic flow capture system.
15. A non-transitory computer readable medium for augmenting network traffic flow data with DNS information, the computer readable medium including instructions that, when executed by at least one data processor of a networking device, cause the at least one data processor to: monitor DNS response traffic through a network; extract at least one domain name record from the response traffic that corresponds to at least one domain name submitted in at least one web request; and provide the at least one domain name record for inclusion in the network traffic flow data.
16. The computer readable medium of claim 15, further including instructions that, when executed by the at least one data processor, cause the at least one data processor to store the extracted at least one domain name record in cache memory.
17. The computer readable medium of claim 16, wherein the cache memory includes prioritized cache memory.
18. The computer readable medium of claim 15, wherein the at least one domain name record includes at least one of an ‘A’, an ‘AAA’, or a ‘CNAME’ record.
19. The computer readable medium of claim 15, wherein the response traffic is directed to a client device from which the at least one web request is submitted.
20. The computer readable medium of claim 15, wherein the network traffic flow data includes at least one IP address corresponding to the at least one domain name record.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] The inventive embodiments are described in greater detail hereinafter with reference to the accompanying drawing figures, in which:
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0025] According to embodiments of the present invention, a system can augment network traffic flow reports (e.g., NetFlow or IPFIX reports) with original DNS queries information or context that are determined in real-time (e.g., as IPv4 and/or IPv6 connections occur), particularly when those queries/connection requests are made.
[0026]
[0027] Process 350 can include extracting the IP address(es) from the packet (step 352) and analyzing the contents in the packet to determine if the packet corresponds to a TCP session (step 354). If the packet is for a TCP session, process 350 can include extracting the TCP session parameters (step 356) and determining whether the session is for a newly established connection (step 358). If the session is for a newly established connection, process 350 can include querying the DNS cache(s) with the extracted IP address (step 360). If a result to the query is available (step 362), process 350 can include querying the DNS cache(s) for the result (step 364), and proceeding to B to return to step 316 of process 300. In some embodiments, querying of the DNS cache for result(s) can be repeated, e.g., until the last result is retrieved. If there is no result available at step 362, process 350 can include creating a new entry in one or more network traffic flow reports or data (step 374)—for example, by adding time information, the IP address, and DNS name if available—and proceeding to C to return to step 316 of process 300.
[0028] Returning to step 354, if the packet is not for a TCP session, process 350 can include determining or checking the last time the IP address was active (step 368). If the last time the IP address was active a relatively long time ago (at step 370), process 350 can include closing the record for that IP address if it is open (step 372), proceeding to step 374, and continuing on the process therefrom as shown. On the other hand, if the last time the IP address was active was relatively recently (at step 370), process 350 can include updating traffic counters for that IP record (step 378) and determining whether the time of the record is older than a reporting period (step 380). If the time of the record is older than the reporting period, process 350 can include recreating the record (step 382) and proceeding to D to return to step 316 of process 300. If the time of the record is not older than the reporting period, process 350 can proceed to E to return directly to step 316 of process 300.
[0029] Returning to step 358, if the session is not for a newly established connection, process 350 can include determining whether the TCP session is closed (step 376). If the TCP session is closed, process 350 can proceed to step 372; otherwise, the process can proceed to step 378.
[0030] According to various embodiments, the system can be implemented as an algorithm, and more specifically, as an extension to network flow capture software (e.g., NetFlow). The algorithm can (i) enable inspection of DNS answer traffic [e.g., more deeply or concentrated than other data], (ii) push answer information into prioritized cache, (iii) mine or “travel” the cache in reverse order to recover original DNS name information used at or about the time of the requests, and (iv) add the recovered original DNS name information to the network flow report.
[0031] An example of a traffic line item from a network flow report augmented with original DNS name information is as follows: 2016-02-26 32:15:32.434 1.030 TCP 192.168.0.1:42343->10.0.226.24 (lenkins.avg-labs.com):80 X XXXXX X. An example of the prioritized DNS cache contents is as follows: [0032] 1. jenkins.avg-labs.com: apps-build-prod-idc-ams001.mgm.avg.com. [0033] 2. apps-build-prod-idc-ams001.mgm.avg.com: 10.0.226.24.
[0034] According to an exemplary embodiment, the system can generate network traffic flows and link connections (e.g., HTTP connections) revealed by the flows to relevant DNS names at or about the time the connections were made. In certain embodiments, the system can be implemented as a special DNS module that extends an existing flow capturing software application. The module can, for example, be configured to: [0035] 1. Capture all incoming DNS traffic; [0036] 2. Extract original web requests and A, AAA, and CNAME records from DNS replies; [0037] 3. Organize such data into one or more special caches; and [0038] 4. Provide an interface to capture flow software such that the software can quickly recover the appropriate DNS name used in the requested connection.
[0039]
[0040]
[0041] An example of a network flow report (e.g., augmented according to one or more of the processes shown in
[0042] It should be understood that the steps shown in processes 300, 350, 500, and 600 are merely illustrative and that existing steps may be modified or omitted, additional steps may be added, and the order of certain steps may be altered.
[0043] Accordingly, embodiments of the present invention advantageously provide network flows that include the original requested DNS names for some or all of the reported connection requests. This enables network analysis personnel, automation tools, or the like to optimize network bandwidth (e.g., for individual users) and identify network security issues. It is to be appreciated that, in certain embodiments, the augmented network flow reports can be useful for detecting malicious programs, such as unauthorized smartphone apps. The novel system described herein, including the supplementation of network flows with DNS names from cache, can overcome the disadvantages of existing DNS caching solutions, which do not effect grouping by individual hosts.
[0044] It should be understood that the foregoing subject matter may be embodied as devices, systems, methods and/or computer program products. Accordingly, some or all of the subject matter may be embodied in hardware and/or in software (including firmware, resident software, micro-code, state machines, gate arrays, etc.). Moreover, the subject matter may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. A computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0045] The computer-usable or computer-readable medium may be for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Computer-readable media may comprise computer storage media and communication media.
[0046] Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EEPROM, flash memory or other memory technology that can be used to store information and that can be accessed by an instruction execution system.
[0047] Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media (wired or wireless). A modulated data signal can be defined as a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
[0048] When the subject matter is embodied in the general context of computer-executable instructions, the embodiment may comprise program modules, executed by one or more systems, computers, or other devices. Generally, program modules include routines, programs, objects, components, data structures and the like, which perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
[0049] Those of ordinary skill in the art will understand that the term “Internet” used herein refers to a collection of computer networks (public and/or private) that are linked together by a set of standard protocols (such as TCP/IP and HTTP) to form a global, distributed network. While this term is intended to refer to what is now commonly known as the Internet, it is also intended to encompass variations that may be made in the future, including changes and additions to existing protocols.
[0050] It will thus be seen that the objects set forth above, among those made apparent from the preceding description and the accompanying drawings, are efficiently attained and, since certain changes can be made in carrying out the above methods and in the constructions set forth for the systems without departing from the spirit and scope of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.
[0051] It is also to be understood that the following claims are intended to cover all of the generic and specific features of the invention herein described, and all statements of the scope of the invention, which, as a matter of language, might be said to fall therebetween.