Data streaming scheduler for dual chipset architectures that includes a high performance chipset and a low performance chipset
10003542 · 2018-06-19
Assignee
Inventors
- Murilo Opsfelder Araújo (Limeira, BR)
- Rafael Camarda Silva Folco (Santa Bárbara d'Oeste, BR)
- Breno Henrique Leitão (Jd Guanabara Campinas, BR)
- Tiago Nunes Dos Santos (Araraquara, BR)
Cpc classification
G06F1/28
PHYSICS
H04L47/27
ELECTRICITY
G06F1/3293
PHYSICS
International classification
G06F1/28
PHYSICS
G01R11/00
PHYSICS
Abstract
A dual chipset architecture, a method of operating a scheduler for a dual chipset architecture, and a computer program product for operating a scheduler for a dual chipset architecture. In an embodiment, the dual chipset architecture comprises a high performance processor, a low performance processor, and a scheduler for the processors. The scheduler is provided for determining an expected data traffic flow to the chipset, and for selectively enabling the high and low performance processors, based on this expected data flow, ahead of this expected data flow reaching the chipset. In one embodiment, a specified data traffic indicator is associated with the expected data traffic flow, and the scheduler uses this specified data traffic indicator to determine the expected data traffic flow. In an embodiment, this specified data traffic indicator is a value for a defined window size for the expected data flow.
Claims
1. A method of operating a scheduler for a dual chipset hardware architecture comprising, a high performance hardware processor unit having a first operating speed, and a low performance hardware processor unit having a second operating speed slower than the first operating speed, the method comprising: analyzing incoming data traffic flow to the dual chipset, and monitoring the incoming data traffic flow for an indication of an expected data traffic flow to the dual chipset; and enabling the high and low performance processor units selectively at different times based on, and ahead of, the expected data traffic flow to the dual chipset at said different times, including based on the expected data traffic flow to the dual chipset at first times, enabling only one of the high and low performance processor units at the first times to process the expected data flow to the dual chipset at the first times, and based on the expected data traffic flow to the dual chipset at second times, enabling both of the high and low performance processor units at the second times to process the expected data traffic flow, based on said expected data flow to the dual chipset at the second time.
2. The method according to claim 1, wherein a specified data traffic indicator is associated with the expected data traffic flow, and the monitoring the incoming data traffic flow for an indication of an expected data traffic flow to the dual chipset includes using the specified data traffic indicator to determine the expected data traffic flow.
3. The method according to claim 2, wherein the expected data traffic flow includes a plurality of data flows through a specified connection, and a respective one data flow indicator value is associated with each of the plurality of data flows, each of the data indicator values indicating an expected volume of data in the associated data flow, and wherein: the specified data traffic flow indicator is based on the data flow indicator values.
4. The method according to claim 1, wherein the selectively enabling includes: enabling the high performance processor when the expected data traffic flow is above a first threshold; enabling the low performance processor when the expected data traffic flow is below a second threshold, said second threshold being lower than the first threshold; enabling the high performance processor and the low performance processor when the expected data flow is between the first and second thresholds; and disabling the low performance processor when the expected data flow remains between the first and second thresholds for a predetermined length of time.
5. The method according to claim 1, wherein the enabling the high and low performance processor units selectively further includes when the expected data traffic flow falls below a first threshold, disabling the high performance processor unit and enabling the low performance processor unit; and after a set period of time, if the expected data traffic flow is above a second threshold, lower than the first threshold, disabling the low performance processor unit and re-enabling the high performance processor unit.
6. A method of operating a scheduler for a dual chipset hardware architecture comprising, a high performance hardware processor unit having a first operating speed, and a low performance hardware processor unit having a second operating speed slower than the first operating speed, the method comprising: analyzing incoming data traffic flow to the dual chipset, and monitoring the incoming data traffic flow for an indication of an expected data traffic flow to the dual chipset; and enabling the high and low performance processor units selectively at different times based on, and ahead of, the expected data traffic flow to the dual chipset at said different times, including enabling only one of the high and low performance processor units at first times to process the expected data flow, and enabling both of the high and low performance processor units at second times to process the expected data traffic flow, based on said expected data flow, ahead of said expected data flow reaching the dual chipset; and wherein: the analyzing incoming data traffic flow to the dual chipset, and monitoring the incoming data traffic flow for an indication of an expected traffic flow includes analyzing the incoming data traffic flow to identify a defined window size for each segment of the analyzed incoming traffic, summing all the window sizes identified over a predetermined time period to obtain a window size sum, determining whether the window size sum meets a first threshold value representative of a high traffic volume, determining whether the window size sum does not meet a second threshold value, less than the first threshold value, representative of a low traffic volume, and determining whether the window size sum is between the first and second threshold values representative of an intermediate traffic volume.
Description
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION
(10) Example methods, apparatuses, and products for operating a dual chipset NIC that includes a high performance media access control chipset and a low performance media access control chipset in accordance with the present invention are described with reference to the accompanying drawings, beginning with
(11) The NIC 102 of
(12) The NIC (102) of
(13) In the example of
(14) In the example of
(15) The high performance media access control chipset 108 includes an offload engine 124. In the example of
(16) With the embodiment of
(17) In the example of
(18) The NIC control module 302 of
(19) Embodiments of the invention provide an efficient scheduler for data streaming to switch between the high performance processor 126 and the power efficient processor 114. Embodiments of the invention use information about incoming data streams to predict the traffic flow, and switch back and forth between the high performance processor and the low performance processor ahead of time based on this prediction, thus improving the power consumption of the architecture. For example, embodiments of the invention may be used with data streams constructed in accordance with the Transmission Control Protocol (TCP). In the TCP, a data stream includes a TCP header that, in turn, includes a field, referred to as window size, and the value in this field is an indication of the amount of traffic in the data stream.
(20) With the TCP protocol, it is easy to determine if a TCP connection will growthat is, if the amount of data entering through the connection will increase. So, if a connection from one device to another device starts to get busy, then the processing of the data should migrate from the low performance processor to the high performance processor.
(21) In contrast when the connection from one device to another device is not busy, the processing of the data should migrate from the high performance processor to the low performance processor. The high performance processor may be disabled, and the low performance processor is responsible of the connection.
(22) TCP provides a feature called Window Scaling, and with reference to
(23) A TCP window size having a high value indicates that a high traffic volume is coming. When the traffic volume over a connection is very small, the TCP window size has a low value.
(24) In embodiments of the invention, a chipset analyzes all the incoming traffic and monitors the TCP window size for each TCP segment. This chipset is referred to as the Traffic Sensor (TS).
(25) If the TCP window size of a TCP segment is high, this indicates that a high traffic volume is coming, and then the high performance processors are enabled to process the high workload.
(26) If the Traffic Sensor realizes that the TCP window size has decreased too much, or that traffic over a connection is stopped (FYN packet), then the high performance processor is disabled and the low performance processor is enabled to process the low workload.
(27) In embodiments of the invention, the scheduler works by measuring the sum of all TCP window size values at a time interval.
(28) Every TCP segment carries the TCP window size for the corresponding stream of data. A network interface can have plural or multiple streams at the same time.
(29) Examples of data streaming applications include: HTTP and web based applications; Database, Virtualization, Cloud, data center workloads; SSL (Secure Socket Layer); Infrastructure services (LDAP, SSH, Telnet); and Financial Services Applications.
(30) In embodiments of the invention, the scheduler measures the sum of all TCP window size values () at a time interval (t, t+1, t+2, . . . , t+n). Table 1 below gives an example.
(31) TABLE-US-00001 TABLE 1 Measure of TCP window size in time intervals Instant t Instant t + 1 Stream: 1 Stream: 3 TCP window size: 3 TCP window size: 7 Stream: 2 Stream: 4 TCP window size: 4 TCP window size: 5 Total: 7 Total: 12
(32) In the example of Table 1, in the moment t, the sum of all TCP window size values was 7. In the next moment t+1, the sum was 12.
(33) The TCP window size is a traffic prediction. The TCP segment carries this value to tell the receiver more packets are expected to arrive. A high window size means that a large amount of data is about to be transmitted to the receiver interface.
(34) Thus, if the sum of all window size values is high, it is clear that a high peak of traffic is arriving. So, having the high performance processors enabled in advance will allow the receiver to process all the incoming packets smoothly.
(35)
(36)
(37) The area 406 between the soft limit 404 and the hard limit 406 is referred to as a grey zone, where both the high performance processor (running at normal speed) and the low performance processor (running overclocked) can handle the traffic with no performance issue.
(38) In an embodiment of the invention, a timer is started when the sum 402 reaches the soft limit. When the timer ends and the sum is still above the hard limit, the high performance processor is enabled again; if the sum is below the hard limit (meaning the traffic prediction has reduced), the over clock mode in the low performance processor is disabled, and this processor is run at its normal speed.
(39)
(40)
(41) In this example, after falls below the soft limit, a time out occurs with still above the hard limit 410. When this timeout occurs, the low performance processor is disabled and the high performance processor is re-enabled, and the processing load is handled by the high performance processor.
(42) The tables of
(43) With reference again to
(44) The NIC control module 302 of
(45) Configuring the NIC 102 to utilize the high performance media access control chipset 108) for data communications operations may be carried out, for example, through the use of an active flag maintained by each media access control chipset in shared memory 116. When the value of the active flag maintained by a particular media access control chipset is set to 1, the media access control chipset may operate as normalsending packets and processing received packets. When the value of the active flag maintained by a particular media access control chipset is set to 0, however, the media access control chipset may operate in a standby mode and may be configured to refrain from processing received packets, transmitting packets, or performing any other operations in an attempt to facilitate data communications. In such an example, configuring the NIC 102 to utilize the high performance media access control chipset 108 for data communications operations may be carried out by setting the active flag for the high performance media access control chipset (108) to a value of 1 and also setting the active flag for the low performance media access control chipset 118 to a value of 0.
(46) The NIC control module 302 of
(47) Configuring the NIC 102 to utilize the low performance media access control chipset 118 for data communications operations may be carried out, for example, through the use of an active flag maintained by each media access control chipset in shared memory 116. When the value of the active flag maintained by a particular media access control chipset is set to 1, the media access control chipset may operate as normalsending packets and processing received packets. When the value of the active flag maintained by a particular media access control chipset is set to 0, however, the media access control chipset may operate in a standby mode and may be configured to refrain from processing received packets, transmitting packets, or performing any other operations in an attempt to facilitate data communications. In such an example, configuring the NIC 102 to utilize the low performance media access control chipset 118 for data communications operations may be carried out by setting the active flag for the low performance media access control chipset 118 to a value of 1 and also setting the active flag for the high performance media access control chipset (108) to a value of 0.
(48) The NIC control module 302 of
(49)
(50) Stored in RAM 168 is an operating system 154. Operating systems useful in computers 152 that include a dual chipset NIC 102 according to embodiments of the present invention include UNIX, Linux, Microsoft XP, AIX, IBM's i5/OS, and others as will occur to those of skill in the art. The operating system 154 in the example of
(51) The computer 152 of
(52) The example computer 152 of
(53) The example computer 152 of
(54) For further explanation,
(55) The example method of
(56) The example method of
(57) The example method of
(58) The example method of
(59) The example method of
(60) Additional details of NIC 102 are disclosed in copending application no. (Attorney Docket CA920130054US1), the entire contents and disclosure of which are hereby incorporated herein by reference.
(61) As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a circuit, module or system. Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
(62) Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
(63) A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
(64) Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
(65) Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
(66) Aspects of the present invention are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
(67) These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
(68) The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
(69) The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
(70) It will be understood from the foregoing description that modifications and changes may be made in various embodiments of the present invention without departing from its true spirit. The descriptions in this specification are for purposes of illustration only and are not to be construed in a limiting sense. The scope of the present invention is limited only by the language of the following claims.