Data centre simulator

09830407 ยท 2017-11-28

Assignee

Inventors

Cpc classification

International classification

Abstract

The invention provides a computer simulation system for simulating a data centre. The simulation system uses a logical representation of the data centre to perform the simulation. This logical representation includes a plurality of nodes representing devices in the data center. Each node has an input for applied load and outputs for electrical power drawn and losses in the form of heat output. Each node also has a function for calculating the outputs from the inputs. A first set of connections between the nodes represent electrical power drawn by one device in the data center from another device in the data center. A second set of connections between the nodes represent a thermal load applied by one device in the data center to another device in the data center. The simulator can be run for a series of different operating conditions to map data center efficiency, for example, or to assess the impact of different IT devices on the data center.

Claims

1. A computer simulation system for simulating a data centre, the simulation system comprising: a server; a logical representation of the data centre including: a plurality of nodes representing devices in the data centre, said plurality of nodes comprising at least one node representing one or more IT devices, at least one node representing an element of an electrical power delivery infrastructure of the data centre, and at least one node representing an element of a mechanical heat removal infrastructure of the data centre, each node comprising: a first input for applied load; a first output for total load from the node; a second output for losses; a function for calculating the outputs from the inputs; a first plurality of connections between at least some of the nodes, each of said first plurality of connections connecting an output of one node to the first input of another node and representing electrical power drawn by one device in the data centre from another device in the data centre; a second plurality of connections between at least some of the nodes, each of said second plurality of connections connecting an output of one node to the first input of another node and representing a thermal load applied by one device in the data centre to another device in the data centre; and a simulator framework operable to run a simulation by sequentially applying a series of input states to the simulator and recording an output of the simulator after the application of each input state; wherein the simulator output determines an allocation or attribution of a share of overall data centre energy to each individual node, and wherein the simulator output allocates a cost individually to each node that includes energy, capital, and operational costs, wherein capital costs include capital costs of hardware and installation, and wherein operational costs include operational costs of the infrastructure of the data centre and maintenance costs of hardware based upon a utilised or allocated portion of capacity.

2. The system of claim 1, wherein each input in the series of inputs comprises an applied electrical load and an external temperature and the simulator output comprises a data centre efficiency value.

3. The system of claim 1, wherein each input in the series of inputs comprises a time.

4. The system of claim 1, wherein each input in the series of inputs comprises operating data for at least some of the nodes, at least some of the nodes each comprising at least one additional input for operating data.

5. The system of claim 4, wherein said operating data comprises performance data for the device represented by the node.

6. The system of claim 4, wherein said operating data comprises data representing at least one environmental parameter.

7. The system of claim 1, wherein the simulator output comprises an allocation of the data centre energy consumption to each of the nodes.

8. The system of claim 1, wherein the cost allocated to a node includes a cost accrued to the capital cost of hardware and installation.

9. The system of claim 1, wherein the cost allocated to a node includes a cost accrued to the maintenance cost of hardware.

10. The system of claim 1, wherein the cost allocated to a node includes a cost accrued to the capital and operational costs of the infrastructure of the data centre based upon the utilised portion of capacity.

11. The system of claim 1, wherein the cost allocated to a node includes a cost accrued to the power lost in variable losses in the data centre infrastructure due to the power delivered to the device represented by the node.

12. The system of claim 1, wherein the applied load input for at least one of said nodes is an applied IT workload.

13. The system of claim 1, wherein at least one of the nodes is a device node which represents multiple devices of the same type and function operating as a group.

14. The system of claim 1, wherein said function for calculating the node outputs from the node inputs is selected from the group consisting of: functions using data points for loss or efficiency by one or more variables; parameterized functions for loss or efficiency by one or more variables; functions that simulate control systems for devices in the data center; and distribution or transformation functions.

15. The system of claim 1, wherein the nodes pass data using an extensible data format.

16. The system of claim 15, wherein the data passed between the nodes comprises a range of categories of cost.

17. The system of claim 15, wherein the data passed between the nodes comprises power passed as a vector representing power factor harmonics.

18. The system of claim 15, wherein the data passed between the nodes comprises values for absolute or relative humidity, water mass or water mass rate.

Description

BRIEF DESCRIPTION OF THE DRAWING

(1) Embodiments and optional features of the invention are described below, with reference to the accompanying drawings, in which:

(2) FIG. 1 illustrates the IT power delivery path and losses in a typical data centre;

(3) FIG. 2 shows the change with IT load of data centre input power required to deliver power to a 1 MW IT electrical load;

(4) FIG. 3 shows the change in data centre efficiency as IT electrical load increases from zero to full load;

(5) FIG. 4 shows data centre efficiency against IT load under a modular provisioning scenario;

(6) FIG. 5 shows a plot of DCIE by IT electrical load and external temperature;

(7) FIG. 6 illustrates the scope of the simulator coverage for an embodiment of the present invention and the variability in the operating parameters of a data centre that the simulator can account for;

(8) FIG. 7 schematically illustrates an individual node of the simulation environment of an embodiment of the invention;

(9) FIG. 8 illustrates the way in which multiple nodes of different types can be connected to one another in a simulation of a data centre;

(10) FIG. 9 shows device nodes connected to simulate the power delivery chain of a data centre, the power transfer being illustrated with solid lines;

(11) FIG. 10 shows device nodes connected to simulate the thermal chain a data centre, the thermal flows being shown in chain link lines;

(12) FIG. 11 shows the device nodes, power delivery connections and thermal connections of FIGS. 9 and 10 merged;

(13) FIG. 12 is a representation of a simple, single data hall data centre;

(14) FIG. 13 is a model of an IT device used in the simulation of an embodiment of the invention;

(15) FIG. 14 shows the relationship between server workload, server power draw (solid block) and server efficiency (line);

(16) FIG. 15 illustrates the manner in which the simulation models IT electrical load other than the IT device(s) being analysed, in order that the IT device is analysed in an operational context;

(17) FIG. 16 illustrates the integration of the IT device and IT electrical model of FIG. 15 into the model of the combined power and thermal chains of FIG. 11;

(18) FIG. 17 is a plot of IT device fixed and variable power drawn against IT workload;

(19) FIG. 18 shows the simulation model of FIG. 16 with additional nodes and connections to enable IT device fixed and variable energy allocation;

(20) FIG. 19 shows the simulation model of FIG. 18 with additional nodes and connections to enable full energy and cost allocation;

(21) FIG. 20 shows the simulation model of FIG. 19 with additional nodes and connections to enable full energy cost allocation with utilisation compensation;

(22) FIG. 21 is a schematic illustration of the software structure of an embodiment of the simulator;

(23) FIG. 22 is a schematic illustration of the software structure of another embodiment of the simulator;

(24) FIGS. 23 to 25 are plots of overall IT device cost and energy use for two comparative scenarios; and

(25) FIG. 26 is a comparative plot of overall (data centre) cost and energy usage for the two scenarios.

DETAILED DESCRIPTION

(26) Data Centre Overview

(27) The exemplary data centre simulator described below is an analysis tool which operates in two basic modes, data centre infrastructure performance and IT device analysis.

(28) Reporting and Analysis

(29) Tools and metrics for the data centre can be broadly categorised as either reporting or analysis.

(30) Reporting Measures and Metrics

(31) Reporting metrics include the Green Grid DCIE.sup.1 metric of electrical power transfer efficiency. This metric expresses the efficiency with which the data centre mechanical and electrical plant transfers energy from the building supply to the IT equipment. .sup.1 Data Center Infrastructure Efficiency

(32) D C I E = I T Equipment Power Total Facility Power

(33) The DCIE can be measured either at a single point in time or across a time period. A DCIE report for a data centre gives a view of the achieved efficiency under the specific combination of conditions during the measurement period.

(34) Analysis and Diagnostic Tools

(35) While the reporting metric approach is effective in providing initial recognition of a potential efficiency problem, there is more required to define a solution. There is also a requirement for analysis tools to determine why the efficiency is as measured and to assist operators in evaluation and business justification of effective financial and environmental improvements.

(36) The data centre simulator is such an analysis tool, designed to help provide understanding and answers to these questions. The simulator provides insight into both the data centre (building) infrastructure and how this interacts with the IT hardware it supports.

(37) Data Centre Efficiency

(38) The first mode of the simulator tool allows the modelling and analysis of the efficiency of the data centre infrastructure, the output from this stage is provided as a DCIE report of the performance as simulated.

(39) Overview of Data Centre Efficiency

(40) To explain the output of this mode it is necessary to provide a brief overview of the behaviour of the data centre mechanical and electrical infrastructure.

(41) As shown in FIG. 1, power is supplied to the data centre from, typically, a utility feed 101. This power then passes through a set of electrical power conversion, conditioning and distribution devices 102 on the way to the IT equipment 110. Each of these devices exhibits some inefficiency and some of the power is lost. Also consuming power is the mechanical plant of the data centre, mostly the CRAC.sup.2 units 103 and the chiller plant 104. Finally there will be ancillary services 105 such as lighting, fire suppression and generator pre-heaters which also consume power. .sup.2 Computer Room Air Conditioner

(42) In FIG. 2 we show a simplified view of the impact of these losses on the data centre utility input power required to deliver power to a 1 MW IT electrical load. As shown, the total power demand at full load is around 205% of the IT electrical load. Of more importance however, is the fixed overhead.sup.3 of the data centre mechanical and electrical infrastructure. At zero IT electrical load this plant would still draw around 600 kW from the utility. .sup.3 See the BCS paper Data Centre Efficiency Metrics for a more detailed exploration of this issue, http://www.bcs.org/datacentreenergy

(43) This fixed overhead means that the data centre efficiency (DCIE) will vary with the IT electrical load in the data centre. As shown in FIG. 3, at full load the DCIE is just under 0.5 but at 20% of the full rated load the DCIE has fallen to 0.23 due to the changing ratio between the fixed and variable power consumption.

(44) This variability of data centre efficiency with load means that we cannot usefully perform analysis or comparison of data centres with measured DCIE values as these reported values do not provide sufficient information to compare data centres or evaluate the impact of any changes.

(45) The electrical load and therefore the achieved efficiency of the data centre will vary with time both as IT equipment is installed and changed in the data centre and, with more modem IT equipment, as the applied IT workload changes. Virtualisation, grid and MAID.sup.4 technologies are all allowing for large variations in IT electrical load as they are installed into data centres through their ability to allow devices to idle, sleep or turn off when not required. .sup.4 Massive Array of Idle Disks, a RAID system which turns off hard disks when not in use

(46) Data Centre Efficiency and Modular Provisioning

(47) An example of complex DCIE variability is a more modern design, modular data centre. In this example the data centre mechanical and electrical plant is rolled out in stages. The PDU.sup.5, UPS.sup.6, CRAC and chiller systems in 200 kW.sup.7 steps of rated IT electrical load to the 1 MW full capacity. As shown in FIG. 4 we now have a family of DCIE curves. The modular deployment provides substantial efficiency improvements in the early stages of the facility operation where the facility is at low utilisation as well as reducing initial capital costs and improving flexibility. The fixed overhead of the data centre is reduced at lower rated capacities through the reduced quantities of mechanical and electrical equipment and their reduced losses. This, fixed modular approach is of less value in a facility where the IT equipment can exhibit large variations in electrical load. .sup.5 Power Distribution Unit, free standing large unit, not the power strips in racks.sup.6 Uninterruptible Power Supply.sup.7 200 kW IT electrical load steps, the actual increments are larger for most devices due to the losses from other devices in the power and cooling systems, these will be 200 kW for the CRAC units and larger for the Transformer

(48) Data Centre Efficiency and External Temperature

(49) The other major influencing factor on data centre efficiency is the external temperature. The efficiency of the data centre cooling systems is influenced by the external temperature into which they are trying to reject energy as heat.

(50) As shown in FIG. 5 the efficiency of the data centre varies substantially with external temperature. In order to effectively understand the cost and energy efficiency characteristics of a data centre or forecast the impact of changes to the mechanical and electrical or IT systems it is necessary to understand the variation of efficiency with both IT electrical load and external temperature.

(51) Data Centre Simulation

(52) The data centre simulator has been designed as a framework tool to encompass the full range of factors affecting data centre cost and energy performance.

(53) Scope of the Simulator

(54) The data centre is a complex environment which covers a broad range of technical disciplines, skill sets and frequently organisational roles. This has led to the development of a range of component calculators and discipline specific tools which seek to address the energy and cost issues of a modern data centre. The simulator covers the range from the IT workloads applied to the IT devices through the electrical and mechanical systems through to the energy supplies and external climate.

(55) The Need for System Level Simulation

(56) As discussed above, the data centre plant efficiency is not a constant that can be measured and then used for analysis or comparison of data centres. The electrical load applied to the infrastructure by the IT equipment affects the infrastructure efficiency. The efficiency of the data centre is also affected by the external temperature which varies with the time of day and season. Whilst much legacy IT equipment drew close to its full power irrespective of load and could be viewed from a mechanical and electrical perspective as little more than expensive resistors once installed in the data centre, modern IT equipment is being designed to exhibit a far stronger connection between the applied IT workload and the electrical power draw. As both the IT workload, driving the IT electrical load and the external temperature vary with the time of day we cannot usefully evaluate the efficiency of the data centre by external temperature without considering the variation in IT electrical load due to IT workload at the same time. The development and implementation of mobile virtual machines leading to real grid computing technologies makes this issue more significant.

(57) This interdependence makes it difficult to perform analysis of the data centre or forecast the impact of any changes to any part of the data centre without considering all of the external variables and the whole data centre as a system. With this range of connected external factors, it rapidly becomes difficult to analyse the performance of a data centre design or the impact of any change to an existing facility.

(58) As shown in FIG. 6, the full system simulator described here captures the variability of each coverage area and provides an operating framework in which simulation models of each component can operate, as part of the full system whilst considering only their direct dependencies. The user can specify the external variables such as the cost of power or IT workload by time and the performance of devices such as a UPS by applied load and the simulator will work through the variables and dependencies.

(59) Simulation Approach

(60) The basic approach of the simulator is to create a representation of the data centre using a set of nodes which represent the individual devices. This is then evaluated for a single set of input values and the resulting data retained for that step. The simulator then iterates through the steps required to produce the simulation output requested, applying the data provided for the external variables as required.

(61) More specifically, each element of the data centre system is represented as an individual node. A node contains the logic to simulate that element. A node is provided with device performance data and external variables by the simulator. Nodes are connected together in the simulator using defined interfaces.

(62) Both power (electrical energy) and thermal loads are represented in the energy simulator. These loads may take the form of simple numbers or structured arrays of values to represent complex constructs such as the phase of frequency harmonics making up an electrical power factor.

(63) The simulator treats each node as a black box and is therefore not restricted to simplified continuous functions but is able to incorporate complex, disjoint device behaviours up to and including the simulation of complex control systems.

(64) The simulator is able to represent feedback loops within the infrastructure that make traditional analysis difficult. For example an air conditioning unit may be powered from a device which is in the area the unit is cooling, as the air conditioning unit handles the thermal load in the area it draws power, this induces further losses in the power supply device, increasing the thermal losses which the air conditioning unit must deal with, increasing the power consumption etc.

(65) These approaches substantially simplify the creation of device specific simulation components by providing a full environment and context for device specific functional representations.

(66) FIG. 7 schematically illustrates an individual node 710 of the simulation environment.

(67) Each node 710 has three basic connections to other nodes within the modelled structure: 1) The applied load 720, be that an electrical, thermal, application workload or other; 2) The losses 730 incurred in handling the applied load 720; and 3) The drawn load 740 resulting from handling the applied load 720.

(68) Each node also has access to any external data it requires to perform its simulation. The external data is supplied by the simulation framework 750. This allows for data which varies with one of the simulation variables such as time.

(69) For example, a node representing an electrical infrastructure element would have an applied load representing the power drawn by the connected devices, the node would then suffer some level of losses dependent upon the load as defined by the node model and the supplied parameters. These losses would be provided at the losses interface and likely collected as thermal losses to be taken to an air conditioning node. In an electrical node the losses are typically summed with the applied load to provide the drawn load. External data might include the capacity of the devices at the electrical node, the required performance data and any variable dependent data such as the external climate conditions for a chiller plant node for example.

(70) The nodes used in the data center simulator can be considered to fall into five basic types:

(71) a. Electrical nodes 810 are used to represent elements of the electrical (power delivery) infrastructure of the data centre;

(72) b. Thermal nodes 820 are used to represent elements of the mechanical (heat removal) infrastructure of the data centre;

(73) c. Load nodes 830 are used to apply loads to devices. This includes electrical & thermal loads applied to the infrastructure as well as workloads applied to IT Devices;

(74) d. Environment nodes 840 act as a source or sink for thermal emissions of the data centre; and

(75) e. Summing nodes 850 are responsible for collating a set of loads, applying a unifying function (possibly the arithmetic sum) and passing the joined loads on to another node. These may also function as splitter nodes and divert varying amounts or proportions of a load to other nodes.

(76) FIG. 8 illustrates how these various node types might be connected in a data centre simulator.

(77) Device Nodes

(78) The basic element of simulation is the device node. Each node represents one instance of a type of device, for example an Uninterruptible Power Supply. The node has inputs to provide the performance data for that device as well as the applied load and any other external factors that the device node requires to determine its behaviour such as external temperature for a chiller plant.

(79) The device node also has at least two outputs, load and loss, typically the electrical power drawn and the heat output. Nodes are not aware of time or any other factors which do not directly impact the node; the simulator is responsible for ensuring that all applied parameters are correct for that step in the simulation.

(80) The basic device types represented are:

(81) TABLE-US-00001 TABLE 1 Basic device node types Device node type Performance data type Depends upon IT electrical loads Load source Time Lighting and other overheads Load source Time Uninterruptible Power Supply Loss by electrical load Electrical load Power Distribution Units Loss by electrical load Electrical load Cabling Loss by electrical load Electrical load Transformers Loss by electrical load Electrical load Computer Room Air Electrical load by Thermal load Conditioning units thermal load Chiller plant components Electrical load by Thermal load thermal load and temperature
Power Chain

(82) The nodes are connected within the simulator to represent the energy paths within the data centre. The first energy path is the electrical power delivery chain formed by the electrical plant of the data centre. An example of this is shown in FIG. 9.

(83) In this simplified example we start with the IT electrical load node 910 which is used by the simulator to apply a load to the data centre. This load source is electrically connected to a PDU node 920. The PDU node has a set of data describing the losses it incurs delivering power, the PDU node adds these losses to the power drawn by the IT electrical load node 910 and passes this load on to the UPS 930, which adds it's losses and so on until we reach the transformer 940 and the overall direct energy use 950 of the data centre electrical system.

(84) Thermal Chain

(85) The second major energy path in the data centre infrastructure is the thermal chain formed by the mechanical plant. An example is shown in FIG. 10.

(86) In this, again, simplified, example we are dealing with the thermal loads within the data centre. Each of the nodes from the electrical chain that exhibits thermal loss within the cooled area of the data centre is included. The IT electrical loads effectively lose their entire input power as heat whilst the electrical infrastructure only rejects the node losses as heat. These thermal loads are summed and applied to the CRAC units which are responsible for removing the heat from the cooled area of the data centre. The CRAC node 1010 has a loss function which expresses the electrical power consumed to deal with a given applied thermal load, this is mostly fan motor power although in a DX or hybrid system this may also be compressor or pump power. Note that more advanced models of the CRAC unit may also represent the dehumidification losses due to the split between sensible and latent cooling at the working temperatures and humidities as well as the electrical load of re-humidification where necessary. This loss is then added to the thermal load and applied to the chiller plant 1020. The chiller plant node uses both the external temperature 1030 and the applied thermal load to determine its energy consumption.

(87) Connected Chains and Iteration

(88) The final step in preparing the node model for simulation is to connect the power and thermal chains as shown in FIG. 11.

(89) In addition to the basic chains we also apply the power consumed by the CRAC and Chiller plant nodes of the mechanical plant to the nodes of the electrical plant (represented by connections 1110). One important aspect of this is that the nodes support feedback loops. For example the CRAC units 1010 may be fed from the UPS power feed creating such a loop where the power drawn by the CRAC units 1010 increases the load on the UPS 930, thus increasing its losses and the thermal load applied to the CRAC units 1010, thus their power draw and the load on the UPS etc. Where these loops occur the simulator simply iterates until the loads stabilise and the working result of the system is achieved.

(90) Simulation Steps

(91) With the data centre electrical and mechanical chains connected, the data centre efficiency simulation can be performed. The output of this is the surface plot of DCIE against both IT electrical load and external temperature.

(92) To do this the simulator framework sets up at the core simulation of connected nodes, loads the performance data values into those nodes and then sets the first temperature and IT electrical load point as inputs. This produces the first output efficiency data point which is retained. The IT electrical load is then increased in steps (e.g. steps of 5%) from 0% up to 100% of the rated IT electrical load. This produces an efficiency against load curve for a single temperature similar to that shown in FIG. 3 calculated for 5% steps in electrical load.

(93) The simulator framework then increments the temperature by a requested step size (e.g. 5 C.) and repeats the sweep of load from 0% to 100% storing the achieved efficiency for this temperature. The temperature is incremented until the upper temperature bound of the simulation is reached and the full grid of DCIE by both IT electrical load and external temperature is complete to produce the surface plot as shown in FIG. 5 calculated for 5% electrical load and 5 C. steps.

(94) Layout Logical Representation

(95) The simulator uses layouts which are logical representations of the data centre mechanical and electrical infrastructure. These simplified layouts provide an effective approximation of the performance of the data centre with substantially reduced complexity. The simulator core is capable of simulating a very large number of nodes but this provides only limited additional accuracy and becomes very data centre specific.

(96) A simple, single data hall data centre is represented in FIG. 12. This is a simple data centre with UPS protected power for the IT devices and CRAC units only on the data floor. No other area of the data centre is cooled from the main chiller plant.

(97) Multiple Devices and Resilience

(98) In the logical representation layouts only a single node for each device type is given. This does not indicate that there is only one device, for example in the layout in FIG. 12 it is expected that there is more than one UPS but that the way they are deployed allows us to represent them logically as a single node.

(99) When configuring a node the parameters of the individual device are provided, generally the rated capacity and the load loss data. These are then supplemented by information to allow the simulator to understand the operating mode including the resilience levels. Taking the UPS node in FIG. 12 as an example the following data might be provided: The UPS devices are rated at 300 kW each There are three UPS in N+1 resilience providing a rated capacity of 600 kW (300 kW*(31)=600 kW) The +1 UPS is in active load sharing mode and therefore each UPS will receive of the applied electrical load

(100) TABLE-US-00002 TABLE 2 Examples of node resilience and capacity data Scenario 1 Scenario 2 Scenario 3 Rated device capacity 300 kW 300 kW 300 kW Provisioned capacity 600 kW 600 kW 600 kW Resilience level N + 1 N + 1 2(N + 1) Operating mode Active load Standby Active load sharing sharing Total device count 3 3 6 Total device capacity 900 kW 900 kW 1800 kW Active device count 3 2 6 Active device capacity 900 kW 600 kW 1800 kW Load at each device Part of device capacity at full 1 provisioned load

(101) Table 2 shows three examples of how the UPS capacity at this node might be logically represented. The simulator is not a data centre reliability and maintainability assessment tool and does not need to understand the resilience approach used, instead the rated capacity of the device group at the node, the number of active devices and the device capacity are sufficient.

(102) Modular Facilities

(103) The number, presence or capacity of devices in the data centre and capacity of the overall data centre may vary with time in the simulator to allow for simulation of modular deployment, removal, migration or replacement of IT, Electrical or Mechanical capacity through the operational lifetime of the building.

(104) Fixed Energy Overhead

(105) The simulator is able to determine the fixed energy consumption overhead of a data centre at any point in time, taking into account external environmental conditions, the configuration and deployment state of the data centre infrastructure and operational management.

(106) This is not possible in an operating facility without substantial disruption to service. The fixed overhead may only otherwise be approximated by regression analysis of energy data which does not provide causal analysis or predictive capability. (See Data centre energy efficiency metrics by Liam Newcombe, a BCS published white paper available at http://www.bcs.org/upload/pdfdata-centre-energy.pdf)

(107) IT Simulation Overview

(108) A second mode of the data centre simulator is to perform an IT simulation. Once a data centre scenario has been created the simulator is able to put IT devices into that data centre and simulate the energy and cost impacts of operating those devices across a specified time period. The output of this simulation is a set of energy and cost data representing the IT device and data centre energy consumption, capital and operational costs.

(109) Whilst there are a number work streams aimed at building on infrastructure level reporting metrics such as DCIE and creating horizontal metrics that describe the overall efficiency of IT equipment or the data centre system this is not the approach taken by the simulator. Just as at the data centre infrastructure level, metrics that report the entire data centre do not provide the analysis capability to support change impact assessment or business case generation and cannot support useful or credible chargeback mechanisms.

(110) The key difference in approach is that the simulator is able to take all of the variables that impact the energy use and cost of IT devices in the data centre and provide a vertical view through the IT equipment and data centre stack to provide allocation of the energy use and cost of the IT devices under examination.

(111) Overview of IT Simulation

(112) The simulation of an IT device is conceptually simple; an application workload 1320 is applied by the simulator to a node 1310 which represents the IT device(s) being simulated. This node 1310 has the application load to power draw function for the IT device(s) under simulation and converts the applied workload into a power draw 1330 and heat output 1340, as illustrated in FIG. 13.

(113) This power draw and heat output are then applied to the simulated data centre infrastructure to determine the actual energy use and cost at the data centre supply of the applied workload on the IT device(s).

(114) IT Device Workload to Power and Efficiency

(115) One key point is that IT devices rarely exhibit constant power efficiency with workload. Much like the data centre infrastructure, the achieved efficiency in terms of IT workload by power consumption falls as the IT workload falls as shown in FIG. 14.

(116) This relationship demonstrates that it is not useful to express the efficiency of groups of IT devices in a data centre without considering the applied workload on each of those groups and the resulting efficiency. The complexity of this evaluation is compounded by the response of the data centre to IT electrical loads.

(117) Electrical Load Context

(118) As shown in the discussion above, the response of the data centre to IT electrical load is not linear, therefore before we can simulate the impact upon the data centre of a specific IT device or group of devices it is necessary to apply the full electrical and thermal load of the other IT equipment in the data centre.

(119) This is achieved in the simulator by using the IT Electrical Load node 910 that was used in the DCIE simulation to apply electrical and thermal load to the data centre (FIG. 15).

(120) IT Device and IT Electrical Load Applied to the Data Centre

(121) The simulator nodes for the IT device and the IT electrical load are connected to the power and thermal simulation chains already established for data centre simulation as shown in 16.

(122) Allocation of Energy

(123) A key part of the operation of the simulator is the allocation mechanism developed to effectively represent the data centre.

(124) Current approaches to energy accounting and charge back metrics are simplistic and frequently ineffective. These approaches typically use either: The power (energy) consumption of a device; and The space or power and cooling capacity allocated to a device, rack, area or room as proxies for the device energy consumption and cost.

(125) These are ineffective and create perverse incentives driving sub optimal behaviour. This failure to effectively understand and represent costs can have significant impacts upon the overall performance of a data centre, either wholly owned or service provider.

(126) The simulator is able to determine the share of both load and allocated capacity at each node in the chain, allowing for much more effective cost allocation than the simplistic approaches currently in use. For example if a server is allocated 100 W and is fed by an Uninterruptible Power Supply with 10% losses, this allocation may become 110 W at the main Transformer feeding the server. The same loss factoring takes place for drawn power.

(127) This system level analysis of allocation and consumption allows for far greater detail and accuracy in the allocation of costs than traditional methods.

(128) A core concept of the simulator is that it understands and implements both fixed and variable costs (energy and financial) and how these are incurred by logical or physical devices within the data centre. Fixed energy consumption and financial costs such as amortised capital and fixed energy consumption are allocated to devices based upon their allocation of data centre resources. Variable energy consumption and financial costs such as energy consumption are allocated to devices based upon their consumption of resources.

(129) The data centre simulator has established and implemented a set of basic rules to allocate a fair and reasonable share of the data centre energy consumption to the simulated IT devices.sup.8. One basic tenet of these rules is to accrue fixed and variable loads and costs separately. These fixed and proportional energy and financial costs for the data centre are directly analogous to the normal finance concepts of fixed and variable cost and we will use them in a similar way to understand the real energy and cost behaviour of the data centre and how that impacts the cost and energy use of operating IT equipment within the data centre. .sup.8 See the BCS whitepaper Data centre energy efficiency metrics for a more detailed exploration of fixed and variable energy and cost allocation, http://www.bcs.org/datacentreenergy, the content of which is incorporated herein by reference.

(130) Fixed and Variable

(131) Simulation of the data centre infrastructure has demonstrated the impact on efficiency of the fixed load that the data centre exhibits at any combination of external temperature and infrastructure deployment. This fixed overhead means that metering all of the IT devices and applying a ratio of the IT power to the overall facility power fails to properly factor this fixed energy cost and is not useful as an allocation mechanism or chargeback metric.

(132) In allocating the cost of an office building the rental and service cost of a desk space would be accrued irrespective of whether the employee used the desk or what work they performed, this is a share of the fixed cost. The variable costs might include the energy used by a desktop PC and the telephone bill incurred by the employee whilst working at the desk.

(133) In a data centre when an IT device is installed, power and cooling capacity is allocated to that device. In most data centres, once this capacity is provisioned it cannot be used for another device. Once all of the available capacity of the data centre is allocated no more IT devices can be installed. The simulator uses this provisioned power to determine the share of the data centre fixed energy use that should be allocated to that IT device. This is carried out at every step of the simulation, taking into account the full state of the data centre and external temperature.

(134) The simulator is also able to determine the marginal energy use of the data centre due to the IT device energy use, this comprises both the energy used by the IT device itself and the additional energy used by both the power and thermal infrastructure to deliver that power to the device and remove the resulting heat. As before, this includes iteration of loops such as UPS fed CRAC units.

(135) As illustrated in FIG. 17, a sum of the fixed and variable power draw provides a fair and reasonable representation of the total energy cost of the IT device in the data centre.

(136) FIG. 18 shows additional nodes in the simulator to analyse power provisioning and energy allocation to determine IT device energy usage, based on the chosen simulation options. These additional nodes are a power provisioning node 1810 and an energy allocation node 1820 that calculate an IT device energy use 1830 for given simulation options 1840.

(137) Cost Allocation

(138) To provide useful output the data centre simulator reports both energy consumption and cost for each scenario. To determine the cost of each scenario the simulator includes the capital and maintenance cost of the IT device(s), the capital cost of the data centre facility and the cost of energy supplied to the data centre.

(139) FIG. 19 shows a further development of the simulator structure to include nodes to calculate an allocation of the energy and facility costs to give the IT device costs. Specifically, an allocation mechanism node 1910 (which has an input of facility capital costs 1915) and facility costs node 1920, in combination with a device energy costs node 1930 (which has an energy costs input 1940), are used to calculate IT device costs 1950.

(140) The simulator is able to model the costs of the data centre in substantial detail using the same basic structure of nodes, connections and performance data as in the energy analysis. This includes repeated accrual of partial costs where feedback loops exist.

(141) The simulator is able to ensure that all injected cost is allocated and accounted for.

(142) Arbitrary expressions of capital and operational cost characteristics may be applied to any node in the data centre; these may be related to the configuration of the device or the applied loads.

(143) The simulator is able to accrue costs to each node or applied load through the simulated system thus providing detailed and accurate analysis of the cost of delivering all or part of the data centre service.

(144) IT Device Cost

(145) The capital cost and annual maintenance cost of the IT device are entered as parameters of the scenario. The capital costs are amortised over the specified device lifetime or write-down period whilst the maintenance costs are accrued throughout the duration of the scenario at their frequency of occurrence.

(146) Facility Capital Costs

(147) The capital cost of the data centre mechanical and electrical plant is represented as a cost per Watt of data centre infrastructure. This is amortised over the stated design lifetime or write-down period of the device to provide a time sensitive cost per Watt of infrastructure and then accrued through simulation time based on the power provisioned to the IT devices.

(148) Facility Space Costs

(149) The capital cost of the remainder of the data centre building may be represented as a cost per unit of usable IT space. This may then be amortised over the stated design lifetime or write-down period of the device to provide a time sensitive cost per unit space of building and then accrued through simulation time based on the space provisioned to the IT devices.

(150) Energy Costs

(151) Energy cost data is used hourly with the device and total energy data to provide energy cost output for the device and the overall facility.

(152) Compensating for Utilisation

(153) The simulator is able to vary the allocation and accrual of energy and cost to a device based upon the accounting preferences of the user and the level of utilisation of the data centre. For example, if the amortised capital cost of the data centre infrastructure is 0.10 per Watt month and a server is allocated 1 kW then it would accrue 100 per month in amortised infrastructure cost. If the data centre capacity is only 50% allocated, i.e. it is half empty, this may still be a valid allocation for the user accompanied by 50% of the amortised capital cost shown as unallocated. Alternatively the simulator can compensate for the utilisation of data centre capacity at that point in time, at 50% the server would accrue 200 for that month.

(154) More specifically, the simulator is able to allocate the IT device energy and costs in the manner described above.

(155) Additional node(s), such as the energy allocation node 2010 illustrated in FIG. 20, can be implemented in the simulator to manage this mode of cost allocation, based on inputs of IT provisioned power 2020 and data centre capacity 2030.

(156) Time in IT Simulations

(157) Whilst the data centre simulations step through a range of applied electrical loads and external temperatures the IT simulation steps through time. This allows the simulator to ensure that the correct value(s) for each of the external variables is applied to each time step. This allows the simulator to provide useful analysis of the impact of devices such as cooling economisers which are most likely to be working overnight when the IT workload and thus power draw may be low and the cost of power at its minimum, conversely, in the middle of the day where external temperature is highest, IT workload and power draw highest and the cost of power high the economiser may not be providing any benefit.

(158) Time Steps

(159) The basic units of time used within the simulator are the day and hour, the simulator by default steps through 24 hours for each day, using the appropriate values from the supplied data and evaluates the state of the data centre, energy consumption and cost for the hour. The costs and energy consumption of the hours are summed to provide a set of daily values.

(160) Simulation Months

(161) The default units of time for a simulation are months; the simulator will simulate one full day of each specified type for each month of the simulation and multiply the values to achieve a total cost for the month. Multiple types of day may be specified, for example to account for variability in user workload between weekdays and weekends.

(162) Time Variant Data

(163) To iterate successfully through a simulation the simulator requires data which is time variant:

(164) TABLE-US-00003 TABLE 3 Time variant data Data Type Varies Monthly Varies Hourly External Temperature Yes Yes Power Cost Yes Yes Total data centre capacity Yes M&E device provisioning Yes Lighting and Other loads Yes Yes IT provisioned power Yes IT workload Yes Yes Other IT electrical load Yes Yes
Software Structure

(165) Embodiments of the simulator are implemented in software executable, for example, on a general purpose computer. In some embodiments, the software is executed on a server computer accessible remotely over a network via a browser interface. For example, the simulator may execute on a server accessible from a client device over the Internet from using an Internet browser application installed on the client device.

(166) The software structure of an embodiment is described below, with reference to FIGS. 21 and 22.

(167) The software can be broadly broken down into five major components, the core simulator, the data formats, the charting module, the Web user interface and an alternative user access interface.

(168) User Interface

(169) A web user interface may be used to enable use of the tool without the need to download and install software onto a user machine. This UI also provides a mechanism for users to report an implemented carbon saving by reporting the two scenarios describing the saving and the assistance provided by the tool.

(170) Charting Module

(171) To provide more visually compelling graphs of the output data from the simulator from the web user interface a charting module is used to provide the characteristic stacked bar charts and surface plot representations of the data (as seen in FIGS. 5 and 23 to 26 for example).

(172) Data Input/Output

(173) The simulator uses a set of data formats for input and output. There is a relatively small set of data formats which describe the specific performance of each device to the simulator node representing each device, a format for simulation output and a format for description of the data centre layout. These are provided to the simulator as XML schemas as this is a broadly recognised platform independent and portable standard.

(174) XML Interface

(175) The XML data formats are supported by input/output interfaces and interpreters.

(176) Simulator

(177) Core Engine

(178) The Open Source Core Engine is the underlying environment which allows the simulation. This implements the functional environment within which the data centre component nodes operate.

(179) Data Centre Components

(180) The Data Centre Components are a set of nodes which represent the individual data centre components in the simulation.

(181) Simulation and Results API

(182) The Simulation and Results API provides the ability to set up, execute and collect the results of a simulation. The Template Functions assist in establishing the simulation model, the Analysis Functions iterate through the parameters of the simulation, varying external variables such as workload and ambient temperature and collating the results.

(183) Alternative User Interface (FIG. 22)

(184) As an alternative to the web UI, but still enableing users to interact with and receive results from the simulator in an effective and predictable manner, an XML interface may be provided to take the place of the calls made by the web UI.

(185) Data Input/Output

(186) A full set of data input and output XML formats can be made available. This can take the place, for example, of form entered data in the Web UI. The input/output interface is expanded from the web UI version to handle all of these formats. It may be a superset of the web UI capability.

(187) Constructor Data

(188) The data centre logical layout within the simulation is represented by a constructor. This carries the information required to create and connect the Data Centre Components within the Core Engine for simulation. This is a complex process which is supported by a specific XML data format representing the layout. This data format is interpreted by the XML Meta Language Interpreter. The simulator can employ a simplified logical layout of the facility (i.e. not incorporating the full complexity of the complete M&E installation). Indeed, the simulation can be implemented with anything from a single node to every component of the data centre dependent on the requirements for the simulation.

(189) Applications of the Simulator

(190) The simulator can be put to use in many and various applications, some examples of which will already be apparent from the discussion above. Some other possible applications are noted below.

(191) Determining Energy and Cost Impact of Logical Devices

(192) The simulator is capable, through system level simulation, of determining the energy or cost impact of logical devices in the data centre as well as physical.

(193) For example, it is not possible to install a power meter for a virtual server but it is possible to simulate the load on the physical server to determine the impact of the virtual server and thus the accrued impact at data centre level.

(194) What If Analysis

(195) The simulator is able to perform a very broad range of what if analysis.

(196) Output data from such what if simulations can be used to determine:

(197) Likely returns on capital investments;

(198) Service delivery costs and their relation to service revenue;

(199) Sensitivity to external factors such as energy cost; and

(200) Optimal strategies for capacity build out and customer pricing.

(201) System Level Analysis of Capacity

(202) The simulator is able to effectively load test data centre designs before or after construction to validate the provisioned and actual device capacity of the data centre under a range of operational modes including degraded operating modes testing system redundancy.

(203) This can be used to analyse both worst case scenarios and to provide capacity curves against other variables for the data centre. A facility may well be able to support a greater IT electrical load at a lower external temperature than its design rating. Dependent upon the operational approach it may be appropriate for the operator to exploit this capacity.

(204) Operational Decision Support

(205) The simulator is well suited to operational decision support in situations such as: Whether to shut down plant equipment whose capacity is not currently required; Where and when to place a devices or workload in a data centre or group of data centres; and What price(s) to accept for services dependent upon the marginal cost of delivery.
Billing

(206) The level of analysis provided by the simulator allows for effective allocation and charge back of workload, device, device group, area or whole data centre costs.

(207) Multi Party Analysis

(208) The simulator facilitates the analysis of data centre energy and cost performance with masking of detailed data where there are multiple parties involved. For example, a data centre operator providing service to an IT equipment operator. In this case the simulator could be used to determine the financial cost and revenue to the data centre operator whilst only showing the IT equipment operator the revenue and allocated utility energy for carbon accounting purposes.

(209) Early Stage Evaluation of Technology or Products

(210) The simulator can been used to evaluate early stage technology at a pre prototype phase. A number of technology development scenarios can be tested against a number of operating data centre scenarios to assess the overall benefits available from the technology. This allows for substantial time and cost acceleration of the technology through the disposal of options which had been considered to be promising prior to systems level analysis.

(211) Scenario Comparison

(212) Having created a data centre and executed an IT simulation within that data centre the simulator can be used to perform scenario comparison.

(213) One example of such a comparison, to illustrate the principle, is a pre/post virtualisation comparison. When virtualising there is frequently a requirement to forecast the business case to justify the change in policy or the capital cost. This can be difficult as the consolidation ratio is not an effective proxy for the cost saving due to: Increased capital cost of the higher spec servers used for virtualisation; Higher per server power consumption of the higher spec servers; Higher per server power consumption of the servers due to higher workloads, particularly when comparing new, Energy Star compliant devices; Higher per server amortised capital cost of the data centre power and cooling infrastructure; Possible changes in utility power cost; and Possible changes in utilisation of the data centre.

(214) The data centre simulator is able to take all of these variables into account and provide an effective forecast of the benefits of a virtualisation program.

(215) The first step in the comparison is to create a pre virtualisation scenario as a baseline for comparison. In this example the company plans to deploy a further 100 commodity 1U servers under the existing one application per server policy. The comparison will be over a 4 year period.

(216) The simulation can then be run and the cost and energy outputs for the 4 year simulation viewed. FIG. 23 shows exemplary results.

(217) The next step is to create the post virtualisation scenario for comparison of cost and energy. Our consolidation will be from the 100 commodity 1U servers down to 15 commodity 4U servers which are of higher specification and cost.

(218) The simulation can then be run again and the cost and energy outputs for the 4 year simulation, this time based on the post virtualisation scenario viewed. FIG. 24 shows exemplary results.

(219) The substantial reduction in overall cost and energy consumption is clear from comparison of the pre virtualisation graphs in FIG. 23 with the post virtualisation results in FIG. 24.

(220) While it is clear that the post virtualisation scenario offers savings compared to the pre virtualisation scenario it is useful to be able to directly compare the cost and energy consumption of the two scenarios. FIG. 25 shows a side by side comparison of the overall IT device(s) cost and energy use. This comparison shows more directly the difference in the scenario output graphs.

(221) The comparisons described so far have been the energy and cost allocated to the IT device. However, a key comparison for the creation of a business case is the impact on the overall energy use and cost of the data centre.

(222) The graphs in FIG. 26 shows the overall costs and energy use of the whole data centre over the simulation period. The Amortised Data Centre Capital Cost is the full amortised cost for the facility rather than the part allocated to the IT devices. The Other Energy segment on the bar chart represents all of the data centre energy use not allocated to the simulated IT devices.

(223) While the invention has been described in conjunction with exemplary embodiments, many equivalent modifications and variations will be apparent to those skilled in the art when given this disclosure. Accordingly, the exemplary embodiments of the invention set forth above are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the invention.