Combined sort and aggregation
11169999 · 2021-11-09
Assignee
Inventors
Cpc classification
International classification
Abstract
Innovative techniques are disclosed for performing a combined sort and aggregation operation involving eagerly performing aggregation while sorting. The techniques described herein enable an aggregation and a group-by operation to be performed using an amount of system memory that is far less than the system memory required to store all the data records being processed while minimizing the need to spill data to disk. This combined sort and aggregation operation has better performance than conventional techniques since system memory is used more efficiently. In certain embodiments, a combined sort and aggregation operation is disclosed that enables the efficient sorting and aggregation of data records when the desired aggregation function is composable (such as SUM, COUNT, MIN and MAX aggregate operations).
Claims
1. A method comprising: receiving, by a data processing system comprising one or more processors and associated memory, a request to apply an aggregate function upon input data records for a particular field according to a group-by clause, wherein the group-by clause is based upon a category or key, and wherein the category or key is a value in another field of the input data records; reading, by the data processing system, a record of the input data records; creating, by the data processing system, a first run in a first portion of the memory for the record; determining, by the data processing system, the first run can be merged with a second run in a second portion of the memory based on both the first run and the second run comprising data records with a same count of the category or key; in response to determining the first run can be merged with the second run, merging, by eagerly performing aggregation while sorting, the first run with the second run to create a third run; releasing, by the data processing system, the first portion of the memory occupied by the first run and the second portion of the memory occupied by the second run; determining, by the data processing system, no existing runs in the memory can be merged based on the count of the category or key; determining, by the data processing system, whether a number of the existing runs in the memory is greater than one, wherein the existing runs include the third run; in response to determining the number of existing runs in the memory is not greater than one, outputting, by the data processing system, a final aggregate result from the third run, wherein the final aggregate result represents final aggregate values for different categories or keys for the input data records; and in response to determining the number of existing runs in the memory is greater than one, merging, by the data processing system, all existing runs irrespective of the count of the category or key into a single run, and outputting, by the data processing system, the final aggregate result from the single run, wherein the final aggregate result represents the final aggregate values for different categories or keys for the input data records.
2. The method of claim 1, wherein the aggregate function is a composable aggregate function.
3. The method of claim 1, wherein the aggregate function is a SUM function or a COUNT function.
4. The method of claim 1, wherein the aggregate function is a MIN function or a MAX function.
5. The method of claim 1, wherein the eagerly performing aggregation while sorting comprises: (i) determining an aggregate value for each value in the category or key for the first run and the second run; (ii) storing the aggregate value in the third run, and (iii) sorting records of the third run based on the aggregate value determined for the category or key.
6. The method of claim 1, further comprising determining, by the data processing system, all records of the input data records are processed, wherein the determining whether the number of the existing runs in the memory is greater than one is performed in response to determining all the records of the input data records are processed.
7. A non-transitory computer-readable medium containing instructions that, when executed by a processor, causes the processor to: receive a request to apply an aggregate function upon input data records for a particular field according to a group-by clause, wherein the group-by clause is based upon a category or key, and wherein the category or key is a value in another field of the input data records; read a record of the input data records; create a first run in a first portion of a memory for the record; determine the first run can be merged with a second run in a second portion of the memory based on both the first run and the second run comprising data records with a same count of the category or key; in response to determining the first run can be merged with the second run, merge, by eagerly performing aggregation while sorting, the first run with the second run to create a third run; release the first portion of the memory occupied by the first run and the second portion of the memory occupied by the second run; determine no existing runs in the memory can be merged based on the count of the category or key; determine whether a number of the existing runs in the memory is greater than one, wherein the existing runs include the third run; in response to determining the number of existing runs in the memory is not greater than one, output a final aggregate result from the third run, wherein the final aggregate result represents final aggregate values for different categories or keys for the input data records; and in response to determining the number of existing runs in the memory is greater than one, merge all existing runs irrespective of the count of the category or key into a single run, and output the final aggregate result from the single run, wherein the final aggregate result represents the final aggregate values for different categories or keys for the input data records.
8. The non-transitory computer-readable medium of claim 7, wherein the aggregate function is a composable aggregate function.
9. The non-transitory computer-readable medium of claim 7, wherein the aggregate function is a SUM function or a COUNT function.
10. The non-transitory computer readable medium of claim 7, wherein the aggregate function is a MIN function or a MAX function.
11. The non-transitory computer readable medium of claim 7, wherein the eagerly performing aggregation while sorting comprises: (i) determining an aggregate value for each value in the category or key for the first run and the second run; (ii) storing the aggregate value in the third run, and (iii) sorting records of the third run based on the aggregate value determined for the category or key.
12. The non-transitory computer readable medium of claim 7, wherein the processor is further caused to determine all records of the input data records are processed, wherein the determining whether the number of the existing runs in the memory is greater than one is performed in response to determining all the records of the input data records are processed.
13. A data processing system comprising: one or more processors and associated memory, wherein the one or more processors are configured to: receive a request to apply an aggregate function upon input data records for a particular field according to a group-by clause, wherein the group-by clause is based upon a category or key, and wherein the category or key is a value in another field of the input data records; read a record of the input data records; create a first run in a first portion of the memory for the record; determine the first run can be merged with a second run in a second portion of the memory based on both the first run and the second run comprising data records with a same count of the category or key; in response to determining the first run can be merged with the second run, merge, by eagerly performing aggregation while sorting, the first run with the second run to create a third run; release the first portion of the memory occupied by the first run and the second portion of the memory occupied by the second run; determine no existing runs in the memory can be merged based on the count of the category or key; determine whether a number of the existing runs in the memory is greater than one, wherein the existing runs include the third run; in response to determining the number of existing runs in the memory is not greater than one, output a final aggregate result from the third run, wherein the final aggregate result represents final aggregate values for different categories or keys for the input data records; and in response to determining the number of existing runs in the memory is greater than one, merge all existing runs irrespective of the count of the category or key into a single run, and output the final aggregate result from the single run, wherein the final aggregate result represents the final aggregate values for different categories or keys for the input data records.
14. The data processing system of claim 13, wherein the aggregate function is a composable aggregate function.
15. The data processing system of claim 13, wherein the aggregate function is a SUM function or a COUNT function.
16. The data processing system of claim 13, wherein the aggregate function is a MIN function or a MAX function.
17. The data processing system of claim 13, wherein the eagerly performing aggregation while sorting comprises: (i) determining an aggregate value for each value in the category or key for the first run and the second run; (ii) storing the aggregate value in the third run, and (iii) sorting records of the third run based on the aggregate value determined for the category or key.
18. The data processing system of claim 13, wherein the one or more processor are further configured to determine all records of the input data records are processed, wherein the determining whether the number of the existing runs in the memory is greater than one is performed in response to determining all the records of the input data records are processed.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION
(10) In the following description, for the purposes of explanation, specific details are set forth in order to provide a thorough understanding of embodiments of the invention. However, it will be apparent that various embodiments may be practiced without these specific details. The figures and description are not intended to be restrictive. The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any embodiment or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or designs.
(11) Systems depicted in some of the figures may be provided in various configurations. In certain embodiments, the systems may be configured as a distributed system where one or more components of the system are distributed across one or more networks in a cloud computing system. In certain embodiments, the systems may be configured to operate in virtual or non-virtual environments.
(12) As an example, consider an aggregation operation that is requested to be performed on data records. However, this aggregation operation may require sorting to also be performed. For instance, if the aggregation is to be performed on data records belonging to certain categories (e.g., having a certain parameter), then the data records may have to be sorted based on category.
(13) For instance, there may be a total of N input data records being processed. An aggregate function AF may be performed that involves a “group by key/category K” clause, where K can take one of the following values {K1, K2, . . . , Km}. Thus, data records for K1 will be aggregated, data records for K2 will be aggregated, and so forth. An example query demonstrating this scenario would be: Select Aggregate_Func(Col) from Table T where < > group by (K).
(14) In traditional approaches for sort and aggregation, all of the data records would be transferred to memory to be fully sorted to group the data records by key/category K (e.g., by the group-by key), followed by a full scan of the sorted data to perform the aggregation. However, this approach can be inefficient when there is a large number of data records, since the cost of performing the full sort is high. In most cases, the data does not fit into the memory available on the computing machine, and sorted batches of data records have to be written out to disk (e.g., external storage) before finally merging them to reconstruct the fully sorted result. Since it is slower to write and read from disk than memory, the additional time required can be substantial.
(15) The performance and efficiency of sort and aggregate operations being applied to the data records is generally constrained by the computing resources available for performing these operations, and further by how the sort and aggregate operations are designed to utilize those computing resources. In the traditional approach, the requirement that the data records are fully sorted (e.g., to group the data records based on certain criteria) before aggregation can be performed (e.g., on the sorted groups of data records) results in the speed at which a data processing system sorts and aggregates data records to heavily depend on the underlying computing resources available to a data processing system.
(16) However, changing the algorithm behind the sort and aggregation can change the speed at which a data processing system sorts and aggregates data records, reducing the dependence on the underlying computing resources available to a data processing system. The present disclosure relates generally to techniques for a combined sort and aggregation operation used in the processing of data records, and more particularly to a combined sort and aggregation operation that enables the efficient sorting and aggregation of data records when the desired aggregation function is composable (such as SUM, COUNT, MIN and MAX). Various inventive embodiments are described herein, including methods, systems, non-transitory computer-readable storage media storing programs, code, or instructions executable by one or more processors, and the like.
(17) The operation to be performed may involve applying an aggregate function “AF” to a set of data records (“N” records) and grouping the results by a group-by key identified in a group-by clause. The AF function may be applied to a particular field of the input data records. The function “AF” may be a composable function such as SUM function, COUNT function, MIN function, MAX function, and the like. The group-by key (“K”) specified in the group-by clause may identify another specific field of the data records, where the specific field may take one of a set of values or categories {K1, K2, . . . Km}. In one example, the operation to be performed may be specified in a query such as:
(18) Select AF(Field_to_be_aggregated) from Table T where < > group by (K).
(19) The result for the above query includes aggregated results determined for each value of K.
(20) The combined sort and aggregation operation embodiments described in this disclosure provide better performance than conventional techniques requiring a full sort of all the data records to be processed followed by application of the aggregate function. Such prior art techniques require at least an amount of system memory that is at least equal to the memory needed to store all the data records being processed. Per the embodiments described in the disclosure, the amount of memory needed to perform the aggregate and group-by operation is significantly less than the amount of memory needed to store all of the data records (e.g., the total size of all the data records). For example, in a traditional approach, for N data records being processed, all the N data records are loaded into system memory, then sorted, and then aggregated. Thus, the memory requirement would be equal to the total size (e.g., number of data records*size of data records) of the data records. By performing the sort and aggregation eagerly and incrementally as disclosed herein, the amount of system memory needed is a lot less than the total size of all the data records.
(21)
(22) As depicted in
(23) Information related to an operating system and applications or processes executed by processors 102 may be loaded in system memory 110. For example, one or more applications and processes executed by data processing system 100 may be loaded into system memory 110. For example, as depicted in
(24) Processor(s) 102 may be configured to execute or run instructions (e.g., code, code objects) for implementing the functions performed by data processing system 100. These functions may include database-related tasks, sorting functions, and the like. Processors 102 may include single core processors or multicore processors. Processors 102 may execute one or more virtual machines.
(25) As depicted in
(26)
(27) At block 202, the processing may be initiated when a data processing system receives a request to apply an aggregate function to multiple data records and to group results by values of a group-by key in a particular field of the of the data records (e.g., key or category K). In certain embodiments, the data processing system will carry out the aggregate function and determine a final result by incrementally loading data records to be processed in memory at a time and eagerly performing aggregation while sorting to determine the final result. This is performed using an amount of memory that is less than an amount of memory needed for storing all the data records being processed. For example, the system memory needed for performing the operation is less than the memory needed for storing all the data records being processed. All the data records being processed may not even fit in the system memory at a time.
(28) The processing may be performed by breaking and storing the data records into runs in memory. At block 204, the data processing system may divide the data records into runs in memory. For example, for sake of simplicity, a first run of records may be stored in memory comprising a first subset of records, and a second run of records may be stored in memory comprising a second subset of records. The total number of records in the first and second runs will be less than the plurality of records (e.g., there will be remaining records in the plurality of records that still need to be processed).
(29) For example, as shown in
(30) At block 206, the records in each run are processed by eagerly performing aggregation while sorting. For example, if there are two runs, the records in each run will be processed by eagerly aggregating while sorting. For each run, the processing in 206 may include, at 208, sorting the records in the run based on the values in the records of the particular group-by key field. For example, if the group-by key is the State field, then the records will be sorted by the values that the State field takes, e.g., California, Oregon, Washington. For example, as depicted in
(31) Further, as part of 206, after the sorting, at 210, the sorted data within each run is scanned and aggregated by the group-by keys. For example, as depicted in
(32) At block 212, the data processing system will merge the existing runs in memory (e.g., a first run and a second run), whose records have been sorted and aggregated, in pair-wise manner (i.e., two at a time) to create a new merged run. In doing this, the data processing system will aggregate the values in the runs being merged based upon the group-by key value. For example, as shown in
(33) The processing depicted in blocks 204, 206 (including 208 and 210), and 212 may be repeated for new data records received until all the data records have been processed. In certain embodiments, if the entire data set does not fit into system memory or if there are runs stored to disk, a run from system memory may be written to disk before proceeding with processing of new data. The new data may then be processed according to 204, 206 (including 208 and 210), and 212.
(34) At block 214, after determining that all the data records have been processed (i.e., there is no further data to process), all the sorted and aggregated and/or merged runs (some of which may be spilled to disk) are merged until there is a single merged run representing the final result.
(35) At block 216, the data processing system may output the result from the single remaining run in memory as a result response to the request received in 202. The single remaining run in memory contains the aggregate result of processing all the data records according to the group-by key.
(36) The approach depicted in
(37)
(38) The processing depicted in
(39) Select AggregateFunc(F) from Table T where < > group by (K).
(40) Examples of “AggregateFunc( ): MAX, MIN, SUM, COUNT
(41) At block 302, the data processing system will read a single input data record or receive a data record to be processed as input. In some embodiments, the data record may be read from or received from an input run of data records.
(42) At block 304, the data processing system will create a new run in memory for that read input data record.
(43) At block 306, the data processing system will determine if any two existing runs in memory can be merged. In certain embodiments, two runs are eligible for merging if they both contain data records with the same count of category K. For example, a run that has data records belonging to two different categories (e.g., two different values in the particular field specified in the group-by clause) associated with the group by key will have a count of two. If there is another run in memory that also has data records belonging to two different categories, then the two runs are eligible to be merged. This will further be explained in the example described below. If this is the first time going through the flowchart, then the data processing system will have just added the first run to memory, so there will not be any merging.
(44) If it is determined in block 306 that there exists at least two runs that can be merged, then in block 310 the two runs are merged to create a single new run by eagerly performing aggregation while sorting. As part of the processing performed in 310, the records within each run are sorted, and then at block 312, the data processing system creates a new merged run to store the results of the merge operation. As part of the processing performed in 310, an aggregate value for each category/value in field K is determined based upon the two runs being merged by eagerly aggregating while sorting. The aggregate value calculated for each value of K is then stored in the new run created for storing the results of the merging. The records are also sorted by the category or value of K in the merged run. Accordingly, in order to merge two runs, for each category K among all of the data records of the two runs, the data processing system will determine the aggregate value for all the data records for that category of K (in the two runs) by eagerly aggregating and sorting.
(45) Then, at block 314, the data processing system will free up the memory for the two runs that were merged, thereby releasing the memory occupied by the runs. Processing then continues with 306, wherein, another determination is made if there are any two runs in memory that can be merged. In this manner, the data processing system continues to merge runs in memory until there are no two existing runs that can be merged.
(46) Referring back to 306, if it is determined in 306 there are no two runs existing in memory that can be merged, then at block 316, the data processing system will determine if all the input data records have been processed. If it is determined in 316 that all the records have not been processed, then processing continues with block 318 where the data processing system reads the next input record, and then processing continues with 304, as described above.
(47) If it is determined in block 316 that all the input data records have been processed, then at block 320, the data processing system checks if the number of existing runs in memory is greater than one. If it is determined in 320 that there are multiple runs existing in memory, then at block 322, the data processing system will merge all the existing runs in memory until there is only one run remaining. The merging in 322 is performed irrespective of the count of K in the runs.
(48) Once there is only one run in memory, either after performing the processing in 322 or upon determining that there is only one run in memory in 320, at block 324 the data processing will output the final aggregate results from the one remaining run. The aggregate values in this final run represent the final aggregate values for the different categories of K for the input records that were processed.
(49) The processing performed according to the flowchart in
(50) Select SUM(Field #1) from Table T where < > group by STATE field.
(51) Thus, in this query is an aggregate function: SUM to be performed on values of Field #1.
(52) For this example, it is assumed that the records are received or processed in the following order: Washington: 250; California: 1000 Oregon: 500 California: 500 Washington: 150 Oregon: 200 Oregon: 100 California: 800 Oregon: 500 California: 500 and Washington: 250. It is assumed that the STATE field in the records can take the following values={California, Oregon, Washington}. The sort order is California, Oregon, Washington. Each record may have multiple fields in addition to field containing the values to be aggregated (i.e., Field #1) and the STATE field. In certain embodiments, as part of the processing, only the two fields may be stored for each record and the other fields ignored.
(53) TABLE-US-00001 TABLE A Example for flowchart depicted in FIG. 3 State of Runs before State of Rows after performing Action Action performed performing action — Read Washington: 250 and RUN1: Washington: 250 create new run for the read record. RUN1: Washington: 250 No runs that can be merged, RUN1: Washington: 250 so read next record. RUN1: Washington: 250 Read California: 1000 and RUN1: Washington: 250 create new run for the read RUN2: California: 1000 record. RUN1: Washington: 250 Since the counts for RUN1 MR1: California: 1000 RUN2: California: 1000 (cnt = 1 .fwdarw. Washington) and Washington: 250 RUN2 (cnt = 1 .fwdarw. California) are the same, merge the runs to create a new run (MR1), where the merging includes eagerly aggregating while sorting. Release RUN1 and RUN2. MR1: California: 1000 Read Oregon: 500 and create MR1: California: 1000 Washington: 250 new run for the read record. Washington: 250 RUN1: Oregon: 500 MR1: California: 1000 Since the counts for MR1 MR1: California: 1000 Washington: 250 (cnt = 2 .fwdarw. California, Washington: 250 RUN1: Oregon: 500 Washington) and RUN1 RUN1: Oregon: 500 (cnt = 1 .fwdarw. Oregon) are NOT the same, do not merge the runs. MR1: California: 1000 Read California: 500 and MR1: California: 1000 Washington: 250 create new run for the read Washington: 250 RUN1: Oregon: 500 record. RUN1: Oregon: 500 RUN2: California: 500 MR1: California: 1000 Since the counts for RUN1 MR1: California: 1000 Washington: 250 (cnt = 1 .fwdarw. Oregon) and Washington: 250 RUN1: Oregon: 500 RUN2 (cnt = 1 .fwdarw. California) MR2: California: 500 RUN2: California: 500 are the same, merge the runs Oregon: 500 to create a new run (MR2), where the merging includes eagerly aggregating while sorting. Release RUN1 and RUN2. MR1: California: 1000 Since the counts for MR1 (cnt = MR3: California: 1500 Washington: 250 2 .fwdarw. California, Oregon: 500 MR2: California: 500 Washington) and MR2 (cnt = Washington: 250 Oregon: 500 2 .fwdarw. California, Oregon) are the same, merge the runs to create a new run (MR3), where the merging includes eagerly aggregating while sorting. Release MR1 and MR2. MR3: California: 1500 Read Washington: 150 and MR3: California: 1500 Oregon: 500 create new run for the read Oregon: 500 Washington: 250 record. Washington: 250 RUN1: Washington: 150 MR3: California: 1500 Since the counts for MR3 MR3: California: 1500 Oregon: 500 (cnt = 3 .fwdarw. California, Oregon, Oregon: 500 Washington: 250 Washington) and RUN1 Washington: 250 RUN1: Washington: 150 (cnt = 1 .fwdarw. Washington) are RUN1: Washington: 150 NOT the same, do not merge the runs. MR3: California: 1500 Read Oregon: 200 and create MR3: California: 1500 Oregon: 500 new run for the read record. Oregon: 500 Washington: 250 Washington: 250 RUN1: Washington: 150 RUN1: Washington: 150 RUN2: Oregon: 200 MR3: California: 1500 Since the counts for RUN1 MR3: California: 1500 Oregon: 500 (cnt = 1 .fwdarw. Washington) and Oregon: 500 Washington: 250 RUN2 (cnt = 1 .fwdarw. Oregon) Washington: 250 RUN1: Washington: 150 are the same, merge the runs MR4: Oregon: 200 RUN2: Oregon: 200 to create a new run (MR4), Washington: 150 where the merging includes eagerly aggregating while sorting. Release RUN1 and RUN2. MR3: California: 1500 Since the counts for MR3 MR3: California: 1500 Oregon: 500 (cnt = 3 .fwdarw. California, Oregon, Oregon: 500 Washington: 250 Washington) and RUN1 Washington: 250 MR4: Oregon: 200 (cnt = 1 .fwdarw. Washington) are MR4: Oregon: 200 Washington: 150 NOT the same, do not merge Washington: 150 the runs. MR3: California: 1500 Read Oregon: 100 and create MR3: California: 1500 Oregon: 500 new run for the read record. Oregon: 500 Washington: 250 Washington: 250 MR4: Oregon: 200 MR4: Oregon: 200 Washington: 150 Washington: 150 RUN1: Oregon: 100 MR3: California: 1500 Since the counts for MR3 MR3: California: 1500 Oregon: 500 (cnt = 3 .fwdarw. California, Oregon, Oregon: 500 Washington: 250 Washington), MR4 (cnt = 2 .fwdarw. Washington: 250 MR4: Oregon: 200 Oregon, Washington) and MR4: Oregon: 200 Washington: 150 RUN1 (cnt = 1 .fwdarw. Oregon) are Washington: 150 RUN1: Oregon: 100 NOT the same, do not merge RUN1: Oregon: 100 the runs. MR3: California: 1500 Read California: 800 and MR3: California: 1500 Oregon: 500 create new run for the read Oregon: 500 Washington: 250 record. Washington: 250 MR4: Oregon: 200 MR4: Oregon: 200 Washington: 150 Washington: 150 RUN1: Oregon: 100 RUN1: Oregon: 100 RUN2: California: 800 MR3: California: 1500 Since the counts for RUN1 MR3: California: 1500 Oregon: 500 (cnt = 1 .fwdarw. Oregon) and Oregon: 500 Washington: 250 RUN2 (cnt = 1 .fwdarw. California) Washington: 250 MR4: Oregon: 200 are the same, merge the runs MR4: Oregon: 200 Washington: 150 to create a new run (MR5), Washington: 150 RUN1: Oregon: 100 where the merging includes MR5: California: 800 RUN2: California: 800 eagerly aggregating while Oregon: 100 sorting. Release RUN1 and RUN2. MR3: California: 1500 Since the counts for MR4 (cnt = MR3: California: 1500 Oregon: 500 2 .fwdarw. Oregon, Washington) Oregon: 500 Washington: 250 and MR5 (cnt = 2 .fwdarw. Washington: 250 MR4: Oregon: 200 California, Oregon) are the MR6: California: 800 Washington: 150 same, merge the runs to Oregon: 300 MR5: California: 800 create a new run (MR6), Washington: 150 Oregon: 100 where the merging includes eagerly aggregating while sorting. Release MR4 and MR5. MR3: California: 1500 Since the counts for MR3 (cnt = MR7: California: 2300 Oregon: 500 3 .fwdarw. California, Oregon, Oregon: 800 Washington: 250 Washington) and MR6 (cnt = Washington: 400 MR6: California: 800 3 .fwdarw. California, Oregon, Oregon: 300 Washington) are the same, Washington: 150 merge the runs to create a new run (MR7), where the merging includes eagerly aggregating while sorting. Release MR3 and MR6. MR7: California: 2300 Read Oregon: 500 and create MR7: California: 2300 Oregon: 800 new run for the read record. Oregon: 800 Washington: 400 Washington: 400 RUN1: Oregon: 500 MR7: California: 2300 Since the counts for MR7 MR7: California: 2300 Oregon: 800 (cnt = 3 .fwdarw. California, Oregon, Oregon: 800 Washington: 400 Washington) and RUN1 Washington: 400 RUN1: Oregon: 500 (cnt = 1 .fwdarw. Oregon) are NOT RUN1: Oregon: 500 the same, do not merge the runs. MR7: California: 2300 Read California: 500 and MR7: California: 2300 Oregon: 800 create new run for the read Oregon: 800 Washington: 400 record. Washington: 400 RUN1: Oregon: 500 RUN1: Oregon: 500 RUN2: California: 500 MR7: California: 2300 Since the counts for RUN1 MR7: California: 2300 Oregon: 800 (cnt = 1 .fwdarw. Oregon) and Oregon: 800 Washington: 400 RUN2 (cnt = 1 .fwdarw. California) Washington: 400 RUN1: Oregon: 500 are the same, merge the runs MR8: California: 500 RUN2: California: 500 to create a new run (MR8), Oregon: 500 where the merging includes eagerly aggregating while sorting. Release RUN1 and RUN2. MR7: California: 2300 Since the counts for MR7 MR7: California: 2300 Oregon: 800 (cnt = 3 .fwdarw. California, Oregon, Oregon: 800 Washington: 400 Washington) and MR8 (cnt = Washington: 400 MR8: California: 500 2 .fwdarw. California, Oregon) are MR8: California: 500 Oregon: 500 NOT the same, do not merge Oregon: 500 the runs. MR7: California: 2300 Read Washington: 250 and MR7: California: 2300 Oregon: 800 create new run for the read Oregon: 800 Washington: 400 record. Washington: 400 MR8: California: 500 MR8: California: 500 Oregon: 500 Oregon: 500 RUN1: Washington: 250 MR7: California: 2300 Since the counts for MR7 MR7: California: 2300 Oregon: 800 (cnt = 3 .fwdarw. California, Oregon, Oregon: 800 Washington: 400 Washington), MR8 (cnt = 2 .fwdarw. Washington: 400 MR8: California: 500 California, Oregon) and MR8: California: 500 Oregon: 500 RUN1 (cnt = 1 .fwdarw. Oregon: 500 RUN1: Washington: 250 Washington) are NOT the RUN1: Washington: 250 same, do not merge the runs. All records processed (i.e., no more records to read. MR7: California: 2300 Start merging existing runs in MR7: California: 2300 Oregon: 800 memory until there is only Oregon: 800 Washington: 400 one remaining run. Merge Washington: 400 MR8: California: 500 MR8 and RUN1, to create a MR9: California: 500 Oregon: 500 new run (MR9), where the Oregon: 500 RUN1: Washington: 250 merging includes eagerly Washington: 250 aggregating while sorting. Release MR8 and RUN1. MR7: California: 2300 Merge MR7 and MR9, to MR10: California: 2800 Oregon: 800 create a new run (MR10), Oregon: 1300 Washington: 400 where the merging includes Washington: 650 MR9: California: 500 eagerly aggregating while Oregon: 500 sorting. Release MR7 and Washington: 250 MR9. MR10: California: 2800 Since MR10 is the last MR10: California: 2800 Oregon: 1300 remaining run and no more Oregon: 1300 Washington: 650 records to process, output the Washington: 650 aggregate values from MR10 (California: 2800; Oregon: 1300; Washington: 650) as the result of the query.
(54) At the beginning, the data processing system will not have any runs in memory. The data processing system will read the set of records, record-by-record. For instance, the data processing system will first read (e.g., at block 302) the record of Washington:250 and create a new run (e.g., RUN1) for the read record (e.g., at block 304). At this point RUN1 will contain Washington:250.
(55) The data processing system may check to see if any runs can be merged (e.g., at block 306). Since there is only one run in memory, no runs can be merged (e.g., go to block 316). The data processing system may determine that not all the records have been processed, which causes the data processing system to go on to read the next record (e.g., block 318), which is California:1000. The data processing system will create a new run (e.g., RUN2) for this record, and RUN2 will contain California:1000 (e.g., block 304). This will result in two runs in memory, RUN1 and RUN2.
(56) At this point, the data processing system will check to see if two existing runs in memory can be merged. Since the memory only contains RUN1 and RUN2, the data processing system will check to see if the counts for RUN1 and RUN2 are the same (e.g., block 306). RUN1 only contains records of one category (e.g., Washington), so the count is one. Similarly, RUN2 only contains records of one category (e.g., California), so the count is also one. Since the runs have the same count, they can be merged. The data processing system merges the runs (e.g., block 310) to create a new merged run (e.g., MR1). This merging includes eagerly aggregating while sorting, and results in MR1 (e.g., block 312) containing California:1000 and Washington:250. The memory for RUN1 and RUN2 is released since it is no longer needed (e.g., block 314).
(57) At this point, the data processing system will check to see if there are any runs that can be merged (e.g., go back to block 306). There is only one run (MR1) in memory, so the data processing system will check to see if all records are processed (e.g., block 316). Since there are still records left, the system will read the next record, Oregon:500, (e.g., block 318) and create a new run (e.g., RUN1) for the read record (e.g., block 304). This results in the memory having two runs, MR1 and RUN1 containing Oregon:500.
(58) At this point, the data processing system will check to see if there are runs that can be merged (e.g., block 306). To do this, the data processing system will compare the counts for MR1 and RUN1 to see if they are the same. MR1 contains records of two categories (e.g., California and Washington), so the count is two. RUN1 only contains records of one category (e.g., Oregon), so the count is also one. Since the runs have different counts, they cannot be merged. Thus, the data processing system will go on to check if all records have been processed (e.g., block 316), for which the answer is no.
(59) This results in the data processing system going on to read the next record, California:500, (e.g., block 318) and creating a new run (e.g., RUN2) for the read record (e.g., block 304). At this point, the memory contains three runs: MR1, RUN1, and RUN2 now containing California:500.
(60) Once again, the data processing system will check to see if there are any runs that can be merged (e.g., block 306). RUN2 has a count of one since it contains records of one category (e.g., California), which allows it to be merged with RUN1 which also has a count of one. These two runs are merged to create a new merged run (e.g., MR2), and the merging process includes eagerly aggregating while sorting (e.g., block 310 and block 312). The memory for RUN1 and RUN2 is released (e.g., block 314), and the memory now only includes MR1 and MR2. MR1 contains the records California:1000 and Washington:250, and MR2 contains the records California:500 and Oregon:500.
(61) The data processing system will then check to see if there are any runs that can be merged by comparing the counts of existing runs (e.g., by going back to block 306). MR1 (California, Washington) and MR2 (California, Oregon) both have a run count of two, so they can be merged to create a new merged run (e.g., MR3). This merging (e.g., block 310 and block 312) includes eagerly aggregating while sorting. MR3 will contain the records California:1500, Oregon:500, and Washington:250. Notice that the value for the record for California from MR1 and the value for the record for California from MR2 have been combined into a single record with an aggregate value (e.g., 1000+500=1500). The memory for MR1 and MR2 is then released (e.g., block 314).
(62) At this point, the data processing system will check to see if there are any runs that can be merged (e.g., block 306). The memory only contains MR3, so the data processing system will check to see if all the records have been processed (e.g., block 316) before reading the next record, Washington:150, (e.g., block 318) and create a new run (e.g., RUN1) for the record (e.g., block 304). Since the count of RUN1 is one and MR3 is three, they cannot be merged (e.g., block 306). Thus, the data processing system will determine that all the records have not been processed before reading the next data record, Oregon:200, (e.g., block 318) and creating a new run (e.g., RUN2) for that record (e.g., block 304). At this point, the memory will contain MR3, RUN1, and RUN2.
(63) The data processing system will again check to see if any runs can be merged (e.g., block 306). Since the counts for RUN1 (Washington) and RUN2 (Oregon) are both one, the data processing system will merge the runs (e.g., block 310) to create a new run (MR4), (e.g., block 312) where the merging includes eagerly aggregating while sorting. The memory for RUN1 and RUN2 will then be released (e.g., block 314). At this point, the memory will contain MR3: {California:1500; Oregon:500; Washington:250} and MR4: {Oregon:200; Washington:150}.
(64) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the count for MR3 is three (California, Oregon, Washington) and MR4 is two (Oregon, Washington) and are not the same, the runs cannot be merged. Thus, the data processing system will determine that there are more records to process (e.g., block 316) and read the next record, Oregon:100, (e.g., block 318) and create a new run (RUN1) for the read record (e.g., block 304).
(65) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for MR3, MR4, and RUN1 (Oregon) are not the same, none of the runs can be merged. The data processing system will determine if there are more records to process (e.g., block 316) and read the next record, California:800, (e.g., block 318) and create a new run for the read record (e.g., block 304). At this point, the memory contains four runs: MR3: {California:1500; Oregon:500; Washington:250}, MR4: {Oregon:200; Washington:150}, RUN1: {Oregon:100}, and RUN2: {California:800}.
(66) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for RUN1 (Oregon) and RUN2 (California) are the same, the data processing system will merge the runs (e.g., block 310) to create a new merged run (e.g., MR5). The merging includes eagerly aggregating while sorting. The memory for RUN1 and RUN2 will be released (e.g., block 314). At this point, the memory will contain MR3: {California:1500; Oregon:500; Washington:250}, MR4: {Oregon:200; Washington:150} and MR5: {California:800; Oregon:100}.
(67) The data processing system will then check to see if any runs can be merged (e.g., block 306). Since the count for MR4 is two (Oregon, Washington) and MR5 is two (California, Oregon) and are the same, MR4 and MR5 can be merged (e.g., block 310 and block 312) to create a new merged run (MR6). The merging process includes eagerly aggregating while sorting. The memory for MR4 and MR5 will be released (e.g., block 314). At this point, the memory will include MR3 and MR6: {California:800; Oregon:300; Washington:150}.
(68) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for MR3 is three (California, Oregon, Washington) and MR6 is three (California, Oregon, Washington) and are the same, the data processing system will merge the runs (e.g., block 310 and block 312) to create a new run (MR7), where the merging includes eagerly aggregating while sorting. The data processing system will release the memory for MR3 and MR6 (e.g., block 314). MR7: {California:2300; Oregon:800; Washington:400}.
(69) The data processing system will again check to see if any runs can be merged (e.g., block 306). Since there is only one run in the memory, the data processing system will determine there are still records to process (e.g., block 316) and read the next record, Oregon:500 (e.g., block 318) and create new run for the read record (e.g., block 304). At this point the memory will contain MR7 and RUN1{Oregon:500}.
(70) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for MR7 is three (California, Oregon, Washington) and RUN1 is one (Oregon), the data processing system does not merge the runs. Instead, the data processing system will determine there are still records to process (e.g., block 316) and read the next record, California:500, (e.g., block 318) and create new run for the read record (e.g., block 304). At this point the memory will contain MR7, RUN1, and RUN2{California:500}.
(71) The data processing system will then check to see if any runs can be merged (e.g., block 306). Since the counts for RUN1 (Oregon) and RUN2 (California) are the same, the data processing system can merge the runs (e.g., block 310 and block 312) to create a new run (MR8:{California:500; Oregon:500}), where the merging includes eagerly aggregating while sorting. The memory for RUN1 and RUN2 will be released (e.g., block 314).
(72) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for MR7 (California, Oregon, Washington) and MR8 (California, Oregon) are not the same, the system does not merge the runs. Instead, the data processing system will determine there are still records to process (e.g., block 316) and read the next record, Washington:250, (e.g., block 318) and create new run for the read record (e.g., block 304). The memory at this point will hold MR7: {California:2300; Oregon:800; Washington:400} and MR8:{California:500; Oregon:500}, and RUN1: {Washington:250}.
(73) The data processing system will check to see if any runs can be merged (e.g., block 306). Since the counts for MR7 is three (California, Oregon, Washington), MR8 is two (California, Oregon) and RUN1 is one (Washington), none of the runs can be merged. At this point, the data processing system may determine that all the records have been processed (e.g., block 316). In this case, the data processing system checks to see if the number of runs in memory is greater than one (e.g., block 320). If this is the case, as in this example, the data processing system starts merging existing runs in memory until there is only one remaining run (e.g., block 322). In some embodiments, the data processing system may prioritize the pair-wise merging of runs of similar size at this stage. Since the memory contains MR7: {California:2300; Oregon:800; Washington:400} and MR8:{California:500; Oregon:500}, and RUN1: {Washington:250}, the data processing system will merge MR8 and RUN1, to create a new run (MR9:{California:500; Oregon:500; Washington:250}), where the merging includes eagerly aggregating while sorting. The memory for MR8 and RUN1 is released. Then the data processing system merges MR7 and MR9, to create a new run (MR10:{California:2800; Oregon:1300; Washington:650}), where the merging includes eagerly aggregating while sorting. The data processing system releases the memory for MR7 and MR9.
(74) At this point, there is only one remaining run, MR10, in memory. With no more records to process, the data processing system will read MR10 in order to output the aggregate values (e.g., block 324), which are now {California:2800; Oregon:1300; Washington:650}, as the result of the query.
(75)
(76) In some embodiments, at block 502, the data processing system may access or receive a collection of data records. In some of such embodiments, the collection of data records may be stored in one or more databases. In certain embodiments, the data records in the collection may include key/value pairs. For example, the collection may be a collection of key/value pairs, with an example of one of the key/value pairs being “California” as the key and “500” as the corresponding value. A single data record may have more than one key and/or more than one value.
(77) In certain embodiments, at block 504, the data processing system may split the collection of data records into chunks or runs. In some embodiments, the data records may be read incrementally, record-by-record, such that each data record is initially stored by itself in its own chunk or run.
(78) In certain embodiments, at blocks 506, 508, and 510, the data processing system may merge two chunks or runs together by sorting the data records within the two chunks based on a particular value or key, such as a group-by key. The specific key used in the sorting operation may be provided by a user of the system (e.g., in the query), or it may be determined automatically by the system based on the needs/desires of the user. Aggregation is eagerly performed while the data records are sorted, and a merged run is produced that contains the aggregate values of the records.
(79) At block 510, runs may be merged together to merge intermediate aggregate values until a single run is remaining, which will contain final aggregate values associated with the processing of all the data records. There may be various methods and techniques for merging runs (some of which will contain intermediate aggregate result records). For example, in certain embodiments, the runs may be greedily merged in a pair-wise fashion. In other words, as runs are obtained and it is determined that they can be merged, they may be merged into a singular merged run in real-time (this can be analogized to individually checking each of a plurality of bank accounts and adding that balance value to an ongoing-tally of the total balance across all bank accounts).
(80) For instance, the aggregate function could be a SUM function that, when applied, provides a sum of all the values associated with the “California” data records and also a sum of all the values associated with the “Oregon” data records. Accordingly, the final resulting run will contain a record that sums up all the values associated with the “California” data records in all the data records, and will also contain a record that sums up all the values associated with the “Oregon” data records.
(81) It should be noted that different aggregate functions used may produce a single value in different ways. For example, the SUM aggregation takes the sum of the column values corresponding to the various categories or keys. The COUNT aggregation counts every occurrence of a record of a certain category or key, and can be thought of conceptually as adding together a “1” for each occurrence (i.e., maybe implemented by a SUM function that adds a “1” for each occurrence). The MIN aggregation takes the minimum of the column values corresponding to the various keys, while the MAX aggregation takes the maximum of the column values corresponding to the various keys. An example diagram is provided below:
(82) TABLE-US-00002 Initial Value(s) Function Sum Column Value Sum Count 1 (indicates occurrence) Sum Min Column Value Min Max Column Value Max
(83)
(84) A collection of data records 602 may be broken up into chunks 604 or runs. Thus, chunks 604 may represent non-overlapping subsets of the collection 602 so that data records in a particular chunk are not duplicated in another chunk. This may involve reading the data records incrementally, record-by-record, such that each record initially is stored in a chunk. The system may sort chunks 604 to obtain sorted chunks 606, where the sorting is based upon the values of the group-by key of the records in each chunk. At the same time, the system may eagerly perform aggregation while sorting in order to obtain the intermediate aggregate result records 608 obtained from sorting and aggregating the chunks 604. All of the intermediate aggregate results 608 may be combined into the final result 610 based on the aggregation function.
(85) The teachings described herein can be used in various different settings and contexts. In one such example, the teachings may be used by an analytics server that is configured to receive a large amount of data records and has to perform various types of analyses on the records, where the analyses involve applying an aggregate function along with a “group by” clause to the data records. For example, Oracle Corporation® provides an Analytics Server that acts as a query and analysis engine for multiple Oracle Business Intelligence (BI) offerings, such as Oracle Data Visualization Desktop. Oracle Data Visualization Desktop's user interface is a tool called Visual Analyzer, which allows business users to visualize and explore their data using rich visualizations. The Visual Analyzer analyses data and provides business users with various visualizations (e.g., graphs, pie charts, etc.) that allow business users to better interpret and comprehend the mountains of data that has been collected in order to draw inferences and arrive at conclusions. Once the Visual Analyzer is provided user inputs laying out the parameters of the analysis to be performed, the Visual Analyzer generates and issues SQL statements to Oracle BI Analytics Server. The Server processes these SQL statements and returns the relevant data, which is then presented by Visual Analyzer to the business users using various forms of visualization, e.g. pie charts, line graphs, etc. The execution of the SQL statements by the Server can generate a large amount of the data records that have to be processed by applying an aggregate function and a “group by” clause to the data records before the server can return the relevant data to the Visual Analyzer, depending on the needs of the business user. The teachings describe in the disclosure may be used by the Server to apply such an aggregate function and a “group by” clause in an efficient manner.
(86) The combined sort and aggregation operation embodiments described in this disclosure provide better performance than conventional techniques requiring a full sort of all the data records to be processed followed by application of the aggregate function. Such prior art techniques requires at least an amount of system memory that is at least equal to the memory needed to store all the data records being processed. Per the embodiments described in the disclosure, the amount of memory needed to perform the aggregate and group-by operation is significantly less than the amount of memory needed to store all of the data records (e.g., the total size of all the data records). For example, in a traditional approach, for N data records being processed, all the N data records are loaded into system memory, then sorted, and then aggregated. Thus, the memory requirement would be equal to the total size (e.g., number of data records*size of data records) of the data records. By performing the sort and aggregation eagerly and incrementally as disclosed herein, the amount of system memory needed is a lot less than the total size of all the data records.
(87) Another advantage of this technique is improved efficiency and less time needed to perform the aggregate and group-by operation. In traditional approaches, the time needed to perform the aggregation operation would be on the order (N log N) (e.g., dependent on the logarithm of the total number of data records). However, by performing the sort and aggregation operations according to certain embodiments disclosed herein, the time needed to perform the aggregation is of the order (N log m), where m is the total number of different categories of K specified in the group-by clause. Since m is typically a lot smaller than N, this translates to significant savings in processing time over conventional techniques.
(88)
(89) In various embodiments, server 712 may be adapted to run one or more services or software applications that enable the memory management techniques described herein.
(90) In certain embodiments, server 712 may also provide other services or software applications that can include non-virtual and virtual environments. In some embodiments, these services may be offered as web-based or cloud services, such as under a Software as a Service (SaaS) model to the users of client computing devices 702, 704, 706, and/or 708. Users operating client computing devices 702, 704, 706, and/or 708 may in turn utilize one or more client applications to interact with server 712 to utilize the services provided by these components.
(91) In the configuration depicted in
(92) Users may use client computing devices 702, 704, 706, and/or 708 to execute one or more applications, which may generate one or more storage requests that may then be serviced in accordance with the teachings of this disclosure. A client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via this interface. Although
(93) The client devices may include various types of computing systems such as portable handheld devices, general purpose computers such as personal computers and laptops, workstation computers, wearable devices, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computing devices may run various types and versions of software applications and operating systems (e.g., Microsoft Windows®, Apple Macintosh®, UNIX® or UNIX-like operating systems, Linux or Linux-like operating systems such as Google Chrome™ OS) including various mobile operating systems (e.g., Microsoft Windows Mobile®, iOS®, Windows Phone®, Android™, BlackBerry®, Palm OS®). Portable handheld devices may include cellular phones, smartphones, (e.g., an iPhone®), tablets (e.g., iPad®), personal digital assistants (PDAs), and the like. Wearable devices may include Google Glass® head mounted display, and other devices. Gaming systems may include various handheld gaming devices, Internet-enabled gaming devices (e.g., a Microsoft Xbox® gaming console with or without a Kinect® gesture input device, Sony PlayStation® system, various gaming systems provided by Nintendo®, and others), and the like. The client devices may be capable of executing various different applications such as various Internet-related apps, communication applications (e.g., E-mail applications, short message service (SMS) applications) and may use various communication protocols.
(94) Network(s) 710 may be any type of network familiar to those skilled in the art that can support data communications using any of a variety of available protocols, including without limitation TCP/IP (transmission control protocol/Internet protocol), SNA (systems network architecture), IPX (Internet packet exchange), AppleTalk®, and the like. Merely by way of example, network(s) 710 can be a local area network (LAN), networks based on Ethernet, Token-Ring, a wide-area network (WAN), the Internet, a virtual network, a virtual private network (VPN), an intranet, an extranet, a public switched telephone network (PSTN), an infra-red network, a wireless network (e.g., a network operating under any of the Institute of Electrical and Electronics (IEEE) 1002.11 suite of protocols, Bluetooth®, and/or any other wireless protocol), and/or any combination of these and/or other networks.
(95) Server 712 may be composed of one or more general purpose computers, specialized server computers (including, by way of example, PC (personal computer) servers, UNIX® servers, mid-range servers, mainframe computers, rack-mounted servers, etc.), server farms, server clusters, or any other appropriate arrangement and/or combination. Server 712 can include one or more virtual machines running virtual operating systems, or other computing architectures involving virtualization such as one or more flexible pools of logical storage devices that can be virtualized to maintain virtual storage devices for the server. In various embodiments, server 712 may be adapted to run one or more services or software applications that provide the functionality described in the foregoing disclosure.
(96) The computing systems in server 712 may run one or more operating systems including any of those discussed above, as well as any commercially available server operating system. Server 712 may also run any of a variety of additional server applications and/or mid-tier applications, including HTTP (hypertext transport protocol) servers, FTP (file transfer protocol) servers, CGI (common gateway interface) servers, JAVA servers, database servers, and the like. Exemplary database servers include without limitation those commercially available from Oracle®, Microsoft®, Sybase®, IBM® (International Business Machines), and the like.
(97) In some implementations, server 712 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of client computing devices 702, 704, 706, and 708. As an example, data feeds and/or event updates may include, but are not limited to, Twitter® feeds, Facebook® updates or real-time updates received from one or more third party information sources and continuous data streams, which may include real-time events related to sensor data applications, financial tickers, network performance measuring tools (e.g., network monitoring and traffic management applications), clickstream analysis tools, automobile traffic monitoring, and the like. Server 712 may also include one or more applications to display the data feeds and/or real-time events via one or more display devices of client computing devices 702, 704, 706, and 708.
(98) Distributed system 700 may also include one or more data repositories 714, 716. These data repositories may be used to store data and other information in certain embodiments. Data repositories 714, 716 may be of different types. In certain embodiments, a data repository used by server 712 may be a database, for example, a relational database, such as databases provided by Oracle Corporation® and other vendors. One or more of these databases may be adapted to enable storage, update, and retrieval of data to and from the database in response to SQL-formatted commands.
(99) In certain embodiments, one or more of data repositories 714, 716 may also be used by applications to store application data. The data repositories used by applications may be of different types such as, for example, a key-value store repository, an object store repository, or a general storage repository supported by a file system.
(100) In certain embodiments, the memory management-related functionalities described in this disclosure may be offered as services via a cloud environment.
(101) Network(s) 810 may facilitate communication and exchange of data between clients 804, 806, and 808 and cloud infrastructure system 802. Network(s) 810 may include one or more networks. The networks may be of the same or different types. Network(s) 810 may support one or more communication protocols, including wired and/or wireless protocols, for facilitating the communications.
(102) The embodiment depicted in
(103) The term cloud service is generally used to refer to a service that is made available to users on demand and via a communication network such as the Internet by systems (e.g., cloud infrastructure system 802) of a service provider. Typically, in a public cloud environment, servers and systems that make up the cloud service provider's system are different from the customer's own on-premise servers and systems. The cloud service provider's systems are managed by the cloud service provider. Customers can thus avail themselves of cloud services provided by a cloud service provider without having to purchase separate licenses, support, or hardware and software resources for the services. For example, a cloud service provider's system may host an application, and a user may, via the Internet, on demand, order and use the application without the user having to buy infrastructure resources for executing the application. Cloud services are designed to provide easy, scalable access to applications, resources and services. Several providers offer cloud services. For example, several cloud services are offered by Oracle Corporation® of Redwood Shores, Calif., such as middleware services, database services, Java cloud services, and others.
(104) In certain embodiments, cloud infrastructure system 802 may provide one or more cloud services using different models such as under a Software as a Service (SaaS) model, a Platform as a Service (PaaS) model, an Infrastructure as a Service (IaaS) model, and others, including hybrid service models. Cloud infrastructure system 802 may include a suite of applications, middleware, databases, and other resources that enable provision of the various cloud services.
(105) A SaaS model enables an application or software to be delivered to a customer over a communication network like the Internet, as a service, without the customer having to buy the hardware or software for the underlying application. For example, a SaaS model may be used to provide customers access to on-demand applications that are hosted by cloud infrastructure system 802. Examples of SaaS services provided by Oracle Corporation® include, without limitation, various services for human resources/capital management, customer relationship management (CRM), enterprise resource planning (ERP), supply chain management (SCM), enterprise performance management (EPM), analytics services, social applications, and others.
(106) An IaaS model is generally used to provide infrastructure resources (e.g., servers, storage, hardware and networking resources) to a customer as a cloud service to provide elastic compute and storage capabilities. Various IaaS services are provided by Oracle Corporation®.
(107) A PaaS model is generally used to provide, as a service, platform and environment resources that enable customers to develop, run, and manage applications and services without the customer having to procure, build, or maintain such resources. Examples of PaaS services provided by Oracle Corporation® include, without limitation, Oracle Java Cloud Service (JCS), Oracle Database Cloud Service (DBCS), data management cloud service, various application development solutions services, and others.
(108) Cloud services are generally provided on an on-demand self-service basis, subscription-based, elastically scalable, reliable, highly available, and secure manner. For example, a customer, via a subscription order, may order one or more services provided by cloud infrastructure system 802. Cloud infrastructure system 802 then performs processing to provide the services requested in the customer's subscription order. Cloud infrastructure system 802 may be configured to provide one or even multiple cloud services.
(109) Cloud infrastructure system 802 may provide the cloud services via different deployment models. In a public cloud model, cloud infrastructure system 802 may be owned by a third party cloud services provider and the cloud services are offered to any general public customer, where the customer can be an individual or an enterprise. In certain other embodiments, under a private cloud model, cloud infrastructure system 802 may be operated within an organization (e.g., within an enterprise organization) and services provided to customers that are within the organization. For example, the customers may be various departments of an enterprise such as the Human Resources department, the Payroll department, etc. or even individuals within the enterprise. In certain other embodiments, under a community cloud model, the cloud infrastructure system 802 and the services provided may be shared by several organizations in a related community. Various other models such as hybrids of the above mentioned models may also be used.
(110) Client computing devices 804, 806, and 808 may be of different types (such as devices 702, 704, 706, and 708 depicted in
(111) In some embodiments, the processing performed by cloud infrastructure system 802 for providing services may involve big data analysis. This analysis may involve using, analyzing, and manipulating and sorting large data sets to detect and visualize various trends, behaviors, relationships, etc. within the data. This analysis may be performed by one or more processors, possibly processing the data in parallel, performing simulations using the data, and the like. The data used for this analysis may include structured data (e.g., data stored in a database or structured according to a structured model) and/or unstructured data (e.g., data blobs (binary large objects)).
(112) As depicted in the embodiment in
(113) In certain embodiments, to facilitate efficient provisioning of these resources for supporting the various cloud services provided by cloud infrastructure system 802 for different customers, the resources may be bundled into sets of resources or resource modules (also referred to as “pods”). Each resource module or pod may comprise a pre-integrated and optimized combination of resources of one or more types. In certain embodiments, different pods may be pre-provisioned for different types of cloud services. For example, a first set of pods may be provisioned for a database service, a second set of pods, which may include a different combination of resources than a pod in the first set of pods, may be provisioned for Java service, and the like. For some services, the resources allocated for provisioning the services may be shared between the services.
(114) Cloud infrastructure system 802 may itself internally use services 832 that are shared by different components of cloud infrastructure system 802 and which facilitate the provisioning of services by cloud infrastructure system 802. These internal shared services may include, without limitation, a security and identity service, an integration service, an enterprise repository service, an enterprise manager service, a virus scanning and white list service, a high availability, backup and recovery service, service for enabling cloud support, an email service, a notification service, a file transfer service, and the like.
(115) Cloud infrastructure system 802 may comprise multiple subsystems. These subsystems may be implemented in software, or hardware, or combinations thereof. As depicted in
(116) In certain embodiments, such as the embodiment depicted in
(117) Once properly validated, OMS 820 may then invoke the order provisioning subsystem (OPS) 824 that is configured to provision resources for the order including processing, memory, and networking resources. The provisioning may include allocating resources for the order and configuring the resources to facilitate the service requested by the customer order. The manner in which resources are provisioned for an order and the type of the provisioned resources may depend upon the type of cloud service that has been ordered by the customer. For example, according to one workflow, OPS 824 may be configured to determine the particular cloud service being requested and identify a number of pods that may have been pre-configured for that particular cloud service. The number of pods that are allocated for an order may depend upon the size/amount/level/scope of the requested service. For example, the number of pods to be allocated may be determined based upon the number of users to be supported by the service, the duration of time for which the service is being requested, and the like. The allocated pods may then be customized for the particular requesting customer for providing the requested service.
(118) Cloud infrastructure system 802 may send a response or notification 844 to the requesting customer to indicate when the requested service is now ready for use. In some instances, information (e.g., a link) may be sent to the customer that enables the customer to start using and availing the benefits of the requested services.
(119) Cloud infrastructure system 802 may provide services to multiple customers. For each customer, cloud infrastructure system 802 is responsible for managing information related to one or more subscription orders received from the customer, maintaining customer data related to the orders, and providing the requested services to the customer. Cloud infrastructure system 802 may also collect usage statistics regarding a customer's use of subscribed services. For example, statistics may be collected for the amount of storage used, the amount of data transferred, the number of users, and the amount of system up time and system down time, and the like. This usage information may be used to bill the customer. Billing may be done, for example, on a monthly cycle.
(120) Cloud infrastructure system 802 may provide services to multiple customers in parallel. Cloud infrastructure system 802 may store information for these customers, including possibly proprietary information. In certain embodiments, cloud infrastructure system 802 comprises an identity management subsystem (IMS) 828 that is configured to manage customers information and provide the separation of the managed information such that information related to one customer is not accessible by another customer. IMS 828 may be configured to provide various security-related services such as identity services, such as information access management, authentication and authorization services, services for managing customer identities and roles and related capabilities, and the like.
(121)
(122) Bus subsystem 902 provides a mechanism for letting the various components and subsystems of computer system 900 communicate with each other as intended. Although bus subsystem 902 is shown schematically as a single bus, alternative embodiments of the bus subsystem may utilize multiple buses. Bus subsystem 902 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, a local bus using any of a variety of bus architectures, and the like. For example, such architectures may include an Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, which can be implemented as a Mezzanine bus manufactured to the IEEE P1386.1 standard, and the like.
(123) Processing subsystem 904 controls the operation of computer system 900 and may comprise one or more processors, application specific integrated circuits (ASICs), or field programmable gate arrays (FPGAs). The processors may include be single core or multicore processors. The processing resources of computer system 900 can be organized into one or more processing units 932, 934, etc. A processing unit may include one or more processors, one or more cores from the same or different processors, a combination of cores and processors, or other combinations of cores and processors. In some embodiments, processing subsystem 904 can include one or more special purpose co-processors such as graphics processors, digital signal processors (DSPs), or the like. In some embodiments, some or all of the processing units of processing subsystem 904 can be implemented using customized circuits, such as application specific integrated circuits (ASICs), or field programmable gate arrays (FPGAs).
(124) In some embodiments, the processing units in processing subsystem 904 can execute instructions stored in system memory 910 or on computer readable storage media 922. In various embodiments, the processing units can execute a variety of programs or code instructions and can maintain multiple concurrently executing programs or processes. At any given time, some or all of the program code to be executed can be resident in system memory 910 and/or on computer-readable storage media 922 including potentially on one or more storage devices. Through suitable programming, processing subsystem 904 can provide various functionalities described above. In instances where computer system 900 is executing one or more virtual machines, one or more processing units may be allocated to each virtual machine.
(125) In certain embodiments, a processing acceleration unit 906 may optionally be provided for performing customized processing or for off-loading some of the processing performed by processing subsystem 904 so as to accelerate the overall processing performed by computer system 900.
(126) I/O subsystem 908 may include devices and mechanisms for inputting information to computer system 900 and/or for outputting information from or via computer system 900. In general, use of the term input device is intended to include all possible types of devices and mechanisms for inputting information to computer system 900. User interface input devices may include, for example, a keyboard, pointing devices such as a mouse or trackball, a touchpad or touch screen incorporated into a display, a scroll wheel, a click wheel, a dial, a button, a switch, a keypad, audio input devices with voice command recognition systems, microphones, and other types of input devices. User interface input devices may also include motion sensing and/or gesture recognition devices such as the Microsoft Kinect® motion sensor that enables users to control and interact with an input device, the Microsoft Xbox® 360 game controller, devices that provide an interface for receiving input using gestures and spoken commands. User interface input devices may also include eye gesture recognition devices such as the Google Glass® blink detector that detects eye activity (e.g., “blinking” while taking pictures and/or making a menu selection) from users and transforms the eye gestures as inputs to an input device (e.g., Google) Glass®. Additionally, user interface input devices may include voice recognition sensing devices that enable users to interact with voice recognition systems (e.g., Siri® navigator) through voice commands.
(127) Other examples of user interface input devices include, without limitation, three dimensional (3D) mice, joysticks or pointing sticks, gamepads and graphic tablets, and audio/visual devices such as speakers, digital cameras, digital camcorders, portable media players, webcams, image scanners, fingerprint scanners, barcode reader 3D scanners, 3D printers, laser rangefinders, and eye gaze tracking devices. Additionally, user interface input devices may include, for example, medical imaging input devices such as computed tomography, magnetic resonance imaging, position emission tomography, and medical ultrasonography devices. User interface input devices may also include, for example, audio input devices such as MIDI keyboards, digital musical instruments and the like.
(128) In general, use of the term output device is intended to include all possible types of devices and mechanisms for outputting information from computer system 900 to a user or other computer. User interface output devices may include a display subsystem, indicator lights, or non-visual displays such as audio output devices, etc. The display subsystem may be a cathode ray tube (CRT), a flat-panel device, such as that using a liquid crystal display (LCD) or plasma display, a projection device, a touch screen, and the like. For example, user interface output devices may include, without limitation, a variety of display devices that visually convey text, graphics and audio/video information such as monitors, printers, speakers, headphones, automotive navigation systems, plotters, voice output devices, and modems.
(129) Storage subsystem 918 provides a repository or data store for storing information and data that is used by computer system 900. Storage subsystem 918 provides a tangible non-transitory computer-readable storage medium for storing the basic programming and data constructs that provide the functionality of some embodiments. Storage subsystem 918 may store software (e.g., programs, code modules, instructions) that when executed by processing subsystem 904 provides the functionality described above. The software may be executed by one or more processing units of processing subsystem 904. Storage subsystem 918 may also provide a repository for storing data used in accordance with the teachings of this disclosure.
(130) Storage subsystem 918 may include one or more non-transitory memory devices, including volatile and non-volatile memory devices. As shown in
(131) By way of example, and not limitation, as depicted in
(132) Computer-readable storage media 922 may store programming and data constructs that provide the functionality of some embodiments. Computer-readable media 922 may provide storage of computer-readable instructions, data structures, program modules, and other data for computer system 900. Software (programs, code modules, instructions) that, when executed by processing subsystem 904 provides the functionality described above, may be stored in storage subsystem 918. By way of example, computer-readable storage media 922 may include non-volatile memory such as a hard disk drive, a magnetic disk drive, an optical disk drive such as a CD ROM, DVD, a Blu-Ray® disk, or other optical media. Computer-readable storage media 922 may include, but is not limited to, Zip® drives, flash memory cards, universal serial bus (USB) flash drives, secure digital (SD) cards, DVD disks, digital video tape, and the like. Computer-readable storage media 922 may also include, solid-state drives (SSD) based on non-volatile memory such as flash-memory based SSDs, enterprise flash drives, solid state ROM, and the like, SSDs based on volatile memory such as solid state RAM, dynamic RAM, static RAM, DRAM-based SSDs, magnetoresistive RAM (MRAM) SSDs, and hybrid SSDs that use a combination of DRAM and flash memory based SSDs.
(133) In certain embodiments, storage subsystem 918 may also include a computer-readable storage media reader 920 that can further be connected to computer-readable storage media 922. Reader 920 may receive and be configured to read data from a memory device such as a disk, a flash drive, etc.
(134) In certain embodiments, computer system 900 may support virtualization technologies, including but not limited to virtualization of processing and memory resources. For example, computer system 900 may provide support for executing one or more virtual machines. In certain embodiments, computer system 900 may execute a program such as a hypervisor that facilitated the configuring and managing of the virtual machines. Each virtual machine may be allocated memory, compute (e.g., processors, cores), I/O, and networking resources. Each virtual machine generally runs independently of the other virtual machines. A virtual machine typically runs its own operating system, which may be the same as or different from the operating systems executed by other virtual machines executed by computer system 900. Accordingly, multiple operating systems may potentially be run concurrently by computer system 900.
(135) Communications subsystem 924 provides an interface to other computer systems and networks. Communications subsystem 924 serves as an interface for receiving data from and transmitting data to other systems from computer system 900. For example, communications subsystem 924 may enable computer system 900 to establish a communication channel to one or more client devices via the Internet for receiving and sending information from and to the client devices.
(136) Communication subsystem 924 may support both wired and/or wireless communication protocols. For example, in certain embodiments, communications subsystem 924 may include radio frequency (RF) transceiver components for accessing wireless voice and/or data networks (e.g., using cellular telephone technology, advanced data network technology, such as 3G, 4G or EDGE (enhanced data rates for global evolution), WiFi (IEEE 802.XX family standards, or other mobile communication technologies, or any combination thereof), global positioning system (GPS) receiver components, and/or other components. In some embodiments communications subsystem 924 can provide wired network connectivity (e.g., Ethernet) in addition to or instead of a wireless interface.
(137) Communication subsystem 924 can receive and transmit data in various forms. For example, in some embodiments, in addition to other forms, communications subsystem 924 may receive input communications in the form of structured and/or unstructured data feeds 926, event streams 928, event updates 930, and the like. For example, communications subsystem 924 may be configured to receive (or send) data feeds 926 in real-time from users of social media networks and/or other communication services such as Twitter® feeds, Facebook® updates, web feeds such as Rich Site Summary (RSS) feeds, and/or real-time updates from one or more third party information sources.
(138) In certain embodiments, communications subsystem 924 may be configured to receive data in the form of continuous data streams, which may include event streams 928 of real-time events and/or event updates 930, that may be continuous or unbounded in nature with no explicit end. Examples of applications that generate continuous data may include, for example, sensor data applications, financial tickers, network performance measuring tools (e.g. network monitoring and traffic management applications), clickstream analysis tools, automobile traffic monitoring, and the like.
(139) Communications subsystem 924 may also be configured to communicate data from computer system 900 to other computer systems or networks. The data may be communicated in various different forms such as structured and/or unstructured data feeds 926, event streams 928, event updates 930, and the like to one or more databases that may be in communication with one or more streaming data source computers coupled to computer system 900.
(140) Computer system 900 can be one of various types, including a handheld portable device (e.g., an iPhone® cellular phone, an iPad® computing tablet, a PDA), a wearable device (e.g., a Google Glass® head mounted display), a personal computer, a workstation, a mainframe, a kiosk, a server rack, or any other data processing system. Due to the ever-changing nature of computers and networks, the description of computer system 900 depicted in
(141) Although specific embodiments have been described, various modifications, alterations, alternative constructions, and equivalents are possible. Embodiments are not restricted to operation within certain specific data processing environments, but are free to operate within a plurality of data processing environments. Additionally, although certain embodiments have been described using a particular series of transactions and steps, it should be apparent to those skilled in the art that this is not intended to be limiting. Although some flowcharts describe operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be rearranged. A process may have additional steps not included in the figure. Various features and aspects of the above-described embodiments may be used individually or jointly.
(142) Further, while certain embodiments have been described using a particular combination of hardware and software, it should be recognized that other combinations of hardware and software are also possible. Certain embodiments may be implemented only in hardware, or only in software, or using combinations thereof. The various processes described herein can be implemented on the same processor or different processors in any combination.
(143) Where devices, systems, components or modules are described as being configured to perform certain operations or functions, such configuration can be accomplished, for example, by designing electronic circuits to perform the operation, by programming programmable electronic circuits (such as microprocessors) to perform the operation such as by executing computer instructions or code, or processors or cores programmed to execute code or instructions stored on a non-transitory memory medium, or any combination thereof. Processes can communicate using a variety of techniques including but not limited to conventional techniques for inter-process communications, and different pairs of processes may use different techniques, or the same pair of processes may use different techniques at different times.
(144) Specific details are given in this disclosure to provide a thorough understanding of the embodiments. However, embodiments may be practiced without these specific details. For example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the embodiments. This description provides example embodiments only, and is not intended to limit the scope, applicability, or configuration of other embodiments. Rather, the preceding description of the embodiments will provide those skilled in the art with an enabling description for implementing various embodiments. Various changes may be made in the function and arrangement of elements.
(145) The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that additions, subtractions, deletions, and other modifications and changes may be made thereunto without departing from the broader spirit and scope as set forth in the claims. Thus, although specific embodiments have been described, these are not intended to be limiting. Various modifications and equivalents are within the scope of the following claims.