OUTPUT DEVICE, DATA STRUCTURE, OUTPUT METHOD, AND OUTPUT PROGRAM
20180004869 · 2018-01-04
Assignee
Inventors
Cpc classification
G06N7/01
PHYSICS
G06F11/3006
PHYSICS
G06F9/50
PHYSICS
International classification
G06F11/34
PHYSICS
Abstract
An output device 10 is provided with an output unit 11 for outputting, on the basis of job feature information indicating the features of the job of a distributed processing system, estimation model application information that is information in a format suitable for an estimation model that estimates the amount of computer resources required for processing a task constituting the job. The estimation model application information may include word-containing information having binary information that indicates whether or not a character string indicated by the character string information included in the job feature information includes a prescribed word. The estimation model application information may include numerical inversion label information having, as string label information, a value derived by converting, by a prescribed function, the numeric value indicated by the numerical information included in the job feature information.
Claims
1. An output device comprising an output unit which outputs estimation model application information on the basis of job feature information indicating the features of the job of a distributed processing system, estimation model application information that is information in a format suitable for an estimation model that estimates the amount of computer resources required for processing a task constituting the job.
2. The output device according to claim 1, wherein the estimation model application information includes word-containing information having binary information that indicates whether or not a character string indicated by the character string information included in the job feature information includes a prescribed word.
3. The output device according to claim 1, wherein the estimation model application information includes numerical inversion label information having, as string label information, a value derived by converting, by a prescribed function, the numeric value indicated by the numerical information included in the job feature information.
4. The output device according to claim 1, comprising a form conversion unit which outputs the estimation model application information output from the estimation model, in a same format as the job feature information corresponding to the estimation model application information.
5. The output device according to claim 1, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
6.-8. (canceled)
9. An output method comprising outputting estimation model application information on the basis of job feature information indicating the features of the job of a distributed processing system, estimation model application information that is information in a format suitable for an estimation model that estimates the amount of computer resources required for processing a task constituting the job.
10. A non-transitory computer-readable recording medium having recorded therein an output program for causing a computer to execute an output process of outputting estimation model application information on the basis of job feature information indicating the features of the job of a distributed processing system, estimation model application information that is information in a format suitable for an estimation model that estimates the amount of computer resources required for processing a task constituting the job.
11. The output device according to claim 2, wherein the estimation model application information includes numerical inversion label information having, as string label information, a value derived by converting, by a prescribed function, the numeric value indicated by the numerical information included in the job feature information.
12. The output device according to claim 2, comprising a form conversion unit which outputs the estimation model application information output from the estimation model, in a same format as the job feature information corresponding to the estimation model application information.
13. The output device according to claim 3, comprising a form conversion unit which outputs the estimation model application information output from the estimation model, in a same format as the job feature information corresponding to the estimation model application information.
14. The output device according to claim 11, comprising a form conversion unit which outputs the estimation model application information output from the estimation model, in a same format as the job feature information corresponding to the estimation model application information.
15. The output device according to claim 2, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
16. The output device according to claim 3, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
17. The output device according to claim 4, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
18. The output device according to claim 11, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
19. The output device according to claim 12, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
20. The output device according to claim 13, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
21. The output device according to claim 14, comprising a computer resources estimation unit which estimates the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit on the basis of the job feature information into the estimation model.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
DESCRIPTION OF EMBODIMENT
Exemplary Embodiment 1
[0056] [Structure]
[0057] The following describes an exemplary embodiment of the present invention with reference to drawings.
[0058] The computer resources usage estimation device 100 depicted in
[0059] The input data conversion unit 101 has a function of converting job feature information included in the input data used for the generation of an estimation model into estimation model application information which is information in a format suitable for the estimation model to be generated, and outputting data including the estimation model application information.
[0060] As depicted in
[0061]
[0062]
[0063] The task identifier corresponds to an identification symbol of job feature information. The word candidate indicates whether or not a predetermined word is included. In
[0064] For example, consider the case of indicating that a character string which is job feature information A corresponding to task identifier Task1 includes word α1. To indicate that word α1 is included, binary information True (true) is set in the word candidate “job feature information A includes word α1?” in the word-containing information of Task1. The word-containing information of Task1 indicates that job feature information A includes word α1.
[0065] Consider the case of indicating that a character string which is job feature information B corresponding to task identifier Task2 does not include word βn. To indicate that word βn is not included, binary information False (false) is set in the word candidate “job feature information B includes word βn?” in the word-containing information of Task2. The word-containing information of Task2 indicates that job feature information B does not include word βn.
[0066]
[0067] The task identifier corresponds to an identification symbol of numerical information. The numerical information corresponds to numerical job feature information. In
[0068] For example, consider the case of indicating that the label information of numerical information A corresponding to task identifier Task1 is 8. To indicate that the label information of numerical information A is 8, character string information “8” is set in the label information “label information of numerical information A” in the numerical inversion label information of Task1. The numerical inversion label information of Task1 indicates that the label information of numerical information A is 8.
[0069] Consider the case of indicating that the label information of numerical information B corresponding to task identifier Task2 is 0. To indicate that the label information of numerical information B is 0, character string information “0” is set in the label information “label information of numerical information B” in the numerical inversion label information of Task2. The numerical inversion label information of Task2 indicates that the label information of numerical information B is 0.
[0070] The computer resources usage estimation model generation unit 102 has a function of receiving the data output from the input data conversion unit 101 and generating the estimation model. As depicted in
[0071] The computer resources usage estimation unit 103 has a function of estimating the computer resources usage of a task whose feature has not been recognized yet, using the received estimation model. The computer resources usage estimation unit 103 may output an estimate of an index relating to process execution, such as processing time, other than the computer resources usage.
[0072] Although the computer resources usage estimation device 100 in this exemplary embodiment estimates computer resources usage, the computer resources usage estimation device 100 may estimate a value other than computer resources usage. For example, the computer resources usage estimation device 100 may estimate task processing time in the distributed processing system. Any value estimated by the computer resources usage estimation device 100 in this exemplary embodiment is expected to have improved estimation accuracy.
[0073] The computer resources usage estimation device 100 in this exemplary embodiment is, for example, realized by a central processing unit (CPU) that executes processes according to a program stored in a storage medium. In other words, the input data conversion unit 101, the computer resources usage estimation model generation unit 102, and the computer resources usage estimation unit 103 are, for example, realized by a CPU that executes processes according to program control.
[0074] Each unit in the computer resources usage estimation device 100 may be realized by a hardware circuit.
[0075] [Operation]
[0076] The following describes the operation of the input data conversion unit 101 in this exemplary embodiment, with reference to
[0077] The operation of the input data conversion unit 101 in this exemplary embodiment generating, for a job name which is one type of job feature information, word-containing information indicating whether or not each word of a word group constituting the job name is included on the basis of the job feature information is described first, with reference to
[0078]
[0079]
[0080] When the job feature information depicted in
[0081] When forming the word-containing information, the input data conversion unit 101 generates each word candidate name by, for example, prefixing the identifier of the generation source information. The input data conversion unit 101 may generate each word candidate name by any other method, as long as the generated name is uniquely identifiable.
[0082] The job name with the task number “1” in the job feature information depicted in
[0083] In detail, the input data conversion unit 101 prefixes “Jobname” to each of the words “Cluster”, “Iterator”, “running”, “iteration”, “3”, “over”, “priorPath”, “kmeans”, “46”, and “clusters-2” present in the job name with the task number “1”, to generate the word candidate name.
[0084] The input data conversion unit 101 also prefixes “Jobname” to each of the words “5”, “106”, and “clusters-4” not present in the job name with the task number “1” and only present in the job name with the task number “2”, to generate the word candidate name. The input data conversion unit 101 forms the word-containing information by the word candidate group indicating the generated names.
[0085] The input data conversion unit 101 generates the same number of pieces of word-containing information as the number of input pieces of job feature information. The input data conversion unit 101 sets the task number of the input job feature information as the task number of the generated word-containing information corresponding to the job feature information.
[0086] The input data conversion unit 101 then sets all word candidates in each generated piece of word-containing information to False, as an initialization process (step S102).
[0087] Following this, the input data conversion unit 101 divides the job name of the input job feature information into words (step S104). For example, the job name with the task number “1” is divided into the words “Cluster”, “Iterator”, “running”, “iteration”, “3”, “over”, “priorPath”, “kmeans”, “46”, and “clusters-2”.
[0088] The delimiter or delimiter character when the input data conversion unit 101 divides the job name into words is, for example, set by the user, the system, or the like. Alternatively, the input data conversion unit 101 may hold the delimiter or delimiter character beforehand.
[0089] The input data conversion unit 101 then sets the word candidates in the word-containing information corresponding to the divided words, to True (step S106). The binary information “True” indicates that the set word candidate is included in the job name. The input data conversion unit 101 sets True for the number of divided words (step S107).
[0090] For example, in the case of the job feature information of the task number “1”, True is set in each of the word candidates “Jobname-Cluster”, “Jobname-Iterator”, “Jobname-running”, “Jobname-iteration”, “Jobname-3”, “Jobname-over”, “Jobname-priorPath”, “Jobname-kmeans”, “Jobname-46”, and “Jobname-clusters-2” for which the corresponding words are present. Meanwhile, False remains to be set in each of the word candidates “Jobname-5”, “Jobname-106”, and “Jobname-clusters-4” for which the corresponding words are not present.
[0091] The data conversion unit 101 may set information other than True in the corresponding word candidate, as long as it is clear that the word candidate is included in the job name. For example, the input data conversion unit 101 may set the numerical value 1 in the corresponding word candidate, instead of True. In the case of setting the numerical value 1, the input data conversion unit 101 sets the numerical value 0 in each word candidate instead of False in the initialization process of step S102.
[0092] As a result of the input data conversion unit 101 setting True for the number of divided words (the determination condition in step S107 is met), the word-containing information corresponding to the input job feature information is generated. The input data conversion unit 101 repeatedly performs the process of steps S103 to S108 for the number of input pieces of job feature information.
[0093] After generating the word-containing information for the number of input pieces of job feature information (the determination condition in step S108 is met), the input data conversion unit 101 ends the generation process.
[0094] The following describes the effect of the information obtained as a result of the conversion as depicted in
[0095] By referencing the word-containing information corresponding to the set of tasks that differ in the relationship between task feature information and the amount of computer resources, the computer resources usage estimation unit 103 can classify the tasks included in the task set depending on whether or not a predetermined word set is included.
[0096] For example, the task corresponding to each piece of task feature information depicted in
[0097] The operation of the input data conversion unit 101 in this exemplary embodiment generating, for a program class name which is one type of job feature information, word-containing information indicating whether or not each word of the word group constituting the class name is included on the basis of the job feature information is described next, with reference to
[0098]
[0099]
[0100] When the job feature information depicted in
[0101] When forming the word-containing information, the input data conversion unit 101 generates each word candidate name by, for example, prefixing the identifier of the generation source information. The input data conversion unit 101 may generate each word candidate name by any other method, as long as the generated name is uniquely identifiable.
[0102] The class name with the task number “1” in the job feature information depicted in
[0103] In detail, the input data conversion unit 101 prefixes “Class” to each of the words “org”, “apache”, “mahout”, “clustering”, “Aerator”, and “CIMapper” present in the class name with the task number “1”, to generate the word candidate name.
[0104] The input data conversion unit 101 also prefixes “Class” to each of the words “cf”, “taste”, “hadoop”, “item”, and “ItemIDIndexMapper” not present in the class name with the task number “1” and only present in the class name with the task number “2”, to generate the word candidate name. The input data conversion unit 101 forms the word-containing information by the word candidate group indicating the generated names.
[0105] The input data conversion unit 101 generates the same number of pieces of word-containing information as the number of input pieces of job feature information. The input data conversion unit 101 sets the task number of the input job feature information as the task number of the generated word-containing information corresponding to the job feature information.
[0106] The input data conversion unit 101 then sets all word candidates in each generated piece of word-containing information to False, as an initialization process (step S112).
[0107] Following this, the input data conversion unit 101 divides the program class name of the input job feature information into words (step S114). For example, the class name with the task number “1” is divided into the words “org”, “apache”, “mahout”, “clustering”, “iterator”, and “CIMapper”.
[0108] The delimiter or delimiter character when the input data conversion unit 101 divides the class name into words are, for example, set by the user, the system, or the like. Alternatively, the input data conversion unit 101 may hold the delimiter or delimiter character beforehand.
[0109] The input data conversion unit 101 then sets the word candidates in the word-containing information corresponding to the divided words, to True (step S116). The binary information “True” indicates that the set word candidate is included in the class name. The input data conversion unit 101 sets True for the number of divided words (step S117).
[0110] For example, in the case of the job feature information of the task number “1”, True is set in each of the word candidates “Class-org”, “Class-apache”, “Class-mahout”, “Class-clustering”, “Class-iterator”, and “Class-CIMapper” for which the corresponding words are present. Meanwhile, False remains to be set in each of the word candidates “Class-cf”, “Class-taste”, “Class-hadoop”, “Class-item”, and “Class-ItemIDIndexMapper” for which the corresponding words are not present.
[0111] The data conversion unit 101 may set information other than True in the corresponding word candidate, as long as it is clear that the word candidate is included in the program class name. For example, the input data conversion unit 101 may set the numerical value 1 in the corresponding word candidate, instead of True. In the case of setting the numerical value 1, the input data conversion unit 101 sets the numerical value 0 in each word candidate instead of False in the initialization process of step S112.
[0112] As a result of the input data conversion unit 101 setting True for the number of divided words (the determination condition in step S117 is met), the word-containing information corresponding to the input job feature information is generated. The input data conversion unit 101 repeatedly performs the process of steps S113 to S118 for the number of input pieces of job feature information.
[0113] After generating the word-containing information for the number of input pieces of job feature information (the determination condition in step S118 is met), the input data conversion unit 101 ends the generation process.
[0114] The following describes the effect of the information obtained as a result of the conversion as depicted in
[0115] By referencing the word-containing information corresponding to the set of tasks that differ in the relationship between task feature information and the amount of computer resources, the computer resources usage estimation unit 103 can classify the tasks included in the task set depending on whether or not a predetermined word set is included.
[0116] For example, the task corresponding to each piece of task feature information depicted in
[0117] Even if the computer resources usage estimation unit 103 has not recognized beforehand that the task executes the program implemented by Apache Mahout, the computer resources usage estimation unit 103 can recognize the tendency of the implementation of Apache Mahout by extracting the task group corresponding to the word-containing information whose word candidate “Class-mahout” is True in
[0118] The operation of the input data conversion unit 101 in this exemplary embodiment generating numerical inversion label information on the basis of one type of job feature information that includes an observation value during program execution and an option numerical value designated during program execution is described next, with reference to
[0119]
[0120]
[0121]
[0122] When the job feature information depicted in
[0123] When forming the numerical inversion label information, the input data conversion unit 101 generates each label information name by, for example, prefixing the identifier of the generation source information. The input data conversion unit 101 may generate each label information name by any other method, as long as the generated name is uniquely identifiable.
[0124] The input data conversion unit 101 may set the job feature information whose value has been replaced, as the numerical inversion label information. The numerical inversion label information depicted in
[0125] Following this, the input data conversion unit 101 converts value v included in the job feature information into value v′ using function f (step S124). Function f used when the input data conversion unit 101 converts the value is, for example, set by the user, the system, or the like. Alternatively, the input data conversion unit 101 may hold function f beforehand.
[0126] The input data conversion unit 101 uses any mathematical function for function f. Function f used for the conversion into the value depicted in
[0127] The input data conversion unit 101 then sets the label information of the numerical inversion label information corresponding to value v, to converted value v′ (step S125). The input data conversion unit 101 performs the value conversion and the converted value setting for the number of conversion target values included in the job feature information (step S126).
[0128] For example, in the case of the job feature information of the task number “1” depicted in
[0129] For example, in the case of the numerical inversion label information of the task number “1” depicted in
[0130] As a result of the input data conversion unit 101 performing the value conversion and the converted value setting for the number of conversion target values included in the job feature information (the determination condition in step S126 is met), the numerical inversion label information corresponding to the input job feature information is generated. The input data conversion unit 101 repeatedly performs the process of steps S122 to S127 for the number of input pieces of job feature information.
[0131] After generating the numerical inversion label information for the number of input pieces of job feature information (the determination condition in step S127 is met), the input data conversion unit 101 ends the generation process.
[0132] The following describes the effect of the information obtained as a result of the conversion as depicted in
[0133] Accordingly, in the case of using the numerical inversion label information depicted in
[0134] For example, the naive Bayes algorithm handles input data as discrete values. When handling numerical information which is a continuous quantity, the naive Bayes algorithm interprets all values as discontinuous discrete values.
[0135] The operation of interpretation as discontinuous discrete values is not an operation that is supposed to be performed by the naive Bayes algorithm. In the case of interpreting the information as discontinuous discrete values, the naive Bayes algorithm performs overfitting or the like in the estimation process. Overfitting or the like degrades the accuracy of the amount of computer resources estimates by the naive Bayes algorithm.
[0136] The numerical inversion label information output from the input data conversion unit 101 in this exemplary embodiment includes the numerical value converted from a continuous quantity to a discrete quantity by function f, as label information. In the case where the numerical inversion label information including the label information is the input data, the computer resources usage estimation unit 103 can use an algorithm, such as the naive Bayes algorithm, that can only handle discrete values. The possibility that the computer resources usage estimation unit 103 can accurately estimate the amount of computer resources required for task processing using the naive Bayes algorithm is thus increased.
[0137] By adjusting function f, the input data conversion unit 101 can convert the distribution of the input data into another distribution. The conversion of the data distribution increases the possibility that the computer resources usage estimation unit 103 can classify data more clearly.
[0138] According to this exemplary embodiment, the amount of computer resources required for task processing in the distributed processing system are estimated accurately. By receiving the information output from the input data conversion unit 101 as input, the computer resources usage estimation model generation unit 102 can easily classify, for each estimation algorithm, the determinant of the format of function for computing the amount of computer resources. The classification of the determinant for each estimation algorithm corresponds to extracting the task group whose word candidate “Jobname-kmeans” is True or extracting the task group whose word candidate “Class-mahout” is True as mentioned above.
[0139] By receiving the classified determinant as input for generating the amount of computer resources estimation algorithm, the computer resources usage estimation model generation unit 102 can generate a function in a format close to the value distribution in task processing. The computer resources usage estimation unit 103 can enhance estimation accuracy by estimating computer resources usage using the function in a format close to the value distribution in task processing which has been generated by the computer resources usage estimation model generation unit 102.
Exemplary Embodiment 2
[0140] [Structure]
[0141] The following describes Exemplary Embodiment 2 of the present invention with reference to drawings.
[0142] As depicted in
[0143] The estimate reverse conversion unit 104 has a function of reversely converting the value output from the computer resources usage estimation unit 103, into a computer resources usage estimate. The estimate reverse conversion unit 104 is, for example, realized by a CPU that executes processes according to program control.
[0144] In this exemplary embodiment, the computer resources usage estimation model generation unit 102 receives the data output from the input data conversion unit 101, and generates the estimation model. The computer resources usage estimation unit 103 receives the data output from the input data conversion unit 101, and outputs, in the same format as the received data, the value of computer resources usage of a task whose feature has not been recognized yet.
[0145] The estimate reverse conversion unit 104 converts the value indicating the computer resources usage estimate output from the computer resources usage estimation unit 103 into numerical information indicating the computer resources usage estimate, and outputs the numerical information. The use of the computer resources usage estimation device 100 in this exemplary embodiment enables the user, the distributed processing system scheduler, etc. to estimate the amount of computer resources required for task processing.
[0146] [Operation]
[0147] The following describes the operation of the input data conversion unit 101 and the operation of the estimate reverse conversion unit 104 in this exemplary embodiment, with reference to
[0148] The operation of the input data conversion unit 101 in this exemplary embodiment generating numerical inversion label information on the basis of one type of job feature information that includes computer resources usage observed during program execution is described first, with reference to
[0149]
[0150]
[0151] When the job feature information depicted in
[0152] When forming the numerical inversion label information, the input data conversion unit 101 generates each label information name by, for example, prefixing the identifier of the generation source information. The input data conversion unit 101 may generate each label information name by any other method, as long as the generated name is uniquely identifiable.
[0153] The input data conversion unit 101 may set the job feature information whose value has been replaced, as the numerical inversion label information. The numerical inversion label information depicted in
[0154] Following this, the input data conversion unit 101 converts value v included in the job feature information into value v′ using function f (step S204). Function f used when the input data conversion unit 101 converts the value is, for example, set by the user, the system, or the like. Alternatively, the input data conversion unit 101 may hold function f beforehand.
[0155] The input data conversion unit 101 uses any mathematical function for function f. Function f used for the conversion into the value depicted in
[0156] The input data conversion unit 101 then sets the label information of the numerical inversion label information corresponding to value v, to converted value v′ (step S205). The input data conversion unit 101 performs the value conversion and the converted value setting for the number of conversion target values included in the job feature information (step S206).
[0157] For example, in the case of the job feature information of the task number “1” depicted in
[0158] As a result of the input data conversion unit 101 performing the value conversion and the converted value setting for the number of conversion target values included in the job feature information (the determination condition in step S206 is met), the numerical inversion label information corresponding to the input job feature information is generated. The input data conversion unit 101 repeatedly performs the process of steps S202 to S207 for the number of input pieces of job feature information.
[0159] After generating the numerical inversion label information for the number of input pieces of job feature information (the determination condition in step S207 is met), the input data conversion unit 101 ends the generation process.
[0160] The input data conversion unit 101 outputs the generated numerical inversion label information to the computer resources usage estimation model generation unit 102 including the machine learning algorithm and the like. The computer resources usage estimation model generation unit 102 generates an estimation model for computing a memory usage estimate, using the received numerical inversion label information.
[0161] The operation of the estimate reverse conversion unit 104 in this exemplary embodiment reversely converting the output value of the estimation algorithm into estimated computer resources usage is described next, with reference to
[0162]
[0163] As depicted in
[0164]
[0165] As depicted in
[0166] The following describes the operation of the estimate reverse conversion unit 104 generating the estimated memory usage information depicted in
[0167] The estimate reverse conversion unit 104 feeds output value p′ included in the numerical inversion label information output from the estimation model, to inverse function f .sup.−1 of function f used in the conversion target value conversion process in step S204 in
[0168] The estimate reverse conversion unit 104 repeatedly performs the process of step S211 for the number of input pieces of numerical inversion label information. After generating estimated memory usage information for the number of input pieces of numerical inversion label information, the estimate reverse conversion unit 104 ends the process.
[0169] Thus, the computer resources usage estimation device 100 in this exemplary embodiment can convert the character string included in the numerical inversion label information output from the estimation model, into a computer resources usage estimate which is numerical information. By using the converted estimate, the distributed processing system can process the task faster or more efficiently. The use of the estimate increases the possibility that the amount of computer resources assigned to the process can be made to minimum required quantity.
[0170] For example, suppose the user sets to use 2 GB memory for all processes in the distributed processing system. With this setting, a computer with 4 GB memory can execute two processes in parallel. In the case where the memory used for a process is 1 GB, however, the setting means that 2 GB memory is unnecessarily assigned to the computer.
[0171] If it is possible to estimate that the memory required for the process is 1 GB, the user can perform setting so that the distributed processing system assigns four processes all at once to a computer with 4 GB memory. By executing the four processes in parallel, the distributed processing system can process the job at double speed, as compared with the aforementioned setting. Moreover, the unnecessarily assignment of 2 GB memory is avoided, which contributes to higher computer resources use efficiency than the aforementioned setting.
[0172] The following describes the effect of the information obtained as a result of the conversion as depicted in
[0173] The numerical inversion label information depicted in
[0174] Accordingly, in the case of using the numerical inversion label information depicted in
[0175] For example, the naive Bayes algorithm handles discrete values as an estimation target. When handling numerical information which is a continuous quantity as an estimation target, the naive Bayes algorithm interprets all values as discontinuous discrete values.
[0176] The operation of interpretation as discontinuous discrete values is not an operation that is supposed to be performed by the naive Bayes algorithm. In the case of interpreting the information as discontinuous discrete values, the naive Bayes algorithm performs overfitting or the like in the estimation process. Overfitting or the like degrades the accuracy of the estimate of the amount of computer resources by the naive Bayes algorithm.
[0177] The numerical inversion label information output from the input data conversion unit 101 in this exemplary embodiment includes the numerical value converted from a continuous quantity to a discrete quantity by function f, as the label information. In the case where the numerical inversion label information including the label information is an estimation target, the computer resources usage estimation unit 103 can use an algorithm, such as the naive Bayes algorithm, that can only handle discrete values as estimates. The possibility that the computer resources usage estimation unit 103 can accurately estimate the amount of computer resources required for task processing using the naive Bayes algorithm is thus increased.
[0178] By adjusting function f, the computer resources usage estimation device 100 can obtain an estimate of appropriate resolution. For example, the computer resources usage estimation device 100 can estimate a large estimate without being affected by a slight change, by using a logarithmic function as function f. This increases the possibility that the amount of computer resources is estimated to an appropriate degree in conformity with the status of the distributed processing system.
[0179] The following describes an overview of the present invention.
[0180] With such a structure, the output device can provide information in a format suitable for a model that estimates the amount of computer resources required for task processing in a distributed processing system.
[0181] The estimation model application information may include word-containing information having binary information that indicates whether or not a character string indicated by the character string information included in the job feature information includes a prescribed word.
[0182] With such a structure, the output device can provide information indicating whether or not a job name or a class name includes a prescribed word.
[0183] The estimation model application information may include numerical inversion label information having, as string label information, a value derived by converting, by a prescribed function, the numeric value indicated by the numerical information included in the job feature information.
[0184] With such a structure, the output device can provide information including string label information that can be easily handled by the estimation model.
[0185] The output device 10 may include a form conversion unit (for example, the estimate reverse conversion unit 104) for outputting the estimation model application information output from the estimation model, in a same format as the job feature information corresponding to the estimation model application information.
[0186] With such a structure, the output device can provide information of computer resources usage in a format desired by the user.
[0187] The output device 10 may include a computer resources estimation unit (for example, the computer resources usage estimation unit 103) for estimating the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, by feeding the estimation model application information output from the output unit 11 on the basis of the job feature information into the estimation model.
[0188] With such a structure, the output device can estimate computer resources usage on the basis of estimation model application information.
[0189] The output device 10 may include a computer resources estimation model generation unit (for example, the computer resources usage estimation model generation unit 102) for generating the estimation model for estimating the amount of computer resources required for processing the task included in the job corresponding to the job feature information in the distributed processing system, using the estimation model application information output from the output unit 11 on the basis of the job feature information.
[0190] With such a structure, the output device can generate a computer resources usage estimation model on the basis of estimation model application information.
[0191]
[0192] With such a structure, the data structure can provide information in a format suitable for a model that estimates the amount of computer resources required for task processing in a distributed processing system.
[0193] The estimation model application information may include word-containing information having binary information that indicates whether or not a character string indicated by the character string information included in the job feature information includes a prescribed word.
[0194] With such a structure, the data structure can provide information indicating whether or not a job name or a class name include a prescribed word.
[0195] The estimation model application information may include numerical inversion label information having, as string label information, a value derived by converting, by a prescribed function, the numeric value indicated by the numerical information included in the job feature information.
[0196] With such a structure, the data structure can provide information including string label information that can be easily handled by the estimation model.
[0197] Although the present invention has been described with reference to the above exemplary embodiments and examples, the present invention is not limited to the above exemplary embodiments and examples. Various changes understandable by those skilled in the art can be made to the structures and details of the present invention within the scope of the present invention.
[0198] This application claims priority based on Japanese Patent Application No. 2015-010492 filed on Jan. 22, 2015, the disclosure of which is incorporated herein in its entirety.
REFERENCE SIGNS LIST
[0199] 10 output device
[0200] 11 output unit
[0201] 100 computer resources usage estimation device
[0202] 101 input data conversion unit
[0203] 102 computer resources usage estimation model generation unit
[0204] 103 computer resources usage estimation unit
[0205] 104 estimate reverse conversion unit