STORAGE MEDIUM, MODEL REDUCTION APPARATUS, AND MODEL REDUCTION METHOD
20230162035 · 2023-05-25
Assignee
Inventors
Cpc classification
G06N3/082
PHYSICS
G06N3/043
PHYSICS
International classification
Abstract
A non-transitory computer-readable storage medium storing a model reduction program that causes at least one computer to execute a process, the process includes identifying as deletion targets a first neuron that does not connect to an input layer in a neural network; identifying as deletion targets a second neuron that does not connect to an output layer in a neural network; combining a bias of the first neuron with a bias of a third neuron connected to the first neuron on an output side; and deleting the first neuron and the second neuron from the neural network.
Claims
1. A non-transitory computer-readable storage medium storing a model reduction program that causes at least one computer to execute a process, the process comprising: identifying as deletion targets a first neuron that does not connect to an input layer in a neural network; identifying as deletion targets a second neuron that does not connect to an output layer in a neural network; combining a bias of the first neuron with a bias of a third neuron connected to the first neuron on an output side; and deleting the first neuron and the second neuron from the neural network.
2. The non-transitory computer-readable storage medium according to claim 1, wherein the identifying the first neuron includes correcting a weight on the output side of the first neuron to 0, the identifying the second neuron includes correcting a weight on an input side of the second neuron to 0, wherein the process further comprising deleting a neuron all weights of which on an input side and on an output side are 0 from the neural network.
3. The non-transitory computer-readable storage medium according to claim 2, wherein the process further comprising: identifying the first neuron as the deletion target in a forward search from the input layer toward the output layer in the neural network; and identifying the second neuron as the deletion target in a backward search from the output layer toward the input layer in the neural network.
4. The non-transitory computer-readable storage medium according to claim 2, wherein the identifying the first neuron and the identifying the second neuron includes correcting corresponding elements to 0 in a parameter table in which a weight between connected neurons is stored in an element of a matrix in which one of the connected neurons is assigned to a row and another of the connected neurons is assigned to a column.
5. The non-transitory computer-readable storage medium according to claim 4, wherein the deleting the neuron all the weights of which on the input side and on the output side are 0 includes deleting, in the parameter table, a row and a column that correspond to the weight of the neuron that is the deletion target.
6. The non-transitory computer-readable storage medium according to claim 1, wherein the combining includes adding, to the bias of the third neuron, a value obtained by multiplying the bias of the first neuron by a weight between the first neuron and the third neuron.
7. The non-transitory computer-readable storage medium according to claim 1, wherein the combining includes adding, to the bias of the third neuron, a value obtained by multiplying by a weight between the first neuron and the third neuron a value obtained by applying an activation function of the first neuron to the bias of the first neuron.
8. A model reduction apparatus comprising: one or more memories; and one or more processors coupled to the one or more memories and the one or more processors configured to: identify as deletion targets a first neuron that does not connect to an input layer in a neural network, identifying as deletion targets a second neuron that does not connect to an output layer in a neural network, combining a bias of the first neuron with a bias of a third neuron connected to the first neuron on an output side, and deleting the first neuron and the second neuron from the neural network.
9. The model reduction apparatus according to claim 8, wherein the one or more processors are further configured to: correct a weight on the output side of the first neuron to 0, correct a weight on an input side of the second neuron to 0, and delete a neuron all weights of which on an input side and on an output side are 0 from the neural network.
10. The model reduction apparatus according to claim 9, wherein the one or more processors are further configured to: identify the first neuron as the deletion target in a forward search from the input layer toward the output layer in the neural network, and identify the second neuron as the deletion target in a backward search from the output layer toward the input layer in the neural network.
11. The model reduction apparatus according to claim 9, wherein the one or more processors are further configured to correct corresponding elements to 0 in a parameter table in which a weight between connected neurons is stored in an element of a matrix in which one of the connected neurons is assigned to a row and another of the connected neurons is assigned to a column.
12. The model reduction apparatus according to claim 11, wherein the one or more processors are further configured to delete, in the parameter table, a row and a column that correspond to the weight of the neuron that is the deletion target.
13. A model reduction method for a computer to execute a process comprising: identifying as deletion targets a first neuron that does not connect to an input layer in a neural network; identifying as deletion targets a second neuron that does not connect to an output layer in a neural network; combining a bias of the first neuron with a bias of a third neuron connected to the first neuron on an output side; and deleting the first neuron and the second neuron from the neural network.
14. The model reduction method according to claim 13, wherein the identifying the first neuron includes correcting a weight on the output side of the first neuron to 0, the identifying the second neuron includes correcting a weight on an input side of the second neuron to 0, wherein the process further comprising deleting a neuron all weights of which on an input side and on an output side are 0 from the neural network.
15. The model reduction method according to claim 14, wherein the process further comprising: identifying the first neuron as the deletion target in a forward search from the input layer toward the output layer in the neural network; and identifying the second neuron as the deletion target in a backward search from the output layer toward the input layer in the neural network.
16. The model reduction method according to claim 14, wherein the identifying the first neuron and the identifying the second neuron includes correcting corresponding elements to 0 in a parameter table in which a weight between connected neurons is stored in an element of a matrix in which one of the connected neurons is assigned to a row and another of the connected neurons is assigned to a column.
17. The model reduction method according to claim 16, wherein the deleting the neuron all the weights of which on the input side and on the output side are 0 includes deleting, in the parameter table, a row and a column that correspond to the weight of the neuron that is the deletion target.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
DESCRIPTION OF EMBODIMENTS
[0028] When only parameters the degree of influence of which is small are deleted as in the related-art model size-reduction technique, useless parameters may remain due to the configuration of a network. In this case, calculation efficiency of inference by the generated model decreases. When parameters the influence of which is small are simply deleted, information useful for inference may be lost, and accuracy of the model after the deletion of the parameters may degrade.
[0029] In one aspect, an object of the disclosed technique is to improve an effect of size reduction of a machine learning model while suppressing degradation in accuracy of the machine learning model.
[0030] In the one aspect, an effect is obtained in which the effect of the size reduction of the machine learning model may be improved while suppressing degradation in the accuracy of the machine learning model.
[0031] Hereinafter, an example of an embodiment according to the disclosed technique will be described with reference to the drawings.
[0032] As illustrated in
[0033] An example of the existing model size-reduction technique will be described with reference to
[0034] As illustrated in
[0035] Each neuron also has a bias as a parameter. For example, in a case where a value y output from a neuron is calculated by a simple linear function (y=ax+b), b is a bias term. Here, x is a value output from a neuron in a previous stage, and a is a weight between the neuron in the previous stage and a target neuron. The bias is a constant value that is obtained as a result of machine learning and does not depend on the input. In a case where a weight between a neuron (for example, I) for which input does not exist as described above and a neuron coupled, on the output side, to this neuron without input is simply deleted, a way for conveying information on the bias of the neuron without input to the neuron on the output side is lost. As a result, information useful for inference may be lost, and accuracy of the model after the size reduction may degrade.
[0036] Thus, according to the present embodiment, the size of a model is reduced by deleting the parameters so that the effect of the model size reduction may be improved while suppressing degradation in the accuracy of the model. Hereinafter, a functional configuration of the model reduction apparatus 10 according to the present embodiment will be described in detail. Hereinafter, as illustrated in
[0037] As illustrated in
[0038] The correction unit 12 obtains parameter tables input to the model reduction apparatus 10.
[0039] In the neural network, the correction unit 12 identifies, as deletion targets, first neurons without a coupling from the input layer and second neurons without a coupling to the output layer. Then, in the parameter tables, the correction unit 12 corrects the output weights of the first neurons to 0 and the input weights of the second neurons to 0.
[0040] For example, as illustrated in
[0041] Similarly, as illustrated in
[0042] For the first neurons identified as the deletion targets in the forward search, the correction unit 12 notifies the compensation unit 14 so that the compensation unit 14 executes a process of compensating for the biases of the identified neurons.
[0043] Based on the notification from the correction unit 12, the compensation unit 14 compensates for biases of the first neurons as the deletion targets by combining the biases of the first neurons with biases of third neurons coupled to the first neurons on the output side. For example, the compensation unit 14 combines the biases by adding values obtained by multiplying the biases of the first neurons by the weights between the first neurons and the third neurons to the biases of the third neurons.
[0044] For example, a case is described in which the neuron I the bias of which is b.sub.I is identified as the deletion-target first neuron. As illustrated in a one-dot chain line portion of an upper section of
b.sub.L<−b.sub.L+w.sub.I,L.sup.(3)b.sub.I,b.sub.M<−b.sub.M+w.sub.I,M.sup.(3)b.sub.I
[0045] The deletion unit 16 deletes the identified deletion-target neurons from the neural network. The input weights and the output weights of the deletion-target neurons are all 0 in the parameter tables. For example, the deletion unit 16 deletes rows and columns corresponding to the weights of the deletion-target neurons in the parameter tables. For example, in a case where the neuron i in the (n−1)th layer is the deletion target, the deletion unit 16 deletes the row of the neuron i all the weights of which are 0 in the parameter table of the (n−1)th layer and the column of the neuron i all the weights of which are 0 in the parameter table of the nth layer.
[0046] For example, as illustrated in a left section of
[0047] The model reduction apparatus 10 may be realized by using, for example, a computer 40 illustrated in
[0048] The storage unit 43 may be realized by using a hard disk drive (HDD), a solid-state drive (SSD), a flash memory, or the like. The storage unit 43 serving as a storage medium stores a model reduction program 50 for causing the computer 40 to function as the model reduction apparatus 10. The model reduction program 50 includes a correction process 52, a compensation process 54, and a deletion process 56.
[0049] The CPU 41 reads the model reduction program 50 from the storage unit 43, loads the read model reduction program 50 on the memory 42, and sequentially executes the processes included in the model reduction program 50. The CPU 41 executes the correction process 52 to operate as the correction unit 12 illustrated in
[0050] The functions realized by the model reduction program 50 may instead be realized by, for example, a semiconductor integrated circuit, in more detail, an application-specific integrated circuit (ASIC) or the like.
[0051] Next, operations of the model reduction apparatus 10 according to the present embodiment will be described. When parameter tables which represents a neural network and from which a subset of parameters have been deleted by using the existing model size-reduction technique are input to the model reduction apparatus 10, a model reduction process illustrated in
[0052] In step S10, the correction unit 12 obtains parameter tables input to the model reduction apparatus 10. Next, in step S20, the correction unit 12 executes a forward weight correction process, identifies the first neurons without a coupling from the input layer as the deletion targets, and corrects the output weights of the first neurons to 0 in the parameter tables. In so doing, the compensation unit 14 executes the process of compensating for the biases of the first neurons. Next, in step S40, the correction unit 12 executes a backward weight correction process, identifies the second neurons without a coupling to the output layer as the deletion targets, and corrects the input weights of the second neurons to 0 in the parameter tables. Next, in step S60, the deletion unit 16 executes a deletion process to delete the deletion-target neurons from the neural network. Hereinafter, each of the forward weight correction process, the backward weight correction process, and the deletion process will be described in detail.
[0053] First, the forward weight correction process will be described with reference to
[0054] In step S21, the correction unit 12 sets a variable n that identifies a hierarchical layer to be processed in the neural network to 2. Next, in step S22, the correction unit 12 determines whether n exceeds N representing the number of hierarchical layers of the neural network. In a case where n does not exceed N, the process proceeds to step S23.
[0055] In step S23, the correction unit 12 obtains a list {c.sub.i} of the neurons all the input weights of which are 0 in the (n−1)th layer. The number of the neuron in the (n−1)th layer is represented by i, and i=1, 2, . . . , (I.sub.n−1 is the number of neurons in the (n−1)th layer). The numbers of the neurons all the input weights of which are 0 out of the neurons in the (n−1)th layer are represented by c.sub.i. For example, the correction unit 12 adds the numbers of the neurons corresponding to the rows in which all the weights are 0 to the list in the parameter table of the (n−1)th layer and obtains {c.sub.i}.
[0056] Next, in step S24, the correction unit 12 sets i to 1. Next, in step S25, the correction unit 12 determines whether i exceeds the maximum value C.sub.n−1 of the numbers of the neurons included in the list {c.sub.i}. In a case where i does not exceed C.sub.n−1, the process moves to step S26. In step S26, the correction unit 12 sets j to 1. The number of the neurons in the nth layer is j, and j=1, 2, . . . , J.sub.n (J.sub.n is the number of neurons in the nth layer). Next, in step S27, the correction unit 12 determines whether j exceeds J.sub.n. In a case where j does not exceed J.sub.n, the process moves to step S28.
[0057] In step S28, the compensation unit 14 compensates for the bias of the ith neuron in the (n−1)th layer by combining the bias of the ith neuron in the (n−1)th layer with the bias of the jth neuron in the nth layer. For example, the compensation unit 14 calculates the bias of the jth neuron in the nth layer like b.sub.j<−b.sub.j+w.sub.c_i,j.sup.(n)b.sub.i and updates the value in the column of the bias of the row corresponding to the jth neuron in the parameter table of the nth layer. Next, in step S29, the correction unit 12 deletes the output weight from the ith neuron in the (n−1)th layer to the jth neuron in the nth layer. For example, the correction unit 12 corrects the weight w.sub.c_i,j stored in the parameter table of the nth layers to 0. Thus, both the input weight and the output weight of the ith neuron in the (n−1)th layer are 0. Although the notation “c_i” is different from c.sub.i for the reason of notation by using subscript, c_i=c.sub.i. This similarly applies to c_j to be described later.
[0058] Next, in step S30, the correction unit 12 increments j by one, and the process returns to step S27. In a case where j exceeds J.sub.n in step S27, the process moves to step S31. In step S31, the correction unit 12 increments i by one, and the process returns to step S25. In a case where i exceeds C.sub.n−1 in step S25, the process moves to step S32. In step S32, the correction unit 12 increments n by one, and the process returns to step S22. In step S22, in a case where n exceeds N, the forward weight correction process ends, and the processing returns to the model reduction process (
[0059] In a case where there is no coupling relationship between the ith neuron in the (n−1)th layer and the jth neuron in the nth layer, the processing in steps S28 and S29 described above is skipped. In a case where i is not included in the list {c.sub.i}, for example, in a case where any of the input weights of the ith neuron in the (n−1)th layer is not 0, the processing in steps S27 to S30 described above is skipped. Then, in step S31 described above, the correction unit 12 may increment i by one, and the process may return to step S25.
[0060] Next, the backward weight correction process will be described with reference to
[0061] In step S41, the correction unit 12 sets the variable n that identifies a hierarchical layer to be processed in the neural network to N−1. Next, in step S42, the correction unit 12 determines whether n is smaller than two. In a case where n is greater than or equal to two, the process moves to step S43.
[0062] In step S43, the correction unit 12 obtains a list {c.sub.j} of the neurons all the output weights of which are 0 in the nth layer. The numbers of the neurons all the output weights of which are 0 out of the neurons in the nth layer are represented by c.sub.j. For example, the correction unit 12 adds the numbers of the neurons corresponding to the columns in which all the weights are 0 to the list in the parameter table of the n+1 layer and obtains {c.sub.j}.
[0063] Next, in step S44, the correction unit 12 sets j to 1. Next, in step S45, the correction unit 12 determines whether j exceeds the maximum value C.sub.n of the numbers of the neurons included in the list {C.sub.j}. In a case where j does not exceed C.sub.n, the process moves to step S46. In step S46, the correction unit 12 sets i to 1. Next, in step S47, the correction unit 12 determines whether i exceeds I.sub.n−1. In a case where i does not exceed the I.sub.n−1, the process moves to step S49.
[0064] In step S49, the correction unit 12 deletes the input weight from the ith neuron in the (n−1)th layer to the jth neuron in the nth layer. For example, the correction unit 12 corrects the weight w.sub.i,c_j.sup.(n) stored in the parameter table of the nth layers to 0. Thus, both the input weight and the output weight of the jth neuron in the nth layer are 0.
[0065] Next, in step S50, the correction unit 12 increments i by one, and the process returns to step S47. In a case where i exceeds I.sub.n−1 in step S47, the process moves to step S51. In step S51, the correction unit 12 increments j by one, and the process returns to step S45. In a case where j exceeds C.sub.n in step S45, the process moves to step S52. In step S52, the correction unit 12 decrements n by one, and the process returns to step S42. In step S42, in a case where n becomes smaller than two, the backward weight correction process ends, and the processing returns to the model reduction process (
[0066] In a case where there is no coupling relationship between the ith neuron in the (n−1)th layer and the jth neuron in the nth layer, the processing in step S49 described above is skipped. In a case where j is not included in the list {c.sub.j}, for example, in a case where any of the output weights of the jth neuron in the nth layer is not 0, the processing in steps S47 to S50 described above is skipped. Then, in step S51 described above, the correction unit 12 may increment j by one, and the process may return to step S45.
[0067] Next, the deletion process will be described with reference to
[0068] In step S61, the deletion unit 16 sets the variable n that identifies a hierarchical layer to be processed in the neural network to 2. Next, in step S62, the deletion unit 16 determines whether n exceeds N representing the number of hierarchical layers of the neural network. In a case where n does not exceed N, the process proceeds to step S63.
[0069] In step S63, the deletion unit 16 obtains the list {c.sub.i} of the neurons all the input weights of which are 0 out of the neurons in the (n−1)th layer. For example, the deletion unit 16 adds the numbers of the neurons corresponding to the rows in which all the weights are 0 to the list in the parameter table of the (n−1)th layer and obtains {c.sub.i}. Next, in step S64, the deletion unit 16 obtains a list {d.sub.i} of the neurons all the output weights of which are 0 out of the neurons in the (n−1)th layer. For example, the deletion unit 16 adds the numbers of the neurons corresponding to the columns in which all the weights are 0 to the list in the parameter table of the nth layer and obtains {d.sub.i}.
[0070] Next, in step S65, the deletion unit 16 obtains a list {e.sub.i} that includes elements which are shared between the list {c.sub.i} and the list {d.sub.i}. For example, the list {e.sub.i} stores the numbers of the neurons all the input weights and all the output weights of which are 0 out of the neurons in the (n−1)th layer. Next, in step S66, the deletion unit 16 obtains a difference set {f.sub.i} between the list {e.sub.i} and a list {f.sub.i} that includes all the numbers of the neurons in the (n−1)th layer. For example, the list {f.sub.i} stores the numbers of the neurons that are not deletion target out of the neurons in the (n−1)th layer.
[0071] Next, in step S67, the deletion unit 16 updates so that the weight w.sub.h,f_i.sup.(n−1) becomes w.sub.h,i′.sup.(n−1) in the parameter table of the (n−1)th layer and updates so that the weight w.sub.f_i,j.sup.(n) becomes w.sub.i′,j.sup.(n) in the parameter table of the nth layer. Here, h is the numbers (h=1, 2, . . . ) of the neurons in the (n−2)th layer, and i′ is numbers newly assigned, like 1, 2, . . . , for the numbers included in {f.sub.i}. Thus, for example, in a case where {f.sub.i}={1,3}, the third row of the parameter table of the (n−1)th layer becomes the second row of the parameter table after the deletion, and the third column of the parameter table of the nth layer is the second column of the parameter table after the deletion. For example, the rows of the parameter table of the (n−1)th layer and the columns of the parameter table of the nth layer corresponding to the neurons of the numbers included in the list {e.sub.i} are deleted.
[0072] Next, in step S68, the deletion unit 16 increments n by one, and the process returns to step S62. In step S62, in a case where n exceeds N, the deletion process ends, and the processing returns to the model reduction process (
[0073] As described above, in the neural network, the model reduction apparatus according to the present embodiment identifies, as the deletion targets, first neurons without a coupling from the input layer and second neurons without a coupling to the output layer. The model reduction apparatus compensates for the biases of the first neurons by combining the biases of the first neurons with the biases of the third neurons coupled to the first neurons on the output side. The model reduction apparatus deletes the identified deletion-target neurons from the neural network. Thus, the effect of the size reduction of the machine learning model may be improved while suppressing degradation in the accuracy of the machine learning model.
[0074] As the process of compensating for the biases of the first neurons, the case is described in which the values obtained by multiplying the biases of the first neurons by the weights between the first neurons and the third neurons are added to the biases of the third neurons according to the above-described embodiment. However, this is not limiting. For example, the values obtained by multiplying values obtained by applying activation functions of the first neurons to the biases of the first neurons by the weights between the first neurons and the third neurons may be added to the biases of the third neurons. In this case, the model reduction apparatus obtains, for example, a layer information table and a function table as illustrated in
b.sub.j<−b.sub.j+w.sub.i,jf(b.sub.i)
[0075] The above-described embodiment may also be applied to a neural network having a configuration including a convolution layer. In this case, the model reduction apparatus obtains, for example, a layer information table and parameter tables as illustrated in
[0076] In the case of the parameter table of the convolution layer, the model reduction apparatus identifies, as the neurons the input weights and the output weights of which are 0, the neurons corresponding to rows or columns in which all the weights including the weights of the elements of the filter are 0. For example, in the case of a left section of
[0077] The case is described where the parameter table in which a subset of parameters has been deleted by using the existing model size-reduction technique is input to the model reduction apparatus according to the above-described embodiment. However, a parameter table before the model size reduction may be input. In this case, the model reduction apparatus may also have the function of the existing model size-reduction technique.
[0078] An example of the relationship between the model size-reduction rate and accuracy in the case where the disclosed technique is applied is described. Here, VGG-19-BN of VGGNet having a layer configuration as illustrated in
[0079] Regarding the reduced data size in each layer of the neural network in the above example,
[0080] Although a form is described in which the model reduction program is stored (installed) in advance in the storage unit according to the above embodiment, this is not limiting. The program according to the disclosed technique may be provided in a form in which the program is stored in a storage medium such as a compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD)-ROM, or a Universal Serial Bus (USB) memory.
[0081] Regarding the above-described embodiment, the following appendices are further disclosed.
[0082] All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.