INFORMATION PROCESSING DEVICE
20230078893 · 2023-03-16
Assignee
Inventors
- Masahiro Mori (Nisshin-shi Aichi-ken, JP)
- Hiroaki Takada (Nagoya-shi Aichi-ken, JP)
- Shinya Honda (Nagoya-shi Aichi-ken, JP)
Cpc classification
International classification
Abstract
An information processing device used for a convolutional neural network includes a processor configured to acquire input data and process the input data by using a convolution layer that executes convolution processing and a pooling layer that executes pooling processing, in which the processor is configured to divide the acquired input data into processing areas having an overlapping area in which processing areas overlap and a non-overlapping area in which processing areas do not overlap, and the processor is configured to, when the processor executes processing of the input data in the processing area, execute the convolution processing or the pooling processing in the non-overlapping area, and execute the processing by reusing a processing result of the convolution processing or a processing result of the pooling processing in the overlapping area.
Claims
1. An information processing device used for a convolutional neural network, the information processing device comprising a processor configured to acquire input data, and process the input data by using a convolution layer that executes convolution processing and a pooling layer that executes pooling processing, wherein: the processor is configured to divide the acquired input data into processing areas having an overlapping area in which processing areas overlap and a non-overlapping area in which processing areas do not overlap; and the processor is configured to, when the processor executes processing of the input data in the processing area, execute the convolution processing or the pooling processing in the non-overlapping area, and execute the processing by reusing a processing result of the convolution processing or a processing result of the pooling processing in the overlapping area.
2. The information processing device according to claim 1, wherein: the input data is time-series data; and the processor is configured to divide the time-series data into the processing areas at a fixed interval, and divide the processing area to have the overlapping area and the non-overlapping area.
3. The information processing device according to claim 1, wherein: the processor is configured to execute the processing of the input data by using a plurality of processing layers including the convolution layer and the pooling layer in a preceding stage of a fully connected layer; the processor is configured to sequentially execute the convolution processing or the pooling processing from a first layer to a final layer of the processing layers during processing of a first cycle; and the processor is configured to, during processing of second and subsequent cycles of the processing, execute the convolution processing or the pooling processing in the non-overlapping area of a previous cycle and a current cycle from the first layer to the final layer, and execute the processing of the input data by reusing a processing result of the convolution processing in the previous cycle or a processing result of the pooling processing in the previous cycle in the overlapping area of the previous cycle and the current cycle.
4. The information processing device according to claim 3, wherein: the final layer creates output data to be input to the fully connected layer; the processor is configured to create the output data by reusing the processing result of the convolution processing in the previous cycle or the processing result of the pooling processing in the previous cycle in the overlapping area of the previous cycle and the current cycle; the processor is configured to create the output data by sequentially executing the convolution processing or the pooling processing from the first layer to the final layer in the non-overlapping area of the previous cycle and the current cycle; and the processor is configured to input the output data to the fully connected layer when all the output data in the processing area of the current cycle is created in the final layer.
5. The information processing device according to claim 4, wherein the processor is configured to sequentially execute the processing when data that is processable by a kernel is prepared in the non-overlapping area of the previous cycle and the current cycle.
6. The information processing device according to claim 1, wherein the processor is mounted in a vehicle.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] Features, advantages, and technical and industrial significance of exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which like signs denote like elements, and wherein:
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
DETAILED DESCRIPTION OF EMBODIMENTS
[0026] Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. The same or corresponding parts in the drawings are designated by the same reference numerals, and the description thereof will not be repeated.
[0027]
[0028] The storage device 30 is configured to include, for example, a read only memory (ROM) and a random access memory (RAM). The storage device 30 stores a program and the like executed by the processor 20. The communication device 40 is configured to allow bidirectional communication between an external device and the processor 20.
[0029] The processor 20 includes a data acquisition unit 21, a processing unit 23, and an output unit 25. The processor 20 functions as the data acquisition unit 21, the processing unit 23, and the output unit 25 by executing a program stored in the storage device 30. The processor 20 may include a buffer during processing data (input data) received from the data acquisition unit 21 by using the CNN, and may use the storage device 30 as the buffer.
[0030] The data acquisition unit 21 acquires time-series data 100 detected by various sensors 50 or created based on values detected by the various sensors 50. The time-series data 100 may be, for example, the motion state of the vehicle V (front-rear acceleration, lateral acceleration, vehicle speed, and the like), the rotation speed and the exhaust temperature of the internal combustion engine E, and the like, and may be time-series data related to the vehicle V. The data acquisition unit 21 acquires the time-series data 100 at a predetermined cycle, and outputs the acquired time-series data 100 to the processing unit 23.
[0031] The processing unit 23 processes the time-series data 100 (input data) received from the data acquisition unit 21 by using the CNN, and outputs the identification result (inference result) with respect to the input data to the output unit 25.
[0032]
[0033] The fully connected layer 235 includes an input layer, an intermediate layer, and an output layer. The input layer is constituted with a plurality of units. The output of the pooling layer 234 converted into one dimension is input to each unit.
[0034] The intermediate layer is constituted with a plurality of layers. Although
[0035] The output layer is constituted with one or more units. The number of units in the output layer can be changed as appropriate. Each unit in the output layer is connected to each unit of the final layer of the intermediate layer. Each unit of the output layer receives the output value from each unit of the final layer of the intermediate layer, multiplies each output value by weight, and integrates the multiplication results. The multiplication results are input to a predetermined activation function (for example, a ramp function or a sigmoid function). The output value of the activation function indicates, for example, a probability.
[0036] Generally, when the input data is the time-series data 100, in the processing using the CNN, the time-series data 100 (input data) acquired by the data acquisition unit 21 is divided at a fixed interval (fixed cycle), and as the processing area of the CNN (operation area), the first processing (processing of the convolution layer in the present embodiment) is executed.
[0037] As described above, when the processing area is divided at the fixed interval T and processing of the CNN is executed, in a case where the time-series data 100 of which the feature is well represented exists between the first processing area 1 and the second processing area 2, there is a concern that the feature of the time-series data 100 cannot be inferred with good precision.
[0038]
[0039] Although
[0040] As described above, by allowing the overlapping of the processing areas in this way, since the area in which the time-series data 100 of which the feature is well represented exists can be reliably covered and the execution frequency of inference by the CNN can be increased, it is possible to infer the feature of the time-series data 100 with good precision. However, when inference by the CNN is executed for each processing area, there arises a problem that the processing amount (operation amount) increases.
[0041]
[0042] When the data acquisition unit 21 acquires the input data (the time-series data 100) and the time-series data 100 of the first processing area 1 is prepared, the first layer (the convolution layer 231) starts convolution processing by using a first layer kernel (filter) 231f. For example, a product-sum operation by using the first layer kernel 231f is executed while the first layer kernel 231f is sequentially slid, and the processing result (processing data) is stored in a first layer buffer 231b. When the processing of the first layer (the convolution layer 231) is completed, the processing of the second layer (the pooling layer 232) is executed.
[0043] In the second layer (the pooling layer 232), a second layer kernel (window) 232c is sequentially slid with respect to the processing data stored in the first layer buffer 231b to execute MAX pooling, and the processing result (processing data) is stored in a second layer buffer 232b. When the processing of the second layer (the pooling layer 232) is completed, the processing of the third layer (the convolution layer 233) is executed.
[0044] The processing of the third layer (the convolution layer 233) and the final layer (the pooling layer 234) is executed in the same manner as described above. Convolution processing by using a third layer kernel (filter) 233f is executed with respect to the processing data stored in the second layer buffer 232b, and the processing result is stored in a third layer buffer 233b. Further, MAX pooling by using a final layer kernel (window) 234c is executed with respect to the processing data stored in the third layer buffer 233b, and the processing result is stored in a final layer buffer 234b. Then, when the processing of the final layer (the pooling layer 234) is completed, the processing data stored in the final layer buffer 234b is input to the fully connected layer 235.
[0045] When the processing from the first layer (the convolution layer 231) to the final layer (the pooling layer 234) is completed in the first processing area 1, the same processing is executed in the second processing area 2. As described above, in related art, inference by the CNN is repeatedly executed sequentially for each processing area.
[0046]
[0047] In the present embodiment, in the processing of the first processing area 1, when the data acquisition unit 21 acquires the input data (the time-series data 100) in the first layer (the convolution layer 231), and the time-series data 100 capable of the product-sum operation using the first layer kernel (filter) 231f is prepared, convolution processing is started by using the first layer kernel 231f. For example, at first, when the same number (or the same number or more) of the time-series data 100 as the size of the first layer kernel 231f is prepared, the product-sum operation is executed, and the processing result (processing data) is stored in the first layer buffer 231b. In the next and subsequent processing (operation), when the time-series data 100 corresponding to the slide amount of the first layer kernel 231f is added, and the time-series data 100 capable of the product-sum operation using the first layer kernel 231f is prepared, the product-sum operation is executed, and the processing result (processing data) is stored in the first layer buffer 231b. As described above, in the first layer (the convolution layer 231), when the time-series data 100 capable of the product-sum operation using the first layer kernel 231f is prepared, the product-sum operation is sequentially executed, and the processing result (processing data) is stored in the first layer buffer 231b.
[0048] In the second layer (the pooling layer 232), the number of the processing data (the processing result of the first layer) stored in the first layer buffer 231b is the number that allows MAX pooling using the second layer kernel (window) 232c, the pooling processing is executed. For example, at first, when the same number (or the same number or more) of the processing data as the size of the second layer kernel 232c is prepared in the first layer buffer 231b, MAX pooling is executed, and the processing result (processing data) is stored in the second layer buffer 232b. In the next and subsequent processing (operation), when the processing data corresponding to the slide amount of the second layer kernel 232c is added to the first layer buffer 231b, and the processing data capable of MAX pooling using the second layer kernel 232c is prepared, MAX pooling is executed, and the processing result (processing data) is stored in the second layer buffer 232b. As described above, even in the second layer (the pooling layer 232), when the processing data capable of MAX pooling using the second layer kernel 232c is prepared in the first layer buffer 231b, the pooling processing is sequentially executed, and the processing result (processing data) is stored in the second layer buffer 232b.
[0049] The processing of the third layer (the convolution layer 233) and the final layer (the pooling layer 234) is also executed in the same manner as described above. When the processing data stored in the second layer buffer 232b is a state capable of convolution processing using the third layer kernel (filter) 233f, the convolution processing is sequentially executed, and the processing result is stored in the third layer buffer 233b. Further, when the processing data stored in the third layer buffer 233b is a state capable of MAX pooling using the final layer kernel 234c, the pooling processing is sequentially executed, and the processing result is stored in the final layer buffer 234b. Then, when the processing of the final layer (the pooling layer 234) in the first processing area 1 is completed, the processing data stored in the final layer buffer 234b is input to the fully connected layer 235.
[0050] In the non-overlapping area of the second processing area 2 (the second processing area following the overlapping area shown by diagonal lines in
[0051] In the second processing area 2, in the overlapping area of the first processing area 1 and the second processing area 2 shown by the diagonal lines, the processing result of the first processing area 1 is reused. In the second processing area 2, the processing of the first layer (the convolution layer 231) to the final layer (the pooling layer 234) with respect to the time-series data 100 in the overlapping area is not executed, and the processing result of the first processing area 1 is reused by adding the processing result (processing data) processed (operated) in the first processing area 1 by using the time-series data 100 in the overlapping area and stored in the final layer buffer 234b to the processing result (processing data) with respect to the time-series data 100 in the non-overlapping area. When the processing of the second processing area 2 is completed, the processing result (processing data) processed (operated) in the first processing area 1 by using the time-series data 100 in the overlapping area, and the processing result (processing data) with respect to the time-series data 100 in the non-overlapping area are stored in the final layer buffer 234b, and thus the processing results (processing data) are input to the fully connected layer 235.
[0052] The processing of the third processing area 3 and after subsequent areas is executed in the same manner as the processing in the second processing area 2, and the processing result of the second processing area 2 is reused in the overlapping area of the second processing area 2 and the third processing area 3. In
[0053] In the present embodiment, when the time-series data 100 (input data) acquired by the data acquisition unit 21 is divided at the fixed interval T and the processing area is set, the overlapping of the processing areas is allowed, and the overlapping area is set. By setting the overlapping area, since the area in which the time-series data 100 of which the feature is well represented exists can be reliably covered and the execution frequency of inference by the CNN can be increased, it is possible to infer the feature of the time-series data 100 with good precision.
[0054] In the present embodiment, when the processing in the processing area is executed, the processing result is reused in the overlapping area. That is, the processing result of the overlapping area of the previous cycle is output as the processing result of the overlapping area of the current cycle. As a result, the processing (operation) amount of the CNN can be reduced.
[0055] In the present embodiment, in the processing of the first processing area 1 that is the first cycle, the processing (operation) is sequentially executed from the first layer (the convolution layer 231) to the final layer (the pooling layer 234). During the processing of the second processing area 2 and subsequent areas, which is the processing of the second cycle and subsequent cycles, in the non-overlapping area of the previous cycle and the current cycle, the processing (operation) is executed from the first layer (the convolution layer 231) to the final layer (the pooling layer 234 is executed sequentially, and in the overlapping area of the previous cycle and the current cycle, the processing result of the previous cycle is reused. As a result, it is possible to sequentially execute the processing from the first layer to the final layer with respect to the time-series data 100 that is the input data, and there is no waiting time for the processing, so that the processing time can be shortened.
[0056] In the present embodiment, the final layer (the pooling layer 234) includes the final layer buffer 234b that stores the output data (processing data) input to the fully connected layer 235. The final layer buffer 234b stores the processing result (processing data) in the previous cycle in the overlapping area of the previous cycle and the current cycle, and stores the processing result (processing data) obtained by sequentially executing the processing from the first layer to the final layer in the non-overlapping area of the previous cycle and the current cycle. Then, when the processing in the processing area of the current cycle is completed, the processing data of the overlapping area and the processing data of the non-overlapping area stored in the final layer buffer 234b are input to the fully connected layer 235. As a result, the processing result (processing data) in the overlapping area is stored in the final layer buffer 234b that stores the output data (processing data) input to the fully connected layer 235 and reused, so that the processing amount (operation amount) can be reduced.
[0057] In the present embodiment, in the non-overlapping area, when the time-series data 100, the processing data stored in the first layer buffer 231b, the processing data stored in the second layer buffer 232b, and the processing data stored in the third layer buffer 233b is the number corresponding to the size of the corresponding kernel or the slide amount, and the data that can be processed by the kernel is prepared, the processing is executed. As a result, since the processing can be executed without waiting for all the data in the non-overlapping area to be prepared, the processing time can be shortened.
[0058] The embodiments disclosed this time should be considered to be exemplary and not restrictive in all respects. The scope of the present disclosure is set forth by the claims rather than the description of the embodiments, and is intended to include all modifications within the meaning and scope of the claims.