INFORMATION PROCESSING DEVICE

Abstract

An information processing device used for a convolutional neural network includes a processor configured to acquire input data and process the input data by using a convolution layer that executes convolution processing and a pooling layer that executes pooling processing, in which the processor is configured to divide the acquired input data into processing areas having an overlapping area in which processing areas overlap and a non-overlapping area in which processing areas do not overlap, and the processor is configured to, when the processor executes processing of the input data in the processing area, execute the convolution processing or the pooling processing in the non-overlapping area, and execute the processing by reusing a processing result of the convolution processing or a processing result of the pooling processing in the overlapping area.

Claims

1. An information processing device used for a convolutional neural network, the information processing device comprising a processor configured to acquire input data, and process the input data by using a convolution layer that executes convolution processing and a pooling layer that executes pooling processing, wherein: the processor is configured to divide the acquired input data into processing areas having an overlapping area in which processing areas overlap and a non-overlapping area in which processing areas do not overlap; and the processor is configured to, when the processor executes processing of the input data in the processing area, execute the convolution processing or the pooling processing in the non-overlapping area, and execute the processing by reusing a processing result of the convolution processing or a processing result of the pooling processing in the overlapping area.

2. The information processing device according to claim 1, wherein: the input data is time-series data; and the processor is configured to divide the time-series data into the processing areas at a fixed interval, and divide the processing area to have the overlapping area and the non-overlapping area.

3. The information processing device according to claim 1, wherein: the processor is configured to execute the processing of the input data by using a plurality of processing layers including the convolution layer and the pooling layer in a preceding stage of a fully connected layer; the processor is configured to sequentially execute the convolution processing or the pooling processing from a first layer to a final layer of the processing layers during processing of a first cycle; and the processor is configured to, during processing of second and subsequent cycles of the processing, execute the convolution processing or the pooling processing in the non-overlapping area of a previous cycle and a current cycle from the first layer to the final layer, and execute the processing of the input data by reusing a processing result of the convolution processing in the previous cycle or a processing result of the pooling processing in the previous cycle in the overlapping area of the previous cycle and the current cycle.

4. The information processing device according to claim 3, wherein: the final layer creates output data to be input to the fully connected layer; the processor is configured to create the output data by reusing the processing result of the convolution processing in the previous cycle or the processing result of the pooling processing in the previous cycle in the overlapping area of the previous cycle and the current cycle; the processor is configured to create the output data by sequentially executing the convolution processing or the pooling processing from the first layer to the final layer in the non-overlapping area of the previous cycle and the current cycle; and the processor is configured to input the output data to the fully connected layer when all the output data in the processing area of the current cycle is created in the final layer.

5. The information processing device according to claim 4, wherein the processor is configured to sequentially execute the processing when data that is processable by a kernel is prepared in the non-overlapping area of the previous cycle and the current cycle.

6. The information processing device according to claim 1, wherein the processor is mounted in a vehicle.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] Features, advantages, and technical and industrial significance of exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which like signs denote like elements, and wherein:

[0020] FIG. 1 is a diagram showing a configuration of an information processing device according to the present embodiment;

[0021] FIG. 2 is a diagram for illustrating a detailed configuration of a processing unit;

[0022] FIG. 3 is a graph illustrating a method of dividing input data in related art;

[0023] FIG. 4 is a graph illustrating a method of dividing input data in the present embodiment;

[0024] FIG. 5 is a diagram schematically illustrating a CNN processing (operation) in related art; and

[0025] FIG. 6 is a diagram schematically illustrating the CNN processing (operation) in the present embodiment.

DETAILED DESCRIPTION OF EMBODIMENTS

[0026] Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. The same or corresponding parts in the drawings are designated by the same reference numerals, and the description thereof will not be repeated.

[0027] FIG. 1 is a diagram showing a configuration of an information processing device 10 according to the present embodiment. The information processing device 10 according to the present embodiment is mounted in a vehicle V. The vehicle V includes an internal combustion engine E, a transmission M, a differential gear G, and a drive wheel D. The vehicle V may be an electrified vehicle provided with an electric motor. The information processing device 10 executes inference by a convolutional neural network (CNN) (classifies input data and infers the result), and outputs the result. The information processing device 10 includes a processor 20, a storage device 30, and a communication device 40.

[0028] The storage device 30 is configured to include, for example, a read only memory (ROM) and a random access memory (RAM). The storage device 30 stores a program and the like executed by the processor 20. The communication device 40 is configured to allow bidirectional communication between an external device and the processor 20.

[0029] The processor 20 includes a data acquisition unit 21, a processing unit 23, and an output unit 25. The processor 20 functions as the data acquisition unit 21, the processing unit 23, and the output unit 25 by executing a program stored in the storage device 30. The processor 20 may include a buffer during processing data (input data) received from the data acquisition unit 21 by using the CNN, and may use the storage device 30 as the buffer.

[0030] The data acquisition unit 21 acquires time-series data 100 detected by various sensors 50 or created based on values detected by the various sensors 50. The time-series data 100 may be, for example, the motion state of the vehicle V (front-rear acceleration, lateral acceleration, vehicle speed, and the like), the rotation speed and the exhaust temperature of the internal combustion engine E, and the like, and may be time-series data related to the vehicle V. The data acquisition unit 21 acquires the time-series data 100 at a predetermined cycle, and outputs the acquired time-series data 100 to the processing unit 23.

[0031] The processing unit 23 processes the time-series data 100 (input data) received from the data acquisition unit 21 by using the CNN, and outputs the identification result (inference result) with respect to the input data to the output unit 25.

[0032] FIG. 2 is a diagram for illustrating a detailed configuration of the processing unit 23. The processing unit 23 includes convolution layers 231, 233, pooling layers 232, 234, and a fully connected layer 235. The convolution layers 231, 233 and the pooling layers 232, 234 extract features from the input data. In the convolution layers 231, 233, a convolution processing using a kernel (filter) of a predetermined size is executed. In the pooling layers 232, 234, processing that compresses the convolution result is executed, and the pooling processing is executed by using a kernel (window) of a predetermined size. In the present embodiment, MAX pooling is executed. Although FIG. 2 shows an example in which the two convolution layers 231, 233 and the two pooling layers 232, 234 are included in the processing unit 23, the number of the processing layers (the number of convolution layers and the number of pooling layers) can be changed as appropriate.

[0033] The fully connected layer 235 includes an input layer, an intermediate layer, and an output layer. The input layer is constituted with a plurality of units. The output of the pooling layer 234 converted into one dimension is input to each unit.

[0034] The intermediate layer is constituted with a plurality of layers. Although FIG. 2 shows a case where the number of layers of the intermediate layer is two, the number of layers of the intermediate layer can be changed as appropriate. Each layer of the intermediate layer is constituted with a plurality of units. Each unit is connected to each unit in the previous layer and each unit in the next layer. Each unit multiplies each output value from each unit in the previous layer by weight and integrates the multiplication results. Next, each unit adds (or subtracts) a predetermined bias to each of the integration results, inputs an addition results (or subtraction results) into a predetermined activation function (for example, a ramp function or a sigmoid function), and outputs the output value of the activation function to each unit of the next layer.

[0035] The output layer is constituted with one or more units. The number of units in the output layer can be changed as appropriate. Each unit in the output layer is connected to each unit of the final layer of the intermediate layer. Each unit of the output layer receives the output value from each unit of the final layer of the intermediate layer, multiplies each output value by weight, and integrates the multiplication results. The multiplication results are input to a predetermined activation function (for example, a ramp function or a sigmoid function). The output value of the activation function indicates, for example, a probability.

[0036] Generally, when the input data is the time-series data 100, in the processing using the CNN, the time-series data 100 (input data) acquired by the data acquisition unit 21 is divided at a fixed interval (fixed cycle), and as the processing area of the CNN (operation area), the first processing (processing of the convolution layer in the present embodiment) is executed. FIG. 3 is a graph illustrating a method of dividing input data in related art. As shown in FIG. 3, the time-series data 100 is divided by a fixed interval (fixed cycle) T, processing is executed with the time-series data 100 from time t1s to time t1e as a first processing area 1, and processing is executed with the time-series data 100 from time t2s (the same time as time t1e) to time t2e as a second processing area 2. The same applies to a third processing area 3 and a fourth processing area 4.

[0037] As described above, when the processing area is divided at the fixed interval T and processing of the CNN is executed, in a case where the time-series data 100 of which the feature is well represented exists between the first processing area 1 and the second processing area 2, there is a concern that the feature of the time-series data 100 cannot be inferred with good precision.

[0038] FIG. 4 is a graph illustrating a method of dividing input data in the present embodiment. In the present embodiment, as in the case in related art, when the time-series data 100 is divided at the fixed interval T, and the processing area is set, the overlapping of the processing areas is allowed and the overlapping area is set. As shown in FIG. 4, start time t2s of the second processing area 2 is set within the first processing area 1 from time t1s to time t1e, and the overlapping of the time-series data 100 from time t1s to time t1e is allowed. As a result, the time-series data 100 from time t2s to time t1e shown by the diagonal lines in FIG. 4 is an overlapping area of the first processing area 1 and the second processing area 2. Further, the time-series data 100 from time t1e to time t2e is a non-overlapping area in the second processing area 2. By setting start time t3s of the third processing area 3 within the second processing area 2 from time t2s to time t2e and allowing overlapping of the time-series data 100 from time t3s to time t2e, the time-series data 100 from time t3s to time t2e is an overlapping area of the second processing area 2 and the third processing area 3. After that, by similarly dividing the input data (the time-series data 100), an overlapping area in the processing area in the current cycle and the processing area in the previous cycle can be set.

[0039] Although FIG. 4 describes that the processing area is divided by time, the time-series data 100 is generated at each predetermined data collection interval. Dividing the time-series data 100 by the fixed interval (fixed cycle) T and setting a processing area are practically the same as setting the consecutive predetermined number of the time-series data 100 as processing areas. When the processing area of the consecutive predetermined number of the time-series data 100 is referred to as an “input window”, the overlapping area can be set by sliding the input window by the set number.

[0040] As described above, by allowing the overlapping of the processing areas in this way, since the area in which the time-series data 100 of which the feature is well represented exists can be reliably covered and the execution frequency of inference by the CNN can be increased, it is possible to infer the feature of the time-series data 100 with good precision. However, when inference by the CNN is executed for each processing area, there arises a problem that the processing amount (operation amount) increases.

[0041] FIG. 5 is a diagram schematically illustrating the CNN processing (operation) in related art. In FIG. 5, the first layer of the processing layer is the convolution layer 231, the second layer is the pooling layer 232, the third layer is the convolution layer 233, and the final layer (fourth layer) is the pooling layer 234. The pooling layer 234, which is the final layer, creates output data to be input to the fully connected layer 235.

[0042] When the data acquisition unit 21 acquires the input data (the time-series data 100) and the time-series data 100 of the first processing area 1 is prepared, the first layer (the convolution layer 231) starts convolution processing by using a first layer kernel (filter) 231f. For example, a product-sum operation by using the first layer kernel 231f is executed while the first layer kernel 231f is sequentially slid, and the processing result (processing data) is stored in a first layer buffer 231b. When the processing of the first layer (the convolution layer 231) is completed, the processing of the second layer (the pooling layer 232) is executed.

[0043] In the second layer (the pooling layer 232), a second layer kernel (window) 232c is sequentially slid with respect to the processing data stored in the first layer buffer 231b to execute MAX pooling, and the processing result (processing data) is stored in a second layer buffer 232b. When the processing of the second layer (the pooling layer 232) is completed, the processing of the third layer (the convolution layer 233) is executed.

[0044] The processing of the third layer (the convolution layer 233) and the final layer (the pooling layer 234) is executed in the same manner as described above. Convolution processing by using a third layer kernel (filter) 233f is executed with respect to the processing data stored in the second layer buffer 232b, and the processing result is stored in a third layer buffer 233b. Further, MAX pooling by using a final layer kernel (window) 234c is executed with respect to the processing data stored in the third layer buffer 233b, and the processing result is stored in a final layer buffer 234b. Then, when the processing of the final layer (the pooling layer 234) is completed, the processing data stored in the final layer buffer 234b is input to the fully connected layer 235.

[0045] When the processing from the first layer (the convolution layer 231) to the final layer (the pooling layer 234) is completed in the first processing area 1, the same processing is executed in the second processing area 2. As described above, in related art, inference by the CNN is repeatedly executed sequentially for each processing area.

[0046] FIG. 6 is a diagram schematically illustrating the CNN processing (operation) in the present embodiment. In FIG. 6, as in FIG. 5, the first layer of the processing layer is the convolution layer 231, the second layer is the pooling layer 232, the third layer is the convolution layer 233, and the final layer (fourth layer) is the pooling layer 234. The pooling layer 234, which is the final layer, creates output data to be input to the fully connected layer 235.

[0047] In the present embodiment, in the processing of the first processing area 1, when the data acquisition unit 21 acquires the input data (the time-series data 100) in the first layer (the convolution layer 231), and the time-series data 100 capable of the product-sum operation using the first layer kernel (filter) 231f is prepared, convolution processing is started by using the first layer kernel 231f. For example, at first, when the same number (or the same number or more) of the time-series data 100 as the size of the first layer kernel 231f is prepared, the product-sum operation is executed, and the processing result (processing data) is stored in the first layer buffer 231b. In the next and subsequent processing (operation), when the time-series data 100 corresponding to the slide amount of the first layer kernel 231f is added, and the time-series data 100 capable of the product-sum operation using the first layer kernel 231f is prepared, the product-sum operation is executed, and the processing result (processing data) is stored in the first layer buffer 231b. As described above, in the first layer (the convolution layer 231), when the time-series data 100 capable of the product-sum operation using the first layer kernel 231f is prepared, the product-sum operation is sequentially executed, and the processing result (processing data) is stored in the first layer buffer 231b.

[0048] In the second layer (the pooling layer 232), the number of the processing data (the processing result of the first layer) stored in the first layer buffer 231b is the number that allows MAX pooling using the second layer kernel (window) 232c, the pooling processing is executed. For example, at first, when the same number (or the same number or more) of the processing data as the size of the second layer kernel 232c is prepared in the first layer buffer 231b, MAX pooling is executed, and the processing result (processing data) is stored in the second layer buffer 232b. In the next and subsequent processing (operation), when the processing data corresponding to the slide amount of the second layer kernel 232c is added to the first layer buffer 231b, and the processing data capable of MAX pooling using the second layer kernel 232c is prepared, MAX pooling is executed, and the processing result (processing data) is stored in the second layer buffer 232b. As described above, even in the second layer (the pooling layer 232), when the processing data capable of MAX pooling using the second layer kernel 232c is prepared in the first layer buffer 231b, the pooling processing is sequentially executed, and the processing result (processing data) is stored in the second layer buffer 232b.

[0049] The processing of the third layer (the convolution layer 233) and the final layer (the pooling layer 234) is also executed in the same manner as described above. When the processing data stored in the second layer buffer 232b is a state capable of convolution processing using the third layer kernel (filter) 233f, the convolution processing is sequentially executed, and the processing result is stored in the third layer buffer 233b. Further, when the processing data stored in the third layer buffer 233b is a state capable of MAX pooling using the final layer kernel 234c, the pooling processing is sequentially executed, and the processing result is stored in the final layer buffer 234b. Then, when the processing of the final layer (the pooling layer 234) in the first processing area 1 is completed, the processing data stored in the final layer buffer 234b is input to the fully connected layer 235.

[0050] In the non-overlapping area of the second processing area 2 (the second processing area following the overlapping area shown by diagonal lines in FIG. 6), the same processing as the processing in the first processing area 1 described above is executed with respect to the time-series data 100 in the non-overlapping area. As a result, the processing of the non-overlapping area of the second processing area is executed consecutively after the processing of the first processing area 1, and the processing result (processing data) with respect to the time-series data 100 in the non-overlapping area is stored in the final layer buffer 234b.

[0051] In the second processing area 2, in the overlapping area of the first processing area 1 and the second processing area 2 shown by the diagonal lines, the processing result of the first processing area 1 is reused. In the second processing area 2, the processing of the first layer (the convolution layer 231) to the final layer (the pooling layer 234) with respect to the time-series data 100 in the overlapping area is not executed, and the processing result of the first processing area 1 is reused by adding the processing result (processing data) processed (operated) in the first processing area 1 by using the time-series data 100 in the overlapping area and stored in the final layer buffer 234b to the processing result (processing data) with respect to the time-series data 100 in the non-overlapping area. When the processing of the second processing area 2 is completed, the processing result (processing data) processed (operated) in the first processing area 1 by using the time-series data 100 in the overlapping area, and the processing result (processing data) with respect to the time-series data 100 in the non-overlapping area are stored in the final layer buffer 234b, and thus the processing results (processing data) are input to the fully connected layer 235.

[0052] The processing of the third processing area 3 and after subsequent areas is executed in the same manner as the processing in the second processing area 2, and the processing result of the second processing area 2 is reused in the overlapping area of the second processing area 2 and the third processing area 3. In FIG. 6, there is an area in which the first processing area 1, the second processing area 2, and the third processing area overlap, and in the area, the processing result of the first processing area 1 is reused for the third processing area 3.

[0053] In the present embodiment, when the time-series data 100 (input data) acquired by the data acquisition unit 21 is divided at the fixed interval T and the processing area is set, the overlapping of the processing areas is allowed, and the overlapping area is set. By setting the overlapping area, since the area in which the time-series data 100 of which the feature is well represented exists can be reliably covered and the execution frequency of inference by the CNN can be increased, it is possible to infer the feature of the time-series data 100 with good precision.

[0054] In the present embodiment, when the processing in the processing area is executed, the processing result is reused in the overlapping area. That is, the processing result of the overlapping area of the previous cycle is output as the processing result of the overlapping area of the current cycle. As a result, the processing (operation) amount of the CNN can be reduced.

[0055] In the present embodiment, in the processing of the first processing area 1 that is the first cycle, the processing (operation) is sequentially executed from the first layer (the convolution layer 231) to the final layer (the pooling layer 234). During the processing of the second processing area 2 and subsequent areas, which is the processing of the second cycle and subsequent cycles, in the non-overlapping area of the previous cycle and the current cycle, the processing (operation) is executed from the first layer (the convolution layer 231) to the final layer (the pooling layer 234 is executed sequentially, and in the overlapping area of the previous cycle and the current cycle, the processing result of the previous cycle is reused. As a result, it is possible to sequentially execute the processing from the first layer to the final layer with respect to the time-series data 100 that is the input data, and there is no waiting time for the processing, so that the processing time can be shortened.

[0056] In the present embodiment, the final layer (the pooling layer 234) includes the final layer buffer 234b that stores the output data (processing data) input to the fully connected layer 235. The final layer buffer 234b stores the processing result (processing data) in the previous cycle in the overlapping area of the previous cycle and the current cycle, and stores the processing result (processing data) obtained by sequentially executing the processing from the first layer to the final layer in the non-overlapping area of the previous cycle and the current cycle. Then, when the processing in the processing area of the current cycle is completed, the processing data of the overlapping area and the processing data of the non-overlapping area stored in the final layer buffer 234b are input to the fully connected layer 235. As a result, the processing result (processing data) in the overlapping area is stored in the final layer buffer 234b that stores the output data (processing data) input to the fully connected layer 235 and reused, so that the processing amount (operation amount) can be reduced.

[0057] In the present embodiment, in the non-overlapping area, when the time-series data 100, the processing data stored in the first layer buffer 231b, the processing data stored in the second layer buffer 232b, and the processing data stored in the third layer buffer 233b is the number corresponding to the size of the corresponding kernel or the slide amount, and the data that can be processed by the kernel is prepared, the processing is executed. As a result, since the processing can be executed without waiting for all the data in the non-overlapping area to be prepared, the processing time can be shortened.

[0058] The embodiments disclosed this time should be considered to be exemplary and not restrictive in all respects. The scope of the present disclosure is set forth by the claims rather than the description of the embodiments, and is intended to include all modifications within the meaning and scope of the claims.

INFORMATION PROCESSING DEVICE

Assignee

Inventors

Cpc classification

Classification Explorer

G06F9/345

PHYSICS

Classification Explorer

G06N3/0464

PHYSICS

Classification Explorer

G06F17/153

PHYSICS

Classification Explorer

G06N3/063

PHYSICS

International classification

Classification Explorer

G06F9/345

PHYSICS

Abstract

Claims

Description