INFORMATION PROCESSING APPARATUS, OUTPUT METHOD, AND STORAGE MEDIUM
20220222973 · 2022-07-14
Assignee
Inventors
Cpc classification
G06V10/765
PHYSICS
G06V20/52
PHYSICS
G06V40/23
PHYSICS
G06V40/25
PHYSICS
International classification
Abstract
An information processing apparatus includes processors configured to detect a plurality of movements of an object from a moving image, generate a first timing that indicates a first movement included in the plurality of movements is detected in the moving image for each of a plurality of time units of the moving image, acquire second timings that indicate the first movement within each of at least one of patterns including the first movement, the second timings indicating when movements occur for each of a plurality of time units of time period, obtain a plurality first similarity values by calculating a first similarity value between the moving image and each of the patterns based on the first timing and each of the second timings, and specify a candidate pattern from the patterns based on the plurality of first similarity values.
Claims
1. An information processing apparatus comprising: one or more memories configured to store a plurality of patterns for recognition of movement within at least one moving image; and one or more processors coupled to the one or more memories and the one or more processors configured to: detect a plurality of movements of an object from a moving image, generate a first timing that indicates a first movement included in the plurality of movements is detected in the moving image for each of a plurality of time units of the moving image, acquire second timings that indicate the first movement within each of at least one of patterns of the plurality of patterns including the first movement, the second timings indicating when movements occur for each of a plurality of time units of time period, obtain a plurality of first similarity values by calculating a first similarity value between the moving image and each of the patterns based on the first timing and each of the second timings, and specify a candidate pattern from the patterns based on the plurality of first similarity values.
2. The information processing apparatus according to claim 1, wherein the one or more processors are further configured to recognize a combined movement that includes the first movement and a second movement from the moving image, wherein the combined movement is associated with the patterns.
3. The information processing apparatus according to claim 1, wherein each of the plurality of time units is a frame, wherein the one or more processors further configured to: acquire a first ratio of a number of frames in which the first movement is detected within the first timing to a total number of frames included in the first timing, acquire a plurality of second ratios, each of the plurality of the second ratios being a ratio of a number of frames in which the first movement occurs within each of the second timings to a total number of frames included in each of the second timings, and acquire differences between the first ratio and each of the plurality of the second ratios.
4. The information processing apparatus according to claim 1, wherein the one or more processors are further configured to generate the first timing by Dynamic Time Warping method.
5. The information processing apparatus according to claim 1, wherein the object is a human.
6. The information processing apparatus according to claim 1, wherein the plurality of patterns are classified into a first rule group and a second rule group, wherein the one or more processors further configured to: specify a first candidate rule from the first rule group based on the plurality of first similarity values, and specify a second candidate rule from the second rule group based on the plurality of first similarity values.
7. The information processing apparatus according to claim 2, wherein the one or more processors are further configured to: generate a third timing that indicates the second movement included in the plurality of movements is detected in the moving image for each of the plurality of time units, acquire fourth timings that indicate the second movement within each of at least one of patterns of the plurality of patterns including the second movement, the fourth timings indicating when movements occur for each of the plurality of time units of the time period, obtain a plurality of second similarity values by calculating a second similarity value between the moving image and each of the patterns based on the third timing and each of the fourth timings, calculate a plurality of total similarity values by weighting the plurality of first similarity values and the plurality of second similarity values based on an occurrence of the first movement and an occurrence of the second movement in each of the patterns, and specify the candidate pattern based on the plurality of total similarity values.
8. The information processing apparatus according to claim 1, wherein the one or more processors are further configured to developing, from the candidate pattern, a new recognition rule for artificial intelligence of a combined movement of the object within a plurality of moving images, the combined movement includes at least a plurality of movements.
9. An output method for a computer to execute a process comprising: detecting a plurality of movements of an object from a moving image; generating a first timing that indicates a first movement included in the plurality of movements is detected in the moving image for each of a plurality of time units of the moving image; acquiring second timings that indicate the first movement within each of at least one of patterns including the first movement, the second timings indicating when movements occur for each of a plurality of time units of time period, the patterns are included in a plurality of patterns for recognition of movement within at least one moving image; obtaining a plurality of first similarity values by calculating a first similarity value between the moving image and each of the patterns based on the first timing and each of the second timings; and specifying a candidate pattern from the patterns based on the plurality of first similarity values.
10. The output method according to claim 9, wherein the process further comprising recognizing a combined movement that includes the first movement and a second movement from the moving image, wherein the combined movement is associated with the patterns.
11. The output method according to claim 9, wherein each of the plurality of time units is a frame, and the obtaining the plurality of first similarity values includes: acquiring a first ratio of a number of frames in which the first movement is detected within the first timing to a total number of frames included in the first timing; acquiring a plurality of second ratios, each of the plurality of the second ratios being a ratio of a number of frames in which the first movement occurs within each of the plurality of pieces of the second timings to a total number of frames included in each of the plurality of pieces of the second timings; and acquiring differences between the first ratio and each of the plurality of the second ratios.
12. The output method according to claim 10, wherein the generating the first timing includes generating the first timing by Dynamic Time Warping method.
13. The output method according to claim 9, wherein the object is a human.
14. The output method according to claim 9, wherein the plurality of patterns are classified into a first rule group and a second rule group, wherein the process further comprising: specify a first candidate rule from the first rule group based on the plurality of first similarity values; and specify a second candidate rule from the second rule group based on the plurality of first similarity values.
15. The output method according to claim 10, wherein the process further comprising: generating a third timing that indicates the second movement included in the plurality of movements is detected in the moving image for each of the plurality of time units; acquiring fourth timings that indicate the second movement within each of at least one of patterns of the plurality of patterns including the second movement, the fourth timings indicating when movements occur for each of the plurality of time units of the time period; obtaining a plurality of second similarity values by calculating a second similarity value between the moving image and each of the patterns based on the third timing and each of the fourth timings; calculating a plurality of total similarity values by weighting the plurality of first similarity values and the plurality of second similarity values based on an occurrence of the first movement and an occurrence of the second movement in each of the patterns; and specifying the candidate pattern based on the plurality of total similarity values.
16. The output method according to claim 9, wherein the process further comprising: developing, from the candidate pattern, a new recognition rule for artificial intelligence of a combined movement of the object within a plurality of moving images, the combined movement includes at least a plurality of movements.
17. A non-transitory computer-readable storage medium storing an output program that causes at least one computer to execute a process, the process comprising: detecting a plurality of movements of an object from a moving image; generating a first timing that indicates a first movement included in the plurality of movements is detected in the moving image for each of a plurality of time units of the moving image; acquiring second timings that indicate the first movement within each of at least one of patterns including the first movement, the second timings indicating when movements occur for each of a plurality of time units of time period, the patterns are included in a plurality of patterns for recognition of movement within at least one moving image; obtaining a plurality of first similarity values by calculating a first similarity value between the moving image and each of the patterns based on the first timing and each of the second timings; and specifying a candidate pattern from the patterns based on the plurality of first similarity values.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
DESCRIPTION OF EMBODIMENTS
[0023] For example, in a case where a recognition model that detects an action of a recognition object is generated by deep learning or the like, a large amount of moving image data for learning is needed for each action to be recognized. In addition, for example, it may take time or it may be difficult to collect moving image data for learning, and it may be difficult to generate a recognition model that recognizes an action of a recognition object.
[0024] Incidentally, for example, an action of a person is generated from a combination of basic movements of a person, such as walking, shaking the head, and reaching out a hand. Thus, for example, it is conceivable that a recognition model that recognizes various basic movements of a person is created in advance, and a rule for recognizing complicated actions of a person such as a suspicious action and a purchase action is described for a combination of the basic movements, to detect an action. In addition, by defining the rule for the combination of the basic movements in this way, it becomes possible to recognize an action of a recognition object by using the rule without preparing a large number of moving images in which actions of the recognition object are captured.
[0025] However, know-how and experience are needed to generate the rule by combining the basic movements to recognize the action to be recognized. In addition, it takes manpower and a cost to generate a rule for each of various actions to be recognized. Therefore, when a new rule for recognizing an action is created, it is preferable that a rule may be diverted from rules created in the past.
[0026] In one aspect, it is an object of an embodiment to specify, among existing rules, a rule for recognizing an action similar to an action captured in a moving image.
[0027] It is possible to specify, among existing rules, a rule for recognizing an action similar to an action captured in a moving image.
[0028] Hereinafter, several embodiments will be described in detail with reference to the drawings. Note that corresponding elements in a plurality of drawings are denoted by the same reference sign.
[0029]
[0030] When receiving moving image data, the information processing apparatus 101 analyzes the received moving image and detects a recognition object ((1) in
[0031] Subsequently, the information processing apparatus 101 recognizes a basic movement from the recognition object captured in the moving image ((2) in
[0032] Subsequently, the information processing apparatus 101 recognizes whether the recognition object has taken an action corresponding to a rule on the basis of whether the basic movement of the recognition object detected from the moving image conforms to the rule ((3) in
[0033] In this way, by defining a rule for detecting an action of a recognition object by using a pattern of basic movements, it is possible to detect an action of the recognition object by using the rule without preparing a large number of moving images for learning in which actions of the recognition object are captured. Thus, for example, even in a case where a system that detects an action of a recognition object is introduced, it is possible to easily introduce the system without trouble of preparing learning data, and the like.
[0034] However, know-how and experience are needed to generate such a rule. In addition, it takes manpower and a cost to generate a rule for each of various actions to be recognized. Therefore, when a new rule for detecting an action is created, it is preferable that a rule may be diverted from rules created in the past. For example, when a rule generated in the past for recognizing an action similar to an action to be recognized may be diverted, labor related to generation of a rule may be reduced.
[0035] As one method of specifying a rule for detecting an action similar to an action of a recognition object from the rules generated in the past, in one example, it is conceivable to execute retrieval with a keyword on the rules generated in the past. For example, it is assumed that metadata such as a name given to data of a rule includes a keyword related to an action to be detected by the rule. In this case, there is a possibility that a rule for detecting an action similar to an action of a recognition object may be specified by executing the retrieval with a character string or the like representing an action to be recognized.
[0036] However, in practice, information registered in the metadata or the like may vary from person to person. As one example, even a rule for recognizing the same action may be titled “screw fastening”, or may be titled “process 1-A” or the like. Alternatively, even when an action is the same “screw fastening” action, in practice, a basic movement that characterizes the action may differ depending on a fixing position where a screw is fastened, or the like. Thus, it may be difficult to specify a rule suitable for diversion by the retrieval with a keyword.
[0037] Furthermore, as another method, for example, it is also conceivable to retrieve a moving image similar to a moving image in which an action desired to be recognized is captured, extract a rule created from the hit moving image, and divert the rule. However, retrieval of a similar moving image is, for example, a technique for retrieving a similar moving image by using a color, the number of persons captured in the moving image, and the like at present, as described in Kimura, Shogo et al., “Construction of Similar Moving Image Retrieval System with Similar Reason Presentation Function”, 49th Programming Symposium, p. 97-p. 106, January 2008, and it is difficult to expect that a moving image in which a similar action is captured is properly retrieved. Therefore, it is desired to further provide a technique for specifying, from existing rules, a rule for recognizing an action similar to an action captured in a moving image to be recognized.
[0038] In the embodiment described below, the information processing apparatus 101 stores, in a storage device, a plurality of rules for defining patterns of basic movements for detecting an action of an object in association with first time-series information representing detection timing of a plurality of basic movements included in the patterns of basic movements in time series. Note that each of the rules may be, for example, a rule for detecting a different action. Then, in a case where a moving image for which a new rule is to be created is input, the information processing apparatus 101 executes detection of a basic movement from the moving image and detects at least one basic movement. Subsequently, the information processing apparatus 101 evaluates, for each of a plurality of rules, a degree of similarity between the rule and the moving image on the basis of the first time-series information associated with the rule and second time-series information representing detection timing of the at least one basic movement of the moving image in time series. Then, on the basis of the degree of similarity, the information processing apparatus 101 outputs a candidate rule to be a candidate for diversion from the plurality of rules.
[0039] In this way, by evaluating the degree of similarity between the rule and the moving image on the basis of the first time-series information associated with the rule and the second time-series information regarding the at least one basic movement detected from the moving image, it is possible to efficiently specify a similar rule that is likely to be diverted. Hereinafter, the embodiment will be described in more detail.
[0040]
[0041] As described above, in the embodiment, rules created in the past are associated with recognition results of basic movements of actions recognized by the rules, and accumulated in the rule information 300. Hereinafter, as an example, the rule information 300 in which information regarding rules are accumulated will be described by taking a graph database (DB) having a graph structure as an example.
[0042]
[0043]
[0044]
[0045]
[0046] Note that the class definitions of the basic movement recognition result 301 indicated in
[0047]
[0048] The refer indicates that “S was created with reference to O”, as indicated in
[0049] The generate indicates that “S generates O”, as indicated in
[0050] The source indicates that “S is information indicating a part of O”, as indicated in
[0051] Note that the definitions of the predicates indicated in
[0052] As described above, in one embodiment, the existing rules 302 are accumulated as the rule information 300 in association with recognition results of basic movements included in actions recognized by the rules by using the graph structure.
[0053] Subsequently, specification of a candidate rule from the existing rules for a moving image in which an action of a recognition object is captured according to the embodiment will be described.
[0054] In Step 801 (hereinafter, Step is described as “5”, and denoted as, for example, S801), the control unit 201 of the information processing apparatus 101 receives input of moving image data in which an action for which a rule is to be created is captured.
[0055] In S802, the control unit 201 executes recognition of a basic movement for the input moving image. For example, the control unit 201 may recognize the basic movement from the moving image by using a recognition model machine-learned by deep learning or the like so as to recognize a basic movement to be recognized. As described above, the basic movement may be, for example, a basic movement taken by the object, and in one example, may include a movement of each part obtained by dividing the body of the object into parts for each joint. Furthermore, examples of the basic movement may include movements that the object often takes in various situations, such as walking, running, throwing, grasping, kicking, jumping, and eating. Note that a recognition result obtained by executing the recognition of the basic movement for the input moving image may be referred to as, for example, the second time-series information.
[0056] In S803, the control unit 201 selects one unprocessed rule 302 from the rules 302 of the rule information 300.
[0057] In S804, the control unit 201 acquires the basic movement recognition result 301 associated with the selected rule 302 from the rule information 300.
[0058] In S805, the control unit 201 evaluates a degree of similarity between the selected rule 302 and the input moving image in the basic movements. For example, the control unit 201 may evaluate a degree of similarity between the basic movement recognition result 301 associated with the selected rule 302 and a recognition result of the basic movement detected from the input moving image. Hereinafter, an example of the evaluation of the degree of similarity in the basic movements according to one embodiment will be described with reference to
[0059] [Example of Evaluation of Degree of Similarity in Basic Movements]
[0060]
[0061] Information regarding a basic movement of the rule 302 for detecting the action of taking a product from a shelf may be acquired from, for example, the body property of the rule 302. In one example, the rule 302 for detecting the action of taking a product from a shelf may be defined as a rule for detecting a basic movement: walking, then a basic movement: turning the right hand forward. Note that the definition of the rule is exemplary, and the rule 302 for detecting the action of taking a product from a shelf may be defined by another pattern of basic movements.
[0062] Furthermore, in
[0063] Moreover, in
[0064] As described above, for example, the control unit 201 may acquire the information indicated in
[0065] Subsequently, with reference to
[0066] Furthermore, in
[0067] As described above, for example, the control unit 201 may acquire the information indicated in
[0068] Then, by using the information indicated in
[0069] Note that the length of a period in which an action to be detected is detected may differ depending on the moving image. For example, in the selected rule 302 in
[0070]
[0071] In this case, corresponding series of the same length may be obtained by using the dynamic time warping method. The dynamic time warping method is, for example, algorithm that obtains a distance between each point of two time series by round robin, and after obtaining all the distances, finds a path in which the two time series are the shortest. In the obtained corresponding series, all pieces of the data are associated with the selected rule and the moving image.
[0072] Then, the control unit 201 calculates a degree of similarity between the corresponding series. For example, the control unit 201 may use a Jaccard index of the corresponding series as the degree of similarity. The Jaccard index may be obtained by, for example, the following equation.
Jaccard index=The number of frames in which both are 1/The number of frames in which at least one is 1
[0073] As illustrated in
[0074] Note that the degree of similarity according to the embodiment is not limited to the Jaccard index, and may be another degree of similarity. For example, in another embodiment, a Dice index, a Simpson index, or the like may be used. Furthermore, for example, in a case where the basic movement recognition result 301 is represented by a vector, a cosine degree of similarity or the like may be adopted.
[0075] For example, as described above, the control unit 201 may evaluate the degree of similarity between the recognition results of the corresponding basic movement between the selected rule 302 and the input moving image.
[0076] In S806, the control unit 201 evaluates a degree of similarity to the rule 302. For example, in a case where degrees of similarity between a corresponding plurality of basic movements are obtained between the rule 302 and the input moving image in S805, the control unit 201 may further obtain a representative degree of similarity that represents the degrees of similarity between the corresponding plurality of basic movements. For example, between the rule 302 illustrated in
[0077] In one example, the control unit 201 may use an average value of the degrees of similarity obtained for the recognition results of the corresponding basic movements as the representative degree of similarity. For example, it is assumed that a degree of similarity between the recognition result of the basic movement: walking associated with the rule 302 in
[0078] Furthermore, in another embodiment, a weighted average may also be used to acquire the representative degree of similarity. For example, weighting may be performed according to an appearance frequency of a basic movement in the rule information 300.
[0079] For example, it is assumed that 100 rules 302 are registered in the rule information 300. Furthermore, it is assumed that, among these 100 rules 302, the number of rules 302 in which walking is registered as a basic movement used for detection of an action is 50. On the other hand, it is assumed that, among these 100 rules 302, the number of rules 302 in which turning the right hand forward is registered as a basic movement used for the detection of an action is 10. In this case, it may be seen that the appearance frequency of the basic movement: turning the right hand forward is smaller than that of the basic movement: walking, and the basic movement: turning the right hand forward is a rare basic movement in the rule information 300. In addition, the basic movement that appears infrequently and is rare in the rule information 300 may be more important in the detection of an action by the rule 302 or may more strongly characterize the rule 302 than the basic movement that appears frequently. Thus, in one embodiment, as the appearance frequency of a basic movement to be recognized is lower, the control unit 201 may strongly reflect a degree of similarity for a recognition result of the basic movement to the representative degree of similarity in the rule information 300.
[0080] For example, in the example described above, there are 100 rules 302 in the rule information 300, and 50 rules 302 among them include walking as a basic movement of interest. Thus, a weighting coefficient of 2 may be obtained with 100/50=2. Similarly, there are 100 rules 302 in the rule information 300, and 10 rules 302 among them include turning the right hand forward as a basic movement of interest. Thus, a weighting coefficient of 10 may be obtained with 100/10=10. Then, the control unit 201 may use the obtained weighting coefficients to calculate a weighted average such that (2*0.9417+10*0.7808)/(2+10)=0.8076, and acquire the representative degree of similarity.
[0081] In addition, in the processing of S806, the control unit 201 may use the obtained representative degree of similarity as the degree of similarity between the selected rule 302 and the input moving image.
[0082] In S807, the control unit 201 determines whether or not there is an unprocessed rule 302 in the rule information 300. In a case where there is an unprocessed rule 302 in the rule information 300 (YES in S807), the flow returns to S803, and the control unit 201 selects the unprocessed rule 302 and repeats the processing. On the other hand, in a case where there is no unprocessed rule 302 in the rule information 300 (NO in S807), the flow proceeds to S808.
[0083] In S808, the control unit 201 specifies and outputs a candidate rule on the basis of the degrees of similarity. For example, the control unit 201 may rearrange the rules 302 in the rule information 300 such that a rule 302 with a high degree of similarity is arranged higher than a rule 302 with a low degree of similarity, and output information indicating the rules 302 as candidate rules. Furthermore, in another example, the control unit 201 may output information indicating a predetermined number of rules 302 with a high degree of similarity as candidate rules.
[0084] Furthermore, when outputting the candidate rule, the control unit 201 may output a moving image specified by a moving image property of the basic movement recognition result 301 corresponding to the candidate rule. With this configuration, a user may watch the moving image corresponding to the candidate rule, and may easily confirm whether the output candidate rule is suitable for diversion.
[0085] Furthermore, for example, the rules 302 accumulated in the rule information 300 may be classified into a plurality of groups in advance according to a type of an action to be detected, or the like. In this case, the control unit 201 may output, in S808, a predetermined number of rules 302 with a higher degree of similarity for each group. The grouping of the rules 302 may be executed, for example, on the basis of the degrees of similarity. In one example, the control unit 201 evaluates the degrees of similarity between the rules 302 included in the rule information 300. Then, the control unit 201 may classify the rules 302 in the rule information 300 into a plurality of groups by grouping rules 302 with a predetermined degree of similarity or higher degree of similarity into groups. Alternatively, a user may execute the grouping of the rules 302 in advance such that rules 302 that are similar to each other are in the same group.
[0086] Then, by performing the grouping in this way and outputting the rule 302 for each group, it is possible to suppress a plurality of substantially the same rules 302 from being specified as candidate rules. For example, in a case where it is desired to retrieve a rule 302 for detecting an action similar to an action captured in a moving image, it may be desirable to specify a rule 302 with a high degree of similarity among rules 302 for detecting not only similar actions but also various actions. By performing the grouping and specifying a candidate rule from the rule 302 for each group, the rule 302 for various actions may be specified as the candidate rule.
[0087] As described above, according to the embodiment, when a moving image in which an action desired to be recognized is captured is prepared, a rule 302 focusing on a basic movement that characterizes the action may be specified as a candidate rule. In addition, in one example, when an error due to imaging conditions or the like such as an angle of a subject included in the moving image to be recognized and image quality of the imaging device is adjusted by parameter fitting, the control unit 201 may start detecting an action of a recognition object from the moving image by using the candidate rule. Alternatively, a user may edit the candidate rule to generate a new rule 302 suitable for the moving image. In this case as well, by diverting the candidate rule, the new rule 302 may be created on the basis of the rule 302 in which the basic movement of interest or the like is specified, so that a creation cost of the rule 302 may be reduced.
[0088] (Modification)
[0089] Subsequently, a modification will be described. For example, one rule 302 may be applied to a plurality of moving images. In this case, for example, the basic movement recognition result 301 and the action detection period 303 may be acquired from each of the moving images and registered in the rule information 300.
[0090]
[0091] It is assumed that, in this way, one rule 302 is applied to the basic movement recognition results 301 of the plurality of moving images and the plurality of action detection periods 303 is generated. In this case as well, by evaluating, for each of the basic movement recognition results 301, a degree of similarity with a recognition result of a basic movement in an input moving image, and acquiring a representative degree of similarity representing the plurality of degrees of similarity, the control unit 201 may evaluate a degree of similarity between the input moving image and the rule 302.
[0092]
[0093] Subsequent processing from S1301 to S1305 may correspond to, for example, the processing from S801 to S805, and the control unit 201 may execute the processing similar to the processing from S801 to S805.
[0094] In S1306, the control unit 201 determines whether or not there is an unprocessed basic movement recognition result 301 associated with the selected rule 302. Then, in a case where there is an unprocessed basic movement recognition result 301 (YES in S1306), the flow returns to S1304, and the processing is repeated for the unprocessed basic movement recognition result 301. On the other hand, in a case where there is no unprocessed basic movement recognition result 301 (NO in S1306), the flow proceeds to S1307.
[0095] In S1307, the control unit 201 evaluates a degree of similarity of the selected rule 302. For example, when there is one basic movement recognition result 301 associated with the selected rule 302, the control unit 201 may obtain a representative degree of similarity representing degrees of similarity obtained for corresponding basic movements, and use the representative degree of similarity as the degree of similarity of the rule 302. On the other hand, in a case where there is a plurality of basic movement recognition results 301 associated with the selected rule 302, a degree of similarity is obtained for each basic movement recognition result 301, for each basic movement. In this case, the control unit 201 obtains, for each basic movement recognition result 301, a representative degree of similarity representing degrees of similarity of corresponding basic movements. In addition, the control unit 201 may obtain a representative degree of similarity further representing the representative degrees of similarity obtained for the basic movement recognition results 301, and use the representative degree of similarity as the degree of similarity between the moving image and the rule 302. Note that the representative degree of similarity may be, for example, a degree of similarity representing a plurality of degrees of similarity, and may be a statistical value such as an average value, a median value, a minimum value, and a maximum value.
[0096] Subsequent processing of S1308 and S1309 may correspond to, for example, the processing of S807 and S808, and the control unit 201 may execute the processing similar to the processing of S807 and S808.
[0097] As described above, for example, it is assumed that the rule 302 is applied to the basic movement recognition results 301 of a plurality of moving images. In this case as well, on the basis of a plurality of pieces of time-series information of the rule and second time-series information corresponding to a basic movement detected from a moving image, a degree of similarity between the rule and the moving image may be evaluated and a candidate rule may be output.
[0098] Furthermore, as described in the modification, by evaluating degrees of similarity with the plurality of basic movement recognition results 301, it becomes possible to specify a wide range of rules 302 as candidate rules. For example, it is assumed that a moving image in which an action of walking and turning the right hand forward is captured is input as a moving image to be input. In this case, a degree of similarity of the rule 302 including the action of walking and turning the right hand forward is highly evaluated.
[0099] Furthermore, for example, it is assumed that the rule information 300 includes a rule 302 for walking and turning one hand forward. In this rule 302, a hand turned forward may be a right hand or a left hand, and as long as an action of walking and turning one hand forward is captured, this rule 302 is satisfied. However, for example, it is assumed that the rule information 300 includes, as a basic movement recognition result 301 associated with this rule 302, only a basic movement recognition result 301 of a moving image in which an action of walking and turning the left hand forward is captured. In this case, since the input moving image is the moving image in which the basic movement of turning the right hand forward is captured, a degree of similarity is lowly evaluated for the basic movement of turning the left hand forward. As a result, a degree of similarity between the input moving image and the rule 302 for walking and turning one hand forward is also lowly evaluated.
[0100] On the other hand, for example, as the basic movement recognition results 301 associated with the rule 302, the basic movement recognition result 301 of the moving image in which the basic movement of walking and turning the left hand forward is captured and the basic movement recognition result 301 of the moving image in which the basic movement of walking and turning the right hand forward is captured are associated with each other. Thus, a degree of similarity between the basic movement recognition result 301 of the moving image in which the basic movement of walking and turning the right hand forward is captured and the input moving image is highly evaluated, and accordingly, a representative degree of similarity representing a plurality of basic movement recognition results 301 may also be highly evaluated. As a result, a degree of similarity between the rule 302 for walking and turning one hand forward and the input moving image may be highly evaluated, and the rule 302 for walking and turning one hand forward may be specified as a candidate rule.
[0101] In this way, the rule 302 may be described to allow a plurality of basic movements, such as turning one hand forward. By associating a plurality of basic movement recognition results 301 with the rule 302 so as to cover these various descriptions, when the rule 302 to be evaluated matches any one of the basic movement recognition results 301, the rule 302 may be highly evaluated. As a result, it becomes possible to specify a wide range of rules 302 corresponding to the input moving image on the basis of degrees of similarity. Note that, in another embodiment, for basic movements described in parallel in the rule 302, the control unit 201 may use the maximum degree of similarity among degrees of similarity of the plurality of basic movements described in parallel as a representative degree of similarity representing the plurality of basic movements described in parallel.
[0102] Although the embodiments have been described above as examples, the embodiment is not limited to these embodiments. For example, the operation flows described above are exemplary, and the embodiment is not limited to this. If possible, the operation flows may be executed by changing the order of processing or may additionally include further processing, or a part of processing may be omitted. For example, in the past execution of the operation flows in
[0103] Furthermore, a recognition result recorded in the basic movement recognition result 301 associated with the rule 302 in the embodiment described above may be, for example, only information regarding a recognition result for a basic movement used in a pattern of basic movements defined in the rule 302. With this configuration, a storage capacity needed for accumulation of the basic movement recognition results 301 may be reduced. However, the embodiment is not limited to this, and the basic movement recognition result 301 may include information regarding a recognition result for another basic movement.
[0104] Furthermore, the processing of evaluating the degree of similarity between the basic movements in S805 and S1305 may also be executed only for basic movements included in the rule 302. For example, the control unit 201 may evaluate a degree of similarity between a part of time-series information corresponding to a plurality of basic movements of the rule in the second time-series information corresponding to at least one basic movement detected from the moving image and the first time-series information associated with the rule. Furthermore, the detection of the basic movement from the input moving image may be executed only for basic movements registered in the rule 302 of the rule information 300. With this configuration, a processing amount may be reduced.
[0105] Furthermore, a basic movement of interest for the rule 302 may not be detected in the input moving image. In this case, the control unit 201 may evaluate a degree of similarity between the rule 302 and the basic movement by using a recognition result in which the basic movement is not detected. Alternatively, the control unit 201 may not evaluate a degree of similarity for a basic movement that is not detected in the input moving image among basic movements included in the rule 302, and may evaluate a degree of similarity between the rule 302 and the input moving image by using a degree of similarity evaluated for another basic movement.
[0106] Furthermore, in the embodiment described above, three classes of the basic movement recognition result 301, the rule 302, and the action detection period 303 are defined as the classes of the rule information 300, but the embodiment is not limited to this. For example, in another embodiment, the action detection period 303 may not be included. Alternatively, the information regarding the action detection period 303 may be appropriately generated by the control unit 201 by applying the rule 302 to the basic movement recognition result 301. For example, the control unit 201 may specify a section in which a basic movement to be detected is detected at a predetermined frequency or more as a section in which the basic movement is detected. In addition, the control unit 201 may integrate sections in which a plurality of basic movements included in a pattern of basic movements defined in the rule 302 is detected, and use the integrated sections as an action detection period. Alternatively, in another embodiment, the basic movement recognition result 301 may be recorded in the rule information 300 so that a range of the moving image of the basic movement recognition result 301 is the action detection period 303.
[0107] Note that, in the embodiment described above, for example, in the processing of S801 and S802 and S1301 and S1302, the control unit 201 of the information processing apparatus 101 operates as the detection unit 211. Furthermore, in the processing of S806 and S1307, the control unit 201 of the information processing apparatus 101 operates as, for example, the evaluation unit 212. In the processing of S808 and S1309, the control unit 201 of the information processing apparatus 101 operates as, for example, the output unit 213.
[0108]
[0109] The processor 1401 may be, for example, a single processor, a multiprocessor, or a multicore processor. The processor 1401 uses the memory 1402 to execute, for example, a program describing procedures of the operation flows described above, so that some or all of the functions of the control unit 201 described above are provided. For example, the processor 1401 of the information processing apparatus 101 operates as the detection unit 211, the evaluation unit 212, and the output unit 213 by reading and executing a program stored in the storage device 1403.
[0110] The memory 1402 is, for example, a semiconductor memory, and may include a RAM region and a ROM region. The storage device 1403 is, for example, a semiconductor memory such as a hard disk or a flash memory, or an external storage device. Note that RAM is an abbreviation for random access memory. Furthermore, ROM is an abbreviation for read only memory.
[0111] The reading device 1404 accesses a removable storage medium 1405 according to an instruction from the processor 1401. The removable storage medium 1405 is achieved by, for example, a semiconductor device, a medium to and from which information is input and output by magnetic action, or a medium to and from which information is input and output by optical action.
[0112] Note that the semiconductor device is, for example, a universal serial bus (USB) memory. Furthermore, the medium to and from which information is input and output by magnetic action is, for example, a magnetic disk. The medium to and from which information is input and output by optical action is, for example, a CD-ROM, a DVD, or a Blu-ray Disc (Blu-ray is a registered trademark). CD is an abbreviation for compact disc. DVD is an abbreviation for digital versatile disk.
[0113] The storage unit 202 described above includes, for example, the memory 1402, the storage device 1403, and the removable storage medium 1405. For example, the storage device 1403 of the information processing apparatus 101 stores the basic movement recognition result 301, the rule 302, and the action detection period 303 of the rule information 300.
[0114] The communication interface 1406 communicates with another device, for example, according to an instruction from the processor 1401. For example, the information processing apparatus 101 may receive moving image data from the imaging device 102 via the communication interface 1406. The communication interface 1406 is one example of the communication unit 203 described above.
[0115] The input/output interface 1407 is, for example, an interface between an input device and an output device. The input device is, for example, a device such as a keyboard, a mouse, or a touch panel that receives an instruction from a user. The output device is, for example, a display device such as a display or an audio device such as a speaker.
[0116] Each program according to the embodiment is provided to the information processing apparatus 101 in the following forms, for example.
[0117] (1) Installed in the storage device 1403 in advance.
[0118] (2) Provided by the removable storage medium 1405.
[0119] (3) Provided from a server such as a program server.
[0120] Note that the hardware configuration of the computer 1400 for achieving the information processing apparatus 101 described with reference to
[0121] Several embodiments have been described above. However, the embodiment is not limited to the embodiments described above, and it should be understood that the embodiment includes various modifications and alternatives of the embodiments described above. For example, it would be understood that various embodiments may be embodied by modifying components without departing from the spirit and scope of the embodiments. Furthermore, it would be understood that various embodiments may be implemented by appropriately combining a plurality of components disclosed in the embodiments described above. Moreover, a person skilled in the art would understand that various embodiments may be implemented by deleting some components from all the components indicated in the embodiments or by adding some components to the components indicated in the embodiments.
[0122] All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.