OCR TARGET AREA POSITION ACQUISITION SYSTEM, COMPUTER-READABLE NON-TRANSITORY RECORDING MEDIUM STORING OCR TARGET AREA POSITION ACQUISITION PROGRAM, HARD COPY, HARD COPY GENERATION SYSTEM, AND COMPUTER-READABLE NON-TRANSITORY RECORDING MEDIUM STORING HARD COPY GENERATION PROGRAM
20220301326 · 2022-09-22
Inventors
Cpc classification
G06V30/1463
PHYSICS
G06V30/20
PHYSICS
International classification
Abstract
An OCR system acquires the position of an image code in a document image, acquires data indicated by the image code, and acquires the position of a handwriting input field in the document image on the basis of the position of the image code in the acquired document image, the position of the image code in the document included in the acquired data, and the position of the handwriting input field in the document included in the acquired data.
Claims
1. An OCR target area position acquisition system comprising: an image code position acquiring unit that acquires a position of an image code in a document image, the document image being an image of a document to which data is added by the image code; a data acquiring unit that acquires the data indicated by the image code; and an ORC target area position acquiring unit that acquires a position of an OCR target area in the document image, the OCR target area being an area, in the document image, to be subjected to OCR processing, wherein, the data added to the document by the image code includes: image code position data including a position of the image code in the document; and OCR target area position data including a position of the OCR target area in the document, and the ORC target area position acquiring unit acquires a position of the ORC target area in the document image based on: a position of the image code in the document image, the image code being acquired by the image code position acquiring unit; a position of the image code in the document, the image code being included in the image code position data acquired by the data acquiring unit; and a position of the OCR target area in the document, the OCR target area being included in the OCR target area position data acquired by the data acquiring unit.
2. A computer-readable non-transitory storage medium storing an OCR target area position acquisition program that causes a computer to realize: an image code position acquiring unit that acquires a position of an image code in a document image, the document image being an image of a document to which data is added by the image code; a data acquiring unit that acquires the data indicated by the image code; and an ORC target area position acquiring unit that acquires a position of an OCR target area in a document image, the OCR target area being an area, in the document image, to be subjected to OCR processing, wherein, the data added to the document by the image code includes: image code position data including a position of the image code in the document; and OCR target area position data including a position of the OCR target area in the document, and the ORC target area position acquiring unit acquires a position of the ORC target area in the document image based on: a position of the image code in the document image, the image code being acquired by the image code position acquiring unit; a position of the image code in the document, the image code being included in the image code position data acquired by the data acquiring unit; and a position of the OCR target area in the document, the OCR target area being included in the OCR target area position data acquired by the data acquiring unit.
3. A hard copy that is an actual document to which data is added by an image code, the data added to the document by the image code comprising: image code position data including a position of the image code in the document; and OCR target area position data including a position of an OCR target area in the document, the OCR target area being an area, in an image of the document, to be subjected to OCR processing.
4. A hard copy generation system comprising: a hard copy generating unit that generates a hard copy, the hard copy being an actual document to which data is added by an image code, wherein the data added to the document by the image code comprising: image code position data including a position of the image code in the document; and OCR target area position data including a position of an OCR target area in the document, the OCR target area being an area, in the image of the document, to be subjected to OCR processing.
5. A computer-readable non-transitory storage medium storing a hard copy generation program that causes a computer to realize: a hard copy generating unit that generates a hard copy, the hard copy being an actual document to which data is added by an image code, wherein the data added to the document by the image code comprising: image code position data including a position of the image code in the document; and OCR target area position data including a position of an OCR target area in the document, the OCR target area being an area, in the image of the document, to be subjected to OCR processing.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
DETAILED DESCRIPTION
[0020] Embodiments of the disclosure will now be described with reference to the accompanying drawings.
[0021] The configuration of a system according to an embodiment of the disclosure will now be described.
[0022]
[0023] As illustrated in
[0024] The OCR system 20 may include a single computer or a plurality of computers.
[0025] The OCR system 20 and the MFP 30 are capable of communicating with each other over a network, such as a local area network (LAN) and the Internet, or without any networks but directly through a wired or wireless connection. Similarly, the OCR system 20 and the user terminal 40 are capable of communicating with each other over a network, such as a LAN and the Internet, or without any networks but directly through a wired or wireless connection.
[0026]
[0027] The OCR system 20 illustrated in
[0028] The storage unit 24 stores an OCR program 24a for executing OCR processing. The OCR program 24a, for example, may be installed in the OCR system 20 at a manufacturing stage of the OCR system 20, may be additionally installed in the OCR system 20 from an external storage medium, such as a universal serial bus (USB) memory, or may be additionally installed in the OCR system 20 from the network.
[0029] The control unit 25 includes, for example, a central processing unit (CPU), a read-only memory (ROM) storing programs and various kinds of data, and a random access memory (RAM) that is a memory used as a work area for the CPU of the control unit 25. The CPU of the control unit 25 executes a program stored in the storage unit 24 or the ROM of the control unit 25.
[0030] The control unit 25 executes the OCR program 24a to implement a hard copy generating unit 25a that generates a hard copy to which data is added by image codes, such as one-dimensional or two-dimensional codes. Thus, the OCR system 20 and the OCR program 24a constitute a hard copy generation system and a hard copy generation program of the disclosure, respectively.
[0031] The control unit 25 executes the OCR program 24a to implement an image code position acquiring unit 25b, a data acquiring unit 25c, and an OCR target area position acquiring unit 25d. The image code position acquiring unit 25b acquires the positions of the image codes in a document image. The data acquiring unit 25c acquires the data indicated by the image cades. The OCR target area position acquiring unit 25d acquires the position, in the document image, of a handwriting input field, which is an OCR target area subjected to OCR processing in the document image. Thus, the OCR system 20 and the OCR program 24a constitute an OCR target area position acquisition system and an OCR target area position acquisition program of the disclosure, respectively.
[0032] The control unit 25 executes the OCR program 24a to implement an OCR processing unit 25e that extracts information from the handwriting input field in the document image through OCR processing.
[0033]
[0034] The MFP 30 as illustrated in
[0035] The control unit 38 includes, for example, a CPU, a ROM storing programs and various kinds of data, and a RAM serving as a memory used as a work area of the CPU of the control unit 38. The CPU of the control unit 38 executes the programs stored in the storage unit 37 or the ROM of the control unit 38.
[0036]
[0037] The user terminal 40 illustrated in
[0038] The control unit 45 includes, for example, a CPU, a ROM storing programs and various kinds of data, and a RAM serving as a memory used as a work area of the CPU of the control unit 45. The CPU of the control unit 45 executes the programs stored in the storage unit 44 or the ROM of the control unit 45.
[0039] The operation of the system 10 will now be explained.
[0040] The operation of the OCR system 20 when the MFP 30 is made to print a document will now be explained.
[0041]
[0042] A user (hereinafter, referred to as “operator”) of the OCR system 20 can instruct the OCR system 20 to create a document through the operation unit 41 of the user terminal 40. Thus, as illustrated in
[0043]
[0044] The document 50 illustrated in
[0045] After the OCR system 20 has been instructed to create a document, the operator can instruct the OCR system 20 to add data to the document by image codes through the operation unit 41 of the user terminal 40. Thus, after step S101, the hard copy generating unit 25a of the OCR system 20 adds image codes corresponding to the data in accordance with the instructions from the operator to the document created in step S101, as illustrated in
[0046]
[0047] As illustrated in
[0048]
[0049] As illustrated in
[0050] The auto-indexing data 61 may include, for example, identification information and a value for each piece of data. The value of the auto-indexing data 61 may be, for example, any piece of text 51. For example, “Data:CarLavel=Vehicle Number” indicates that the data value of the identification information “CarLavel” is “Vehicle Number.”
[0051] The handwriting input field data 62 may include, for example, identification information and the position and size in the document 50 for each handwriting input field. The handwriting input field data 62 includes the position of the handwriting input field as the OCR target area in the document 50, and constitutes the OCR target area position data of the disclosure. For some handwriting input fields, the type of characters used may be included in the handwriting input field data 62. For example, “InputArea:Name=(49,53,182,8), hint:[a-z0-9]” indicates that the position, in document 50, of the upper left corner of the handwriting input field of which the identification information is “Name” is 49 steps to the right and 53 steps down from the upper left corner of document 50, that the size of the text input field is 182 in the left-right direction and 8 in the top-bottom direction, and that the type of characters used in the text input field is only lowercase letters and numbers.
[0052] The image code position data 63 may include, for example, the position, in the document 50, of the upper left corner, the upper right corner, and the lower left corner of the corresponding image code 55. Note that the image code position data 63 may be data indicating only the position of a specific image code 55, such as the position of the leftmost image code 55, when multiple image codes 55 are added to the document 50.
[0053] The text data 64 may include the position of each text in the document 50.
[0054] The guideline data 65 may include the position of each guideline in the document 50. For example, “Line:(5,17)-(5,134)” indicates a guideline connecting the position 5 to the right and 17 down from the upper left corner of the document 50 with the position 5 to the right and 134 down from the upper left corner of the document 50.
[0055] The image data 66 may include, for example, identification information and the position in the document 50 for each image. For example, “Image:xx=(218,8,8,8)” indicates that the position, in the document 50, of the upper left corner of the image of which the identification information is “xx” is 218 to the right and 8 down from the upper left corner of the document 50 and that the size of the image is 8 in the left-right direction and 8 in the top-bottom direction.
[0056] After step S102, the hard copy generating unit 25a of the OCR system 20 instructs the MFP 30 to print the document to which the image codes have been added in step S102 via the communication unit 23 of the OCR system 20, as illustrated in
[0057] After step S103, the hard copy generating unit 25a ends the operation illustrated in
[0058] In the above, the operator executes various instructions to the OCR system 20 via the operation unit 41 of the user terminal 40. In place of the operation unit 41 of the user terminal 40, the operator may execute various instructions to the OCR system 20 via the operation unit 31 of the MFP 30.
[0059] The operator distributes the hard copies printed through the operation illustrated in
[0060] The operation of the OCR system 20 when information is extracted from a document image will now be explained.
[0061]
[0062] The operator can place a hard copy that has been returned by a person who received a hard copy into the scanner 34 of the MFP 30, and instruct the MFP 30 to extract information from the hard copy, for example, via the operation unit 31 of the MFP 30. When the extraction of information from the hard copy is instructed, the control unit 38 of the MFP 30 reads, with the scanner 34, a document image from the hard copy placed in the scanner 34, and instructs the OCR system 20, via the communication unit 36 of the MFP 30, to extract information from the document image. Here, the instruction for extracting information from the document image (hereinafter referred to as “information extraction instruction”) includes the document image that is the target of this information extraction instruction. When the control unit 25 of the OCR system 20 receives the information extraction instruction via the communication unit 23, the control unit 25 executes the operation illustrated in
[0063] As illustrated in
[0064] After step S121, the data acquiring unit 25c acquires various kinds of information indicated by the image code contained in the document image that is the target of the information extraction instruction (step S122).
[0065] After step S122, the OCR target area position acquiring unit 25d calculates a transformation matrix M for the transformation of the hard copy to the document image that is the target of the information extraction instruction (step S123).
[0066] The method of calculating the transformation matrix M in step S123 will now be explained.
[0067]
[0068] In the hard copy 70 illustrated in
[0069] In a document image 80 illustrated in
[0070] Assuming that the coordinate of P0 in the left-right direction is POx and the coordinate of P0 in the top-bottom direction is POy, the coordinates of P0 can be expressed as in Equations 1. Similarly, the coordinates of P1, P2, P0′, P1′ and P2′ can be expressed as in Equations 1. Note that the coordinates in Equations 1 are expressed in a homogeneous coordinate system used in affine transformation.
P.sub.0=(P.sub.0x,P.sub.0y,1)=
P.sub.1=(P.sub.1x,P.sub.1y,1)
P.sub.2=(P.sub.2x,P.sub.2y,1)
P.sub.0′=(P.sub.0x′,P.sub.0y′,1)
P.sub.1′=(P.sub.1x′,P.sub.1y′,1)
P.sub.2′=(P.sub.2x′,P.sub.2y′,1) [Equations 1]
[0071] Assuming that the transformation from the handwriting input field 71 and the image code 72 in the hard copy 70 illustrated in
[0072] Here, the coordinates P0′, P1′, and P2′ can be expressed as in Equations 3 by using P0, P1, P2, and M.
P.sub.0′=MP.sub.0
P.sub.1′=MP.sub.1
P.sub.2′=MP.sub.2 [Equations 3]
[0073] Based on Equations 1 to 3, Equations 4 are established.
P.sub.0x′=aP.sub.0x+bP.sub.0y+c [0074] P.sub.0y′=dP.sub.0x+eP.sub.0y+f
P.sub.1x′=aP.sub.1x+bP.sub.1y+c
P.sub.1y′=dP.sub.1x+eP.sub.1y+f
P.sub.2x′=aP.sub.2x+bP.sub.2y+c
P.sub.2y′=dP.sub.2x+eP.sub.2y+f [Equations 4]
[0075] By combining Equations 4 by x and y, Equations 5 are obtained.
[0076] Based on Equations 5, Equation 6 is obtained.
[0077] Based on Equation 6, Equation 7 is obtained. In Equation 7, −1 at the upper right corner of the matrix represents an inverse matrix. For Equation 7, a case where the inverse matrix does not exist is not considered.
[0078] Based on Equations 2 and 7, the transformation matrix M can be expressed as Equation 8. In Equation 8, T at the upper right corner of the matrix represents a transposed matrix. In Equation 8, −1 at the upper right corner of the matrix represents an inverse matrix. For Equation 8, a case where the inverse matrix does not exist is not considered. Here, P0x, P0y, P1x, P1y, P2x, and P2y are indicated by the image code position data acquired in step S122. P0x′, P0y′, P1x′, P1y′, P2x′, and P2y′ are acquired in step S121.
[0079] As illustrated in
[0080] In other words, the positions Q0′, Q1′, and Q2′ of the handwriting input field in the document image that is the target of the information extraction instruction can be expressed as Equations 9 by using the transformation matrix M calculated in step S123 and positions Q0, Q1, and Q2 of the handwriting input field in the hard copy. Here, the coordinates of Q0, Q1, and Q2 are indicated by the handwriting input field data acquired in step S122.
Q.sub.0′=MQ.sub.0
Q.sub.1′=MQ.sub.1
Q.sub.2′=MQ.sub.2 [Equations 9]
[0081] As illustrated in
[0082] After step S125, the OCR processing unit 25e saves the information extracted in step S125 in the storage unit 24 (step S126). Here, the OCR processing unit 25e may store at least one piece of data acquired in step S122 together with the information extracted in step S125. For example, the OCR processing unit 25e can save the text data, the guideline data, and the image data acquired in step S122 together with the information extracted in step S125, to reproduce the hard copy based on the saved data. The OCR processing unit 25e may adopt a destination for the information in step S126 in accordance with the information indicated by the auto-indexing data acquired in step S122.
[0083] After step S126, the control unit 25 ends the operation illustrated in
[0084] As explained above, the OCR system 20 acquires the position of the handwriting input field in the document image (steps S123 and S124) on the basis of the position of the image code in the document image, the position of an image code in a document included in image code position data indicated by the image code in the document image, and the position of a handwriting input field in the document in the handwriting input field data indicated by the image code in the document image. Thus, the OCR system 20 can specify a handwriting input field as an OCR target area in the document image with high precision, and, as a result can improve the accuracy of OCR processing. Thus, the OCR system 20 can streamline data entry operations, for example, for inputting information handwritten on a document into a computer as data.
[0085] Since the image code position data including the positions of the image codes in the document and the handwriting input field data including the positions of the handwriting input fields in the document are added to the hard copy 70 in the form of image codes, the OCR system 20 can specify the handwriting input fields in the document image with high precision. As a result, the OCR system 20 can increase the accuracy of the OCR processing.
[0086] The image code position data including the positions of the image codes in the document and the handwriting input field data including the positions of the handwriting input fields in the document are added to each hard copy in the form of image codes. Thus, even when information is continuously extracted from multiple document images generated from different types of hard copies, such as hard copies having different layouts, the OCR system 20 can specify the handwriting input fields in each document image with high precision.
[0087] Since the OCR system 20 generates a hard copy provided with image codes indicating the image code position data including the positions of the image codes in the document and the handwriting input field data including the positions of the handwriting input fields in the document (steps S101 to S103), a hard copy that can increase accuracy of the OCR processing can be generated.
[0088] In the above, the method of acquiring a document image by an MFP is described as a method of reading a document image from a hard copy by a scanner. However, the method of acquiring a document image by the MFP may be a method other than the method of reading a document image from a hard copy by a scanner. For example, the MFP may acquire a document image by receiving a document image through the fax communication unit.
[0089] In the above, an MFP is used as an example of an image reading apparatus. However, the image reading apparatus may be any apparatus other than an MFP with or without a scanner. The image reading apparatus may be, for example, a dedicated scanner. For example, the image reading apparatus may be an apparatus equipped with a camera that captures an image of a hard copy and generates a document image. The image reading apparatus may be, for example, a portable terminal. The document image generated from a hard copy by an apparatus including a camera is more likely to be shifted in position relative to an ideal document image when compared with a document image generated from a hard copy by an apparatus including a scanner. Thus, the disclosure is more likely to be needed when a document image is generated from a hard copy by an apparatus including a camera, as compared to when a document image is generated from a hard copy by an apparatus including a scanner.
[0090] In the above, the OCR system and the image reading apparatus are provided separately. However, for example, the OCR system may be built in to the image reading apparatus.
[0091] In the above, the OCR system and the user terminal are provided separately. However, for example, the OCR system may be built in to the user terminal.
[0092] In the above, the OCR system performs the OCR processing. However, the OCR system may request an external service, such as a cloud service, to perform the OCR processing.
[0093] The disclosure may be adopted, for example, by an enterprise implementing enterprise content management (ECM), robotic process automation (RPA), or the like.